llmq


Namellmq JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryHigh-Performance vLLM Job Queue Package
upload_time2025-08-02 16:10:07
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords llm queue vllm gpu inference rabbitmq async
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # llmq

High-Performance Inference Queueing

## Quick Start

```bash
# Install
pip install llmq

# Start RabbitMQ
docker run -d --name rabbitmq -p 5672:5672 rabbitmq:3

# Submit jobs
echo '{"id": "1", "prompt": "Say hello", "name": "world"}' > jobs.jsonl
llmq submit my-queue jobs.jsonl > output.jsonl

# Start worker on a GPU node (in another terminal)
llmq worker dummy my-queue
```

## Features

- **High-performance**: GPU-accelerated inference with vLLM
- **Scalable**: RabbitMQ-based job distribution
- **Simple**: Unix-friendly CLI with piped output
- **Async**: Non-blocking job processing
- **Flexible**: Support for multiple worker types

## Worker Types

- `llmq worker run <model> <queue>` - vLLM worker for real inference
- `llmq worker dummy <queue>` - Testing worker

## Configuration

Set via environment variables:

- `RABBITMQ_URL` - RabbitMQ connection
- `VLLM_GPU_MEMORY_UTILIZATION` - GPU memory usage (0.0-1.0)
- `VLLM_QUEUE_PREFETCH` - Concurrent jobs per worker

## Documentation

See the [GitHub repository](https://github.com/ipieter/llmq) for full documentation.

## License

MIT

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llmq",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "llm, queue, vllm, gpu, inference, rabbitmq, async",
    "author": null,
    "author_email": "Pieter <pieter@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/13/1d/64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d/llmq-0.0.2.tar.gz",
    "platform": null,
    "description": "# llmq\n\nHigh-Performance Inference Queueing\n\n## Quick Start\n\n```bash\n# Install\npip install llmq\n\n# Start RabbitMQ\ndocker run -d --name rabbitmq -p 5672:5672 rabbitmq:3\n\n# Submit jobs\necho '{\"id\": \"1\", \"prompt\": \"Say hello\", \"name\": \"world\"}' > jobs.jsonl\nllmq submit my-queue jobs.jsonl > output.jsonl\n\n# Start worker on a GPU node (in another terminal)\nllmq worker dummy my-queue\n```\n\n## Features\n\n- **High-performance**: GPU-accelerated inference with vLLM\n- **Scalable**: RabbitMQ-based job distribution\n- **Simple**: Unix-friendly CLI with piped output\n- **Async**: Non-blocking job processing\n- **Flexible**: Support for multiple worker types\n\n## Worker Types\n\n- `llmq worker run <model> <queue>` - vLLM worker for real inference\n- `llmq worker dummy <queue>` - Testing worker\n\n## Configuration\n\nSet via environment variables:\n\n- `RABBITMQ_URL` - RabbitMQ connection\n- `VLLM_GPU_MEMORY_UTILIZATION` - GPU memory usage (0.0-1.0)\n- `VLLM_QUEUE_PREFETCH` - Concurrent jobs per worker\n\n## Documentation\n\nSee the [GitHub repository](https://github.com/ipieter/llmq) for full documentation.\n\n## License\n\nMIT\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "High-Performance vLLM Job Queue Package",
    "version": "0.0.2",
    "project_urls": {
        "Documentation": "https://github.com/ipieter/llmq#readme",
        "Homepage": "https://github.com/ipieter/llmq",
        "Issues": "https://github.com/ipieter/llmq/issues",
        "Repository": "https://github.com/ipieter/llmq"
    },
    "split_keywords": [
        "llm",
        " queue",
        " vllm",
        " gpu",
        " inference",
        " rabbitmq",
        " async"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0986076b8b1f48aafda13ca3106666a0624fe7c48b9c48a8cbc08b714a8511e8",
                "md5": "27a74a9d4e6d91516626d8f88c2256b9",
                "sha256": "e6bc85024fbd2ace037b008cbc49b46ae0860ab023dd720848f8dfc240d45a75"
            },
            "downloads": -1,
            "filename": "llmq-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "27a74a9d4e6d91516626d8f88c2256b9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 20525,
            "upload_time": "2025-08-02T16:10:06",
            "upload_time_iso_8601": "2025-08-02T16:10:06.144804Z",
            "url": "https://files.pythonhosted.org/packages/09/86/076b8b1f48aafda13ca3106666a0624fe7c48b9c48a8cbc08b714a8511e8/llmq-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "131d64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d",
                "md5": "365f72ed1aacf9086ceb0908da97a2e3",
                "sha256": "5d43d7466982c59ac1ef5d9d063d52b1cb4c8b5257742570e24fd0dab11e6cd5"
            },
            "downloads": -1,
            "filename": "llmq-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "365f72ed1aacf9086ceb0908da97a2e3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 30459,
            "upload_time": "2025-08-02T16:10:07",
            "upload_time_iso_8601": "2025-08-02T16:10:07.343125Z",
            "url": "https://files.pythonhosted.org/packages/13/1d/64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d/llmq-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-02 16:10:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ipieter",
    "github_project": "llmq#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llmq"
}
        
Elapsed time: 1.30167s