Name | llmq JSON |
Version |
0.0.2
JSON |
| download |
home_page | None |
Summary | High-Performance vLLM Job Queue Package |
upload_time | 2025-08-02 16:10:07 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | MIT |
keywords |
llm
queue
vllm
gpu
inference
rabbitmq
async
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# llmq
High-Performance Inference Queueing
## Quick Start
```bash
# Install
pip install llmq
# Start RabbitMQ
docker run -d --name rabbitmq -p 5672:5672 rabbitmq:3
# Submit jobs
echo '{"id": "1", "prompt": "Say hello", "name": "world"}' > jobs.jsonl
llmq submit my-queue jobs.jsonl > output.jsonl
# Start worker on a GPU node (in another terminal)
llmq worker dummy my-queue
```
## Features
- **High-performance**: GPU-accelerated inference with vLLM
- **Scalable**: RabbitMQ-based job distribution
- **Simple**: Unix-friendly CLI with piped output
- **Async**: Non-blocking job processing
- **Flexible**: Support for multiple worker types
## Worker Types
- `llmq worker run <model> <queue>` - vLLM worker for real inference
- `llmq worker dummy <queue>` - Testing worker
## Configuration
Set via environment variables:
- `RABBITMQ_URL` - RabbitMQ connection
- `VLLM_GPU_MEMORY_UTILIZATION` - GPU memory usage (0.0-1.0)
- `VLLM_QUEUE_PREFETCH` - Concurrent jobs per worker
## Documentation
See the [GitHub repository](https://github.com/ipieter/llmq) for full documentation.
## License
MIT
Raw data
{
"_id": null,
"home_page": null,
"name": "llmq",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "llm, queue, vllm, gpu, inference, rabbitmq, async",
"author": null,
"author_email": "Pieter <pieter@example.com>",
"download_url": "https://files.pythonhosted.org/packages/13/1d/64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d/llmq-0.0.2.tar.gz",
"platform": null,
"description": "# llmq\n\nHigh-Performance Inference Queueing\n\n## Quick Start\n\n```bash\n# Install\npip install llmq\n\n# Start RabbitMQ\ndocker run -d --name rabbitmq -p 5672:5672 rabbitmq:3\n\n# Submit jobs\necho '{\"id\": \"1\", \"prompt\": \"Say hello\", \"name\": \"world\"}' > jobs.jsonl\nllmq submit my-queue jobs.jsonl > output.jsonl\n\n# Start worker on a GPU node (in another terminal)\nllmq worker dummy my-queue\n```\n\n## Features\n\n- **High-performance**: GPU-accelerated inference with vLLM\n- **Scalable**: RabbitMQ-based job distribution\n- **Simple**: Unix-friendly CLI with piped output\n- **Async**: Non-blocking job processing\n- **Flexible**: Support for multiple worker types\n\n## Worker Types\n\n- `llmq worker run <model> <queue>` - vLLM worker for real inference\n- `llmq worker dummy <queue>` - Testing worker\n\n## Configuration\n\nSet via environment variables:\n\n- `RABBITMQ_URL` - RabbitMQ connection\n- `VLLM_GPU_MEMORY_UTILIZATION` - GPU memory usage (0.0-1.0)\n- `VLLM_QUEUE_PREFETCH` - Concurrent jobs per worker\n\n## Documentation\n\nSee the [GitHub repository](https://github.com/ipieter/llmq) for full documentation.\n\n## License\n\nMIT\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "High-Performance vLLM Job Queue Package",
"version": "0.0.2",
"project_urls": {
"Documentation": "https://github.com/ipieter/llmq#readme",
"Homepage": "https://github.com/ipieter/llmq",
"Issues": "https://github.com/ipieter/llmq/issues",
"Repository": "https://github.com/ipieter/llmq"
},
"split_keywords": [
"llm",
" queue",
" vllm",
" gpu",
" inference",
" rabbitmq",
" async"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0986076b8b1f48aafda13ca3106666a0624fe7c48b9c48a8cbc08b714a8511e8",
"md5": "27a74a9d4e6d91516626d8f88c2256b9",
"sha256": "e6bc85024fbd2ace037b008cbc49b46ae0860ab023dd720848f8dfc240d45a75"
},
"downloads": -1,
"filename": "llmq-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "27a74a9d4e6d91516626d8f88c2256b9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 20525,
"upload_time": "2025-08-02T16:10:06",
"upload_time_iso_8601": "2025-08-02T16:10:06.144804Z",
"url": "https://files.pythonhosted.org/packages/09/86/076b8b1f48aafda13ca3106666a0624fe7c48b9c48a8cbc08b714a8511e8/llmq-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "131d64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d",
"md5": "365f72ed1aacf9086ceb0908da97a2e3",
"sha256": "5d43d7466982c59ac1ef5d9d063d52b1cb4c8b5257742570e24fd0dab11e6cd5"
},
"downloads": -1,
"filename": "llmq-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "365f72ed1aacf9086ceb0908da97a2e3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 30459,
"upload_time": "2025-08-02T16:10:07",
"upload_time_iso_8601": "2025-08-02T16:10:07.343125Z",
"url": "https://files.pythonhosted.org/packages/13/1d/64aaed023de9f901541a30ee5b40046f1db608cb37de2bfb132c54a6883d/llmq-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-02 16:10:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ipieter",
"github_project": "llmq#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llmq"
}