rocketrag


Namerocketrag JSON
Version 0.1.4 PyPI version JSON
download
home_pageNone
SummaryFast, efficient, minimal, extendible and elegant RAG system
upload_time2025-09-01 12:36:43
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseCC-BY-4.0
keywords rag retrieval llm ai vector-database embeddings
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # πŸš€ RocketRAG

**Fast, efficient, minimal, extendible and elegant RAG system**

RocketRAG is a high-performance Retrieval-Augmented Generation (RAG) system designed with a focus on speed, simplicity, and extensibility. Built on top of state-of-the-art libraries, it provides both CLI and web server capabilities for seamless integration into any workflow.

## 🎯 Mission

RocketRAG aims to be the **fastest and most efficient RAG library** while maintaining:
- **Minimal footprint** - Clean, lightweight codebase
- **Maximum extensibility** - Pluggable architecture for all components
- **Peak performance** - Leveraging the best-in-class libraries
- **Ease of use** - Simple CLI and API interfaces

## ⚑ Performance-First Architecture

RocketRAG is built on top of cutting-edge, performance-optimized libraries:

- **[Chonkie](https://github.com/bhavnicksm/chonkie)** - Ultra-fast semantic chunking with model2vec
- **[Kreuzberg](https://github.com/mixedbread-ai/kreuzberg)** - Lightning-fast document loading and processing
- **[llama-cpp-python](https://github.com/abetlen/llama-cpp-python)** - Optimized LLM inference with GGUF support
- **[Milvus Lite](https://github.com/milvus-io/milvus-lite)** - High-performance vector database
- **[Sentence Transformers](https://github.com/UKPLab/sentence-transformers)** - State-of-the-art embeddings

## πŸš€ Quick Start

### Installation

#### Using pip
```bash
pip install rocketrag
```

#### Using uvx (recommended for CLI usage)
```bash
# Run directly without installation
uvx rocketrag --help

# Or install globally
uvx install rocketrag
```

### Basic Usage

```python
from rocketrag import RocketRAG

rag = RocketRAG("./data") # Path do your data (supports PDF, TXT, MD, etc.)
rag.prepare() # Construct vector database

# Ask questions
answer, sources = rag.ask("What is the main topic of the documents?")
print(answer)
```

### CLI Usage

```bash
# Prepare documents from a directory
rocketrag prepare --data-dir ./documents

# Ask questions via CLI
rocketrag ask "What are the key findings?"

# Start web server
rocketrag server --port 8000
```

#### Using uvx (no installation required)
```bash
# Same commands work with uvx
uvx rocketrag prepare --data-dir ./documents
uvx rocketrag ask "What are the key findings?"
uvx rocketrag server --port 8000

# Run as module
uvx --from rocketrag python -m rocketrag --help
```

## πŸ—οΈ Architecture

RocketRAG follows a modular, plugin-based architecture:

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Document      β”‚    β”‚    Chunking     β”‚    β”‚   Vectorization β”‚
β”‚   Loaders       │───▢│   (Chonkie)     │───▢│ (SentenceTransf)β”‚
β”‚  (Kreuzberg)    β”‚    β”‚                 β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                        β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚      LLM        β”‚    β”‚   Vector DB     β”‚β—€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ (llama-cpp-py)  │◀───│ (Milvus Lite)   β”‚
β”‚                 β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

### Core Components

- **BaseLoader**: Pluggable document loading (PDF, TXT, MD, etc.)
- **BaseChunker**: Configurable chunking strategies (semantic, recursive, etc.)
- **BaseVectorizer**: Flexible embedding models
- **BaseLLM**: Swappable language models
- **MilvusLiteDB**: High-performance vector storage and retrieval

## πŸ”§ Configuration

### Custom Components

```python
from rocketrag import RocketRAG
from rocketrag.vectors import SentenceTransformersVectorizer
from rocketrag.chonk import ChonkieChunker
from rocketrag.llm import LLamaLLM
from rocketrag.loaders import KreuzbergLoader

# Configure high-performance components
vectorizer = SentenceTransformersVectorizer(
    model_name="minishlab/potion-multilingual-128M"  # Fast multilingual model
)

chunker = ChonkieChunker(
    method="semantic",  # Semantic chunking for better context
    embedding_model="minishlab/potion-multilingual-128M",
    chunk_size=512
)

llm = LLamaLLM(
    repo_id="unsloth/gemma-3n-E2B-it-GGUF",
    filename="*Q8_0.gguf"  # Quantized for speed
)

loader = KreuzbergLoader()  # Ultra-fast document processing

rag = RocketRAG(
    vectorizer=vectorizer,
    chunker=chunker,
    llm=llm,
    loader=loader
)
```

### CLI Configuration

```bash
# Custom chunking strategy
rocketrag prepare \
  --chonker chonkie \
  --chonker-args '{"method": "semantic", "chunk_size": 512}' \
  --vectorizer-args '{"model_name": "all-MiniLM-L6-v2"}'

# Custom LLM for inference
rocketrag ask "Your question" \
  --repo-id "microsoft/DialoGPT-medium" \
  --filename "*.gguf"
```

## 🌐 Web Server

RocketRAG includes a FastAPI-based web server with OpenAI-compatible endpoints:

```bash
# Start server
rocketrag server --port 8000 --host 0.0.0.0
```

### API Endpoints

- `GET /` - Interactive web interface
- `POST /ask` - Question answering
- `POST /ask/stream` - Streaming responses
- `GET /chat` - Chat interface
- `GET /browse` - Document browser
- `GET /visualize` - Vector visualization
- `GET /health` - Health check

### Example API Usage

```python
import requests

response = requests.post(
    "http://localhost:8000/ask",
    json={"question": "What are the main findings?"}
)

result = response.json()
print(result["answer"])
print(result["sources"])
```

## 🎨 Features

### Core Features
- ⚑ **Ultra-fast document processing** with Kreuzberg
- 🧠 **Semantic chunking** with Chonkie and model2vec
- πŸ” **High-performance vector search** with Milvus Lite
- πŸ€– **Optimized LLM inference** with llama-cpp-python
- πŸ“Š **Rich CLI interface** with progress bars and formatting
- 🌐 **Web server** with interactive UI
- πŸ”Œ **Pluggable architecture** for easy customization

### Advanced Features
- πŸ“ˆ **Vector visualization** for debugging and analysis
- πŸ“š **Document browsing** interface
- πŸ’¬ **Streaming responses** for real-time interaction
- πŸ”„ **Batch processing** for large document sets
- πŸ“ **Metadata preservation** throughout the pipeline
- 🎯 **Context-aware chunking** for better retrieval

## πŸ› οΈ Development

### Installation for Development

```bash
git clone https://github.com/yourusername/rocketrag.git
cd rocketrag
pip install -e ".[dev]"
```

### Running Tests

```bash
pytest tests/
```

### Code Quality

```bash
ruff check .
ruff format .
```

## πŸ“Š Performance

RocketRAG is designed for speed:

- **Document Loading**: 10x faster with Kreuzberg's optimized parsers
- **Chunking**: Semantic chunking with model2vec for superior context preservation
- **Vectorization**: Optimized batch processing with sentence-transformers
- **Retrieval**: Sub-millisecond vector search with Milvus Lite
- **Generation**: GGUF quantization for 4x faster inference

## 🀝 Contributing

We welcome contributions! RocketRAG's modular architecture makes it easy to:

- Add new document loaders
- Implement custom chunking strategies
- Integrate different embedding models
- Support additional LLM backends
- Enhance the web interface

## πŸ™ Acknowledgments

RocketRAG builds upon the excellent work of:
- [Chonkie](https://github.com/bhavnicksm/chonkie) for semantic chunking
- [Kreuzberg](https://github.com/mixedbread-ai/kreuzberg) for document processing
- [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) for LLM inference
- [Milvus](https://github.com/milvus-io/milvus-lite) for vector storage
- [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) for embeddings


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rocketrag",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "rag, retrieval, llm, ai, vector-database, embeddings",
    "author": null,
    "author_email": "Aleksander Obuchowski <obuchowskialeksander@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/53/bd/b0dfbb62f361c8da89bb78a555cf61387c903fd5ceccea2e611728c6b3c0/rocketrag-0.1.4.tar.gz",
    "platform": null,
    "description": "# \ud83d\ude80 RocketRAG\n\n**Fast, efficient, minimal, extendible and elegant RAG system**\n\nRocketRAG is a high-performance Retrieval-Augmented Generation (RAG) system designed with a focus on speed, simplicity, and extensibility. Built on top of state-of-the-art libraries, it provides both CLI and web server capabilities for seamless integration into any workflow.\n\n## \ud83c\udfaf Mission\n\nRocketRAG aims to be the **fastest and most efficient RAG library** while maintaining:\n- **Minimal footprint** - Clean, lightweight codebase\n- **Maximum extensibility** - Pluggable architecture for all components\n- **Peak performance** - Leveraging the best-in-class libraries\n- **Ease of use** - Simple CLI and API interfaces\n\n## \u26a1 Performance-First Architecture\n\nRocketRAG is built on top of cutting-edge, performance-optimized libraries:\n\n- **[Chonkie](https://github.com/bhavnicksm/chonkie)** - Ultra-fast semantic chunking with model2vec\n- **[Kreuzberg](https://github.com/mixedbread-ai/kreuzberg)** - Lightning-fast document loading and processing\n- **[llama-cpp-python](https://github.com/abetlen/llama-cpp-python)** - Optimized LLM inference with GGUF support\n- **[Milvus Lite](https://github.com/milvus-io/milvus-lite)** - High-performance vector database\n- **[Sentence Transformers](https://github.com/UKPLab/sentence-transformers)** - State-of-the-art embeddings\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n#### Using pip\n```bash\npip install rocketrag\n```\n\n#### Using uvx (recommended for CLI usage)\n```bash\n# Run directly without installation\nuvx rocketrag --help\n\n# Or install globally\nuvx install rocketrag\n```\n\n### Basic Usage\n\n```python\nfrom rocketrag import RocketRAG\n\nrag = RocketRAG(\"./data\") # Path do your data (supports PDF, TXT, MD, etc.)\nrag.prepare() # Construct vector database\n\n# Ask questions\nanswer, sources = rag.ask(\"What is the main topic of the documents?\")\nprint(answer)\n```\n\n### CLI Usage\n\n```bash\n# Prepare documents from a directory\nrocketrag prepare --data-dir ./documents\n\n# Ask questions via CLI\nrocketrag ask \"What are the key findings?\"\n\n# Start web server\nrocketrag server --port 8000\n```\n\n#### Using uvx (no installation required)\n```bash\n# Same commands work with uvx\nuvx rocketrag prepare --data-dir ./documents\nuvx rocketrag ask \"What are the key findings?\"\nuvx rocketrag server --port 8000\n\n# Run as module\nuvx --from rocketrag python -m rocketrag --help\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\nRocketRAG follows a modular, plugin-based architecture:\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   Document      \u2502    \u2502    Chunking     \u2502    \u2502   Vectorization \u2502\n\u2502   Loaders       \u2502\u2500\u2500\u2500\u25b6\u2502   (Chonkie)     \u2502\u2500\u2500\u2500\u25b6\u2502 (SentenceTransf)\u2502\n\u2502  (Kreuzberg)    \u2502    \u2502                 \u2502    \u2502                 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                                                        \u2502\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510             \u2502\n\u2502      LLM        \u2502    \u2502   Vector DB     \u2502\u25c0\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\u2502 (llama-cpp-py)  \u2502\u25c0\u2500\u2500\u2500\u2502 (Milvus Lite)   \u2502\n\u2502                 \u2502    \u2502                 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Core Components\n\n- **BaseLoader**: Pluggable document loading (PDF, TXT, MD, etc.)\n- **BaseChunker**: Configurable chunking strategies (semantic, recursive, etc.)\n- **BaseVectorizer**: Flexible embedding models\n- **BaseLLM**: Swappable language models\n- **MilvusLiteDB**: High-performance vector storage and retrieval\n\n## \ud83d\udd27 Configuration\n\n### Custom Components\n\n```python\nfrom rocketrag import RocketRAG\nfrom rocketrag.vectors import SentenceTransformersVectorizer\nfrom rocketrag.chonk import ChonkieChunker\nfrom rocketrag.llm import LLamaLLM\nfrom rocketrag.loaders import KreuzbergLoader\n\n# Configure high-performance components\nvectorizer = SentenceTransformersVectorizer(\n    model_name=\"minishlab/potion-multilingual-128M\"  # Fast multilingual model\n)\n\nchunker = ChonkieChunker(\n    method=\"semantic\",  # Semantic chunking for better context\n    embedding_model=\"minishlab/potion-multilingual-128M\",\n    chunk_size=512\n)\n\nllm = LLamaLLM(\n    repo_id=\"unsloth/gemma-3n-E2B-it-GGUF\",\n    filename=\"*Q8_0.gguf\"  # Quantized for speed\n)\n\nloader = KreuzbergLoader()  # Ultra-fast document processing\n\nrag = RocketRAG(\n    vectorizer=vectorizer,\n    chunker=chunker,\n    llm=llm,\n    loader=loader\n)\n```\n\n### CLI Configuration\n\n```bash\n# Custom chunking strategy\nrocketrag prepare \\\n  --chonker chonkie \\\n  --chonker-args '{\"method\": \"semantic\", \"chunk_size\": 512}' \\\n  --vectorizer-args '{\"model_name\": \"all-MiniLM-L6-v2\"}'\n\n# Custom LLM for inference\nrocketrag ask \"Your question\" \\\n  --repo-id \"microsoft/DialoGPT-medium\" \\\n  --filename \"*.gguf\"\n```\n\n## \ud83c\udf10 Web Server\n\nRocketRAG includes a FastAPI-based web server with OpenAI-compatible endpoints:\n\n```bash\n# Start server\nrocketrag server --port 8000 --host 0.0.0.0\n```\n\n### API Endpoints\n\n- `GET /` - Interactive web interface\n- `POST /ask` - Question answering\n- `POST /ask/stream` - Streaming responses\n- `GET /chat` - Chat interface\n- `GET /browse` - Document browser\n- `GET /visualize` - Vector visualization\n- `GET /health` - Health check\n\n### Example API Usage\n\n```python\nimport requests\n\nresponse = requests.post(\n    \"http://localhost:8000/ask\",\n    json={\"question\": \"What are the main findings?\"}\n)\n\nresult = response.json()\nprint(result[\"answer\"])\nprint(result[\"sources\"])\n```\n\n## \ud83c\udfa8 Features\n\n### Core Features\n- \u26a1 **Ultra-fast document processing** with Kreuzberg\n- \ud83e\udde0 **Semantic chunking** with Chonkie and model2vec\n- \ud83d\udd0d **High-performance vector search** with Milvus Lite\n- \ud83e\udd16 **Optimized LLM inference** with llama-cpp-python\n- \ud83d\udcca **Rich CLI interface** with progress bars and formatting\n- \ud83c\udf10 **Web server** with interactive UI\n- \ud83d\udd0c **Pluggable architecture** for easy customization\n\n### Advanced Features\n- \ud83d\udcc8 **Vector visualization** for debugging and analysis\n- \ud83d\udcda **Document browsing** interface\n- \ud83d\udcac **Streaming responses** for real-time interaction\n- \ud83d\udd04 **Batch processing** for large document sets\n- \ud83d\udcdd **Metadata preservation** throughout the pipeline\n- \ud83c\udfaf **Context-aware chunking** for better retrieval\n\n## \ud83d\udee0\ufe0f Development\n\n### Installation for Development\n\n```bash\ngit clone https://github.com/yourusername/rocketrag.git\ncd rocketrag\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\npytest tests/\n```\n\n### Code Quality\n\n```bash\nruff check .\nruff format .\n```\n\n## \ud83d\udcca Performance\n\nRocketRAG is designed for speed:\n\n- **Document Loading**: 10x faster with Kreuzberg's optimized parsers\n- **Chunking**: Semantic chunking with model2vec for superior context preservation\n- **Vectorization**: Optimized batch processing with sentence-transformers\n- **Retrieval**: Sub-millisecond vector search with Milvus Lite\n- **Generation**: GGUF quantization for 4x faster inference\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! RocketRAG's modular architecture makes it easy to:\n\n- Add new document loaders\n- Implement custom chunking strategies\n- Integrate different embedding models\n- Support additional LLM backends\n- Enhance the web interface\n\n## \ud83d\ude4f Acknowledgments\n\nRocketRAG builds upon the excellent work of:\n- [Chonkie](https://github.com/bhavnicksm/chonkie) for semantic chunking\n- [Kreuzberg](https://github.com/mixedbread-ai/kreuzberg) for document processing\n- [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) for LLM inference\n- [Milvus](https://github.com/milvus-io/milvus-lite) for vector storage\n- [Sentence Transformers](https://github.com/UKPLab/sentence-transformers) for embeddings\n\n",
    "bugtrack_url": null,
    "license": "CC-BY-4.0",
    "summary": "Fast, efficient, minimal, extendible and elegant RAG system",
    "version": "0.1.4",
    "project_urls": {
        "Bug Tracker": "https://github.com/TheLion-ai/RocketRAG/issues",
        "Documentation": "https://github.com/TheLion-ai/RocketRAG#readme",
        "Homepage": "https://github.com/TheLion-ai/RocketRAG",
        "Repository": "https://github.com/TheLion-ai/RocketRAG"
    },
    "split_keywords": [
        "rag",
        " retrieval",
        " llm",
        " ai",
        " vector-database",
        " embeddings"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5d6e18e5b04b0b29fec37929e788ab6a77eea549b8ea9334b79fd4137e8dd8de",
                "md5": "f90d8e51fd07f18d810b416ebb99952e",
                "sha256": "71cead93461a1f7f6fd17465463efdfc4e3cce0181e34b9e9234e4914a292783"
            },
            "downloads": -1,
            "filename": "rocketrag-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f90d8e51fd07f18d810b416ebb99952e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 39975,
            "upload_time": "2025-09-01T12:36:42",
            "upload_time_iso_8601": "2025-09-01T12:36:42.267922Z",
            "url": "https://files.pythonhosted.org/packages/5d/6e/18e5b04b0b29fec37929e788ab6a77eea549b8ea9334b79fd4137e8dd8de/rocketrag-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "53bdb0dfbb62f361c8da89bb78a555cf61387c903fd5ceccea2e611728c6b3c0",
                "md5": "58305b6c87a47fedba5a74d0bc738d8a",
                "sha256": "0f0db3f4f651d5f10360b3110b63740dc18e04c52c7208b50a5aa9d0168592ba"
            },
            "downloads": -1,
            "filename": "rocketrag-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "58305b6c87a47fedba5a74d0bc738d8a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 44155,
            "upload_time": "2025-09-01T12:36:43",
            "upload_time_iso_8601": "2025-09-01T12:36:43.517080Z",
            "url": "https://files.pythonhosted.org/packages/53/bd/b0dfbb62f361c8da89bb78a555cf61387c903fd5ceccea2e611728c6b3c0/rocketrag-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-01 12:36:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TheLion-ai",
    "github_project": "RocketRAG",
    "github_not_found": true,
    "lcname": "rocketrag"
}
        
Elapsed time: 1.20092s