# gptme-rag
A powerful RAG (Retrieval-Augmented Generation) system that enhances AI interactions by providing relevant context from your local files. Built primarily for [gptme](https://github.com/ErikBjare/gptme), but can be used standalone.
RAG systems improve AI responses by retrieving and incorporating relevant information from a knowledge base into the generation process. This leads to more accurate, contextual, and factual responses.
<p align="center">
<a href="https://github.com/ErikBjare/gptme-rag/actions/workflows/test.yml">
<img src="https://github.com/ErikBjare/gptme-rag/actions/workflows/test.yml/badge.svg" alt="Tests" />
</a>
<a href="https://pypi.org/project/gptme-rag/">
<img src="https://img.shields.io/pypi/v/gptme-rag" alt="PyPI version" />
</a>
<a href="https://github.com/ErikBjare/gptme-rag/blob/master/LICENSE">
<img src="https://img.shields.io/github/license/ErikBjare/gptme-rag" alt="License" />
</a>
<br>
<a href="https://github.com/ErikBjare/gptme">
<img src="https://img.shields.io/badge/built%20using-gptme%20%F0%9F%A4%96-5151f5?style=flat" alt="Built using gptme" />
</a>
</p>
## Features
- ๐ Document indexing with ChromaDB
- Fast and efficient vector storage
- Semantic search capabilities
- Persistent storage
- ๐ Semantic search with embeddings
- Relevance scoring
- Token-aware context assembly
- Clean output formatting
- ๐ Smart document processing
- Streaming large file handling
- Automatic document chunking
- Configurable chunk size/overlap
- Document reconstruction
- ๐ File watching and auto-indexing
- Real-time index updates
- Pattern-based file filtering
- Efficient batch processing
- Automatic persistence
- ๐ ๏ธ CLI interface for testing and development
- Index management
- Search functionality
- Context assembly
- File watching
## Quick Start
```bash
# Install (requires Python 3.10+)
pipx install gptme-rag # or: pip install gptme-rag
# Index your documents
gptme-rag index ./docs --pattern "**/*.md"
# Search
gptme-rag search "What is the architecture of the system?"
```
For development installation:
```bash
git clone https://github.com/ErikBjare/gptme-rag.git
cd gptme-rag
poetry install
```
## Usage
### Indexing Documents
```bash
# Index markdown files in a directory
gptme-rag index /path/to/documents --pattern "**/*.md"
# Index with custom persist directory
gptme-rag index /path/to/documents --persist-dir ./index
```
### Searching
```bash
# Basic search
gptme-rag search "your query here"
# Advanced search with options
gptme-rag search "your query" \
--n-results 5 \
--persist-dir ./index \
--max-tokens 4000 \
--show-context
```
### File Watching
The watch command monitors directories for changes and automatically updates the index:
```bash
# Watch a directory with default settings
gptme-rag watch /path/to/documents
# Watch with custom pattern and ignore rules
gptme-rag watch /path/to/documents \
--pattern "**/*.{md,py}" \
--ignore-patterns "*.tmp" "*.log" \
--persist-dir ./index
```
Features:
- ๐ Real-time index updates
- ๐ฏ Pattern matching for file types
- ๐ซ Configurable ignore patterns
- ๐ Efficient batch processing
- ๐พ Automatic persistence
The watcher will:
- Perform initial indexing of existing files
- Monitor for file changes (create/modify/delete/move)
- Update the index automatically
- Handle rapid changes efficiently with debouncing
- Continue running until interrupted (Ctrl+C)
### Performance Benchmarking
The benchmark commands help measure and optimize performance:
```bash
# Benchmark document indexing
gptme-rag benchmark indexing /path/to/documents \
--pattern "**/*.md" \
--persist-dir ./benchmark_index
# Benchmark search performance
gptme-rag benchmark search /path/to/documents \
--queries "python" "documentation" "example" \
--n-results 10
# Benchmark file watching
gptme-rag benchmark watch-perf /path/to/documents \
--duration 10 \
--updates-per-second 5
```
Features:
- ๐ Comprehensive metrics
- Operation duration
- Memory usage
- Throughput
- Custom metrics per operation
- ๐ฌ Multiple benchmark types
- Document indexing
- Search operations
- File watching
- ๐ Performance tracking
- Memory efficiency
- Processing speed
- System resource usage
Example benchmark output:
```plaintext
โโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโณโโโโโโโโโโโโณโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโ
โ Operation โ Duration(s) โ Memory(MB) โ Throughput โ Additional Metrics โ
โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ
โ indexing โ 0.523 โ 15.42 โ 19.12/s โ files: 10 โ
โ search โ 0.128 โ 5.67 โ 23.44/s โ queries: 3 โ
โ file_watching โ 5.012 โ 8.91 โ 4.99/s โ updates: 25 โ
โโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโ
```
### Document Chunking
The indexer supports automatic document chunking for efficient processing of large files:
```bash
# Index with custom chunk settings
gptme-rag index /path/to/documents \
--chunk-size 1000 \
--chunk-overlap 200
# Search with chunk grouping
gptme-rag search "your query" \
--group-chunks \
--n-results 5
```
Features:
- ๐ Streaming processing
- Handles large files efficiently
- Minimal memory usage
- Progress reporting
- ๐ Smart chunking
- Configurable chunk size
- Overlapping chunks for context
- Token-aware splitting
- ๐ Enhanced search
- Chunk-aware relevance
- Result grouping by document
- Full document reconstruction
Example Output:
```plaintext
Most Relevant Documents:
1. documentation.md#chunk2 (relevance: 0.85)
Detailed section about configuration options, including chunk size and overlap settings.
[Part of: documentation.md]
2. guide.md#chunk5 (relevance: 0.78)
Example usage showing how to process large documents efficiently.
[Part of: guide.md]
3. README.md#chunk1 (relevance: 0.72)
Overview of the chunking system and its benefits for large document processing.
[Part of: README.md]
Full Context:
Total tokens: 850
Documents included: 3 (from 3 source documents)
Truncated: False
```
The chunking system automatically:
- Splits large documents into manageable pieces
- Maintains context across chunk boundaries
- Groups related chunks in search results
- Provides document reconstruction when needed
## Development
### Running Tests
```bash
# Run all tests
poetry run pytest
# Run with coverage
poetry run pytest --cov=gptme_rag
```
### Project Structure
```plaintext
gptme_rag/
โโโ __init__.py
โโโ cli.py # CLI interface
โโโ indexing/ # Document indexing
โ โโโ document.py # Document model
โ โโโ indexer.py # ChromaDB integration
โโโ query/ # Search functionality
โ โโโ context_assembler.py # Context assembly
โโโ utils/ # Utility functions
```
### Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests for new functionality
5. Run tests and linting
6. Submit a pull request
### Releases
Releases are automated through GitHub Actions. The process is:
1. Update version in pyproject.toml
2. Commit the change: `git commit -am "chore: bump version to x.y.z"`
3. Create and push a tag: `git tag vx.y.z && git push origin master vx.y.z`
4. Create a GitHub release (can be done with `gh release create vx.y.z`)
5. The publish workflow will automatically:
- Run tests
- Build the package
- Publish to PyPI
## Integration with gptme
This package is designed to integrate with [gptme](https://github.com/ErikBjare/gptme) to provide AI assistants with relevant context from your local files. When used with gptme, it:
- Automatically indexes your project files
- Enhances AI responses with relevant context
- Provides semantic search across your codebase
- Maintains a persistent knowledge base
- Assembles context intelligently within token limits
To use with gptme, simply install both packages and gptme will automatically detect and use gptme-rag for context management.
## License
MIT License. See [LICENSE](LICENSE) for details.
Raw data
{
"_id": null,
"home_page": "https://github.com/ErikBjare/gptme-rag",
"name": "gptme-rag",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "rag, gptme, context-management, embeddings, chromadb",
"author": "Bob",
"author_email": "bob@gptme.org",
"download_url": "https://files.pythonhosted.org/packages/0f/34/ddd432b9008caeee8de7d45522df2e03a663613f5c0108bd31c9b332973d/gptme_rag-0.5.0.tar.gz",
"platform": null,
"description": "# gptme-rag\n\nA powerful RAG (Retrieval-Augmented Generation) system that enhances AI interactions by providing relevant context from your local files. Built primarily for [gptme](https://github.com/ErikBjare/gptme), but can be used standalone.\n\nRAG systems improve AI responses by retrieving and incorporating relevant information from a knowledge base into the generation process. This leads to more accurate, contextual, and factual responses.\n\n<p align=\"center\">\n <a href=\"https://github.com/ErikBjare/gptme-rag/actions/workflows/test.yml\">\n <img src=\"https://github.com/ErikBjare/gptme-rag/actions/workflows/test.yml/badge.svg\" alt=\"Tests\" />\n </a>\n <a href=\"https://pypi.org/project/gptme-rag/\">\n <img src=\"https://img.shields.io/pypi/v/gptme-rag\" alt=\"PyPI version\" />\n </a>\n <a href=\"https://github.com/ErikBjare/gptme-rag/blob/master/LICENSE\">\n <img src=\"https://img.shields.io/github/license/ErikBjare/gptme-rag\" alt=\"License\" />\n </a>\n <br>\n <a href=\"https://github.com/ErikBjare/gptme\">\n <img src=\"https://img.shields.io/badge/built%20using-gptme%20%F0%9F%A4%96-5151f5?style=flat\" alt=\"Built using gptme\" />\n </a>\n</p>\n\n## Features\n\n- \ud83d\udcda Document indexing with ChromaDB\n - Fast and efficient vector storage\n - Semantic search capabilities\n - Persistent storage\n- \ud83d\udd0d Semantic search with embeddings\n - Relevance scoring\n - Token-aware context assembly\n - Clean output formatting\n- \ud83d\udcc4 Smart document processing\n - Streaming large file handling\n - Automatic document chunking\n - Configurable chunk size/overlap\n - Document reconstruction\n- \ud83d\udc40 File watching and auto-indexing\n - Real-time index updates\n - Pattern-based file filtering\n - Efficient batch processing\n - Automatic persistence\n- \ud83d\udee0\ufe0f CLI interface for testing and development\n - Index management\n - Search functionality\n - Context assembly\n - File watching\n\n## Quick Start\n\n```bash\n# Install (requires Python 3.10+)\npipx install gptme-rag # or: pip install gptme-rag\n\n# Index your documents\ngptme-rag index ./docs --pattern \"**/*.md\"\n\n# Search\ngptme-rag search \"What is the architecture of the system?\"\n```\n\nFor development installation:\n```bash\ngit clone https://github.com/ErikBjare/gptme-rag.git\ncd gptme-rag\npoetry install\n```\n\n## Usage\n\n### Indexing Documents\n\n```bash\n# Index markdown files in a directory\ngptme-rag index /path/to/documents --pattern \"**/*.md\"\n\n# Index with custom persist directory\ngptme-rag index /path/to/documents --persist-dir ./index\n```\n\n### Searching\n\n```bash\n# Basic search\ngptme-rag search \"your query here\"\n\n# Advanced search with options\ngptme-rag search \"your query\" \\\n --n-results 5 \\\n --persist-dir ./index \\\n --max-tokens 4000 \\\n --show-context\n```\n\n### File Watching\n\nThe watch command monitors directories for changes and automatically updates the index:\n\n```bash\n# Watch a directory with default settings\ngptme-rag watch /path/to/documents\n\n# Watch with custom pattern and ignore rules\ngptme-rag watch /path/to/documents \\\n --pattern \"**/*.{md,py}\" \\\n --ignore-patterns \"*.tmp\" \"*.log\" \\\n --persist-dir ./index\n```\n\nFeatures:\n- \ud83d\udd04 Real-time index updates\n- \ud83c\udfaf Pattern matching for file types\n- \ud83d\udeab Configurable ignore patterns\n- \ud83d\udd0b Efficient batch processing\n- \ud83d\udcbe Automatic persistence\n\nThe watcher will:\n- Perform initial indexing of existing files\n- Monitor for file changes (create/modify/delete/move)\n- Update the index automatically\n- Handle rapid changes efficiently with debouncing\n- Continue running until interrupted (Ctrl+C)\n\n### Performance Benchmarking\n\nThe benchmark commands help measure and optimize performance:\n\n```bash\n# Benchmark document indexing\ngptme-rag benchmark indexing /path/to/documents \\\n --pattern \"**/*.md\" \\\n --persist-dir ./benchmark_index\n\n# Benchmark search performance\ngptme-rag benchmark search /path/to/documents \\\n --queries \"python\" \"documentation\" \"example\" \\\n --n-results 10\n\n# Benchmark file watching\ngptme-rag benchmark watch-perf /path/to/documents \\\n --duration 10 \\\n --updates-per-second 5\n```\n\nFeatures:\n- \ud83d\udcca Comprehensive metrics\n - Operation duration\n - Memory usage\n - Throughput\n - Custom metrics per operation\n- \ud83d\udd2c Multiple benchmark types\n - Document indexing\n - Search operations\n - File watching\n- \ud83d\udcc8 Performance tracking\n - Memory efficiency\n - Processing speed\n - System resource usage\n\nExample benchmark output:\n```plaintext\n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Operation \u2503 Duration(s) \u2503 Memory(MB) \u2503 Throughput \u2503 Additional Metrics \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 indexing \u2502 0.523 \u2502 15.42 \u2502 19.12/s \u2502 files: 10 \u2502\n\u2502 search \u2502 0.128 \u2502 5.67 \u2502 23.44/s \u2502 queries: 3 \u2502\n\u2502 file_watching \u2502 5.012 \u2502 8.91 \u2502 4.99/s \u2502 updates: 25 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n### Document Chunking\n\nThe indexer supports automatic document chunking for efficient processing of large files:\n\n```bash\n# Index with custom chunk settings\ngptme-rag index /path/to/documents \\\n --chunk-size 1000 \\\n --chunk-overlap 200\n\n# Search with chunk grouping\ngptme-rag search \"your query\" \\\n --group-chunks \\\n --n-results 5\n```\n\nFeatures:\n- \ud83d\udd04 Streaming processing\n - Handles large files efficiently\n - Minimal memory usage\n - Progress reporting\n- \ud83d\udcd1 Smart chunking\n - Configurable chunk size\n - Overlapping chunks for context\n - Token-aware splitting\n- \ud83d\udd0d Enhanced search\n - Chunk-aware relevance\n - Result grouping by document\n - Full document reconstruction\n\nExample Output:\n```plaintext\nMost Relevant Documents:\n\n1. documentation.md#chunk2 (relevance: 0.85)\n Detailed section about configuration options, including chunk size and overlap settings.\n [Part of: documentation.md]\n\n2. guide.md#chunk5 (relevance: 0.78)\n Example usage showing how to process large documents efficiently.\n [Part of: guide.md]\n\n3. README.md#chunk1 (relevance: 0.72)\n Overview of the chunking system and its benefits for large document processing.\n [Part of: README.md]\n\nFull Context:\nTotal tokens: 850\nDocuments included: 3 (from 3 source documents)\nTruncated: False\n```\n\nThe chunking system automatically:\n- Splits large documents into manageable pieces\n- Maintains context across chunk boundaries\n- Groups related chunks in search results\n- Provides document reconstruction when needed\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\npoetry run pytest\n\n# Run with coverage\npoetry run pytest --cov=gptme_rag\n```\n\n### Project Structure\n\n```plaintext\ngptme_rag/\n\u251c\u2500\u2500 __init__.py\n\u251c\u2500\u2500 cli.py # CLI interface\n\u251c\u2500\u2500 indexing/ # Document indexing\n\u2502 \u251c\u2500\u2500 document.py # Document model\n\u2502 \u2514\u2500\u2500 indexer.py # ChromaDB integration\n\u251c\u2500\u2500 query/ # Search functionality\n\u2502 \u2514\u2500\u2500 context_assembler.py # Context assembly\n\u2514\u2500\u2500 utils/ # Utility functions\n```\n\n### Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Add tests for new functionality\n5. Run tests and linting\n6. Submit a pull request\n\n### Releases\n\nReleases are automated through GitHub Actions. The process is:\n1. Update version in pyproject.toml\n2. Commit the change: `git commit -am \"chore: bump version to x.y.z\"`\n3. Create and push a tag: `git tag vx.y.z && git push origin master vx.y.z`\n4. Create a GitHub release (can be done with `gh release create vx.y.z`)\n5. The publish workflow will automatically:\n - Run tests\n - Build the package\n - Publish to PyPI\n\n## Integration with gptme\n\nThis package is designed to integrate with [gptme](https://github.com/ErikBjare/gptme) to provide AI assistants with relevant context from your local files. When used with gptme, it:\n\n- Automatically indexes your project files\n- Enhances AI responses with relevant context\n- Provides semantic search across your codebase\n- Maintains a persistent knowledge base\n- Assembles context intelligently within token limits\n\nTo use with gptme, simply install both packages and gptme will automatically detect and use gptme-rag for context management.\n\n## License\n\nMIT License. See [LICENSE](LICENSE) for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "RAG implementation for gptme context management",
"version": "0.5.0",
"project_urls": {
"Documentation": "https://github.com/ErikBjare/gptme-rag#readme",
"Homepage": "https://github.com/ErikBjare/gptme-rag",
"Repository": "https://github.com/ErikBjare/gptme-rag"
},
"split_keywords": [
"rag",
" gptme",
" context-management",
" embeddings",
" chromadb"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c228f595a6e416127bd9a9ccca38d79975ff59fa1133fb181427e16934f79313",
"md5": "af7d220c1604a55ba5b43471b675ec02",
"sha256": "c6c371d54c92c672dc258b6893ae27a7e83f9aa9dbed53b7609a41ed689bc1ae"
},
"downloads": -1,
"filename": "gptme_rag-0.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "af7d220c1604a55ba5b43471b675ec02",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 30273,
"upload_time": "2024-12-11T21:46:25",
"upload_time_iso_8601": "2024-12-11T21:46:25.304998Z",
"url": "https://files.pythonhosted.org/packages/c2/28/f595a6e416127bd9a9ccca38d79975ff59fa1133fb181427e16934f79313/gptme_rag-0.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0f34ddd432b9008caeee8de7d45522df2e03a663613f5c0108bd31c9b332973d",
"md5": "fe947d167c61face64c5b02854c9a33b",
"sha256": "4b92deafec19e5c14468b4308b2b42d644ceff9bd17412effb5d31213ce9bade"
},
"downloads": -1,
"filename": "gptme_rag-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "fe947d167c61face64c5b02854c9a33b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 28798,
"upload_time": "2024-12-11T21:46:27",
"upload_time_iso_8601": "2024-12-11T21:46:27.632499Z",
"url": "https://files.pythonhosted.org/packages/0f/34/ddd432b9008caeee8de7d45522df2e03a663613f5c0108bd31c9b332973d/gptme_rag-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-11 21:46:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ErikBjare",
"github_project": "gptme-rag",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "gptme-rag"
}