ragpackaiai

Name	ragpackaiai JSON
Version	0.1.1 JSON
	download
home_page	None
Summary	Portable Retrieval-Augmented Generation Library
upload_time	2025-08-28 13:35:22
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	MIT
keywords	rag retrieval augmented generation llm embeddings vectorstore ai nlp machine-learning langchain
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ragpackai 📦

**Portable Retrieval-Augmented Generation Library**

ragpackai is a Python library for creating, saving, loading, and querying portable RAG (Retrieval-Augmented Generation) packs. It allows you to bundle documents, embeddings, vectorstores, and configuration into a single `.rag` file that can be easily shared and deployed across different environments.

## ✨ Features

- 🚀 **Portable RAG Packs**: Bundle everything into a single `.rag` file
- 🔄 **Provider Flexibility**: Support for OpenAI, Google, Groq, Cerebras, and HuggingFace
- 🔒 **Encryption Support**: Optional AES-GCM encryption for sensitive data
- 🎯 **Runtime Overrides**: Change embedding/LLM providers without rebuilding
- 📚 **Multiple Formats**: Support for PDF, TXT, MD, and more
- 🛠️ **CLI Tools**: Command-line interface for easy pack management
- 🔧 **Lazy Loading**: Efficient dependency management with lazy imports

## 🚀 Quick Start

### Installation

```bash
# Core installation
pip install ragpackai

# With optional providers
pip install ragpackai[google]     # Google Vertex AI
pip install ragpackai[groq]       # Groq
pip install ragpackai[cerebras]   # Cerebras
pip install ragpackai[all]        # All providers
```

### Basic Usage

```python
from ragpackai import ragpackai

# Create a pack from documents
pack = ragpackai.from_files([
    "docs/manual.pdf", 
    "notes.txt",
    "knowledge_base/"
])

# Save the pack
pack.save("my_knowledge.rag")

# Load and query
pack = ragpackai.load("my_knowledge.rag")

# Simple retrieval (no LLM)
results = pack.query("How do I install this?", top_k=3)
print(results)

# Question answering with LLM
answer = pack.ask("What are the main features?")
print(answer)
```

### Provider Overrides

```python
# Load with different providers
pack = ragpackai.load(
    "my_knowledge.rag",
    embedding_config={
        "provider": "google", 
        "model_name": "textembedding-gecko"
    },
    llm_config={
        "provider": "groq", 
        "model_name": "mixtral-8x7b-32768"
    }
)

answer = pack.ask("Explain the architecture")
```

## 🛠️ Command Line Interface

### Create a RAG Pack

```bash
# From files and directories
ragpackai create docs/ notes.txt --output knowledge.rag

# With custom settings
ragpackai create docs/ \
  --embedding-provider openai \
  --embedding-model text-embedding-3-large \
  --chunk-size 1024 \
  --encrypt-key mypassword
```

### Query and Ask

```bash
# Simple retrieval
ragpackai query knowledge.rag "How to install?"

# Question answering
ragpackai ask knowledge.rag "What are the requirements?" \
  --llm-provider openai \
  --llm-model gpt-4o

# With provider overrides
ragpackai ask knowledge.rag "Explain the API" \
  --embedding-provider google \
  --embedding-model textembedding-gecko \
  --llm-provider groq \
  --llm-model mixtral-8x7b-32768
```

### Pack Information

```bash
ragpackai info knowledge.rag
```

## 🏗️ Architecture

### .rag File Structure

A `.rag` file is a structured zip archive:

```
mypack.rag
├── metadata.json          # Pack metadata
├── config.json           # Default configurations
├── documents/            # Original documents
│   ├── doc1.txt
│   └── doc2.pdf
└── vectorstore/          # Chroma vectorstore
    ├── chroma.sqlite3
    └── ...
```

### Supported Providers

**Embedding Providers:**
- `openai`: text-embedding-3-small, text-embedding-3-large
- `huggingface`: all-MiniLM-L6-v2, all-mpnet-base-v2 (offline)
- `google`: textembedding-gecko

**LLM Providers:**
- `openai`: gpt-4o, gpt-4o-mini, gpt-3.5-turbo
- `google`: gemini-pro, gemini-1.5-flash
- `groq`: mixtral-8x7b-32768, llama2-70b-4096
- `cerebras`: llama3.1-8b, llama3.1-70b

## 📖 API Reference

### ragpackai Class

#### `ragpackai.from_files(files, embed_model="openai:text-embedding-3-small", **kwargs)`

Create a RAG pack from files.

**Parameters:**
- `files`: List of file paths or directories
- `embed_model`: Embedding model in format "provider:model"
- `chunk_size`: Text chunk size (default: 512)
- `chunk_overlap`: Chunk overlap (default: 50)
- `name`: Pack name

#### `ragpackai.load(path, embedding_config=None, llm_config=None, **kwargs)`

Load a RAG pack from file.

**Parameters:**
- `path`: Path to .rag file
- `embedding_config`: Override embedding configuration
- `llm_config`: Override LLM configuration
- `reindex_on_mismatch`: Rebuild vectorstore if dimensions mismatch
- `decrypt_key`: Decryption password

#### `pack.save(path, encrypt_key=None)`

Save pack to .rag file.

#### `pack.query(question, top_k=3)`

Retrieve relevant chunks (no LLM).

#### `pack.ask(question, top_k=4, temperature=0.0)`

Ask question with LLM.

### Provider Wrappers

```python
# Direct provider access
from ragpackai.embeddings import OpenAI, HuggingFace, Google
from ragpackai.llms import OpenAIChat, GoogleChat, GroqChat

# Create embedding provider
embeddings = OpenAI(model_name="text-embedding-3-large")
vectors = embeddings.embed_documents(["Hello world"])

# Create LLM provider
llm = OpenAIChat(model_name="gpt-4o", temperature=0.7)
response = llm.invoke("What is AI?")
```

## 🔧 Configuration

### Environment Variables

```bash
# API Keys
export OPENAI_API_KEY="your-key"
export GOOGLE_CLOUD_PROJECT="your-project"
export GROQ_API_KEY="your-key"
export CEREBRAS_API_KEY="your-key"

# Optional
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"
```

### Configuration Files

```python
# Custom embedding config
embedding_config = {
    "provider": "huggingface",
    "model_name": "all-mpnet-base-v2",
    "device": "cuda"  # Use GPU
}

# Custom LLM config
llm_config = {
    "provider": "openai",
    "model_name": "gpt-4o",
    "temperature": 0.7,
    "max_tokens": 2000
}
```

## 🔒 Security

### Encryption

ragpackai supports AES-GCM encryption for sensitive data:

```python
# Save with encryption
pack.save("sensitive.rag", encrypt_key="strong-password")

# Load encrypted pack
pack = ragpackai.load("sensitive.rag", decrypt_key="strong-password")
```

### Best Practices

- Use strong passwords for encryption
- Store API keys securely in environment variables
- Validate .rag files before loading in production
- Consider network security when sharing packs

## 🧪 Examples

See the `examples/` directory for complete examples:

- `basic_usage.py` - Simple pack creation and querying
- `provider_overrides.py` - Using different providers
- `encryption_example.py` - Working with encrypted packs
- `cli_examples.sh` - Command-line usage examples

## 🤝 Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🆘 Support

- 📖 [Documentation](https://aimldev726.github.io/ragpackai/)
- 🐛 [Issue Tracker](https://github.com/AIMLDev726/ragpackai/issues)
- 💬 [Discussions](https://github.com/AIMLDev726/ragpackai/discussions)

## 🙏 Acknowledgments

Built with:
- [LangChain](https://langchain.com/) - LLM framework
- [ChromaDB](https://www.trychroma.com/) - Vector database
- [Sentence Transformers](https://www.sbert.net/) - Embedding models

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ragpackaiai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "ragpackai Team <aistudentlearn4@gmail.com>",
    "keywords": "rag, retrieval, augmented, generation, llm, embeddings, vectorstore, ai, nlp, machine-learning, langchain",
    "author": null,
    "author_email": "ragpackai Team <aistudentlearn4@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/55/cc/99b2d6d89d31180f1d2bba21c6e460419c70a5d6965a8c1339cb5f9c3b28/ragpackaiai-0.1.1.tar.gz",
    "platform": null,
    "description": "# ragpackai \ud83d\udce6\r\n\r\n**Portable Retrieval-Augmented Generation Library**\r\n\r\nragpackai is a Python library for creating, saving, loading, and querying portable RAG (Retrieval-Augmented Generation) packs. It allows you to bundle documents, embeddings, vectorstores, and configuration into a single `.rag` file that can be easily shared and deployed across different environments.\r\n\r\n## \u2728 Features\r\n\r\n- \ud83d\ude80 **Portable RAG Packs**: Bundle everything into a single `.rag` file\r\n- \ud83d\udd04 **Provider Flexibility**: Support for OpenAI, Google, Groq, Cerebras, and HuggingFace\r\n- \ud83d\udd12 **Encryption Support**: Optional AES-GCM encryption for sensitive data\r\n- \ud83c\udfaf **Runtime Overrides**: Change embedding/LLM providers without rebuilding\r\n- \ud83d\udcda **Multiple Formats**: Support for PDF, TXT, MD, and more\r\n- \ud83d\udee0\ufe0f **CLI Tools**: Command-line interface for easy pack management\r\n- \ud83d\udd27 **Lazy Loading**: Efficient dependency management with lazy imports\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n### Installation\r\n\r\n```bash\r\n# Core installation\r\npip install ragpackai\r\n\r\n# With optional providers\r\npip install ragpackai[google]     # Google Vertex AI\r\npip install ragpackai[groq]       # Groq\r\npip install ragpackai[cerebras]   # Cerebras\r\npip install ragpackai[all]        # All providers\r\n```\r\n\r\n### Basic Usage\r\n\r\n```python\r\nfrom ragpackai import ragpackai\r\n\r\n# Create a pack from documents\r\npack = ragpackai.from_files([\r\n    \"docs/manual.pdf\", \r\n    \"notes.txt\",\r\n    \"knowledge_base/\"\r\n])\r\n\r\n# Save the pack\r\npack.save(\"my_knowledge.rag\")\r\n\r\n# Load and query\r\npack = ragpackai.load(\"my_knowledge.rag\")\r\n\r\n# Simple retrieval (no LLM)\r\nresults = pack.query(\"How do I install this?\", top_k=3)\r\nprint(results)\r\n\r\n# Question answering with LLM\r\nanswer = pack.ask(\"What are the main features?\")\r\nprint(answer)\r\n```\r\n\r\n### Provider Overrides\r\n\r\n```python\r\n# Load with different providers\r\npack = ragpackai.load(\r\n    \"my_knowledge.rag\",\r\n    embedding_config={\r\n        \"provider\": \"google\", \r\n        \"model_name\": \"textembedding-gecko\"\r\n    },\r\n    llm_config={\r\n        \"provider\": \"groq\", \r\n        \"model_name\": \"mixtral-8x7b-32768\"\r\n    }\r\n)\r\n\r\nanswer = pack.ask(\"Explain the architecture\")\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Command Line Interface\r\n\r\n### Create a RAG Pack\r\n\r\n```bash\r\n# From files and directories\r\nragpackai create docs/ notes.txt --output knowledge.rag\r\n\r\n# With custom settings\r\nragpackai create docs/ \\\r\n  --embedding-provider openai \\\r\n  --embedding-model text-embedding-3-large \\\r\n  --chunk-size 1024 \\\r\n  --encrypt-key mypassword\r\n```\r\n\r\n### Query and Ask\r\n\r\n```bash\r\n# Simple retrieval\r\nragpackai query knowledge.rag \"How to install?\"\r\n\r\n# Question answering\r\nragpackai ask knowledge.rag \"What are the requirements?\" \\\r\n  --llm-provider openai \\\r\n  --llm-model gpt-4o\r\n\r\n# With provider overrides\r\nragpackai ask knowledge.rag \"Explain the API\" \\\r\n  --embedding-provider google \\\r\n  --embedding-model textembedding-gecko \\\r\n  --llm-provider groq \\\r\n  --llm-model mixtral-8x7b-32768\r\n```\r\n\r\n### Pack Information\r\n\r\n```bash\r\nragpackai info knowledge.rag\r\n```\r\n\r\n## \ud83c\udfd7\ufe0f Architecture\r\n\r\n### .rag File Structure\r\n\r\nA `.rag` file is a structured zip archive:\r\n\r\n```\r\nmypack.rag\r\n\u251c\u2500\u2500 metadata.json          # Pack metadata\r\n\u251c\u2500\u2500 config.json           # Default configurations\r\n\u251c\u2500\u2500 documents/            # Original documents\r\n\u2502   \u251c\u2500\u2500 doc1.txt\r\n\u2502   \u2514\u2500\u2500 doc2.pdf\r\n\u2514\u2500\u2500 vectorstore/          # Chroma vectorstore\r\n    \u251c\u2500\u2500 chroma.sqlite3\r\n    \u2514\u2500\u2500 ...\r\n```\r\n\r\n### Supported Providers\r\n\r\n**Embedding Providers:**\r\n- `openai`: text-embedding-3-small, text-embedding-3-large\r\n- `huggingface`: all-MiniLM-L6-v2, all-mpnet-base-v2 (offline)\r\n- `google`: textembedding-gecko\r\n\r\n**LLM Providers:**\r\n- `openai`: gpt-4o, gpt-4o-mini, gpt-3.5-turbo\r\n- `google`: gemini-pro, gemini-1.5-flash\r\n- `groq`: mixtral-8x7b-32768, llama2-70b-4096\r\n- `cerebras`: llama3.1-8b, llama3.1-70b\r\n\r\n## \ud83d\udcd6 API Reference\r\n\r\n### ragpackai Class\r\n\r\n#### `ragpackai.from_files(files, embed_model=\"openai:text-embedding-3-small\", **kwargs)`\r\n\r\nCreate a RAG pack from files.\r\n\r\n**Parameters:**\r\n- `files`: List of file paths or directories\r\n- `embed_model`: Embedding model in format \"provider:model\"\r\n- `chunk_size`: Text chunk size (default: 512)\r\n- `chunk_overlap`: Chunk overlap (default: 50)\r\n- `name`: Pack name\r\n\r\n#### `ragpackai.load(path, embedding_config=None, llm_config=None, **kwargs)`\r\n\r\nLoad a RAG pack from file.\r\n\r\n**Parameters:**\r\n- `path`: Path to .rag file\r\n- `embedding_config`: Override embedding configuration\r\n- `llm_config`: Override LLM configuration\r\n- `reindex_on_mismatch`: Rebuild vectorstore if dimensions mismatch\r\n- `decrypt_key`: Decryption password\r\n\r\n#### `pack.save(path, encrypt_key=None)`\r\n\r\nSave pack to .rag file.\r\n\r\n#### `pack.query(question, top_k=3)`\r\n\r\nRetrieve relevant chunks (no LLM).\r\n\r\n#### `pack.ask(question, top_k=4, temperature=0.0)`\r\n\r\nAsk question with LLM.\r\n\r\n### Provider Wrappers\r\n\r\n```python\r\n# Direct provider access\r\nfrom ragpackai.embeddings import OpenAI, HuggingFace, Google\r\nfrom ragpackai.llms import OpenAIChat, GoogleChat, GroqChat\r\n\r\n# Create embedding provider\r\nembeddings = OpenAI(model_name=\"text-embedding-3-large\")\r\nvectors = embeddings.embed_documents([\"Hello world\"])\r\n\r\n# Create LLM provider\r\nllm = OpenAIChat(model_name=\"gpt-4o\", temperature=0.7)\r\nresponse = llm.invoke(\"What is AI?\")\r\n```\r\n\r\n## \ud83d\udd27 Configuration\r\n\r\n### Environment Variables\r\n\r\n```bash\r\n# API Keys\r\nexport OPENAI_API_KEY=\"your-key\"\r\nexport GOOGLE_CLOUD_PROJECT=\"your-project\"\r\nexport GROQ_API_KEY=\"your-key\"\r\nexport CEREBRAS_API_KEY=\"your-key\"\r\n\r\n# Optional\r\nexport GOOGLE_APPLICATION_CREDENTIALS=\"path/to/service-account.json\"\r\n```\r\n\r\n### Configuration Files\r\n\r\n```python\r\n# Custom embedding config\r\nembedding_config = {\r\n    \"provider\": \"huggingface\",\r\n    \"model_name\": \"all-mpnet-base-v2\",\r\n    \"device\": \"cuda\"  # Use GPU\r\n}\r\n\r\n# Custom LLM config\r\nllm_config = {\r\n    \"provider\": \"openai\",\r\n    \"model_name\": \"gpt-4o\",\r\n    \"temperature\": 0.7,\r\n    \"max_tokens\": 2000\r\n}\r\n```\r\n\r\n## \ud83d\udd12 Security\r\n\r\n### Encryption\r\n\r\nragpackai supports AES-GCM encryption for sensitive data:\r\n\r\n```python\r\n# Save with encryption\r\npack.save(\"sensitive.rag\", encrypt_key=\"strong-password\")\r\n\r\n# Load encrypted pack\r\npack = ragpackai.load(\"sensitive.rag\", decrypt_key=\"strong-password\")\r\n```\r\n\r\n### Best Practices\r\n\r\n- Use strong passwords for encryption\r\n- Store API keys securely in environment variables\r\n- Validate .rag files before loading in production\r\n- Consider network security when sharing packs\r\n\r\n## \ud83e\uddea Examples\r\n\r\nSee the `examples/` directory for complete examples:\r\n\r\n- `basic_usage.py` - Simple pack creation and querying\r\n- `provider_overrides.py` - Using different providers\r\n- `encryption_example.py` - Working with encrypted packs\r\n- `cli_examples.sh` - Command-line usage examples\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nWe welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\r\n\r\n## \ud83d\udcc4 License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## \ud83c\udd98 Support\r\n\r\n- \ud83d\udcd6 [Documentation](https://aimldev726.github.io/ragpackai/)\r\n- \ud83d\udc1b [Issue Tracker](https://github.com/AIMLDev726/ragpackai/issues)\r\n- \ud83d\udcac [Discussions](https://github.com/AIMLDev726/ragpackai/discussions)\r\n\r\n## \ud83d\ude4f Acknowledgments\r\n\r\nBuilt with:\r\n- [LangChain](https://langchain.com/) - LLM framework\r\n- [ChromaDB](https://www.trychroma.com/) - Vector database\r\n- [Sentence Transformers](https://www.sbert.net/) - Embedding models\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Portable Retrieval-Augmented Generation Library",
    "version": "0.1.1",
    "project_urls": {
        "Bug Reports": "https://github.com/AIMLDev726/ragpackai/issues",
        "Changelog": "https://github.com/AIMLDev726/ragpackai/blob/main/CHANGELOG.md",
        "Documentation": "https://AIMLDev726.readthedocs.io/",
        "Homepage": "https://github.com/AIMLDev726/ragpackai",
        "Repository": "https://github.com/AIMLDev726/ragpackai"
    },
    "split_keywords": [
        "rag",
        " retrieval",
        " augmented",
        " generation",
        " llm",
        " embeddings",
        " vectorstore",
        " ai",
        " nlp",
        " machine-learning",
        " langchain"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "83dfcdc4c3a0ffb5a012dae30f26e47fc95125714e6482a4dd74c98224acaba8",
                "md5": "dbac682dc880029afccdacaedbe483b6",
                "sha256": "f3d8d53a5b0b4ecc88d7aea6513799aaeed90c92de5c4cbce7e4ffe08dbe935d"
            },
            "downloads": -1,
            "filename": "ragpackaiai-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dbac682dc880029afccdacaedbe483b6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 33838,
            "upload_time": "2025-08-28T13:35:20",
            "upload_time_iso_8601": "2025-08-28T13:35:20.347390Z",
            "url": "https://files.pythonhosted.org/packages/83/df/cdc4c3a0ffb5a012dae30f26e47fc95125714e6482a4dd74c98224acaba8/ragpackaiai-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "55cc99b2d6d89d31180f1d2bba21c6e460419c70a5d6965a8c1339cb5f9c3b28",
                "md5": "a22a18a8003daf955132f0a819544e99",
                "sha256": "5ec1cb31f39412841104ef24e0f099bbb4ba4f0de907b406ed858e9edbecf600"
            },
            "downloads": -1,
            "filename": "ragpackaiai-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "a22a18a8003daf955132f0a819544e99",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 41285,
            "upload_time": "2025-08-28T13:35:22",
            "upload_time_iso_8601": "2025-08-28T13:35:22.799328Z",
            "url": "https://files.pythonhosted.org/packages/55/cc/99b2d6d89d31180f1d2bba21c6e460419c70a5d6965a8c1339cb5f9c3b28/ragpackaiai-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 13:35:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AIMLDev726",
    "github_project": "ragpackai",
    "github_not_found": true,
    "lcname": "ragpackaiai"
}

None