<p align="center">
  <a href="https://cerevox.ai/lexa">
    <img height="120" src="https://raw.githubusercontent.com/CerevoxAI/assets/refs/heads/main/cerevox-python.png" alt="Cerevox Logo">
  </a>
</p>
<h1 align="center">Cerevox - The Data Layer for AI Agents 🧠 ⚡</h1>
<p align="center">
  <strong>Data Parsing (Lexa) • Data Search (Hippo) • Enterprise-grade • Built for AI</strong><br>
  <i>AI-powered • Highest Accuracy • Vector DB ready</i>
</p>
<p align="center">
  <a href="https://github.com/cerevoxAI/cerevox-python/actions"><img src="https://img.shields.io/github/actions/workflow/status/CerevoxAI/cerevox-python/ci.yml" alt="CI Status"></a>
  <a href="https://codecov.io/gh/CerevoxAI/cerevox-python"><img src="https://codecov.io/gh/CerevoxAI/cerevox-python/branch/main/graph/badge.svg" alt="Code Coverage"></a>
  <a href="https://github.com/cerevoxAI/cerevox-python"><img src="https://qlty.sh/badges/8be43bff-101e-4701-a522-84b27c9e0f9b/maintainability.svg" alt="Maintainability"></a>
  <a href="https://pypi.org/project/cerevox/"><img src="https://img.shields.io/pypi/v/cerevox?color=blue" alt="PyPI version"></a>
  <a href="https://pypi.org/project/cerevox/"><img src="https://img.shields.io/pypi/pyversions/cerevox" alt="Python versions"></a>
  <a href="https://github.com/cerevoxAI/cerevox-python/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License"></a>
</p>
**Official Python SDK for:**
- **[Lexa](https://cerevox.ai/lexa) - Parse documents into structured data**
  - > 🎯 **Perfect for**: RAG applications, document analysis, data extraction, and vector database preparation
- **[Hippo](https://cerevox.ai/) - Search and query your document collections**
  - > 🎯 **Perfect for**: AI-powered Q&A, semantic search, and drawing insights from document collections
- **[Account](https://cerevox.ai/) - Enterprise user management and authentication**
  - > 🎯 **Perfect for**: User authentication, account management, and usage tracking
### Table of Contents
- <a href="#-installation">Installation</a>
- <a href="#-lexa-quick-start">Lexa Quick Start</a>
- <a href="#-hippo-getting-started">Hippo Getting Started</a>
- <a href="#-features">Features</a>
- <a href="#-examples">Examples</a>
- <a href="#-documentation">Documentation</a>
- <a href="#-support--community">Support</a>
## 📦 Installation
```bash
pip install cerevox
```
### 📋 Requirements
- Python 3.9+
- API key from [Cerevox](https://cerevox.ai)
## 🚀 Lexa Quick Start
### Basic Usage
```python
from cerevox import Lexa
# Parse a document
client = Lexa(api_key="your-api-key")
documents = client.parse(["document.pdf"])
print(f"Extracted {len(documents[0].content)} characters")
print(f"Found {len(documents[0].tables)} tables")
```
### Async Processing (Recommended)
```python
import asyncio
from cerevox import AsyncLexa
async def main():
    async with AsyncLexa(api_key="your-api-key") as client:
        documents = await client.parse(["document.pdf", "report.docx"])
        
        # Get chunks optimized for vector databases
        chunks = documents.get_all_text_chunks(target_size=500)
        print(f"Ready for embedding: {len(chunks)} chunks")
asyncio.run(main())
```
## 🚀 Hippo Getting Started
- Create Folder and Upload Files
- Start Chat to Ask Questions on the Folder Data
See guide [hippo-getting-started.md](docs/hippo-getting-started.md)
## ✨ Features
### 🚀 **Performance & Scale**
- **10x Faster** than traditional solutions
- **Native Async Support** with concurrent processing
- **Enterprise-grade** reliability with automatic retries
### 🧠 **AI-Powered Extraction**
- **SOTA Accuracy** with cutting-edge ML models
- **Advanced Table Extraction** preserving structure and formatting
- **12+ File Formats** including PDF, DOCX, PPTX, HTML, and more
### 🔗 **Integration Ready**
- **Vector Database Optimized** chunks for RAG applications
- **7+ Cloud Storage** integrations (S3, SharePoint, Google Drive, etc.)
- **Framework Agnostic** works with Django, Flask, FastAPI
- **Rich Metadata** extraction including images, formatting, and structure
## 📋 Examples
Explore comprehensive examples in the `examples/` directory:
### Lexa
| Example | Description |
|---------|-------------|
| **[`lexa_examples.py`](examples/lexa_examples.py)** | Complete SDK functionality demonstration |
| **[`lexa_async_examples.py`](examples/lexa_async_examples.py)** | Advanced async processing techniques |
| **[`lexa_cloud_integrations.py`](examples/lexa_cloud_integrations.py)** | Cloud storage service integrations |
### Document
| Example | Description |
|---------|-------------|
| **[`document_examples.py`](examples/document_examples.py)** | Document analysis and manipulation features |
| **[`document_vector_db_preparation.py`](examples/document_vector_db_preparation.py)** | Vector database chunking and integration patterns |
### 🚀 Run Examples
```bash
# Clone and explore
git clone https://github.com/CerevoxAI/cerevox-python.git
cd cerevox-python
export CEREVOX_API_KEY="your-api-key"
# Run demos
python examples/lexa_examples.py            # Basic usage
python examples/lexa_async_examples.py      # Async features
python examples/lexa_cloud_integrations.py  # Cloud Integrations Coming Soon!
python examples/document_examples.py               # Document analysis
python examples/document_vector_db_preparation.py  # Vector DB integration
```
## 📚 Documentation
### 📖 **API References**
- **[API Reference](docs/apis)** - Complete API documentation
### 📖 **Guides & Tutorials**
- **[Vector Database Integration](docs/vector-database-integration.md)** - RAG and vector DB setup
- **[Advanced Examples](docs/advanced-examples.md)** - Real-world usage patterns
- **[Migration Guide](docs/migration-guide.md)** - Migrate from other tools
### 🔗 **External Resources**
- **[Full Documentation](https://docs.cerevox.ai)** - Comprehensive guides
- **[Interactive API Docs](https://data.cerevox.ai/docs)** - Try the API
- **[Discord Community](https://discord.gg/cerevox)** - Get help and discuss
## 🤝 Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🆘 Support & Community
<table>
<tr>
<td>
**📖 Resources**
- [Documentation](https://docs.cerevox.ai)
- [API Reference](docs/apis/)
- [Examples](examples/)
- [Changelog](CHANGELOG.md)
</td>
<td>
**💬 Get Help**
- [Discord Community](https://discord.gg/cerevox)
- [GitHub Discussions](https://github.com/CerevoxAI/cerevox-python/discussions)
- [Stack Overflow](https://stackoverflow.com/questions/tagged/cerevox)
- [Email Support](mailto:support@cerevox.ai)
</td>
<td>
**🐛 Issues**
- [Bug Reports](https://github.com/CerevoxAI/cerevox-python/issues/new?template=bug_report.yml)
- [Feature Requests](https://github.com/CerevoxAI/cerevox-python/issues/new?template=feature_request.yml)
- [Performance](https://github.com/CerevoxAI/cerevox-python/issues/new?template=performance.yml)
- [Security Issues](mailto:security@cerevox.ai)
</td>
</tr>
</table>
---
<strong>⭐ Star us on GitHub if Cerevox helped your project!</strong><br>
Made with ❤️ by the Cerevox team<br>
Happy Building! 🔍 🦛 ✨
            
         
        Raw data
        
            {
    "_id": null,
    "home_page": null,
    "name": "cerevox",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "ai-agents, data-layer, document-parsing, semantic-search, data-search, api, pdf, extraction, async, vector-database, chunking, rag, retrieval, qa, ai",
    "author": null,
    "author_email": "Cerevox <support@cerevox.ai>, Muaz Siddiqui <muaz@cerevox.ai>",
    "download_url": "https://files.pythonhosted.org/packages/93/65/6186b3628998b4c53ba9939dcfdaa67994f54fa034423efab111fbf9e19e/cerevox-0.2.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <a href=\"https://cerevox.ai/lexa\">\n    <img height=\"120\" src=\"https://raw.githubusercontent.com/CerevoxAI/assets/refs/heads/main/cerevox-python.png\" alt=\"Cerevox Logo\">\n  </a>\n</p>\n\n<h1 align=\"center\">Cerevox - The Data Layer for AI Agents \ud83e\udde0 \u26a1</h1>\n\n<p align=\"center\">\n  <strong>Data Parsing (Lexa) \u2022 Data Search (Hippo) \u2022 Enterprise-grade \u2022 Built for AI</strong><br>\n  <i>AI-powered \u2022 Highest Accuracy \u2022 Vector DB ready</i>\n</p>\n\n<p align=\"center\">\n  <a href=\"https://github.com/cerevoxAI/cerevox-python/actions\"><img src=\"https://img.shields.io/github/actions/workflow/status/CerevoxAI/cerevox-python/ci.yml\" alt=\"CI Status\"></a>\n  <a href=\"https://codecov.io/gh/CerevoxAI/cerevox-python\"><img src=\"https://codecov.io/gh/CerevoxAI/cerevox-python/branch/main/graph/badge.svg\" alt=\"Code Coverage\"></a>\n  <a href=\"https://github.com/cerevoxAI/cerevox-python\"><img src=\"https://qlty.sh/badges/8be43bff-101e-4701-a522-84b27c9e0f9b/maintainability.svg\" alt=\"Maintainability\"></a>\n  <a href=\"https://pypi.org/project/cerevox/\"><img src=\"https://img.shields.io/pypi/v/cerevox?color=blue\" alt=\"PyPI version\"></a>\n  <a href=\"https://pypi.org/project/cerevox/\"><img src=\"https://img.shields.io/pypi/pyversions/cerevox\" alt=\"Python versions\"></a>\n  <a href=\"https://github.com/cerevoxAI/cerevox-python/blob/main/LICENSE\"><img src=\"https://img.shields.io/badge/License-MIT-blue.svg\" alt=\"License\"></a>\n</p>\n\n**Official Python SDK for:**\n- **[Lexa](https://cerevox.ai/lexa) - Parse documents into structured data**\n  - > \ud83c\udfaf **Perfect for**: RAG applications, document analysis, data extraction, and vector database preparation\n- **[Hippo](https://cerevox.ai/) - Search and query your document collections**\n  - > \ud83c\udfaf **Perfect for**: AI-powered Q&A, semantic search, and drawing insights from document collections\n- **[Account](https://cerevox.ai/) - Enterprise user management and authentication**\n  - > \ud83c\udfaf **Perfect for**: User authentication, account management, and usage tracking\n\n### Table of Contents\n- <a href=\"#-installation\">Installation</a>\n- <a href=\"#-lexa-quick-start\">Lexa Quick Start</a>\n- <a href=\"#-hippo-getting-started\">Hippo Getting Started</a>\n- <a href=\"#-features\">Features</a>\n- <a href=\"#-examples\">Examples</a>\n- <a href=\"#-documentation\">Documentation</a>\n- <a href=\"#-support--community\">Support</a>\n\n\n## \ud83d\udce6 Installation\n\n```bash\npip install cerevox\n```\n\n### \ud83d\udccb Requirements\n\n- Python 3.9+\n- API key from [Cerevox](https://cerevox.ai)\n\n## \ud83d\ude80 Lexa Quick Start\n\n### Basic Usage\n\n```python\nfrom cerevox import Lexa\n\n# Parse a document\nclient = Lexa(api_key=\"your-api-key\")\ndocuments = client.parse([\"document.pdf\"])\n\nprint(f\"Extracted {len(documents[0].content)} characters\")\nprint(f\"Found {len(documents[0].tables)} tables\")\n```\n\n### Async Processing (Recommended)\n\n```python\nimport asyncio\nfrom cerevox import AsyncLexa\n\nasync def main():\n    async with AsyncLexa(api_key=\"your-api-key\") as client:\n        documents = await client.parse([\"document.pdf\", \"report.docx\"])\n        \n        # Get chunks optimized for vector databases\n        chunks = documents.get_all_text_chunks(target_size=500)\n        print(f\"Ready for embedding: {len(chunks)} chunks\")\n\nasyncio.run(main())\n```\n\n## \ud83d\ude80 Hippo Getting Started\n\n- Create Folder and Upload Files\n- Start Chat to Ask Questions on the Folder Data\n\nSee guide [hippo-getting-started.md](docs/hippo-getting-started.md)\n\n## \u2728 Features\n\n### \ud83d\ude80 **Performance & Scale**\n- **10x Faster** than traditional solutions\n- **Native Async Support** with concurrent processing\n- **Enterprise-grade** reliability with automatic retries\n\n### \ud83e\udde0 **AI-Powered Extraction**\n- **SOTA Accuracy** with cutting-edge ML models\n- **Advanced Table Extraction** preserving structure and formatting\n- **12+ File Formats** including PDF, DOCX, PPTX, HTML, and more\n\n### \ud83d\udd17 **Integration Ready**\n- **Vector Database Optimized** chunks for RAG applications\n- **7+ Cloud Storage** integrations (S3, SharePoint, Google Drive, etc.)\n- **Framework Agnostic** works with Django, Flask, FastAPI\n- **Rich Metadata** extraction including images, formatting, and structure\n\n## \ud83d\udccb Examples\n\nExplore comprehensive examples in the `examples/` directory:\n\n### Lexa\n\n| Example | Description |\n|---------|-------------|\n| **[`lexa_examples.py`](examples/lexa_examples.py)** | Complete SDK functionality demonstration |\n| **[`lexa_async_examples.py`](examples/lexa_async_examples.py)** | Advanced async processing techniques |\n| **[`lexa_cloud_integrations.py`](examples/lexa_cloud_integrations.py)** | Cloud storage service integrations |\n\n### Document\n\n| Example | Description |\n|---------|-------------|\n| **[`document_examples.py`](examples/document_examples.py)** | Document analysis and manipulation features |\n| **[`document_vector_db_preparation.py`](examples/document_vector_db_preparation.py)** | Vector database chunking and integration patterns |\n\n### \ud83d\ude80 Run Examples\n\n```bash\n# Clone and explore\ngit clone https://github.com/CerevoxAI/cerevox-python.git\ncd cerevox-python\n\nexport CEREVOX_API_KEY=\"your-api-key\"\n\n# Run demos\npython examples/lexa_examples.py            # Basic usage\npython examples/lexa_async_examples.py      # Async features\npython examples/lexa_cloud_integrations.py  # Cloud Integrations Coming Soon!\n\npython examples/document_examples.py               # Document analysis\npython examples/document_vector_db_preparation.py  # Vector DB integration\n```\n\n## \ud83d\udcda Documentation\n\n### \ud83d\udcd6 **API References**\n- **[API Reference](docs/apis)** - Complete API documentation\n\n### \ud83d\udcd6 **Guides & Tutorials**\n- **[Vector Database Integration](docs/vector-database-integration.md)** - RAG and vector DB setup\n- **[Advanced Examples](docs/advanced-examples.md)** - Real-world usage patterns\n- **[Migration Guide](docs/migration-guide.md)** - Migrate from other tools\n\n### \ud83d\udd17 **External Resources**\n- **[Full Documentation](https://docs.cerevox.ai)** - Comprehensive guides\n- **[Interactive API Docs](https://data.cerevox.ai/docs)** - Try the API\n- **[Discord Community](https://discord.gg/cerevox)** - Get help and discuss\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83c\udd98 Support & Community\n\n<table>\n<tr>\n<td>\n\n**\ud83d\udcd6 Resources**\n- [Documentation](https://docs.cerevox.ai)\n- [API Reference](docs/apis/)\n- [Examples](examples/)\n- [Changelog](CHANGELOG.md)\n\n</td>\n<td>\n\n**\ud83d\udcac Get Help**\n- [Discord Community](https://discord.gg/cerevox)\n- [GitHub Discussions](https://github.com/CerevoxAI/cerevox-python/discussions)\n- [Stack Overflow](https://stackoverflow.com/questions/tagged/cerevox)\n- [Email Support](mailto:support@cerevox.ai)\n\n</td>\n<td>\n\n**\ud83d\udc1b Issues**\n- [Bug Reports](https://github.com/CerevoxAI/cerevox-python/issues/new?template=bug_report.yml)\n- [Feature Requests](https://github.com/CerevoxAI/cerevox-python/issues/new?template=feature_request.yml)\n- [Performance](https://github.com/CerevoxAI/cerevox-python/issues/new?template=performance.yml)\n- [Security Issues](mailto:security@cerevox.ai)\n\n</td>\n</tr>\n</table>\n\n---\n\n<strong>\u2b50 Star us on GitHub if Cerevox helped your project!</strong><br>\nMade with \u2764\ufe0f by the Cerevox team<br>\nHappy Building! \ud83d\udd0d \ud83e\udd9b \u2728\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Official Python SDK for Cerevox - The Data Layer for AI Agents: data parsing (Lexa) and data search (Hippo)",
    "version": "0.2.0",
    "project_urls": {
        "API Reference": "https://data.cerevox.ai/docs",
        "Bug Tracker": "https://github.com/CerevoxAI/cerevox-python/issues",
        "Documentation": "https://docs.cerevox.ai",
        "Homepage": "https://cerevox.ai",
        "Repository": "https://github.com/CerevoxAI/cerevox-python"
    },
    "split_keywords": [
        "ai-agents",
        " data-layer",
        " document-parsing",
        " semantic-search",
        " data-search",
        " api",
        " pdf",
        " extraction",
        " async",
        " vector-database",
        " chunking",
        " rag",
        " retrieval",
        " qa",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "15733ee27df3a15707e8116480deffc7903c02457c55faee2794d5667019daee",
                "md5": "c266f631e51f09f9bd085ffdda742553",
                "sha256": "9a1896ac4a2ebbb520599bdd7d5dd287ed7ad72a3b58077c2f59db7d8f4f8d8f"
            },
            "downloads": -1,
            "filename": "cerevox-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c266f631e51f09f9bd085ffdda742553",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 110072,
            "upload_time": "2025-10-22T00:52:47",
            "upload_time_iso_8601": "2025-10-22T00:52:47.750760Z",
            "url": "https://files.pythonhosted.org/packages/15/73/3ee27df3a15707e8116480deffc7903c02457c55faee2794d5667019daee/cerevox-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "93656186b3628998b4c53ba9939dcfdaa67994f54fa034423efab111fbf9e19e",
                "md5": "f56bb9c5141eac243bd7a3aa7fc40c0d",
                "sha256": "ce92c0ed3d6d59a68a97c9ecc2fac0e4af8114bcfff0daccfafea576b97c6d50"
            },
            "downloads": -1,
            "filename": "cerevox-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f56bb9c5141eac243bd7a3aa7fc40c0d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 98249,
            "upload_time": "2025-10-22T00:52:49",
            "upload_time_iso_8601": "2025-10-22T00:52:49.173805Z",
            "url": "https://files.pythonhosted.org/packages/93/65/6186b3628998b4c53ba9939dcfdaa67994f54fa034423efab111fbf9e19e/cerevox-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-22 00:52:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "CerevoxAI",
    "github_project": "cerevox-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "cerevox"
}