thoth-qdrant


Namethoth-qdrant JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryNative Qdrant implementation for Thoth Vector Database
upload_time2025-08-13 10:32:58
maintainerNone
docs_urlNone
authorMarco Pancotti
requires_python>=3.9
licenseNone
keywords vector database qdrant embeddings thoth ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Thoth Qdrant

A native Qdrant implementation for the Thoth Vector Database system, providing high-performance vector storage and similarity search capabilities without Haystack dependencies.

## Features

- **Native Qdrant Integration**: Direct use of Qdrant client without Haystack
- **Full API Compatibility**: Same interface as thoth_vdb2 for seamless integration
- **External Embeddings**: Support for OpenAI, Cohere, Mistral, and HuggingFace
- **Document Types**: EvidenceDocument, SqlDocument, ColumnNameDocument
- **Similarity Search**: Native Qdrant search with document type filtering
- **Batch Operations**: Efficient bulk document insertion
- **Caching**: Intelligent embedding cache for performance

## Installation

```bash
# Basic installation
pip install thoth-qdrant

# With OpenAI embeddings support
pip install thoth-qdrant[openai]

# With all embedding providers
pip install thoth-qdrant[all-providers]
```

## Configuration

### Environment Variables

```bash
# Embedding provider configuration
export EMBEDDING_PROVIDER=openai
export EMBEDDING_MODEL=text-embedding-3-small
export OPENAI_API_KEY=your-api-key

# Or use provider-specific keys
export OPENAI_API_KEY=sk-...
export COHERE_API_KEY=...
export MISTRAL_API_KEY=...
```

### Qdrant Setup

Ensure Qdrant is running locally:

```bash
docker run -p 6333:6333 qdrant/qdrant
```

## Usage

```python
from thoth_qdrant import VectorStoreFactory
from thoth_qdrant.core.base import (
    ColumnNameDocument,
    SqlDocument,
    EvidenceDocument,
    ThothType,
)

# Create vector store
store = VectorStoreFactory.create(
    backend="qdrant",
    collection="my_collection",
    host="localhost",
    port=6333,
    embedding_provider="openai",
    embedding_model="text-embedding-3-small"
)

# Add documents
column_doc = ColumnNameDocument(
    table_name="users",
    column_name="email",
    original_column_name="email_address",
    column_description="User email for authentication",
    value_description="Valid email format"
)
doc_id = store.add_column_description(column_doc)

sql_doc = SqlDocument(
    question="How to find recent users?",
    sql="SELECT * FROM users WHERE created_at > NOW() - INTERVAL '30 days'",
    evidence="Filter by date using interval"
)
store.add_sql(sql_doc)

# Search similar documents
results = store.search_similar(
    query="user email authentication",
    doc_type=ThothType.COLUMN_NAME,
    top_k=5,
    score_threshold=0.7
)

# Bulk operations
documents = [column_doc, sql_doc]
doc_ids = store.bulk_add_documents(documents)

# Get document by ID
doc = store.get_document(doc_id)

# Delete document
store.delete_document(doc_id)

# Get all documents by type
all_columns = store.get_all_column_documents()
all_sql = store.get_all_sql_documents()

# Collection info
info = store.get_collection_info()
print(info)
```

## API Reference

### VectorStoreInterface Methods

- `add_column_description(doc: ColumnNameDocument) -> str`
- `add_sql(doc: SqlDocument) -> str`
- `add_evidence(doc: EvidenceDocument) -> str`
- `search_similar(query: str, doc_type: ThothType, top_k: int = 5, score_threshold: float = 0.7) -> List[BaseThothDocument]`
- `get_document(doc_id: str) -> Optional[BaseThothDocument]`
- `delete_document(doc_id: str) -> None`
- `bulk_add_documents(documents: List[BaseThothDocument]) -> List[str]`
- `delete_collection(thoth_type: ThothType) -> None`
- `get_all_column_documents() -> List[ColumnNameDocument]`
- `get_all_sql_documents() -> List[SqlDocument]`
- `get_all_evidence_documents() -> List[EvidenceDocument]`
- `get_collection_info() -> Dict[str, Any]`

## Testing

```bash
# Run tests with local Qdrant
pytest tests/

# Run specific test
pytest tests/test_qdrant_adapter.py -v

# With coverage
pytest --cov=thoth_qdrant tests/
```

## Development

```bash
# Install development dependencies
pip install -e .[dev,test]

# Format code
black thoth_qdrant tests
isort thoth_qdrant tests

# Type checking
mypy thoth_qdrant

# Linting
ruff thoth_qdrant
```

## License

Apache License 2.0 - See LICENSE.md for details

## Compatibility

This library is fully compatible with thoth_vdb2 API, allowing seamless migration from Haystack-based implementations to native Qdrant.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "thoth-qdrant",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "vector database, qdrant, embeddings, thoth, ai",
    "author": "Marco Pancotti",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/0a/ec/c00697cd0776a0273a124864ad76a41023cf64c27c1e8ec2842c872dfe0f/thoth_qdrant-0.1.3.tar.gz",
    "platform": null,
    "description": "# Thoth Qdrant\n\nA native Qdrant implementation for the Thoth Vector Database system, providing high-performance vector storage and similarity search capabilities without Haystack dependencies.\n\n## Features\n\n- **Native Qdrant Integration**: Direct use of Qdrant client without Haystack\n- **Full API Compatibility**: Same interface as thoth_vdb2 for seamless integration\n- **External Embeddings**: Support for OpenAI, Cohere, Mistral, and HuggingFace\n- **Document Types**: EvidenceDocument, SqlDocument, ColumnNameDocument\n- **Similarity Search**: Native Qdrant search with document type filtering\n- **Batch Operations**: Efficient bulk document insertion\n- **Caching**: Intelligent embedding cache for performance\n\n## Installation\n\n```bash\n# Basic installation\npip install thoth-qdrant\n\n# With OpenAI embeddings support\npip install thoth-qdrant[openai]\n\n# With all embedding providers\npip install thoth-qdrant[all-providers]\n```\n\n## Configuration\n\n### Environment Variables\n\n```bash\n# Embedding provider configuration\nexport EMBEDDING_PROVIDER=openai\nexport EMBEDDING_MODEL=text-embedding-3-small\nexport OPENAI_API_KEY=your-api-key\n\n# Or use provider-specific keys\nexport OPENAI_API_KEY=sk-...\nexport COHERE_API_KEY=...\nexport MISTRAL_API_KEY=...\n```\n\n### Qdrant Setup\n\nEnsure Qdrant is running locally:\n\n```bash\ndocker run -p 6333:6333 qdrant/qdrant\n```\n\n## Usage\n\n```python\nfrom thoth_qdrant import VectorStoreFactory\nfrom thoth_qdrant.core.base import (\n    ColumnNameDocument,\n    SqlDocument,\n    EvidenceDocument,\n    ThothType,\n)\n\n# Create vector store\nstore = VectorStoreFactory.create(\n    backend=\"qdrant\",\n    collection=\"my_collection\",\n    host=\"localhost\",\n    port=6333,\n    embedding_provider=\"openai\",\n    embedding_model=\"text-embedding-3-small\"\n)\n\n# Add documents\ncolumn_doc = ColumnNameDocument(\n    table_name=\"users\",\n    column_name=\"email\",\n    original_column_name=\"email_address\",\n    column_description=\"User email for authentication\",\n    value_description=\"Valid email format\"\n)\ndoc_id = store.add_column_description(column_doc)\n\nsql_doc = SqlDocument(\n    question=\"How to find recent users?\",\n    sql=\"SELECT * FROM users WHERE created_at > NOW() - INTERVAL '30 days'\",\n    evidence=\"Filter by date using interval\"\n)\nstore.add_sql(sql_doc)\n\n# Search similar documents\nresults = store.search_similar(\n    query=\"user email authentication\",\n    doc_type=ThothType.COLUMN_NAME,\n    top_k=5,\n    score_threshold=0.7\n)\n\n# Bulk operations\ndocuments = [column_doc, sql_doc]\ndoc_ids = store.bulk_add_documents(documents)\n\n# Get document by ID\ndoc = store.get_document(doc_id)\n\n# Delete document\nstore.delete_document(doc_id)\n\n# Get all documents by type\nall_columns = store.get_all_column_documents()\nall_sql = store.get_all_sql_documents()\n\n# Collection info\ninfo = store.get_collection_info()\nprint(info)\n```\n\n## API Reference\n\n### VectorStoreInterface Methods\n\n- `add_column_description(doc: ColumnNameDocument) -> str`\n- `add_sql(doc: SqlDocument) -> str`\n- `add_evidence(doc: EvidenceDocument) -> str`\n- `search_similar(query: str, doc_type: ThothType, top_k: int = 5, score_threshold: float = 0.7) -> List[BaseThothDocument]`\n- `get_document(doc_id: str) -> Optional[BaseThothDocument]`\n- `delete_document(doc_id: str) -> None`\n- `bulk_add_documents(documents: List[BaseThothDocument]) -> List[str]`\n- `delete_collection(thoth_type: ThothType) -> None`\n- `get_all_column_documents() -> List[ColumnNameDocument]`\n- `get_all_sql_documents() -> List[SqlDocument]`\n- `get_all_evidence_documents() -> List[EvidenceDocument]`\n- `get_collection_info() -> Dict[str, Any]`\n\n## Testing\n\n```bash\n# Run tests with local Qdrant\npytest tests/\n\n# Run specific test\npytest tests/test_qdrant_adapter.py -v\n\n# With coverage\npytest --cov=thoth_qdrant tests/\n```\n\n## Development\n\n```bash\n# Install development dependencies\npip install -e .[dev,test]\n\n# Format code\nblack thoth_qdrant tests\nisort thoth_qdrant tests\n\n# Type checking\nmypy thoth_qdrant\n\n# Linting\nruff thoth_qdrant\n```\n\n## License\n\nApache License 2.0 - See LICENSE.md for details\n\n## Compatibility\n\nThis library is fully compatible with thoth_vdb2 API, allowing seamless migration from Haystack-based implementations to native Qdrant.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Native Qdrant implementation for Thoth Vector Database",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "vector database",
        " qdrant",
        " embeddings",
        " thoth",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b1fbb0e7323cb9d6410d43d0d9ab9a28654fef9d48039df0f28151aa4cbdf9d9",
                "md5": "60eadd19c37e5d37f2f02782ba65f761",
                "sha256": "b783bd7194a65a571380cfe410bba34a4dc80cdfc5ff4700d275d3dc107c7938"
            },
            "downloads": -1,
            "filename": "thoth_qdrant-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60eadd19c37e5d37f2f02782ba65f761",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 17424,
            "upload_time": "2025-08-13T10:32:57",
            "upload_time_iso_8601": "2025-08-13T10:32:57.721598Z",
            "url": "https://files.pythonhosted.org/packages/b1/fb/b0e7323cb9d6410d43d0d9ab9a28654fef9d48039df0f28151aa4cbdf9d9/thoth_qdrant-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0aecc00697cd0776a0273a124864ad76a41023cf64c27c1e8ec2842c872dfe0f",
                "md5": "a136bd731b8b2373bfd844842be4e643",
                "sha256": "0575f876e8b37e81057879b0b3941eaa9d7f100a644b2c8379765af4b3e3c787"
            },
            "downloads": -1,
            "filename": "thoth_qdrant-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "a136bd731b8b2373bfd844842be4e643",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 19187,
            "upload_time": "2025-08-13T10:32:58",
            "upload_time_iso_8601": "2025-08-13T10:32:58.711516Z",
            "url": "https://files.pythonhosted.org/packages/0a/ec/c00697cd0776a0273a124864ad76a41023cf64c27c1e8ec2842c872dfe0f/thoth_qdrant-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-13 10:32:58",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "thoth-qdrant"
}
        
Elapsed time: 1.22436s