lateness


Namelateness JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/moderncolbert/lateness
SummaryModern ColBERT for Late Interaction with native multi-vector support
upload_time2025-07-17 07:49:50
maintainerNone
docs_urlNone
authordonkey stereotype
requires_python>=3.8
licenseNone
keywords colbert retrieval embeddings information-retrieval nlp onnx pytorch qdrant
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Lateness - Modern ColBERT for Late Interaction

A Python package for Modern ColBERT (late interaction) embeddings with native multi-vector support for efficient retrieval using Qdrant vector database.

## Features

- **Dual Backend Architecture**: ONNX for fast retrieval, PyTorch for GPU indexing
- **Native Multi-Vector Support**: Optimized for Qdrant's MaxSim comparator
- **Smart Installation**: Lightweight retrieval or heavy indexing based on your needs
- **Production Ready**: Separate deployment targets for different workloads

## Quick Start

### Installation

```bash
# Lightweight retrieval (ONNX + Qdrant)
pip install lateness

# Heavy indexing (PyTorch + Transformers + ONNX + Qdrant)
pip install lateness[index]
```

### Backend Selection

### Basic Usage

**Default Installation (ONNX Backend):**
```python

# pip install lateness
from lateness import ModernColBERT
colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")
# Output:
# 🚀 Using ONNX backend Using ONNX backend (default, for GPU accelerated indexing, install lateness[index] and set LATENESS_USE_TORCH=true)
# 🔄 Downloading model: prithivida/modern_colbert_base_en_v1
# ✅ ONNX ColBERT loaded with providers: ['CPUExecutionProvider']
# Query max length: 256, Document max length: 300
```

**Index Installation (PyTorch Backend):**
```python
# pip install lateness[index]
import os
os.environ['LATENESS_USE_TORCH'] = 'true'
from lateness import ModernColBERT

colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")
# Output:
# 🚀 Using PyTorch backend (LATENESS_USE_TORCH=true)
# 🔄 Downloading model: prithivida/modern_colbert_base_en_v1
# Loading model from: /root/.cache/huggingface/hub/models--prithivida--modern_colbert_base_en_v1/...
# ✅ PyTorch ColBERT loaded on cuda
# Query max length: 256, Document max length: 300
```

**Complete Example with Qdrant:**

For a complete working example with Qdrant integration, environment setup, and testing instructions, see the [examples/qdrant folder](./examples/qdrant/).

The examples include:
- Environment setup and testing
- Local Qdrant server management
- Complete indexing and retrieval workflows
- Both ONNX and PyTorch backend examples

## Architecture

### Two Deployment Models

**Retrieval Service (Lightweight)**
```bash
pip install lateness
```
- ONNX backend (fast CPU inference)
- Qdrant integration
- ~50MB total dependencies
- Perfect for user-facing search APIs

**Indexing Service (Heavy)**
```bash
pip install lateness[index]
```
- PyTorch backend (GPU acceleration)
- Full Transformers support
- ~2GB+ dependencies
- Perfect for batch document processing

### Backend Selection

The package uses environment variables for backend control:

- **Default behavior** → ONNX backend (CPU retrieval)
- **`LATENESS_USE_TORCH=true`** → PyTorch backend (GPU indexing)

**Note:** PyTorch backend requires `pip install lateness[index]` to install PyTorch dependencies.

## API Reference

### ModernColBERT

```python
from lateness import ModernColBERT

# Initialize
colbert = ModernColBERT("prithivida/modern_colbert_base_en_v1")

# Encode queries
query_embeddings = colbert.encode_queries(["What is AI?"])

# Encode documents  
doc_embeddings = colbert.encode_documents(["AI is artificial intelligence"])

# Compute similarity
scores = ModernColBERT.compute_similarity(query_embeddings, doc_embeddings)
```

### Qdrant Integration

```python
from lateness import QdrantIndexer, QdrantRetriever
from qdrant_client import QdrantClient

client = QdrantClient("localhost", port=6333)

# Indexing
indexer = QdrantIndexer(client, "documents")
indexer.create_collection()
indexer.index_documents_simple(documents)

# Retrieval
retriever = QdrantRetriever(client, "documents")
results = retriever.search_simple("query", top_k=10)
```

## License

Apache License 2.0

## Contributing

Contributions welcome! Please check our [contributing guidelines](CONTRIBUTING.md).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/moderncolbert/lateness",
    "name": "lateness",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "colbert, retrieval, embeddings, information-retrieval, nlp, onnx, pytorch, qdrant",
    "author": "donkey stereotype",
    "author_email": "prithivi@donkeystereotype.com",
    "download_url": "https://files.pythonhosted.org/packages/bd/fd/352f8bd8c48526cc37f7dc8614ec899cdd1624bf40b818856d5037767622/lateness-0.1.0.tar.gz",
    "platform": null,
    "description": "# Lateness - Modern ColBERT for Late Interaction\n\nA Python package for Modern ColBERT (late interaction) embeddings with native multi-vector support for efficient retrieval using Qdrant vector database.\n\n## Features\n\n- **Dual Backend Architecture**: ONNX for fast retrieval, PyTorch for GPU indexing\n- **Native Multi-Vector Support**: Optimized for Qdrant's MaxSim comparator\n- **Smart Installation**: Lightweight retrieval or heavy indexing based on your needs\n- **Production Ready**: Separate deployment targets for different workloads\n\n## Quick Start\n\n### Installation\n\n```bash\n# Lightweight retrieval (ONNX + Qdrant)\npip install lateness\n\n# Heavy indexing (PyTorch + Transformers + ONNX + Qdrant)\npip install lateness[index]\n```\n\n### Backend Selection\n\n### Basic Usage\n\n**Default Installation (ONNX Backend):**\n```python\n\n# pip install lateness\nfrom lateness import ModernColBERT\ncolbert = ModernColBERT(\"prithivida/modern_colbert_base_en_v1\")\n# Output:\n# \ud83d\ude80 Using ONNX backend Using ONNX backend (default, for GPU accelerated indexing, install lateness[index] and set LATENESS_USE_TORCH=true)\n# \ud83d\udd04 Downloading model: prithivida/modern_colbert_base_en_v1\n# \u2705 ONNX ColBERT loaded with providers: ['CPUExecutionProvider']\n# Query max length: 256, Document max length: 300\n```\n\n**Index Installation (PyTorch Backend):**\n```python\n# pip install lateness[index]\nimport os\nos.environ['LATENESS_USE_TORCH'] = 'true'\nfrom lateness import ModernColBERT\n\ncolbert = ModernColBERT(\"prithivida/modern_colbert_base_en_v1\")\n# Output:\n# \ud83d\ude80 Using PyTorch backend (LATENESS_USE_TORCH=true)\n# \ud83d\udd04 Downloading model: prithivida/modern_colbert_base_en_v1\n# Loading model from: /root/.cache/huggingface/hub/models--prithivida--modern_colbert_base_en_v1/...\n# \u2705 PyTorch ColBERT loaded on cuda\n# Query max length: 256, Document max length: 300\n```\n\n**Complete Example with Qdrant:**\n\nFor a complete working example with Qdrant integration, environment setup, and testing instructions, see the [examples/qdrant folder](./examples/qdrant/).\n\nThe examples include:\n- Environment setup and testing\n- Local Qdrant server management\n- Complete indexing and retrieval workflows\n- Both ONNX and PyTorch backend examples\n\n## Architecture\n\n### Two Deployment Models\n\n**Retrieval Service (Lightweight)**\n```bash\npip install lateness\n```\n- ONNX backend (fast CPU inference)\n- Qdrant integration\n- ~50MB total dependencies\n- Perfect for user-facing search APIs\n\n**Indexing Service (Heavy)**\n```bash\npip install lateness[index]\n```\n- PyTorch backend (GPU acceleration)\n- Full Transformers support\n- ~2GB+ dependencies\n- Perfect for batch document processing\n\n### Backend Selection\n\nThe package uses environment variables for backend control:\n\n- **Default behavior** \u2192 ONNX backend (CPU retrieval)\n- **`LATENESS_USE_TORCH=true`** \u2192 PyTorch backend (GPU indexing)\n\n**Note:** PyTorch backend requires `pip install lateness[index]` to install PyTorch dependencies.\n\n## API Reference\n\n### ModernColBERT\n\n```python\nfrom lateness import ModernColBERT\n\n# Initialize\ncolbert = ModernColBERT(\"prithivida/modern_colbert_base_en_v1\")\n\n# Encode queries\nquery_embeddings = colbert.encode_queries([\"What is AI?\"])\n\n# Encode documents  \ndoc_embeddings = colbert.encode_documents([\"AI is artificial intelligence\"])\n\n# Compute similarity\nscores = ModernColBERT.compute_similarity(query_embeddings, doc_embeddings)\n```\n\n### Qdrant Integration\n\n```python\nfrom lateness import QdrantIndexer, QdrantRetriever\nfrom qdrant_client import QdrantClient\n\nclient = QdrantClient(\"localhost\", port=6333)\n\n# Indexing\nindexer = QdrantIndexer(client, \"documents\")\nindexer.create_collection()\nindexer.index_documents_simple(documents)\n\n# Retrieval\nretriever = QdrantRetriever(client, \"documents\")\nresults = retriever.search_simple(\"query\", top_k=10)\n```\n\n## License\n\nApache License 2.0\n\n## Contributing\n\nContributions welcome! Please check our [contributing guidelines](CONTRIBUTING.md).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Modern ColBERT for Late Interaction with native multi-vector support",
    "version": "0.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/moderncolbert/lateness/issues",
        "Documentation": "https://moderncolbert.github.io/lateness/",
        "Homepage": "https://github.com/moderncolbert/lateness",
        "Source": "https://github.com/moderncolbert/lateness"
    },
    "split_keywords": [
        "colbert",
        " retrieval",
        " embeddings",
        " information-retrieval",
        " nlp",
        " onnx",
        " pytorch",
        " qdrant"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0e1402351c6bdf42d9880e999ab5234bbd10f6b99b99b8674cd68b8e40b4b26f",
                "md5": "c015281960e6d63d9169054611c47448",
                "sha256": "a36d2ac44be6b48ba06a2e9c8bd643318f947ba56cee24347cdcada034b6a580"
            },
            "downloads": -1,
            "filename": "lateness-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c015281960e6d63d9169054611c47448",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 19659,
            "upload_time": "2025-07-17T07:49:49",
            "upload_time_iso_8601": "2025-07-17T07:49:49.562760Z",
            "url": "https://files.pythonhosted.org/packages/0e/14/02351c6bdf42d9880e999ab5234bbd10f6b99b99b8674cd68b8e40b4b26f/lateness-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bdfd352f8bd8c48526cc37f7dc8614ec899cdd1624bf40b818856d5037767622",
                "md5": "f62faf8d16cd0194bcaeedbd55c87121",
                "sha256": "fe807e2e8cd6757f4acf78eb15dcfbdef21e6aa917bf5964143ae6703b5e18e7"
            },
            "downloads": -1,
            "filename": "lateness-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f62faf8d16cd0194bcaeedbd55c87121",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 17538,
            "upload_time": "2025-07-17T07:49:50",
            "upload_time_iso_8601": "2025-07-17T07:49:50.972318Z",
            "url": "https://files.pythonhosted.org/packages/bd/fd/352f8bd8c48526cc37f7dc8614ec899cdd1624bf40b818856d5037767622/lateness-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-17 07:49:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "moderncolbert",
    "github_project": "lateness",
    "github_not_found": true,
    "lcname": "lateness"
}
        
Elapsed time: 0.57044s