litepali


Namelitepali JSON
Version 0.0.5 PyPI version JSON
download
home_pageNone
SummaryLightweight ColPali-based retrieval for cloud
upload_time2024-09-15 18:02:09
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2024 Simeon Emanuilov Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords colpali image retrieval rag semantic search vision-language-models document-ai multi-modal information retrieval machine learning cloud-optimized
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="assets/logo.png" alt="LitePali Logo" width="100"/>
  <h1>LitePali</h1>
  <p>Lightweight Document Retrieval with Vision Language Models</p>
</div>

<p align="center">
  <img src="https://img.shields.io/badge/Python-3.7%2B-blue?logo=python" alt="Python Version">
  <img src="https://img.shields.io/badge/License-MIT-green" alt="License">
  <img src="https://img.shields.io/badge/Pytorch-1.8%2B-orange?logo=pytorch" alt="Pytorch Version">
</p>

<p align="center">
  <a href="#features">🚀 Features</a> •
  <a href="#model">🧠 Model</a> •
  <a href="#installation">💻 Installation</a> •
  <a href="#usage">📘 Usage</a> •
  <a href="#why-litepali">❓ Why LitePali</a> •
  <a href="#contributing">🤝 Contributing</a>
  <a href="#todo">🏗 TODO</a>
</p>

---

# LitePali

LitePali is a lightweight document retrieval system I created, inspired by the ColPali model and optimized for cloud
deployment. It's designed to efficiently process and search through document images using state-of-the-art
vision-language models.

## 🚀Features

📦 Minimal dependencies
🖼️ Direct image processing without complex PDF parsing
🔄 Deterministic file processing
⚡ Batch processing for multiple files
☁️ Optimized for cloud environments

## 🧠Model

LitePali is built on the ColPali architecture, which uses Vision Language Models (VLMs) for efficient document
retrieval.

Key features include:

1. **Late Interaction Mechanism**: Enables efficient query matching while maintaining context.
2. **Multi-Vector Representations**: Generates fine-grained representations of text and images.
3. **Visual and Textual Understanding**: Processes document images directly, understanding both content and layout.
4. **Efficient Indexing**: Faster corpus indexing compared to traditional PDF parsing methods.

![ColPali Architecture](assets/colpali-architecture.png)

This approach allows LitePali to perform efficient retrieval while capturing complex document structures and content.

## Inspiration

This library is inspired by [byaldi](https://github.com/AnswerDotAI/byaldi), but with several key differences:

- **Focus on images**: LitePali works exclusively with images, allowing PDF processing to be handled separately on
  CPU-only environments.
- **Simplified dependencies**: No need for Poppler or other PDF-related dependencies.
- **Updated engine**: Utilizes `colpali-engine` >=0.3.0 for improved performance.
- **Deterministic processing**: Implements deterministic file processing for consistent results.
- **Efficient batch processing**: Employs batch processing when adding multiple files, enhancing performance.
- **Customized functionality**: Tailored for specific needs while building upon the excellent foundation laid by
  byaldi.

These differences make LitePali a more streamlined and focused tool for image-based document retrieval, offering
flexibility in deployment and integration with existing PDF processing pipelines.

## Installation

Install LitePali using pip:

```bash
pip install litepali
```

## Usage

Here's a simple example of how to use LitePali:

```python
from litepali import LitePali, ImageFile

# Initialize LitePali
litepali = LitePali()

# Add some images with metadata and page information
litepali.add(ImageFile(
    path="path/to/image1.jpg",
    document_id=1,
    page_id=1,
    metadata={"title": "Introduction", "author": "John Doe"}
))
litepali.add(ImageFile(
    path="path/to/image2.png",
    document_id=1,
    page_id=2,
    metadata={"title": "Results", "author": "John Doe"}
))
litepali.add(ImageFile(
    path="path/to/image3.jpg",
    document_id=2,
    page_id=1,
    metadata={"title": "Abstract", "author": "Jane Smith"}
))

# Process the added images
litepali.process()

# Perform a search
results = litepali.search("Your query here", k=5)

# Print results
for result in results:
    print(f"Image: {result['image'].path}, Score: {result['score']}")

# Save the index
litepali.save_index("path/to/save/index")

# Later, load the index
new_litepali = LitePali()
new_litepali.load_index("path/to/save/index")
```

This example demonstrates how to add images, process them, perform a search, and save/load the index.

## Why LitePali?

I created LitePali to address the need for a lightweight, efficient document retrieval system that could work directly
with images. By leveraging the power of vision-language models like ColPali, LitePali can understand both textual and
visual elements in documents, making it ideal for complex document retrieval tasks.

LitePali is designed to be easy to use and deploy in cloud environments, making it a great choice for researchers and
developers working on document retrieval systems.

## Contributing

Contributions are welcome! Feel free to submit issues or pull requests if you have any improvements or bug fixes.

## TODO

Future improvements and features planned for LitePali:

- [ ] **Enhanced index storage**
    - Implement storage of base64-encoded versions of images within the index.
    - This will allow for quick retrieval and display of images without needing to access the original files.

- [ ] **Performance optimizations**
    - Tests with flash-attention.
    - This optimization is expected to significantly speed up processing times, especially for large batches of images.

- [ ] **Quantization support**
    - Add support for lower precision (e.g., int8, int4) to reduce memory footprint and increase inference speed.

- [ ] **API enhancements**
    - Develop a more comprehensive API for advanced querying and filtering options.

- [ ] **Documentation expansion**
    - Create more detailed documentation, including advanced usage examples and best practices.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "litepali",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "colpali, image retrieval, rag, semantic search, vision-language-models, document-ai, multi-modal, information retrieval, machine learning, cloud-optimized",
    "author": null,
    "author_email": "Simeon Emanuilov <simeon.emanuilov@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/fe/86/85b50544d69d359c0c42626279e9b352281ccc69d2fcbc95bc1409f6e6fe/litepali-0.0.5.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"assets/logo.png\" alt=\"LitePali Logo\" width=\"100\"/>\n  <h1>LitePali</h1>\n  <p>Lightweight Document Retrieval with Vision Language Models</p>\n</div>\n\n<p align=\"center\">\n  <img src=\"https://img.shields.io/badge/Python-3.7%2B-blue?logo=python\" alt=\"Python Version\">\n  <img src=\"https://img.shields.io/badge/License-MIT-green\" alt=\"License\">\n  <img src=\"https://img.shields.io/badge/Pytorch-1.8%2B-orange?logo=pytorch\" alt=\"Pytorch Version\">\n</p>\n\n<p align=\"center\">\n  <a href=\"#features\">\ud83d\ude80 Features</a> \u2022\n  <a href=\"#model\">\ud83e\udde0 Model</a> \u2022\n  <a href=\"#installation\">\ud83d\udcbb Installation</a> \u2022\n  <a href=\"#usage\">\ud83d\udcd8 Usage</a> \u2022\n  <a href=\"#why-litepali\">\u2753 Why LitePali</a> \u2022\n  <a href=\"#contributing\">\ud83e\udd1d Contributing</a>\n  <a href=\"#todo\">\ud83c\udfd7 TODO</a>\n</p>\n\n---\n\n# LitePali\n\nLitePali is a lightweight document retrieval system I created, inspired by the ColPali model and optimized for cloud\ndeployment. It's designed to efficiently process and search through document images using state-of-the-art\nvision-language models.\n\n## \ud83d\ude80Features\n\n\ud83d\udce6 Minimal dependencies\n\ud83d\uddbc\ufe0f Direct image processing without complex PDF parsing\n\ud83d\udd04 Deterministic file processing\n\u26a1 Batch processing for multiple files\n\u2601\ufe0f Optimized for cloud environments\n\n## \ud83e\udde0Model\n\nLitePali is built on the ColPali architecture, which uses Vision Language Models (VLMs) for efficient document\nretrieval.\n\nKey features include:\n\n1. **Late Interaction Mechanism**: Enables efficient query matching while maintaining context.\n2. **Multi-Vector Representations**: Generates fine-grained representations of text and images.\n3. **Visual and Textual Understanding**: Processes document images directly, understanding both content and layout.\n4. **Efficient Indexing**: Faster corpus indexing compared to traditional PDF parsing methods.\n\n![ColPali Architecture](assets/colpali-architecture.png)\n\nThis approach allows LitePali to perform efficient retrieval while capturing complex document structures and content.\n\n## Inspiration\n\nThis library is inspired by [byaldi](https://github.com/AnswerDotAI/byaldi), but with several key differences:\n\n- **Focus on images**: LitePali works exclusively with images, allowing PDF processing to be handled separately on\n  CPU-only environments.\n- **Simplified dependencies**: No need for Poppler or other PDF-related dependencies.\n- **Updated engine**: Utilizes `colpali-engine` >=0.3.0 for improved performance.\n- **Deterministic processing**: Implements deterministic file processing for consistent results.\n- **Efficient batch processing**: Employs batch processing when adding multiple files, enhancing performance.\n- **Customized functionality**: Tailored for specific needs while building upon the excellent foundation laid by\n  byaldi.\n\nThese differences make LitePali a more streamlined and focused tool for image-based document retrieval, offering\nflexibility in deployment and integration with existing PDF processing pipelines.\n\n## Installation\n\nInstall LitePali using pip:\n\n```bash\npip install litepali\n```\n\n## Usage\n\nHere's a simple example of how to use LitePali:\n\n```python\nfrom litepali import LitePali, ImageFile\n\n# Initialize LitePali\nlitepali = LitePali()\n\n# Add some images with metadata and page information\nlitepali.add(ImageFile(\n    path=\"path/to/image1.jpg\",\n    document_id=1,\n    page_id=1,\n    metadata={\"title\": \"Introduction\", \"author\": \"John Doe\"}\n))\nlitepali.add(ImageFile(\n    path=\"path/to/image2.png\",\n    document_id=1,\n    page_id=2,\n    metadata={\"title\": \"Results\", \"author\": \"John Doe\"}\n))\nlitepali.add(ImageFile(\n    path=\"path/to/image3.jpg\",\n    document_id=2,\n    page_id=1,\n    metadata={\"title\": \"Abstract\", \"author\": \"Jane Smith\"}\n))\n\n# Process the added images\nlitepali.process()\n\n# Perform a search\nresults = litepali.search(\"Your query here\", k=5)\n\n# Print results\nfor result in results:\n    print(f\"Image: {result['image'].path}, Score: {result['score']}\")\n\n# Save the index\nlitepali.save_index(\"path/to/save/index\")\n\n# Later, load the index\nnew_litepali = LitePali()\nnew_litepali.load_index(\"path/to/save/index\")\n```\n\nThis example demonstrates how to add images, process them, perform a search, and save/load the index.\n\n## Why LitePali?\n\nI created LitePali to address the need for a lightweight, efficient document retrieval system that could work directly\nwith images. By leveraging the power of vision-language models like ColPali, LitePali can understand both textual and\nvisual elements in documents, making it ideal for complex document retrieval tasks.\n\nLitePali is designed to be easy to use and deploy in cloud environments, making it a great choice for researchers and\ndevelopers working on document retrieval systems.\n\n## Contributing\n\nContributions are welcome! Feel free to submit issues or pull requests if you have any improvements or bug fixes.\n\n## TODO\n\nFuture improvements and features planned for LitePali:\n\n- [ ] **Enhanced index storage**\n    - Implement storage of base64-encoded versions of images within the index.\n    - This will allow for quick retrieval and display of images without needing to access the original files.\n\n- [ ] **Performance optimizations**\n    - Tests with flash-attention.\n    - This optimization is expected to significantly speed up processing times, especially for large batches of images.\n\n- [ ] **Quantization support**\n    - Add support for lower precision (e.g., int8, int4) to reduce memory footprint and increase inference speed.\n\n- [ ] **API enhancements**\n    - Develop a more comprehensive API for advanced querying and filtering options.\n\n- [ ] **Documentation expansion**\n    - Create more detailed documentation, including advanced usage examples and best practices.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Simeon Emanuilov  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Lightweight ColPali-based retrieval for cloud",
    "version": "0.0.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/s-emanuilov/litepali/issues",
        "Homepage": "https://github.com/s-emanuilov/litepali",
        "Website": "https://litepali.com"
    },
    "split_keywords": [
        "colpali",
        " image retrieval",
        " rag",
        " semantic search",
        " vision-language-models",
        " document-ai",
        " multi-modal",
        " information retrieval",
        " machine learning",
        " cloud-optimized"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "73958b8418308124071b62651fc9803e9be778100c0a629e76fca7c293fcf836",
                "md5": "7b857039cf627da8dce72a2fb770021b",
                "sha256": "38144236b24a9632e8f1170720cbb0e14333524f642d604e846b3ac944a6815c"
            },
            "downloads": -1,
            "filename": "litepali-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7b857039cf627da8dce72a2fb770021b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8158,
            "upload_time": "2024-09-15T18:02:08",
            "upload_time_iso_8601": "2024-09-15T18:02:08.238765Z",
            "url": "https://files.pythonhosted.org/packages/73/95/8b8418308124071b62651fc9803e9be778100c0a629e76fca7c293fcf836/litepali-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fe8685b50544d69d359c0c42626279e9b352281ccc69d2fcbc95bc1409f6e6fe",
                "md5": "5f44fa0baa1fdad965c18f7c0c77bce9",
                "sha256": "937ca6a402a73f5d8d212c6f8b0372f579f0b8213ef2b7394eddf710ce7eab50"
            },
            "downloads": -1,
            "filename": "litepali-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "5f44fa0baa1fdad965c18f7c0c77bce9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 10959,
            "upload_time": "2024-09-15T18:02:09",
            "upload_time_iso_8601": "2024-09-15T18:02:09.906708Z",
            "url": "https://files.pythonhosted.org/packages/fe/86/85b50544d69d359c0c42626279e9b352281ccc69d2fcbc95bc1409f6e6fe/litepali-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-15 18:02:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "s-emanuilov",
    "github_project": "litepali",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "litepali"
}
        
Elapsed time: 0.82733s