# LoLLMsVectorDB
**LoLLMsVectorDB**: A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem. Supports various vectorization methods and directory bindings for efficient text data management.
## Features
- **Flexible Vectorization**: Supports multiple vectorization methods including TF-IDF and Word2Vec.
- **Directory Binding**: Automatically updates the vector store with text data from a specified directory.
- **Efficient Search**: Provides fast and accurate search results with metadata to locate the original text chunks.
- **Modular Design**: Easily extendable to support new vectorization methods and functionalities.
## Installation
```bash
pip install lollmsvectordb
```
## Usage
### Example with TFIDFVectorizer
```python
from lollmsvectordb import TFIDFVectorizer, VectorDatabase, DirectoryBinding
# Initialize the vectorizer
tfidf_vectorizer = TFIDFVectorizer()
tfidf_vectorizer.fit(["This is a sample text.", "Another sample text."])
# Create the vector database
db = VectorDatabase("vector_db.sqlite", tfidf_vectorizer)
# Bind a directory to the vector database
directory_binding = DirectoryBinding("path_to_your_directory", db)
# Update the vector store with text data from the directory
directory_binding.update_vector_store()
# Search for a query in the vector database
results = directory_binding.search("This is a sample text.")
print(results)
```
### Adding New Vectorization Methods
To add a new vectorization method, create a subclass of the `Vectorizer` class and implement the `vectorize` method.
```python
from lollmsvectordb import Vectorizer
class CustomVectorizer(Vectorizer):
def vectorize(self, data):
# Implement your custom vectorization logic here
pass
```
## Contributing
Contributions are welcome! Please fork the repository and submit a pull request.
## License
This project is licensed under the MIT License.
## Contact
For any questions or suggestions, feel free to reach out to the author:
- **Twitter**: [@ParisNeo_AI](https://twitter.com/ParisNeo_AI)
- **Discord**: [Join our Discord](https://discord.gg/BDxacQmv)
- **Sub-Reddit**: [r/lollms](https://www.reddit.com/r/lollms/)
- **Instagram**: [spacenerduino](https://www.instagram.com/spacenerduino/)
## Acknowledgements
Special thanks to the LoLLMs community for their continuous support and contributions.
Raw data
{
"_id": null,
"home_page": "https://github.com/ParisNeo/LoLLMsVectorDB",
"name": "lollmsvectordb",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "ParisNeo",
"author_email": "parisneoai@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/e8/17/7ae422cbe1dae4509b3e4cb67ccef38458fffa7dd77039817180bb0848ff/lollmsvectordb-1.1.5.tar.gz",
"platform": null,
"description": "# LoLLMsVectorDB\r\n\r\n**LoLLMsVectorDB**: A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem. Supports various vectorization methods and directory bindings for efficient text data management.\r\n\r\n## Features\r\n\r\n- **Flexible Vectorization**: Supports multiple vectorization methods including TF-IDF and Word2Vec.\r\n- **Directory Binding**: Automatically updates the vector store with text data from a specified directory.\r\n- **Efficient Search**: Provides fast and accurate search results with metadata to locate the original text chunks.\r\n- **Modular Design**: Easily extendable to support new vectorization methods and functionalities.\r\n\r\n## Installation\r\n\r\n```bash\r\npip install lollmsvectordb\r\n```\r\n\r\n## Usage\r\n\r\n### Example with TFIDFVectorizer\r\n\r\n```python\r\nfrom lollmsvectordb import TFIDFVectorizer, VectorDatabase, DirectoryBinding\r\n\r\n# Initialize the vectorizer\r\ntfidf_vectorizer = TFIDFVectorizer()\r\ntfidf_vectorizer.fit([\"This is a sample text.\", \"Another sample text.\"])\r\n\r\n# Create the vector database\r\ndb = VectorDatabase(\"vector_db.sqlite\", tfidf_vectorizer)\r\n\r\n# Bind a directory to the vector database\r\ndirectory_binding = DirectoryBinding(\"path_to_your_directory\", db)\r\n\r\n# Update the vector store with text data from the directory\r\ndirectory_binding.update_vector_store()\r\n\r\n# Search for a query in the vector database\r\nresults = directory_binding.search(\"This is a sample text.\")\r\nprint(results)\r\n```\r\n\r\n### Adding New Vectorization Methods\r\n\r\nTo add a new vectorization method, create a subclass of the `Vectorizer` class and implement the `vectorize` method.\r\n\r\n```python\r\nfrom lollmsvectordb import Vectorizer\r\n\r\nclass CustomVectorizer(Vectorizer):\r\n def vectorize(self, data):\r\n # Implement your custom vectorization logic here\r\n pass\r\n```\r\n\r\n## Contributing\r\n\r\nContributions are welcome! Please fork the repository and submit a pull request.\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License.\r\n\r\n## Contact\r\n\r\nFor any questions or suggestions, feel free to reach out to the author:\r\n\r\n- **Twitter**: [@ParisNeo_AI](https://twitter.com/ParisNeo_AI)\r\n- **Discord**: [Join our Discord](https://discord.gg/BDxacQmv)\r\n- **Sub-Reddit**: [r/lollms](https://www.reddit.com/r/lollms/)\r\n- **Instagram**: [spacenerduino](https://www.instagram.com/spacenerduino/)\r\n\r\n## Acknowledgements\r\n\r\nSpecial thanks to the LoLLMs community for their continuous support and contributions.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "A modular text-based database manager for retrieval-augmented generation (RAG), seamlessly integrating with the LoLLMs ecosystem.",
"version": "1.1.5",
"project_urls": {
"Homepage": "https://github.com/ParisNeo/LoLLMsVectorDB"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "69158dc5bc9c963d64bab00e44e05be7fedca1757d6ab1624707e3fe76f769f9",
"md5": "466e28c4980f5b9a45519899bfb4a9c3",
"sha256": "ab7bd1ae66f1e29914f515a1834d446aa1c7b4503ffc92c2274c3c5a7c3df7d4"
},
"downloads": -1,
"filename": "lollmsvectordb-1.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "466e28c4980f5b9a45519899bfb4a9c3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 34423,
"upload_time": "2024-10-09T09:02:15",
"upload_time_iso_8601": "2024-10-09T09:02:15.355828Z",
"url": "https://files.pythonhosted.org/packages/69/15/8dc5bc9c963d64bab00e44e05be7fedca1757d6ab1624707e3fe76f769f9/lollmsvectordb-1.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e8177ae422cbe1dae4509b3e4cb67ccef38458fffa7dd77039817180bb0848ff",
"md5": "9b9cf9cf564d5c96ccff8919284c3f44",
"sha256": "34140e86e653e6153c187c0c17e5c3fca209769d52613f7195c4b31273c0b3aa"
},
"downloads": -1,
"filename": "lollmsvectordb-1.1.5.tar.gz",
"has_sig": false,
"md5_digest": "9b9cf9cf564d5c96ccff8919284c3f44",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 29825,
"upload_time": "2024-10-09T09:02:17",
"upload_time_iso_8601": "2024-10-09T09:02:17.771888Z",
"url": "https://files.pythonhosted.org/packages/e8/17/7ae422cbe1dae4509b3e4cb67ccef38458fffa7dd77039817180bb0848ff/lollmsvectordb-1.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-09 09:02:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ParisNeo",
"github_project": "LoLLMsVectorDB",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "numpy",
"specs": [
[
"==",
"1.26.3"
]
]
},
{
"name": "tiktoken",
"specs": []
},
{
"name": "ascii_colors",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "transformers",
"specs": []
},
{
"name": "PyPDF2",
"specs": []
},
{
"name": "openai",
"specs": []
},
{
"name": "dpkt",
"specs": []
},
{
"name": "pipmaster",
"specs": []
},
{
"name": "python-docx",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "openpyxl",
"specs": []
},
{
"name": "beautifulsoup4",
"specs": []
},
{
"name": "extract-msg",
"specs": []
}
],
"lcname": "lollmsvectordb"
}