# just-semantic-search
[![PyPI version](https://badge.fury.io/py/just-semantic-search.svg)](https://badge.fury.io/py/just-semantic-search)
[![Python Version](https://img.shields.io/pypi/pyversions/just-semantic-search.svg)](https://pypi.org/project/just-semantic-search/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Downloads](https://static.pepy.tech/badge/just-semantic-search)](https://pepy.tech/project/just-semantic-search)
LLM-agnostic semantic-search library with hybrid search support and multiple backends.
## Features
- 🔍 Hybrid search combining semantic and keyword search
- 🚀 Multiple backend support (Meilisearch, more coming soon)
- 📄 Smart document splitting with semantic awareness
- 🔌 LLM-agnostic - works with any embedding model
- 🎯 Optimized for scientific and technical content
- 🛠 Easy to use API and CLI tools
## Installation
Make sure you have at least Python 3.11 installed.
### Using pip
```bash
pip install just-semantic-search # Core package
pip install just-semantic-search-meili # Meilisearch backend
```
### Using Poetry
```bash
poetry add just-semantic-search # Core package
poetry add just-semantic-search-meili # Meilisearch backend
```
### From Source
```bash
# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -
# Clone the repository
git clone https://github.com/your-username/just-semantic-search.git
cd just-semantic-search
# Install dependencies and create virtual environment
poetry install
# Activate the virtual environment
poetry shell
```
### Docker Setup for Meilisearch
The project includes a Docker Compose configuration for running Meilisearch. Simply run:
```bash
./bin/meili.sh
```
This will start a Meilisearch instance with vector search enabled and persistent data storage.
## Quick Start
### Document Splitting
```python
from just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter
from sentence_transformers import SentenceTransformer
# Initialize model and splitter
model = SentenceTransformer('thenlper/gte-base')
splitter = ArticleSemanticSplitter(model)
# Split document with metadata
documents = splitter.split_file(
"path/to/document.txt",
embed=True,
title="Document Title",
source="https://source.url"
)
```
### Hybrid Search with Meilisearch
```python
from just_semantic_search.meili.rag import MeiliConfig, MeiliRAG
# Configure Meilisearch
config = MeiliConfig(
host="127.0.0.1",
port=7700,
api_key="your_api_key"
)
# Initialize RAG
rag = MeiliRAG(
"test_index",
"thenlper/gte-base",
config,
create_index_if_not_exists=True
)
# Add documents and search
rag.add_documents_sync(documents)
results = rag.search(
text_query="What are CAD-genes?",
vector=model.encode("What are CAD-genes?")
)
```
## Project Structure
The project consists of multiple components:
- `core`: Core interfaces for hybrid search implementations
- `meili`: Meilisearch backend implementation
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
## License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
## Citation
If you use this software in your research, please cite:
```bibtex
@software{just_semantic_search,
title = {just-semantic-search: LLM-agnostic semantic search library},
author = {Karmazin, Alex and Kulaga, Anton},
year = {2024},
url = {https://github.com/your-username/just-semantic-search}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "just-semantic-search",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.11",
"maintainer_email": null,
"keywords": "python, llm, science, review, hybrid search, semantic search",
"author": "Alex Karmazin",
"author_email": "karmazinalex@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/c6/97/ab9b29344f156a71bd905f5de4154b950819f00850f7458995bbb96aca32/just_semantic_search-0.0.4.tar.gz",
"platform": null,
"description": "# just-semantic-search\n\n[![PyPI version](https://badge.fury.io/py/just-semantic-search.svg)](https://badge.fury.io/py/just-semantic-search)\n[![Python Version](https://img.shields.io/pypi/pyversions/just-semantic-search.svg)](https://pypi.org/project/just-semantic-search/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Downloads](https://static.pepy.tech/badge/just-semantic-search)](https://pepy.tech/project/just-semantic-search)\n\nLLM-agnostic semantic-search library with hybrid search support and multiple backends.\n\n## Features\n\n- \ud83d\udd0d Hybrid search combining semantic and keyword search\n- \ud83d\ude80 Multiple backend support (Meilisearch, more coming soon)\n- \ud83d\udcc4 Smart document splitting with semantic awareness\n- \ud83d\udd0c LLM-agnostic - works with any embedding model\n- \ud83c\udfaf Optimized for scientific and technical content\n- \ud83d\udee0 Easy to use API and CLI tools\n\n## Installation\n\nMake sure you have at least Python 3.11 installed.\n\n### Using pip\n\n```bash\npip install just-semantic-search # Core package\npip install just-semantic-search-meili # Meilisearch backend\n```\n\n### Using Poetry\n\n```bash\npoetry add just-semantic-search # Core package\npoetry add just-semantic-search-meili # Meilisearch backend\n```\n\n### From Source\n\n```bash\n# Install Poetry if you haven't already\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Clone the repository\ngit clone https://github.com/your-username/just-semantic-search.git\ncd just-semantic-search\n\n# Install dependencies and create virtual environment\npoetry install\n\n# Activate the virtual environment\npoetry shell\n```\n\n### Docker Setup for Meilisearch\n\nThe project includes a Docker Compose configuration for running Meilisearch. Simply run:\n\n```bash\n./bin/meili.sh\n```\n\nThis will start a Meilisearch instance with vector search enabled and persistent data storage.\n\n## Quick Start\n\n### Document Splitting\n\n```python\nfrom just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter\nfrom sentence_transformers import SentenceTransformer\n\n# Initialize model and splitter\nmodel = SentenceTransformer('thenlper/gte-base')\nsplitter = ArticleSemanticSplitter(model)\n\n# Split document with metadata\ndocuments = splitter.split_file(\n \"path/to/document.txt\",\n embed=True,\n title=\"Document Title\",\n source=\"https://source.url\"\n)\n```\n\n### Hybrid Search with Meilisearch\n\n```python\nfrom just_semantic_search.meili.rag import MeiliConfig, MeiliRAG\n\n# Configure Meilisearch\nconfig = MeiliConfig(\n host=\"127.0.0.1\",\n port=7700,\n api_key=\"your_api_key\"\n)\n\n# Initialize RAG\nrag = MeiliRAG(\n \"test_index\",\n \"thenlper/gte-base\",\n config,\n create_index_if_not_exists=True\n)\n\n# Add documents and search\nrag.add_documents_sync(documents)\nresults = rag.search(\n text_query=\"What are CAD-genes?\",\n vector=model.encode(\"What are CAD-genes?\")\n)\n```\n\n## Project Structure\n\nThe project consists of multiple components:\n\n- `core`: Core interfaces for hybrid search implementations\n- `meili`: Meilisearch backend implementation\n\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use this software in your research, please cite:\n\n```bibtex\n@software{just_semantic_search,\n title = {just-semantic-search: LLM-agnostic semantic search library},\n author = {Karmazin, Alex and Kulaga, Anton},\n year = {2024},\n url = {https://github.com/your-username/just-semantic-search}\n}\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Core interfaces for hybrid search implementations",
"version": "0.0.4",
"project_urls": null,
"split_keywords": [
"python",
" llm",
" science",
" review",
" hybrid search",
" semantic search"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9d220e511dfe48fcd18603c9e96d2a90e55c3d5d0032bb5d57c51202afde2048",
"md5": "1a12e9baa5d74207cfa45794a07ddd8a",
"sha256": "26eac8212f022d5a529fcdf6aad791f8da09c70a3d229485ed37991087555c4f"
},
"downloads": -1,
"filename": "just_semantic_search-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1a12e9baa5d74207cfa45794a07ddd8a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.11",
"size": 14246,
"upload_time": "2024-12-17T02:31:02",
"upload_time_iso_8601": "2024-12-17T02:31:02.171154Z",
"url": "https://files.pythonhosted.org/packages/9d/22/0e511dfe48fcd18603c9e96d2a90e55c3d5d0032bb5d57c51202afde2048/just_semantic_search-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c697ab9b29344f156a71bd905f5de4154b950819f00850f7458995bbb96aca32",
"md5": "7ef465eefa65bda53bc4327673e8741f",
"sha256": "eb681e01eda151756888be191d0b97aea28356312be1aca1b40dc31571a0a57f"
},
"downloads": -1,
"filename": "just_semantic_search-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "7ef465eefa65bda53bc4327673e8741f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.11",
"size": 12061,
"upload_time": "2024-12-17T02:31:04",
"upload_time_iso_8601": "2024-12-17T02:31:04.708534Z",
"url": "https://files.pythonhosted.org/packages/c6/97/ab9b29344f156a71bd905f5de4154b950819f00850f7458995bbb96aca32/just_semantic_search-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-17 02:31:04",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "just-semantic-search"
}