# just-semantic-search
[](https://badge.fury.io/py/just-semantic-search)
[](https://pypi.org/project/just-semantic-search/)
[](https://opensource.org/licenses/Apache-2.0)
[](https://pepy.tech/project/just-semantic-search)
LLM-agnostic semantic-search library with hybrid search support and multiple backends.
## Features
- 🔍 Hybrid search combining semantic and keyword search
- 🚀 Multiple backend support (Meilisearch, more coming soon)
- 📄 Smart document splitting with semantic awareness
- 🔌 LLM-agnostic - works with any embedding model
- 🎯 Optimized for scientific and technical content
- 🛠 Easy to use API and CLI tools
## Installation
Make sure you have at least Python 3.11 installed.
### Using pip
```bash
pip install just-semantic-search # Core package
pip install just-semantic-search-meili # Meilisearch backend
```
### Using Poetry
```bash
poetry add just-semantic-search # Core package
poetry add just-semantic-search-meili # Meilisearch backend
```
### From Source
```bash
# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -
# Clone the repository
git clone https://github.com/your-username/just-semantic-search.git
cd just-semantic-search
# Install dependencies and create virtual environment
poetry install
# Activate the virtual environment
poetry shell
```
### Docker Setup for Meilisearch
The project includes a Docker Compose configuration for running Meilisearch. Simply run:
```bash
./bin/meili.sh
```
This will start a Meilisearch instance with vector search enabled and persistent data storage.
## Quick Start
### Document Splitting
```python
from just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter
from sentence_transformers import SentenceTransformer
# Initialize model and splitter
model = SentenceTransformer('thenlper/gte-base')
splitter = ArticleSemanticSplitter(model)
# Split document with metadata
documents = splitter.split_file(
"path/to/document.txt",
embed=True,
title="Document Title",
source="https://source.url"
)
```
### Hybrid Search with Meilisearch
```python
from just_semantic_search.meili.rag import MeiliConfig, MeiliRAG
# Configure Meilisearch
config = MeiliConfig(
host="127.0.0.1",
port=7700,
api_key="your_api_key"
)
# Initialize RAG
rag = MeiliRAG(
"test_index",
"thenlper/gte-base",
config,
create_index_if_not_exists=True
)
# Add documents and search
rag.add_documents_sync(documents)
results = rag.search(
text_query="What are CAD-genes?",
vector=model.encode("What are CAD-genes?")
)
```
## Project Structure
The project consists of multiple components:
- `core`: Core interfaces for hybrid search implementations
- `meili`: Meilisearch backend implementation
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
## License
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
## Citation
If you use this software in your research, please cite:
```bibtex
@software{just_semantic_search,
title = {just-semantic-search: LLM-agnostic semantic search library},
author = {Karmazin, Alex and Kulaga, Anton},
year = {2024},
url = {https://github.com/your-username/just-semantic-search}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "just-semantic-search",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.15,>=3.10",
"maintainer_email": null,
"keywords": "python, llm, science, review, hybrid search, semantic search, gpu, cuda",
"author": "Alex Karmazin",
"author_email": "karmazinalex@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/af/ac/98a673534505f12b8d4e5a6a5afbd7285d52df1886503fc2bb23aa41743c/just_semantic_search-0.2.4.tar.gz",
"platform": null,
"description": "# just-semantic-search\n\n[](https://badge.fury.io/py/just-semantic-search)\n[](https://pypi.org/project/just-semantic-search/)\n[](https://opensource.org/licenses/Apache-2.0)\n[](https://pepy.tech/project/just-semantic-search)\n\nLLM-agnostic semantic-search library with hybrid search support and multiple backends.\n\n## Features\n\n- \ud83d\udd0d Hybrid search combining semantic and keyword search\n- \ud83d\ude80 Multiple backend support (Meilisearch, more coming soon)\n- \ud83d\udcc4 Smart document splitting with semantic awareness\n- \ud83d\udd0c LLM-agnostic - works with any embedding model\n- \ud83c\udfaf Optimized for scientific and technical content\n- \ud83d\udee0 Easy to use API and CLI tools\n\n## Installation\n\nMake sure you have at least Python 3.11 installed.\n\n### Using pip\n\n```bash\npip install just-semantic-search # Core package\npip install just-semantic-search-meili # Meilisearch backend\n```\n\n### Using Poetry\n\n```bash\npoetry add just-semantic-search # Core package\npoetry add just-semantic-search-meili # Meilisearch backend\n```\n\n### From Source\n\n```bash\n# Install Poetry if you haven't already\ncurl -sSL https://install.python-poetry.org | python3 -\n\n# Clone the repository\ngit clone https://github.com/your-username/just-semantic-search.git\ncd just-semantic-search\n\n# Install dependencies and create virtual environment\npoetry install\n\n# Activate the virtual environment\npoetry shell\n```\n\n### Docker Setup for Meilisearch\n\nThe project includes a Docker Compose configuration for running Meilisearch. Simply run:\n\n```bash\n./bin/meili.sh\n```\n\nThis will start a Meilisearch instance with vector search enabled and persistent data storage.\n\n## Quick Start\n\n### Document Splitting\n\n```python\nfrom just_semantic_search.article_semantic_splitter import ArticleSemanticSplitter\nfrom sentence_transformers import SentenceTransformer\n\n# Initialize model and splitter\nmodel = SentenceTransformer('thenlper/gte-base')\nsplitter = ArticleSemanticSplitter(model)\n\n# Split document with metadata\ndocuments = splitter.split_file(\n \"path/to/document.txt\",\n embed=True,\n title=\"Document Title\",\n source=\"https://source.url\"\n)\n```\n\n### Hybrid Search with Meilisearch\n\n```python\nfrom just_semantic_search.meili.rag import MeiliConfig, MeiliRAG\n\n# Configure Meilisearch\nconfig = MeiliConfig(\n host=\"127.0.0.1\",\n port=7700,\n api_key=\"your_api_key\"\n)\n\n# Initialize RAG\nrag = MeiliRAG(\n \"test_index\",\n \"thenlper/gte-base\",\n config,\n create_index_if_not_exists=True\n)\n\n# Add documents and search\nrag.add_documents_sync(documents)\nresults = rag.search(\n text_query=\"What are CAD-genes?\",\n vector=model.encode(\"What are CAD-genes?\")\n)\n```\n\n## Project Structure\n\nThe project consists of multiple components:\n\n- `core`: Core interfaces for hybrid search implementations\n- `meili`: Meilisearch backend implementation\n\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n## License\n\nThis project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use this software in your research, please cite:\n\n```bibtex\n@software{just_semantic_search,\n title = {just-semantic-search: LLM-agnostic semantic search library},\n author = {Karmazin, Alex and Kulaga, Anton},\n year = {2024},\n url = {https://github.com/your-username/just-semantic-search}\n}\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Core interfaces for hybrid search implementations (CUDA version)",
"version": "0.2.4",
"project_urls": null,
"split_keywords": [
"python",
" llm",
" science",
" review",
" hybrid search",
" semantic search",
" gpu",
" cuda"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9372b75330f46c0c512afcd7bb0096776c30571dcda5aaa6c5fb4784996aba8d",
"md5": "6594ca16d8bfa19b294c8019fedcda2c",
"sha256": "d0627d544b3a36cbd0902d8965f2102a4abdd4634041179dd18981de5c94ab2f"
},
"downloads": -1,
"filename": "just_semantic_search-0.2.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6594ca16d8bfa19b294c8019fedcda2c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.15,>=3.10",
"size": 21899,
"upload_time": "2025-02-25T16:44:44",
"upload_time_iso_8601": "2025-02-25T16:44:44.304397Z",
"url": "https://files.pythonhosted.org/packages/93/72/b75330f46c0c512afcd7bb0096776c30571dcda5aaa6c5fb4784996aba8d/just_semantic_search-0.2.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "afac98a673534505f12b8d4e5a6a5afbd7285d52df1886503fc2bb23aa41743c",
"md5": "62d9ccde55cfdb6248de820b171ea888",
"sha256": "2f2024a9f64e7b06bf1a08fa0d229f992b177031bdb5075da2de5693335f82ca"
},
"downloads": -1,
"filename": "just_semantic_search-0.2.4.tar.gz",
"has_sig": false,
"md5_digest": "62d9ccde55cfdb6248de820b171ea888",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.15,>=3.10",
"size": 18342,
"upload_time": "2025-02-25T16:44:45",
"upload_time_iso_8601": "2025-02-25T16:44:45.782760Z",
"url": "https://files.pythonhosted.org/packages/af/ac/98a673534505f12b8d4e5a6a5afbd7285d52df1886503fc2bb23aa41743c/just_semantic_search-0.2.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-25 16:44:45",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "just-semantic-search"
}