# Unified Hyperbolic Spectral Retrieval (UHSR)
Unified Hyperbolic Spectral Retrieval (UHSR) is an advanced **hybrid text retrieval model** that seamlessly integrates **lexical search (BM25)** with **semantic search (FAISS/Pinecone)** while employing **spectral re-ranking** for **interpretable and normalized** relevance scores in the **[0,1] range**.
## 🚀 Key Features
- **🔍 Hybrid Retrieval:** Combines **BM25** for lexical scoring and **dense vector** semantic similarity for contextual understanding.
- **🎯 Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming** similarity.
- **🔬 Spectral Re-Ranking:** Uses **graph Laplacian & Fiedler vector** to boost highly relevant candidates.
- **⚡ AI-powered Reranking:** Supports **Hugging Face Cross-Encoders & OpenAI API-based Reranking**.
- **📈 Interpretable Scores:** Final relevance scores are **logistic-normalized** in **[0,1]** for **easy ranking**.
- **🚀 Scalable & Efficient:** Works with **FAISS (local)** for fast retrieval and **Pinecone (cloud-based)** for large-scale vector search.
---
## 🛠️ **How It Works**
UHSR **enhances traditional retrieval** by blending **BM25-based keyword matching** with **semantic vector representations** using the following pipeline:
| Step | Description |
|------|-------------|
| 1️⃣ **Lexical Filtering** | Uses **BM25** to rank documents by keyword relevance |
| 2️⃣ **Semantic Scoring** | Computes similarity using **FAISS or Pinecone** |
| 3️⃣ **Fusion Process** | Blends scores via **logistic normalization & harmonic fusion** |
| 4️⃣ **Spectral Re-Ranking** | Uses **graph Laplacian analysis** to boost central candidates |
| 5️⃣ **(Optional) AI Reranking** | Uses **OpenAI API or Hugging Face Cross-Encoders** |
---
## 🌍 **Supported Retrieval Methods**
- ✅ **BM25 (Lexical Matching)**
- ✅ **FAISS (Local Vector Search)**
- ✅ **Pinecone (Cloud Vector Search)**
- ✅ **Hugging Face Rerankers**
- ✅ **OpenAI API-based Reranking**
---
## 📌 **Why UHSR?**
- **Better Search Results:** Combines **exact keyword matching (BM25)** with **contextual embeddings (Semantic Search)**.
- **Faster & Scalable:** Uses **FAISS for local retrieval** or **Pinecone for cloud-based vector search**.
- **Interpretable Ranking:** Outputs **normalized scores in [0,1]**, making it easy to **interpret**.
- **Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming**.
---
## 🎯 **Intended Use**
UHSR is designed for:
- **Information Retrieval Research**
- **Search Engines & Recommendation Systems**
- **NLP Applications in AI & Machine Learning**
- **Academic & Industry-scale Document Ranking**
---
## 📂 **Code & Documentation**
For complete documentation, usage examples, and implementation details, visit the **[GitHub repository](https://github.com/vedaant00/uhsr).**
_Learn More about this package on [Medium](https://vedaantsingh706.medium.com/revolutionizing-text-retrieval-with-uhsr-a-hybrid-approach-combining-lexical-semantic-spectral-6c7e28c3e7d9)._
---
### 🔥 **Try UHSR today and revolutionize your search engine!** 🚀
Raw data
{
"_id": null,
"home_page": "https://github.com/vedaant00/uhsr",
"name": "uhsr",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "uhsr, text retrieval, BM25, FAISS, Pinecone, vector search, semantic search, lexical search, spectral re-ranking, machine learning, NLP",
"author": "Vedaant Singh",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/18/44/b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef/uhsr-0.2.8.tar.gz",
"platform": null,
"description": "# Unified Hyperbolic Spectral Retrieval (UHSR)\r\n\r\nUnified Hyperbolic Spectral Retrieval (UHSR) is an advanced **hybrid text retrieval model** that seamlessly integrates **lexical search (BM25)** with **semantic search (FAISS/Pinecone)** while employing **spectral re-ranking** for **interpretable and normalized** relevance scores in the **[0,1] range**.\r\n\r\n## \ud83d\ude80 Key Features\r\n\r\n- **\ud83d\udd0d Hybrid Retrieval:** Combines **BM25** for lexical scoring and **dense vector** semantic similarity for contextual understanding.\r\n- **\ud83c\udfaf Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming** similarity.\r\n- **\ud83d\udd2c Spectral Re-Ranking:** Uses **graph Laplacian & Fiedler vector** to boost highly relevant candidates.\r\n- **\u26a1 AI-powered Reranking:** Supports **Hugging Face Cross-Encoders & OpenAI API-based Reranking**.\r\n- **\ud83d\udcc8 Interpretable Scores:** Final relevance scores are **logistic-normalized** in **[0,1]** for **easy ranking**.\r\n- **\ud83d\ude80 Scalable & Efficient:** Works with **FAISS (local)** for fast retrieval and **Pinecone (cloud-based)** for large-scale vector search.\r\n\r\n---\r\n\r\n## \ud83d\udee0\ufe0f **How It Works**\r\n\r\nUHSR **enhances traditional retrieval** by blending **BM25-based keyword matching** with **semantic vector representations** using the following pipeline:\r\n\r\n| Step | Description |\r\n|------|-------------|\r\n| 1\ufe0f\u20e3 **Lexical Filtering** | Uses **BM25** to rank documents by keyword relevance |\r\n| 2\ufe0f\u20e3 **Semantic Scoring** | Computes similarity using **FAISS or Pinecone** |\r\n| 3\ufe0f\u20e3 **Fusion Process** | Blends scores via **logistic normalization & harmonic fusion** |\r\n| 4\ufe0f\u20e3 **Spectral Re-Ranking** | Uses **graph Laplacian analysis** to boost central candidates |\r\n| 5\ufe0f\u20e3 **(Optional) AI Reranking** | Uses **OpenAI API or Hugging Face Cross-Encoders** |\r\n\r\n---\r\n\r\n## \ud83c\udf0d **Supported Retrieval Methods**\r\n- \u2705 **BM25 (Lexical Matching)**\r\n- \u2705 **FAISS (Local Vector Search)**\r\n- \u2705 **Pinecone (Cloud Vector Search)**\r\n- \u2705 **Hugging Face Rerankers**\r\n- \u2705 **OpenAI API-based Reranking**\r\n\r\n---\r\n\r\n## \ud83d\udccc **Why UHSR?**\r\n- **Better Search Results:** Combines **exact keyword matching (BM25)** with **contextual embeddings (Semantic Search)**.\r\n- **Faster & Scalable:** Uses **FAISS for local retrieval** or **Pinecone for cloud-based vector search**.\r\n- **Interpretable Ranking:** Outputs **normalized scores in [0,1]**, making it easy to **interpret**.\r\n- **Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming**.\r\n\r\n---\r\n\r\n## \ud83c\udfaf **Intended Use**\r\n\r\nUHSR is designed for:\r\n- **Information Retrieval Research**\r\n- **Search Engines & Recommendation Systems**\r\n- **NLP Applications in AI & Machine Learning**\r\n- **Academic & Industry-scale Document Ranking**\r\n\r\n---\r\n\r\n## \ud83d\udcc2 **Code & Documentation**\r\nFor complete documentation, usage examples, and implementation details, visit the **[GitHub repository](https://github.com/vedaant00/uhsr).**\r\n\r\n_Learn More about this package on [Medium](https://vedaantsingh706.medium.com/revolutionizing-text-retrieval-with-uhsr-a-hybrid-approach-combining-lexical-semantic-spectral-6c7e28c3e7d9)._\r\n\r\n---\r\n\r\n### \ud83d\udd25 **Try UHSR today and revolutionize your search engine!** \ud83d\ude80\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Unified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.",
"version": "0.2.8",
"project_urls": {
"Homepage": "https://github.com/vedaant00/uhsr"
},
"split_keywords": [
"uhsr",
" text retrieval",
" bm25",
" faiss",
" pinecone",
" vector search",
" semantic search",
" lexical search",
" spectral re-ranking",
" machine learning",
" nlp"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "bfbac779915b6ec33ffb441f3fca2bdb56350933038b7d46cc25a4d224ee2adc",
"md5": "d02b1798b9d55a66ddd732081eb09de6",
"sha256": "3eb68e81c6295dac80951dca9c1e55a15e12375aa729b03c7506235b828b4a2d"
},
"downloads": -1,
"filename": "uhsr-0.2.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d02b1798b9d55a66ddd732081eb09de6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 12480,
"upload_time": "2025-02-19T04:41:37",
"upload_time_iso_8601": "2025-02-19T04:41:37.665850Z",
"url": "https://files.pythonhosted.org/packages/bf/ba/c779915b6ec33ffb441f3fca2bdb56350933038b7d46cc25a4d224ee2adc/uhsr-0.2.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1844b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef",
"md5": "4022ac36c90d30d8e2d1bb1ceb0b4be3",
"sha256": "aa86e239d48196db4e1e8fc539956574bc4f658abd93caed5a071b45df9c788d"
},
"downloads": -1,
"filename": "uhsr-0.2.8.tar.gz",
"has_sig": false,
"md5_digest": "4022ac36c90d30d8e2d1bb1ceb0b4be3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 13986,
"upload_time": "2025-02-19T04:41:42",
"upload_time_iso_8601": "2025-02-19T04:41:42.441544Z",
"url": "https://files.pythonhosted.org/packages/18/44/b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef/uhsr-0.2.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-19 04:41:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "vedaant00",
"github_project": "uhsr",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "sentence-transformers",
"specs": []
},
{
"name": "pinecone-client",
"specs": []
},
{
"name": "openai",
"specs": []
}
],
"lcname": "uhsr"
}