uhsr


Nameuhsr JSON
Version 0.2.8 PyPI version JSON
download
home_pagehttps://github.com/vedaant00/uhsr
SummaryUnified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.
upload_time2025-02-19 04:41:42
maintainerNone
docs_urlNone
authorVedaant Singh
requires_python>=3.6
licenseNone
keywords uhsr text retrieval bm25 faiss pinecone vector search semantic search lexical search spectral re-ranking machine learning nlp
VCS
bugtrack_url
requirements numpy sentence-transformers pinecone-client openai
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Unified Hyperbolic Spectral Retrieval (UHSR)

Unified Hyperbolic Spectral Retrieval (UHSR) is an advanced **hybrid text retrieval model** that seamlessly integrates **lexical search (BM25)** with **semantic search (FAISS/Pinecone)** while employing **spectral re-ranking** for **interpretable and normalized** relevance scores in the **[0,1] range**.

## 🚀 Key Features

- **🔍 Hybrid Retrieval:** Combines **BM25** for lexical scoring and **dense vector** semantic similarity for contextual understanding.
- **🎯 Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming** similarity.
- **🔬 Spectral Re-Ranking:** Uses **graph Laplacian & Fiedler vector** to boost highly relevant candidates.
- **⚡ AI-powered Reranking:** Supports **Hugging Face Cross-Encoders & OpenAI API-based Reranking**.
- **📈 Interpretable Scores:** Final relevance scores are **logistic-normalized** in **[0,1]** for **easy ranking**.
- **🚀 Scalable & Efficient:** Works with **FAISS (local)** for fast retrieval and **Pinecone (cloud-based)** for large-scale vector search.

---

## 🛠️ **How It Works**

UHSR **enhances traditional retrieval** by blending **BM25-based keyword matching** with **semantic vector representations** using the following pipeline:

| Step | Description |
|------|-------------|
| 1️⃣ **Lexical Filtering** | Uses **BM25** to rank documents by keyword relevance |
| 2️⃣ **Semantic Scoring** | Computes similarity using **FAISS or Pinecone** |
| 3️⃣ **Fusion Process** | Blends scores via **logistic normalization & harmonic fusion** |
| 4️⃣ **Spectral Re-Ranking** | Uses **graph Laplacian analysis** to boost central candidates |
| 5️⃣ **(Optional) AI Reranking** | Uses **OpenAI API or Hugging Face Cross-Encoders** |

---

## 🌍 **Supported Retrieval Methods**
- ✅ **BM25 (Lexical Matching)**
- ✅ **FAISS (Local Vector Search)**
- ✅ **Pinecone (Cloud Vector Search)**
- ✅ **Hugging Face Rerankers**
- ✅ **OpenAI API-based Reranking**

---

## 📌 **Why UHSR?**
- **Better Search Results:** Combines **exact keyword matching (BM25)** with **contextual embeddings (Semantic Search)**.
- **Faster & Scalable:** Uses **FAISS for local retrieval** or **Pinecone for cloud-based vector search**.
- **Interpretable Ranking:** Outputs **normalized scores in [0,1]**, making it easy to **interpret**.
- **Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming**.

---

## 🎯 **Intended Use**

UHSR is designed for:
- **Information Retrieval Research**
- **Search Engines & Recommendation Systems**
- **NLP Applications in AI & Machine Learning**
- **Academic & Industry-scale Document Ranking**

---

## 📂 **Code & Documentation**
For complete documentation, usage examples, and implementation details, visit the **[GitHub repository](https://github.com/vedaant00/uhsr).**

_Learn More about this package on [Medium](https://vedaantsingh706.medium.com/revolutionizing-text-retrieval-with-uhsr-a-hybrid-approach-combining-lexical-semantic-spectral-6c7e28c3e7d9)._

---

### 🔥 **Try UHSR today and revolutionize your search engine!** 🚀

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/vedaant00/uhsr",
    "name": "uhsr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "uhsr, text retrieval, BM25, FAISS, Pinecone, vector search, semantic search, lexical search, spectral re-ranking, machine learning, NLP",
    "author": "Vedaant Singh",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/18/44/b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef/uhsr-0.2.8.tar.gz",
    "platform": null,
    "description": "# Unified Hyperbolic Spectral Retrieval (UHSR)\r\n\r\nUnified Hyperbolic Spectral Retrieval (UHSR) is an advanced **hybrid text retrieval model** that seamlessly integrates **lexical search (BM25)** with **semantic search (FAISS/Pinecone)** while employing **spectral re-ranking** for **interpretable and normalized** relevance scores in the **[0,1] range**.\r\n\r\n## \ud83d\ude80 Key Features\r\n\r\n- **\ud83d\udd0d Hybrid Retrieval:** Combines **BM25** for lexical scoring and **dense vector** semantic similarity for contextual understanding.\r\n- **\ud83c\udfaf Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming** similarity.\r\n- **\ud83d\udd2c Spectral Re-Ranking:** Uses **graph Laplacian & Fiedler vector** to boost highly relevant candidates.\r\n- **\u26a1 AI-powered Reranking:** Supports **Hugging Face Cross-Encoders & OpenAI API-based Reranking**.\r\n- **\ud83d\udcc8 Interpretable Scores:** Final relevance scores are **logistic-normalized** in **[0,1]** for **easy ranking**.\r\n- **\ud83d\ude80 Scalable & Efficient:** Works with **FAISS (local)** for fast retrieval and **Pinecone (cloud-based)** for large-scale vector search.\r\n\r\n---\r\n\r\n## \ud83d\udee0\ufe0f **How It Works**\r\n\r\nUHSR **enhances traditional retrieval** by blending **BM25-based keyword matching** with **semantic vector representations** using the following pipeline:\r\n\r\n| Step | Description |\r\n|------|-------------|\r\n| 1\ufe0f\u20e3 **Lexical Filtering** | Uses **BM25** to rank documents by keyword relevance |\r\n| 2\ufe0f\u20e3 **Semantic Scoring** | Computes similarity using **FAISS or Pinecone** |\r\n| 3\ufe0f\u20e3 **Fusion Process** | Blends scores via **logistic normalization & harmonic fusion** |\r\n| 4\ufe0f\u20e3 **Spectral Re-Ranking** | Uses **graph Laplacian analysis** to boost central candidates |\r\n| 5\ufe0f\u20e3 **(Optional) AI Reranking** | Uses **OpenAI API or Hugging Face Cross-Encoders** |\r\n\r\n---\r\n\r\n## \ud83c\udf0d **Supported Retrieval Methods**\r\n- \u2705 **BM25 (Lexical Matching)**\r\n- \u2705 **FAISS (Local Vector Search)**\r\n- \u2705 **Pinecone (Cloud Vector Search)**\r\n- \u2705 **Hugging Face Rerankers**\r\n- \u2705 **OpenAI API-based Reranking**\r\n\r\n---\r\n\r\n## \ud83d\udccc **Why UHSR?**\r\n- **Better Search Results:** Combines **exact keyword matching (BM25)** with **contextual embeddings (Semantic Search)**.\r\n- **Faster & Scalable:** Uses **FAISS for local retrieval** or **Pinecone for cloud-based vector search**.\r\n- **Interpretable Ranking:** Outputs **normalized scores in [0,1]**, making it easy to **interpret**.\r\n- **Multi-Metric Similarity:** Supports **cosine, euclidean, mahalanobis, manhattan, chebyshev, jaccard, and hamming**.\r\n\r\n---\r\n\r\n## \ud83c\udfaf **Intended Use**\r\n\r\nUHSR is designed for:\r\n- **Information Retrieval Research**\r\n- **Search Engines & Recommendation Systems**\r\n- **NLP Applications in AI & Machine Learning**\r\n- **Academic & Industry-scale Document Ranking**\r\n\r\n---\r\n\r\n## \ud83d\udcc2 **Code & Documentation**\r\nFor complete documentation, usage examples, and implementation details, visit the **[GitHub repository](https://github.com/vedaant00/uhsr).**\r\n\r\n_Learn More about this package on [Medium](https://vedaantsingh706.medium.com/revolutionizing-text-retrieval-with-uhsr-a-hybrid-approach-combining-lexical-semantic-spectral-6c7e28c3e7d9)._\r\n\r\n---\r\n\r\n### \ud83d\udd25 **Try UHSR today and revolutionize your search engine!** \ud83d\ude80\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Unified Hyperbolic Spectral Retrieval (UHSR) - a novel text retrieval algorithm combining lexical and semantic search.",
    "version": "0.2.8",
    "project_urls": {
        "Homepage": "https://github.com/vedaant00/uhsr"
    },
    "split_keywords": [
        "uhsr",
        " text retrieval",
        " bm25",
        " faiss",
        " pinecone",
        " vector search",
        " semantic search",
        " lexical search",
        " spectral re-ranking",
        " machine learning",
        " nlp"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bfbac779915b6ec33ffb441f3fca2bdb56350933038b7d46cc25a4d224ee2adc",
                "md5": "d02b1798b9d55a66ddd732081eb09de6",
                "sha256": "3eb68e81c6295dac80951dca9c1e55a15e12375aa729b03c7506235b828b4a2d"
            },
            "downloads": -1,
            "filename": "uhsr-0.2.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d02b1798b9d55a66ddd732081eb09de6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 12480,
            "upload_time": "2025-02-19T04:41:37",
            "upload_time_iso_8601": "2025-02-19T04:41:37.665850Z",
            "url": "https://files.pythonhosted.org/packages/bf/ba/c779915b6ec33ffb441f3fca2bdb56350933038b7d46cc25a4d224ee2adc/uhsr-0.2.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1844b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef",
                "md5": "4022ac36c90d30d8e2d1bb1ceb0b4be3",
                "sha256": "aa86e239d48196db4e1e8fc539956574bc4f658abd93caed5a071b45df9c788d"
            },
            "downloads": -1,
            "filename": "uhsr-0.2.8.tar.gz",
            "has_sig": false,
            "md5_digest": "4022ac36c90d30d8e2d1bb1ceb0b4be3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 13986,
            "upload_time": "2025-02-19T04:41:42",
            "upload_time_iso_8601": "2025-02-19T04:41:42.441544Z",
            "url": "https://files.pythonhosted.org/packages/18/44/b8ce9ba8a8ee6fbb9c54e35327ddb2d10886356589d3857ddf6d537ef3ef/uhsr-0.2.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-19 04:41:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "vedaant00",
    "github_project": "uhsr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "sentence-transformers",
            "specs": []
        },
        {
            "name": "pinecone-client",
            "specs": []
        },
        {
            "name": "openai",
            "specs": []
        }
    ],
    "lcname": "uhsr"
}
        
Elapsed time: 0.51562s