ragoon


Nameragoon JSON
Version 0.0.3 PyPI version JSON
download
home_pageNone
SummaryImprove large language models (LLM) retrieval using dynamic web-search based on blazingly fast query generation from Groq chips.
upload_time2024-05-26 22:07:21
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseNone
keywords language-models retrieval web-scraping few-shot-learning nlp machine-learning retrieval-augmented-generation rag groq generative-ai llama mistral
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![Plot](https://github.com/louisbrulenaudet/ragoon/blob/main/thumbnail.png?raw=true)

# RAGoon : Improve Large Language Models retrieval using dynamic web-search ⚡
[![Python](https://img.shields.io/pypi/pyversions/tensorflow.svg)](https://badge.fury.io/py/tensorflow) [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) ![Maintainer](https://img.shields.io/badge/maintainer-@louisbrulenaudet-blue)

RAGoon is a Python library that aims to improve the performance of language models by providing contextually relevant information through retrieval-based querying, web scraping, and data augmentation techniques. It offers an integration of various APIs, enabling users to retrieve information from the web, enrich it with domain-specific knowledge, and feed it to language models for more informed responses.

RAGoon's core functionality revolves around the concept of few-shot learning, where language models are provided with a small set of high-quality examples to enhance their understanding and generate more accurate outputs. By curating and retrieving relevant data from the web, RAGoon equips language models with the necessary context and knowledge to tackle complex queries and generate insightful responses.

## Usage Example
Here's an example of how to use RAGoon:

```python
from groq import Groq
# from openai import OpenAI
from ragoon import RAGoon

# Initialize RAGoon instance
ragoon = RAGoon(
    google_api_key="your_google_api_key",
    google_cx="your_google_cx",
    completion_client=Groq(api_key="your_groq_api_key")
)

# Search and get results
query = "I want to do a left join in python polars"
results = ragoon.search(
    query=query,
    completion_model="Llama3-70b-8192",
    max_tokens=512,
    temperature=1,
)

# Print results
print(results)
```

## Key Features
- **Query Generation**: RAGoon generates search queries tailored to retrieve results that directly address the user's intent, enhancing the context for subsequent language model interactions.
- **Web Scraping and Data Retrieval**: RAGoon leverages web scraping capabilities to extract relevant content from various websites, providing language models with domain-specific knowledge.
- **Parallel Processing**: RAGoon utilizes parallel processing techniques to efficiently scrape and retrieve data from multiple URLs simultaneously.
- **Language Model Integration**: RAGoon integrates with language models, such as OpenAI's GPT-3 or LLama 3 on Groq Cloud, enabling users to leverage natural language processing capabilities for their applications.
- **Extensible Design**: RAGoon's modular architecture allows for the integration of new data sources, retrieval methods, and language models, ensuring future extensibility.

## Dependencies
- `dotenv`: A Python library that loads environment variables from a `.env` file.
- `groq`: A Python client library for the Groq API, which provides access to language models.
- `openai`: A Python library to interact with the OpenAI API, including access to GPT-3 and other language models.
- `requests`: A popular Python library for making HTTP requests and interacting with web services.
- `beautifulsoup4`: A Python library for web scraping, providing tools for parsing and navigating HTML and XML documents.
- `httpx`: A modern, Python 3 library for making HTTP requests.
- `googleapiclient`: A Python client library for accessing Google APIs.

## Citing this project
If you use this code in your research, please use the following BibTeX entry.

```BibTeX
@misc{louisbrulenaudet2024,
	author = {Louis Brulé Naudet},
	title = {RAGoon : Improve Large Language Models retrieval using dynamic web-search},
	howpublished = {\url{https://github.com/louisbrulenaudet/ragoon}},
	year = {2024}
}
```
## Feedback
If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ragoon",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "language-models, retrieval, web-scraping, few-shot-learning, nlp, machine-learning, retrieval-augmented-generation, RAG, groq, generative-ai, llama, Mistral",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/4f/3c/340f18f530a960cd0dd2b755d7110ebf1de7336a98ae2b8154f28e7be658/ragoon-0.0.3.tar.gz",
    "platform": null,
    "description": "![Plot](https://github.com/louisbrulenaudet/ragoon/blob/main/thumbnail.png?raw=true)\n\n# RAGoon : Improve Large Language Models retrieval using dynamic web-search \u26a1\n[![Python](https://img.shields.io/pypi/pyversions/tensorflow.svg)](https://badge.fury.io/py/tensorflow) [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) ![Maintainer](https://img.shields.io/badge/maintainer-@louisbrulenaudet-blue)\n\nRAGoon is a Python library that aims to improve the performance of language models by providing contextually relevant information through retrieval-based querying, web scraping, and data augmentation techniques. It offers an integration of various APIs, enabling users to retrieve information from the web, enrich it with domain-specific knowledge, and feed it to language models for more informed responses.\n\nRAGoon's core functionality revolves around the concept of few-shot learning, where language models are provided with a small set of high-quality examples to enhance their understanding and generate more accurate outputs. By curating and retrieving relevant data from the web, RAGoon equips language models with the necessary context and knowledge to tackle complex queries and generate insightful responses.\n\n## Usage Example\nHere's an example of how to use RAGoon:\n\n```python\nfrom groq import Groq\n# from openai import OpenAI\nfrom ragoon import RAGoon\n\n# Initialize RAGoon instance\nragoon = RAGoon(\n    google_api_key=\"your_google_api_key\",\n    google_cx=\"your_google_cx\",\n    completion_client=Groq(api_key=\"your_groq_api_key\")\n)\n\n# Search and get results\nquery = \"I want to do a left join in python polars\"\nresults = ragoon.search(\n    query=query,\n    completion_model=\"Llama3-70b-8192\",\n    max_tokens=512,\n    temperature=1,\n)\n\n# Print results\nprint(results)\n```\n\n## Key Features\n- **Query Generation**: RAGoon generates search queries tailored to retrieve results that directly address the user's intent, enhancing the context for subsequent language model interactions.\n- **Web Scraping and Data Retrieval**: RAGoon leverages web scraping capabilities to extract relevant content from various websites, providing language models with domain-specific knowledge.\n- **Parallel Processing**: RAGoon utilizes parallel processing techniques to efficiently scrape and retrieve data from multiple URLs simultaneously.\n- **Language Model Integration**: RAGoon integrates with language models, such as OpenAI's GPT-3 or LLama 3 on Groq Cloud, enabling users to leverage natural language processing capabilities for their applications.\n- **Extensible Design**: RAGoon's modular architecture allows for the integration of new data sources, retrieval methods, and language models, ensuring future extensibility.\n\n## Dependencies\n- `dotenv`: A Python library that loads environment variables from a `.env` file.\n- `groq`: A Python client library for the Groq API, which provides access to language models.\n- `openai`: A Python library to interact with the OpenAI API, including access to GPT-3 and other language models.\n- `requests`: A popular Python library for making HTTP requests and interacting with web services.\n- `beautifulsoup4`: A Python library for web scraping, providing tools for parsing and navigating HTML and XML documents.\n- `httpx`: A modern, Python 3 library for making HTTP requests.\n- `googleapiclient`: A Python client library for accessing Google APIs.\n\n## Citing this project\nIf you use this code in your research, please use the following BibTeX entry.\n\n```BibTeX\n@misc{louisbrulenaudet2024,\n\tauthor = {Louis Brul\u00e9 Naudet},\n\ttitle = {RAGoon : Improve Large Language Models retrieval using dynamic web-search},\n\thowpublished = {\\url{https://github.com/louisbrulenaudet/ragoon}},\n\tyear = {2024}\n}\n```\n## Feedback\nIf you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Improve large language models (LLM) retrieval using dynamic web-search based on blazingly fast query generation from Groq chips.",
    "version": "0.0.3",
    "project_urls": null,
    "split_keywords": [
        "language-models",
        " retrieval",
        " web-scraping",
        " few-shot-learning",
        " nlp",
        " machine-learning",
        " retrieval-augmented-generation",
        " rag",
        " groq",
        " generative-ai",
        " llama",
        " mistral"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc4aefaa5a71ee2166d77303ae1000631f33a4af62cf9477af72f57b75cb5ba7",
                "md5": "9173e5bbeefbb33856cd26dcceb57755",
                "sha256": "f45da5ca9f14566616af8ba2c9ea02c5c817ec9e08786dc7ba0eff47674c93bf"
            },
            "downloads": -1,
            "filename": "ragoon-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9173e5bbeefbb33856cd26dcceb57755",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 14020,
            "upload_time": "2024-05-26T22:07:19",
            "upload_time_iso_8601": "2024-05-26T22:07:19.767184Z",
            "url": "https://files.pythonhosted.org/packages/bc/4a/efaa5a71ee2166d77303ae1000631f33a4af62cf9477af72f57b75cb5ba7/ragoon-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4f3c340f18f530a960cd0dd2b755d7110ebf1de7336a98ae2b8154f28e7be658",
                "md5": "b6ff83e844c8e2903225de541550a8f0",
                "sha256": "6f50d62d71aa9350070612b942f59a1a9070a45182984b080bdbc7e37cd49135"
            },
            "downloads": -1,
            "filename": "ragoon-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "b6ff83e844c8e2903225de541550a8f0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 13354,
            "upload_time": "2024-05-26T22:07:21",
            "upload_time_iso_8601": "2024-05-26T22:07:21.742695Z",
            "url": "https://files.pythonhosted.org/packages/4f/3c/340f18f530a960cd0dd2b755d7110ebf1de7336a98ae2b8154f28e7be658/ragoon-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-26 22:07:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ragoon"
}
        
Elapsed time: 0.24622s