rag_webquery

Name	rag_webquery JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/robert-mcdermott/rag_webquery
Summary	A command line utility to query websites using a local LLM
upload_time	2024-01-08 08:08:09
maintainer
docs_url	None
author	Robert McDermott
requires_python	>=3.9,<4.0
license	Apache-2.0
keywords	llm
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Rag WebQuery

![rag-webquery.png](https://raw.githubusercontent.com/robert-mcdermott/rag_webquery/main/images/rag-webquery.png)

## Description
**rag_webquery** is a command-line tool that allows you to use a local Large Language Model (LLM) to answer questions from website contents. The utility extracts all textual information from the desired URL, chunks it up, converts it to embeddings stored in an in-memory vector store, that's then used to find the most relevant information to use as context to answer the supplied question.

## Requirements

- **Python**: The utility is written in python so you'll need Python 3.9 or greater installed.
- **Ollama**: To host the local LLMs you'll need to have [Ollama](https://ollama.ai/) running.
- **LLM(s)**: The LLM(s) that you want to answer your questions need to be downloaded (pulled) with Ollama.
- **GPU**: A GPU is recommended but it will work with a CPU only system without one albeit slowly
- **RAM**: Enough system RAM to run the selected model; the default model (Zephyr 7B) requires a minium of 8GB; 16GB system recommended.
- **OS**: Ollama currently only runs on MacOS and Linux, Windows support coming soon.

By default **rag-webquery** uses the [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) model which is a fined tuned version of Mistral 7B.  After Ollama is installed and running you need to download it with the following command:

```bash
ollama pull zephyr:latest
```

Ollama supports several other models that you can choose from in the [library](https://ollama.ai/library). If you want to use a model other than Zephyr, you'll need to pull it with Ollama and specify it with the **rag-webquery** '--model' flag.


## Installation

Assuming you already have Python installed on your system, you can easily install it with pip with this command:

```bash
pip install -U rag_webquery
``` 

## Usage 

**Usage documentation provided by the '--help' flag:**

```text
usage: rag-webquery [-h] [--model MODEL] [--base_url BASE_URL] [--chunk_size CHUNK_SIZE] [--chunk_overlap CHUNK_OVERLAP]
                    [--top_matches TOP_MATCHES] [--system SYSTEM] [--temp TEMP]
                    website question

Query a webpage with a local LLM

positional arguments:
  website               The website URL to retrieve data from
  question              The question to ask about the website's content

options:
  -h, --help            show this help message and exit
  --model MODEL         The model to use (default: zephyr:latest)
  --base_url BASE_URL   The base URL for the Ollama (default: http://localhost:11434)
  --chunk_size CHUNK_SIZE
                        The document token chunk size (default: 200)
  --chunk_overlap CHUNK_OVERLAP
                        The amount of chunk overlap (default: 50)
  --top_matches TOP_MATCHES
                        The number the of top matching document chunks to retrieve (default: 4)
  --system SYSTEM       The system message provided to the LLM
  --temp TEMP           The model temperature setting (default: 0.0)
```


### **Most basic usage**:

```bash
rag-webquery https://en.wikipedia.org/wiki/Ukraine "What was holomodor? What was its root cause?"
```

**Output**:

```
### Answer:
The Holodomor was a major famine that took place in Soviet Ukraine during
1932 and 1933. It led to the death by starvation of millions of Ukrainians,
particularly peasants. The root cause of the Holodomor was the forced
collectivization of crops and their confiscation by Soviet authorities. This
policy aimed to centralize agricultural production but instead resulted in
widespread food shortages and devastating consequences for the local
population. Some countries have recognized this event as an act of genocide
perpetrated by Joseph Stalin and other Soviet notables.
```

### **More complicated usage**:

In this example I'll be using the powerful "Mixtral 8x7B" model. First, I'll need to pull it via Ollama (if not already done previously): 

```bash
ollama pull mixtral:latest 
```

Then I'll specify a custom system message that instructs the LLM to perform in the role of a data extraction expert that only responds with JSON formatted output. Then I ask it about Ukraine's demographics, which it will extract from the website contents and provide a JSON representation of.

```bash
rag-webquery https://en.wikipedia.org/wiki/Ukraine \
             "What are Ukraine's demographics?" \
             --model mixtral \
             --system "You are a data extraction expert. \
                        You take information and return a valid JSON document \
                        that captures the information " \
             --chunk_size 1500
```

**Output**:

```json
{
    "Population": {
        "Estimated before 2022 Russian invasion": 41000000,
        "Decrease from 1993 to 2014": -6.6,
        "Percentage decrease": 12.8,
        "Urban population": 67,
        "Population density": 69.5,
        "Overall life expectancy at birth": 73,
        "Life expectancy at birth for males": 68,
        "Life expectancy at birth for females": 77.8
    },
    "Ethnic composition (2001 Census)": {
        "Ukrainians": 77.8,
        "Russians": 17.3,
        "Romanians and Moldovans": 0.8,
        "Belarusians": 0.6,
        "Crimean Tatars": 0.5,
        "Bulgarians": 0.4,
        "Hungarians": 0.3,
        "Poles": 0.3,
        "Other": 2
    },
    "Minority populations": {
        "Belarusians": 0.6,
        "Moldovans": 0.5,
        "Crimean Tatars": 0.5,
        "Bulgarians": 0.4,
        "Hungarians": 0.3,
        "Romanians": 0.3,
        "Poles": 0.3,
        "Jews": 0.3,
        "Armenians": 0.2,
        "Greeks": 0.2,
        "Tatars": 0.2,
        "Koreans": {
            "Estimate": "10-40000",
            "Location": "mostly in the south of the country"
        },
        "Roma": {
            "Estimate (official)": 47600,
            "Estimate (Council of Europe)": 260000
        }
    },
    "Internally displaced and refugees": {
        "Due to war in Donbas (late 2010s)": 1400000,
        "Due to Russian invasion (early 2022)": 4100000
    }
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/robert-mcdermott/rag_webquery",
    "name": "rag_webquery",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "llm",
    "author": "Robert McDermott",
    "author_email": "robert.c.mcdermott@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e3/2f/6248ca04be015fb369c061e7856b5fed1c358ddee886304791f48c07d017/rag_webquery-0.1.1.tar.gz",
    "platform": null,
    "description": "# Rag WebQuery\n\n![rag-webquery.png](https://raw.githubusercontent.com/robert-mcdermott/rag_webquery/main/images/rag-webquery.png)\n\n## Description\n**rag_webquery** is a command-line tool that allows you to use a local Large Language Model (LLM) to answer questions from website contents. The utility extracts all textual information from the desired URL, chunks it up, converts it to embeddings stored in an in-memory vector store, that's then used to find the most relevant information to use as context to answer the supplied question.\n\n## Requirements\n\n- **Python**: The utility is written in python so you'll need Python 3.9 or greater installed.\n- **Ollama**: To host the local LLMs you'll need to have [Ollama](https://ollama.ai/) running.\n- **LLM(s)**: The LLM(s) that you want to answer your questions need to be downloaded (pulled) with Ollama.\n- **GPU**: A GPU is recommended but it will work with a CPU only system without one albeit slowly\n- **RAM**: Enough system RAM to run the selected model; the default model (Zephyr 7B) requires a minium of 8GB; 16GB system recommended.\n- **OS**: Ollama currently only runs on MacOS and Linux, Windows support coming soon.\n\nBy default **rag-webquery** uses the [Zephyr](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta) model which is a fined tuned version of Mistral 7B.  After Ollama is installed and running you need to download it with the following command:\n\n```bash\nollama pull zephyr:latest\n```\n\nOllama supports several other models that you can choose from in the [library](https://ollama.ai/library). If you want to use a model other than Zephyr, you'll need to pull it with Ollama and specify it with the **rag-webquery** '--model' flag.\n\n\n## Installation\n\nAssuming you already have Python installed on your system, you can easily install it with pip with this command:\n\n```bash\npip install -U rag_webquery\n``` \n\n## Usage \n\n**Usage documentation provided by the '--help' flag:**\n\n```text\nusage: rag-webquery [-h] [--model MODEL] [--base_url BASE_URL] [--chunk_size CHUNK_SIZE] [--chunk_overlap CHUNK_OVERLAP]\n                    [--top_matches TOP_MATCHES] [--system SYSTEM] [--temp TEMP]\n                    website question\n\nQuery a webpage with a local LLM\n\npositional arguments:\n  website               The website URL to retrieve data from\n  question              The question to ask about the website's content\n\noptions:\n  -h, --help            show this help message and exit\n  --model MODEL         The model to use (default: zephyr:latest)\n  --base_url BASE_URL   The base URL for the Ollama (default: http://localhost:11434)\n  --chunk_size CHUNK_SIZE\n                        The document token chunk size (default: 200)\n  --chunk_overlap CHUNK_OVERLAP\n                        The amount of chunk overlap (default: 50)\n  --top_matches TOP_MATCHES\n                        The number the of top matching document chunks to retrieve (default: 4)\n  --system SYSTEM       The system message provided to the LLM\n  --temp TEMP           The model temperature setting (default: 0.0)\n```\n\n\n### **Most basic usage**:\n\n```bash\nrag-webquery https://en.wikipedia.org/wiki/Ukraine \"What was holomodor? What was its root cause?\"\n```\n\n**Output**:\n\n```\n### Answer:\nThe Holodomor was a major famine that took place in Soviet Ukraine during\n1932 and 1933. It led to the death by starvation of millions of Ukrainians,\nparticularly peasants. The root cause of the Holodomor was the forced\ncollectivization of crops and their confiscation by Soviet authorities. This\npolicy aimed to centralize agricultural production but instead resulted in\nwidespread food shortages and devastating consequences for the local\npopulation. Some countries have recognized this event as an act of genocide\nperpetrated by Joseph Stalin and other Soviet notables.\n```\n\n### **More complicated usage**:\n\nIn this example I'll be using the powerful \"Mixtral 8x7B\" model. First, I'll need to pull it via Ollama (if not already done previously): \n\n```bash\nollama pull mixtral:latest \n```\n\nThen I'll specify a custom system message that instructs the LLM to perform in the role of a data extraction expert that only responds with JSON formatted output. Then I ask it about Ukraine's demographics, which it will extract from the website contents and provide a JSON representation of.\n\n```bash\nrag-webquery https://en.wikipedia.org/wiki/Ukraine \\\n             \"What are Ukraine's demographics?\" \\\n             --model mixtral \\\n             --system \"You are a data extraction expert. \\\n                        You take information and return a valid JSON document \\\n                        that captures the information \" \\\n             --chunk_size 1500\n```\n\n**Output**:\n\n```json\n{\n    \"Population\": {\n        \"Estimated before 2022 Russian invasion\": 41000000,\n        \"Decrease from 1993 to 2014\": -6.6,\n        \"Percentage decrease\": 12.8,\n        \"Urban population\": 67,\n        \"Population density\": 69.5,\n        \"Overall life expectancy at birth\": 73,\n        \"Life expectancy at birth for males\": 68,\n        \"Life expectancy at birth for females\": 77.8\n    },\n    \"Ethnic composition (2001 Census)\": {\n        \"Ukrainians\": 77.8,\n        \"Russians\": 17.3,\n        \"Romanians and Moldovans\": 0.8,\n        \"Belarusians\": 0.6,\n        \"Crimean Tatars\": 0.5,\n        \"Bulgarians\": 0.4,\n        \"Hungarians\": 0.3,\n        \"Poles\": 0.3,\n        \"Other\": 2\n    },\n    \"Minority populations\": {\n        \"Belarusians\": 0.6,\n        \"Moldovans\": 0.5,\n        \"Crimean Tatars\": 0.5,\n        \"Bulgarians\": 0.4,\n        \"Hungarians\": 0.3,\n        \"Romanians\": 0.3,\n        \"Poles\": 0.3,\n        \"Jews\": 0.3,\n        \"Armenians\": 0.2,\n        \"Greeks\": 0.2,\n        \"Tatars\": 0.2,\n        \"Koreans\": {\n            \"Estimate\": \"10-40000\",\n            \"Location\": \"mostly in the south of the country\"\n        },\n        \"Roma\": {\n            \"Estimate (official)\": 47600,\n            \"Estimate (Council of Europe)\": 260000\n        }\n    },\n    \"Internally displaced and refugees\": {\n        \"Due to war in Donbas (late 2010s)\": 1400000,\n        \"Due to Russian invasion (early 2022)\": 4100000\n    }\n}\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A command line utility to query websites using a local LLM",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/robert-mcdermott/rag_webquery",
        "Repository": "https://github.com/robert-mcdermott/rag_webquery"
    },
    "split_keywords": [
        "llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a1d7984fa1e3f4c98ee542682f7b1e4f2aadb44598e2e0e429f900797917a780",
                "md5": "625af146862b95874b4256e3ff032573",
                "sha256": "4abfabaaa1a2c4c7d03259fba5252705e7e10deba654b8c9108f99216dd43b1d"
            },
            "downloads": -1,
            "filename": "rag_webquery-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "625af146862b95874b4256e3ff032573",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 9832,
            "upload_time": "2024-01-08T08:08:07",
            "upload_time_iso_8601": "2024-01-08T08:08:07.629841Z",
            "url": "https://files.pythonhosted.org/packages/a1/d7/984fa1e3f4c98ee542682f7b1e4f2aadb44598e2e0e429f900797917a780/rag_webquery-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e32f6248ca04be015fb369c061e7856b5fed1c358ddee886304791f48c07d017",
                "md5": "676411e386d9602fd5c42ae71978f329",
                "sha256": "3c7f570c856a9b336cb10dd0b9f04170687cbbdce69b35584dc9dc4cf7ff5b5a"
            },
            "downloads": -1,
            "filename": "rag_webquery-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "676411e386d9602fd5c42ae71978f329",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 8744,
            "upload_time": "2024-01-08T08:08:09",
            "upload_time_iso_8601": "2024-01-08T08:08:09.322274Z",
            "url": "https://files.pythonhosted.org/packages/e3/2f/6248ca04be015fb369c061e7856b5fed1c358ddee886304791f48c07d017/rag_webquery-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-08 08:08:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "robert-mcdermott",
    "github_project": "rag_webquery",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "rag_webquery"
}

Robert McDermott