rag-cli

Name	rag-cli JSON
Version	0.3.1 JSON
	download
home_page	None
Summary	A project to show good CLI practices with a fully fledged RAG system.
upload_time	2024-06-13 22:44:04
maintainer	None
docs_url	None
author	Oliver Kenyon Wilkins
requires_python	<4.0,>=3.9
license	GNU GPL v3
keywords	cli rag llm vector database ollama
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align="center">
  <img height="100" src="https://github.com/okwilkins/rag-cli/raw/main/docs/images/logo.png" alt="RAG CLI">
</p>

<p align="center">
    <b>A project to show good CLI practices with a fully fledged RAG system.</b>
</p>

<p align=center>
    <a href="https://pypi.org/project/rag-cli/"><img src="https://img.shields.io/pypi/pyversions/rag-cli" alt="Python version"></a>
    <a href="https://pypi.org/project/rag-cli/"><img src="https://img.shields.io/pypi/v/rag-cli" alt="PyPI version"></a>
    <a href="https://github.com/okwilkins/rag-cli/raw/main/LICENSE"><img src="https://img.shields.io/badge/License-GNU%20GPL-success" alt="GNU GPL"></a>
</p>

# RAG CLI

## Installation

```bash
pip install rag-cli
```

## Features

- CLI tooling for RAG
- Embedder (Ollama)
- Vector store (Qdrant)

## Usage

### Docker

If you don't have a running instance of [Qdrant](https://qdrant.tech/) or [Ollama](https://ollama.com/), you can use the provided docker-compose file to start one.

```bash
docker-compose up --build -d
```

This will start Ollama on `http://localhost:11434` and Qdrant on `http://localhost:6333`.

#### Development

This project uses a dev container, which is the easiest way to set up a consistent development environment. Dev containers provide all the necessary tools, dependencies, and configuration, so you can focus on coding right away.

##### Using Dev Containers

This project uses a dev container for a consistent development environment. To get started:

1. Open the project in Visual Studio Code.
2. On Windows/Linux, press `Ctrl+Shift+P` and run the command `Remote-Containers: Reopen in Container`. On Mac, press `Cmd+Shift+P` and run the same command.
3. VS Code will build and start the dev container, providing access to the project's codebase and dependencies.

Other editors may have similar functionality but this project is optimised for Visual Studio Code.

### Embedder

Before running this command, make sure you have a running instance of [Ollama](https://ollama.com/) and the nomic-embed-text:v1.5 model is available:

```bash
ollama pull nomic-embed-text:v1.5
```

```bash
rag-cli embed --ollama-url http://localhost:11434 --file <INPUT_FILE>
```

You can alternatively use stdin to pass the text:

```bash
cat <INPUT_FILE> | rag-cli embed --ollama-url http://localhost:11434
```

### Vector store

```bash
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'
--embedding <EMBEDDING_FILE>
```

You can alternatively use stdin to pass embeddings:

```bash
cat <INPUT_FILE> | \
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'
```

### RAG Chat

```bash
rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5
--file <INPUT_FILE>
```

You can alternatively use stdin to pass the text:

```bash
cat <INPUT_FILE> | \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5
```

### End-to-end Pipeline For Storing Embeddings

Here is an example of an end-to-end pipeline for storing embeddings. It takes the following steps:

- Get a random Wikipedia article
- Embed the article
- Store the embedding in Qdrant

Before running the pipeline make sure you have the following installed:

```bash
sudo apt-get update && sudo apt-get install parallel jq curl
```

Also make sure that the `data/articles` and `data/embeddings` directories exist:

```bash
mkdir -p data/articles data/embeddings
```

Then run the pipeline:

```bash
bash scripts/run_pipeline.sh
```

#### Parallel Pipeline

The script `scripts/run_pipeline.sh` can be run in parallel with [GNU Parallel](https://www.gnu.org/software/parallel/) to speed up the process.

```bash
parallel -j 5 -n0 bash scripts/run_pipeline.sh ::: {0..10}
```

## Examples

### Get 10 Random Wikipedia Articles

```bash
parallel -n0 -j 10 '
curl -L -s "https://en.wikipedia.org/api/rest_v1/page/random/summary" | \
jq -r ".title, .description, .extract" | \
tee data/articles/$(cat /proc/sys/kernel/random/uuid).txt
' ::: {0..10}
```

### Run Embedder On All Articles

```bash
parallel '
rag-cli embed --ollama-url http://localhost:11434 --file {1} 2>> output.log | \
jq ".embedding" | \
tee data/embeddings/$(basename {1} .txt) 1> /dev/null
' ::: $(find data/articles/*.txt)
```

### Store All Embeddings In Qdrant

```bash
parallel rag-cli vector-store --qdrant-url http://localhost:6333 --collection-name nomic-embed-text-v1.5 2>> output.log ::: $(find data/embeddings/*)
```

### Run RAG Chat On A Query

```bash
echo "Who invented the blue LED?" | \
rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5 \
2>> output.log
```

This example obviously requires that the articles similar to the query have been embedded and stored in Qdrant. You can do this with the example found in the next section.

### End-to-end Pipeline For A Single Article

```bash
wikipedia_data=$(curl -L -s "https://en.wikipedia.org/api/rest_v1/page/summary/Shuji_Nakamura") && \
payload_data=$(jq "{title: .title, description: .description, extract: .extract}"  <(echo $wikipedia_data)) && \
text_to_embed=$(jq -r ".title, .description, .extract" <(echo $wikipedia_data)) && \
echo $text_to_embed | \
rag-cli embed --ollama-url http://localhost:11434 | \
jq -r ".embedding" | \
rag-cli vector-store \
  --qdrant-url http://localhost:6333 \
  --collection-name nomic-embed-text-v1.5 \
  --data "$payload_data"
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rag-cli",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "CLI, RAG, LLM, vector database, ollama",
    "author": "Oliver Kenyon Wilkins",
    "author_email": "okwilkins@googlemail.com",
    "download_url": "https://files.pythonhosted.org/packages/57/d4/4994902da9fac1520ca9fd833ababee7bc7ab511746c642cf0c0796da173/rag_cli-0.3.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img height=\"100\" src=\"https://github.com/okwilkins/rag-cli/raw/main/docs/images/logo.png\" alt=\"RAG CLI\">\n</p>\n\n<p align=\"center\">\n    <b>A project to show good CLI practices with a fully fledged RAG system.</b>\n</p>\n\n<p align=center>\n    <a href=\"https://pypi.org/project/rag-cli/\"><img src=\"https://img.shields.io/pypi/pyversions/rag-cli\" alt=\"Python version\"></a>\n    <a href=\"https://pypi.org/project/rag-cli/\"><img src=\"https://img.shields.io/pypi/v/rag-cli\" alt=\"PyPI version\"></a>\n    <a href=\"https://github.com/okwilkins/rag-cli/raw/main/LICENSE\"><img src=\"https://img.shields.io/badge/License-GNU%20GPL-success\" alt=\"GNU GPL\"></a>\n</p>\n\n# RAG CLI\n\n## Installation\n\n```bash\npip install rag-cli\n```\n\n## Features\n\n- CLI tooling for RAG\n- Embedder (Ollama)\n- Vector store (Qdrant)\n\n## Usage\n\n### Docker\n\nIf you don't have a running instance of [Qdrant](https://qdrant.tech/) or [Ollama](https://ollama.com/), you can use the provided docker-compose file to start one.\n\n```bash\ndocker-compose up --build -d\n```\n\nThis will start Ollama on `http://localhost:11434` and Qdrant on `http://localhost:6333`.\n\n#### Development\n\nThis project uses a dev container, which is the easiest way to set up a consistent development environment. Dev containers provide all the necessary tools, dependencies, and configuration, so you can focus on coding right away.\n\n##### Using Dev Containers\n\nThis project uses a dev container for a consistent development environment. To get started:\n\n1. Open the project in Visual Studio Code.\n2. On Windows/Linux, press `Ctrl+Shift+P` and run the command `Remote-Containers: Reopen in Container`. On Mac, press `Cmd+Shift+P` and run the same command.\n3. VS Code will build and start the dev container, providing access to the project's codebase and dependencies.\n\nOther editors may have similar functionality but this project is optimised for Visual Studio Code.\n\n### Embedder\n\nBefore running this command, make sure you have a running instance of [Ollama](https://ollama.com/) and the nomic-embed-text:v1.5 model is available:\n\n```bash\nollama pull nomic-embed-text:v1.5\n```\n\n```bash\nrag-cli embed --ollama-url http://localhost:11434 --file <INPUT_FILE>\n```\n\nYou can alternatively use stdin to pass the text:\n\n```bash\ncat <INPUT_FILE> | rag-cli embed --ollama-url http://localhost:11434\n```\n\n### Vector store\n\n```bash\nrag-cli vector-store \\\n--qdrant-url http://localhost:6333 \\\n--collection-name <COLLECTION_NAME> \\\n--data '{<JSON_DATA>}'\n--embedding <EMBEDDING_FILE>\n```\n\nYou can alternatively use stdin to pass embeddings:\n\n```bash\ncat <INPUT_FILE> | \\\nrag-cli vector-store \\\n--qdrant-url http://localhost:6333 \\\n--collection-name <COLLECTION_NAME> \\\n--data '{<JSON_DATA>}'\n```\n\n### RAG Chat\n\n```bash\nrag-cli rag \\\n--ollama-embedding-url http://localhost:11434 \\\n--ollama-chat-url http://localhost:11435 \\\n--qdrant-url http://localhost:6333 \\\n--collection-name nomic-embed-text-v1.5 \\\n--top-k 5 \\\n--min-similarity 0.5\n--file <INPUT_FILE>\n```\n\nYou can alternatively use stdin to pass the text:\n\n```bash\ncat <INPUT_FILE> | \\\n--ollama-embedding-url http://localhost:11434 \\\n--ollama-chat-url http://localhost:11435 \\\n--qdrant-url http://localhost:6333 \\\n--collection-name nomic-embed-text-v1.5 \\\n--top-k 5 \\\n--min-similarity 0.5\n```\n\n### End-to-end Pipeline For Storing Embeddings\n\nHere is an example of an end-to-end pipeline for storing embeddings. It takes the following steps:\n\n- Get a random Wikipedia article\n- Embed the article\n- Store the embedding in Qdrant\n\nBefore running the pipeline make sure you have the following installed:\n\n```bash\nsudo apt-get update && sudo apt-get install parallel jq curl\n```\n\nAlso make sure that the `data/articles` and `data/embeddings` directories exist:\n\n```bash\nmkdir -p data/articles data/embeddings\n```\n\nThen run the pipeline:\n\n```bash\nbash scripts/run_pipeline.sh\n```\n\n#### Parallel Pipeline\n\nThe script `scripts/run_pipeline.sh` can be run in parallel with [GNU Parallel](https://www.gnu.org/software/parallel/) to speed up the process.\n\n```bash\nparallel -j 5 -n0 bash scripts/run_pipeline.sh ::: {0..10}\n```\n\n## Examples\n\n### Get 10 Random Wikipedia Articles\n\n```bash\nparallel -n0 -j 10 '\ncurl -L -s \"https://en.wikipedia.org/api/rest_v1/page/random/summary\" | \\\njq -r \".title, .description, .extract\" | \\\ntee data/articles/$(cat /proc/sys/kernel/random/uuid).txt\n' ::: {0..10}\n```\n\n### Run Embedder On All Articles\n\n```bash\nparallel '\nrag-cli embed --ollama-url http://localhost:11434 --file {1} 2>> output.log | \\\njq \".embedding\" | \\\ntee data/embeddings/$(basename {1} .txt) 1> /dev/null\n' ::: $(find data/articles/*.txt)\n```\n\n### Store All Embeddings In Qdrant\n\n```bash\nparallel rag-cli vector-store --qdrant-url http://localhost:6333 --collection-name nomic-embed-text-v1.5 2>> output.log ::: $(find data/embeddings/*)\n```\n\n### Run RAG Chat On A Query\n\n```bash\necho \"Who invented the blue LED?\" | \\\nrag-cli rag \\\n--ollama-embedding-url http://localhost:11434 \\\n--ollama-chat-url http://localhost:11435 \\\n--qdrant-url http://localhost:6333 \\\n--collection-name nomic-embed-text-v1.5 \\\n--top-k 5 \\\n--min-similarity 0.5 \\\n2>> output.log\n```\n\nThis example obviously requires that the articles similar to the query have been embedded and stored in Qdrant. You can do this with the example found in the next section.\n\n### End-to-end Pipeline For A Single Article\n\n```bash\nwikipedia_data=$(curl -L -s \"https://en.wikipedia.org/api/rest_v1/page/summary/Shuji_Nakamura\") && \\\npayload_data=$(jq \"{title: .title, description: .description, extract: .extract}\"  <(echo $wikipedia_data)) && \\\ntext_to_embed=$(jq -r \".title, .description, .extract\" <(echo $wikipedia_data)) && \\\necho $text_to_embed | \\\nrag-cli embed --ollama-url http://localhost:11434 | \\\njq -r \".embedding\" | \\\nrag-cli vector-store \\\n  --qdrant-url http://localhost:6333 \\\n  --collection-name nomic-embed-text-v1.5 \\\n  --data \"$payload_data\"\n```\n\n",
    "bugtrack_url": null,
    "license": "GNU GPL v3",
    "summary": "A project to show good CLI practices with a fully fledged RAG system.",
    "version": "0.3.1",
    "project_urls": null,
    "split_keywords": [
        "cli",
        " rag",
        " llm",
        " vector database",
        " ollama"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "029c039a3abfc9dd3845e2b7d12f87e09a3d42ba399afb1ff100af975fb658b4",
                "md5": "39dd9ccfd7e4c71facacd1243296b2a5",
                "sha256": "d7d0f987197e8a3815dd023c0eaec541fb9ce8b7bfe358c924015dcc5ad156ee"
            },
            "downloads": -1,
            "filename": "rag_cli-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "39dd9ccfd7e4c71facacd1243296b2a5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 19917,
            "upload_time": "2024-06-13T22:44:03",
            "upload_time_iso_8601": "2024-06-13T22:44:03.717077Z",
            "url": "https://files.pythonhosted.org/packages/02/9c/039a3abfc9dd3845e2b7d12f87e09a3d42ba399afb1ff100af975fb658b4/rag_cli-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "57d44994902da9fac1520ca9fd833ababee7bc7ab511746c642cf0c0796da173",
                "md5": "157028a24e47ab28b7e508f4fee9106b",
                "sha256": "fb6c24a2f27d0471aa6ac61cda06cd68293aa672f8bbce69913e645e1a541d9d"
            },
            "downloads": -1,
            "filename": "rag_cli-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "157028a24e47ab28b7e508f4fee9106b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 19956,
            "upload_time": "2024-06-13T22:44:04",
            "upload_time_iso_8601": "2024-06-13T22:44:04.727435Z",
            "url": "https://files.pythonhosted.org/packages/57/d4/4994902da9fac1520ca9fd833ababee7bc7ab511746c642cf0c0796da173/rag_cli-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-13 22:44:04",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "rag-cli"
}

Oliver Kenyon Wilkins