argilla-llama-index


Nameargilla-llama-index JSON
Version 0.0.1a0 PyPI version JSON
download
home_pageNone
SummaryArgilla-Llama Index Integration
upload_time2024-01-23 14:59:11
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords annotation llm monitoring
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Argilla-Llama-Index

Argilla is an open-source platform for data-centric LLM development. Integrates human and model feedback loops for continuous LLM refinement and oversight.

With Argilla's Python SDK and adaptable UI, you can create human and model-in-the-loop workflows for:

- Supervised fine-tuning
- Preference tuning (RLHF, DPO, RLAIF, and more)
- Small, specialized NLP models
- Scalable evaluation.

## Getting Started

You first need to install argilla and argilla-llama-index as follows:

```bash
pip install argilla-llama-index
```

You will need to an Argilla Server running to monitor the LLM. You can either install the server locally or have it on HuggingFace Spaces. For a complete guide on how to install and initialize the server, you can refer to the [Quickstart Guide](https://docs.argilla.io/en/latest/getting_started/quickstart_installation.html). 

## Usage

It requires just a simple step to log your data into Argilla within your LlamaIndex workflow. We just need to call the handler before starting production with your LLM.

We will use GPT3.5 from OpenAI as our LLM. For this, you will need a valid API key from OpenAI. You can have more info and get one via [this link](https://openai.com/blog/openai-api).

After you get your API key, let us import the key.

```python
import os
from getpass import getpass

openai_api_key = os.getenv("OPENAI_API_KEY", None) or getpass("Enter OpenAI API key:")
```

Let us make the necessary imports.

```python
from argilla_llama_index import ArgillaCallbackHandler
from llama_index import VectorStoreIndex, ServiceContext, SimpleDirectoryReader
from llama_index.llms import OpenAI
from llama_index import set_global_handler
```

What we need to do is to set Argilla as the global handler as below. Within the handler, we need to provide the dataset name that we will use. If the dataset does not exist, it will be created with the given name. You can also set the API KEY, API URL, and the Workspace name. If you do not provide these, the default values will be used.

```python
set_global_handler("argilla", dataset_name="query_model")
```

Let us create the LLM.

```python
llm = OpenAI(model="gpt-3.5-turbo", temperature=0.8)
```

With the code snippet below, you can create a basic workflow with Llama Index. You will also need a txt file as the data source within a folder named "data". For a sample data file and more info regarding the use of Llama Index, you can refer to the [Llama Index documentation](https://docs.llamaindex.ai/en/stable/getting_started/starter_example.html).


```python
service_context = ServiceContext.from_defaults(llm=llm)
docs = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(docs, service_context=service_context)
query_engine = index.as_query_engine()
```

Now, let us run the `query_engine` to have a response from the model. 

```python
response = query_engine.query("What did the author do growing up?")
```

```bash
The author worked on two main things outside of school before college: writing and programming. They wrote short stories and tried writing programs on an IBM 1401. They later got a microcomputer, built it themselves, and started programming on it.
```

The prompt given and the response obtained will be logged in to Argilla server. You can check the data on the Argilla UI.

![Argilla Dataset](docs/argilla-ui-dataset.png)

And we also logged the metadata properties into Argilla. You can check them via the Filters on the upper left and filter your data according to any of them.

![Argilla Dataset](docs/argilla-ui-dataset-2.png)





            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "argilla-llama-index",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "annotation,llm,monitoring",
    "author": null,
    "author_email": "Argilla <admin@argilla.io>",
    "download_url": "https://files.pythonhosted.org/packages/d5/9c/61b0a7b1176dfacce6855a836a590b0e6f9b1bb64451016d456edf9c6ae8/argilla_llama_index-0.0.1a0.tar.gz",
    "platform": null,
    "description": "# Argilla-Llama-Index\n\nArgilla is an open-source platform for data-centric LLM development. Integrates human and model feedback loops for continuous LLM refinement and oversight.\n\nWith Argilla's Python SDK and adaptable UI, you can create human and model-in-the-loop workflows for:\n\n- Supervised fine-tuning\n- Preference tuning (RLHF, DPO, RLAIF, and more)\n- Small, specialized NLP models\n- Scalable evaluation.\n\n## Getting Started\n\nYou first need to install argilla and argilla-llama-index as follows:\n\n```bash\npip install argilla-llama-index\n```\n\nYou will need to an Argilla Server running to monitor the LLM. You can either install the server locally or have it on HuggingFace Spaces. For a complete guide on how to install and initialize the server, you can refer to the [Quickstart Guide](https://docs.argilla.io/en/latest/getting_started/quickstart_installation.html). \n\n## Usage\n\nIt requires just a simple step to log your data into Argilla within your LlamaIndex workflow. We just need to call the handler before starting production with your LLM.\n\nWe will use GPT3.5 from OpenAI as our LLM. For this, you will need a valid API key from OpenAI. You can have more info and get one via [this link](https://openai.com/blog/openai-api).\n\nAfter you get your API key, let us import the key.\n\n```python\nimport os\nfrom getpass import getpass\n\nopenai_api_key = os.getenv(\"OPENAI_API_KEY\", None) or getpass(\"Enter OpenAI API key:\")\n```\n\nLet us make the necessary imports.\n\n```python\nfrom argilla_llama_index import ArgillaCallbackHandler\nfrom llama_index import VectorStoreIndex, ServiceContext, SimpleDirectoryReader\nfrom llama_index.llms import OpenAI\nfrom llama_index import set_global_handler\n```\n\nWhat we need to do is to set Argilla as the global handler as below. Within the handler, we need to provide the dataset name that we will use. If the dataset does not exist, it will be created with the given name. You can also set the API KEY, API URL, and the Workspace name. If you do not provide these, the default values will be used.\n\n```python\nset_global_handler(\"argilla\", dataset_name=\"query_model\")\n```\n\nLet us create the LLM.\n\n```python\nllm = OpenAI(model=\"gpt-3.5-turbo\", temperature=0.8)\n```\n\nWith the code snippet below, you can create a basic workflow with Llama Index. You will also need a txt file as the data source within a folder named \"data\". For a sample data file and more info regarding the use of Llama Index, you can refer to the [Llama Index documentation](https://docs.llamaindex.ai/en/stable/getting_started/starter_example.html).\n\n\n```python\nservice_context = ServiceContext.from_defaults(llm=llm)\ndocs = SimpleDirectoryReader(\"data\").load_data()\nindex = VectorStoreIndex.from_documents(docs, service_context=service_context)\nquery_engine = index.as_query_engine()\n```\n\nNow, let us run the `query_engine` to have a response from the model. \n\n```python\nresponse = query_engine.query(\"What did the author do growing up?\")\n```\n\n```bash\nThe author worked on two main things outside of school before college: writing and programming. They wrote short stories and tried writing programs on an IBM 1401. They later got a microcomputer, built it themselves, and started programming on it.\n```\n\nThe prompt given and the response obtained will be logged in to Argilla server. You can check the data on the Argilla UI.\n\n![Argilla Dataset](docs/argilla-ui-dataset.png)\n\nAnd we also logged the metadata properties into Argilla. You can check them via the Filters on the upper left and filter your data according to any of them.\n\n![Argilla Dataset](docs/argilla-ui-dataset-2.png)\n\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Argilla-Llama Index Integration",
    "version": "0.0.1a0",
    "project_urls": {
        "Documentation": "https://github.com/argilla-io/argilla-llama-index",
        "Issues": "https://github.com/argilla-io/argilla-llama-index/issues",
        "Source": "https://github.com/argilla-io/argilla-llama-index"
    },
    "split_keywords": [
        "annotation",
        "llm",
        "monitoring"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a7d3e36d180c638e9c2376d30979cc10e1788858767c9e265322c7e4df1feaad",
                "md5": "091051cb20ba5694f48644968fe38231",
                "sha256": "d9eb1aeced7f084a1605fdff6559bbff88023d65e0575f299810151f305855b9"
            },
            "downloads": -1,
            "filename": "argilla_llama_index-0.0.1a0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "091051cb20ba5694f48644968fe38231",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 15252,
            "upload_time": "2024-01-23T14:59:13",
            "upload_time_iso_8601": "2024-01-23T14:59:13.342778Z",
            "url": "https://files.pythonhosted.org/packages/a7/d3/e36d180c638e9c2376d30979cc10e1788858767c9e265322c7e4df1feaad/argilla_llama_index-0.0.1a0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d59c61b0a7b1176dfacce6855a836a590b0e6f9b1bb64451016d456edf9c6ae8",
                "md5": "37db3919021709e8f0da845c56beb056",
                "sha256": "ee9c726bc9682ed6804bedf2f6e775c037d1071bd7f2a7e8780ac57e459215fa"
            },
            "downloads": -1,
            "filename": "argilla_llama_index-0.0.1a0.tar.gz",
            "has_sig": false,
            "md5_digest": "37db3919021709e8f0da845c56beb056",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 523176,
            "upload_time": "2024-01-23T14:59:11",
            "upload_time_iso_8601": "2024-01-23T14:59:11.658009Z",
            "url": "https://files.pythonhosted.org/packages/d5/9c/61b0a7b1176dfacce6855a836a590b0e6f9b1bb64451016d456edf9c6ae8/argilla_llama_index-0.0.1a0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-23 14:59:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "argilla-io",
    "github_project": "argilla-llama-index",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "argilla-llama-index"
}
        
Elapsed time: 0.19346s