llama-index-packs-multidoc-autoretrieval


Namellama-index-packs-multidoc-autoretrieval JSON
Version 0.3.0 PyPI version JSON
download
home_pageNone
Summaryllama-index packs multidoc_autoretrieval integration
upload_time2024-11-18 01:30:35
maintainerjerryjliu
docs_urlNone
authorYour Name
requires_python<4.0,>=3.9
licenseMIT
keywords autoretrieval document multi multidoc retrieval
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Multi-Document AutoRetrieval (with Weaviate) Pack

This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.

## CLI Usage

You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:

```bash
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
```

You can then inspect the files at `./multidoc_autoretrieval_pack` and use them as a template for your own project!

## Code Usage

You can download the pack to a the `./multidoc_autoretrieval_pack` directory:

```python
from llama_index.core.llama_pack import download_llama_pack

# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
    "MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
```

From here, you can use the pack. To initialize it, you need to define a few arguments, see below.

Then, you can set up the pack like so:

```python
# setup pack arguments
from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo

import weaviate

# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
    "https://<cluster>.weaviate.network",
    auth_client_secret=auth_config,
)

vector_store_info = VectorStoreInfo(
    content_info="Github Issues",
    metadata_info=[
        MetadataInfo(
            name="state",
            description="Whether the issue is `open` or `closed`",
            type="string",
        ),
        ...,
    ],
)

# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]

pack = MultiDocAutoRetrieverPack(
    client,
    "<metadata_index_name>",
    "<doc_chunks_index_name>",
    metadata_nodes,
    docs,
    vector_store_info,
    auto_retriever_kwargs={
        # any kwargs for the auto-retriever
        ...
    },
)
```

The `run()` function is a light wrapper around `query_engine.query()`.

```python
response = pack.run("Tell me a bout a Music celebritiy.")
```

You can also use modules individually.

```python
# use the retriever
retriever = pack.retriever
nodes = retriever.retrieve("query_str")

# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-packs-multidoc-autoretrieval",
    "maintainer": "jerryjliu",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "autoretrieval, document, multi, multidoc, retrieval",
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/3c/55/e713bea9edc5494ed844874890d89906a8f5596ed131880e11c6715510bd/llama_index_packs_multidoc_autoretrieval-0.3.0.tar.gz",
    "platform": null,
    "description": "# Multi-Document AutoRetrieval (with Weaviate) Pack\n\nThis LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.\n\n## CLI Usage\n\nYou can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:\n\n```bash\nllamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack\n```\n\nYou can then inspect the files at `./multidoc_autoretrieval_pack` and use them as a template for your own project!\n\n## Code Usage\n\nYou can download the pack to a the `./multidoc_autoretrieval_pack` directory:\n\n```python\nfrom llama_index.core.llama_pack import download_llama_pack\n\n# download and install dependencies\nMultiDocAutoRetrieverPack = download_llama_pack(\n    \"MultiDocAutoRetrieverPack\", \"./multidoc_autoretrieval_pack\"\n)\n```\n\nFrom here, you can use the pack. To initialize it, you need to define a few arguments, see below.\n\nThen, you can set up the pack like so:\n\n```python\n# setup pack arguments\nfrom llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo\n\nimport weaviate\n\n# cloud\nauth_config = weaviate.AuthApiKey(api_key=\"<api_key>\")\nclient = weaviate.Client(\n    \"https://<cluster>.weaviate.network\",\n    auth_client_secret=auth_config,\n)\n\nvector_store_info = VectorStoreInfo(\n    content_info=\"Github Issues\",\n    metadata_info=[\n        MetadataInfo(\n            name=\"state\",\n            description=\"Whether the issue is `open` or `closed`\",\n            type=\"string\",\n        ),\n        ...,\n    ],\n)\n\n# metadata_nodes is set of nodes with metadata representing each document\n# docs is the source docs\n# metadata_nodes and docs must be the same length\nmetadata_nodes = [TextNode(..., metadata={...}), ...]\ndocs = [Document(...), ...]\n\npack = MultiDocAutoRetrieverPack(\n    client,\n    \"<metadata_index_name>\",\n    \"<doc_chunks_index_name>\",\n    metadata_nodes,\n    docs,\n    vector_store_info,\n    auto_retriever_kwargs={\n        # any kwargs for the auto-retriever\n        ...\n    },\n)\n```\n\nThe `run()` function is a light wrapper around `query_engine.query()`.\n\n```python\nresponse = pack.run(\"Tell me a bout a Music celebritiy.\")\n```\n\nYou can also use modules individually.\n\n```python\n# use the retriever\nretriever = pack.retriever\nnodes = retriever.retrieve(\"query_str\")\n\n# use the query engine\nquery_engine = pack.query_engine\nresponse = query_engine.query(\"query_str\")\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index packs multidoc_autoretrieval integration",
    "version": "0.3.0",
    "project_urls": null,
    "split_keywords": [
        "autoretrieval",
        " document",
        " multi",
        " multidoc",
        " retrieval"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "701c300120c063220d3c1396c39f98f4764fad4e9bfde1776ed5677e78d8fd2a",
                "md5": "da320ed9c3dc48b338bb8daa6124d1e2",
                "sha256": "422be6cb4e8deade23f459de6fc1f01cb3c453ba72d4cfae9509620d0bd31b1b"
            },
            "downloads": -1,
            "filename": "llama_index_packs_multidoc_autoretrieval-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "da320ed9c3dc48b338bb8daa6124d1e2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 4611,
            "upload_time": "2024-11-18T01:30:33",
            "upload_time_iso_8601": "2024-11-18T01:30:33.643551Z",
            "url": "https://files.pythonhosted.org/packages/70/1c/300120c063220d3c1396c39f98f4764fad4e9bfde1776ed5677e78d8fd2a/llama_index_packs_multidoc_autoretrieval-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3c55e713bea9edc5494ed844874890d89906a8f5596ed131880e11c6715510bd",
                "md5": "f15e8a1a3ac5400aa5369b0dc6763139",
                "sha256": "d96f05aad1fc9c9981ebc45ca2480ddd2f8651421e4ab2b54f7aaaac3ce2d95c"
            },
            "downloads": -1,
            "filename": "llama_index_packs_multidoc_autoretrieval-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f15e8a1a3ac5400aa5369b0dc6763139",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 4216,
            "upload_time": "2024-11-18T01:30:35",
            "upload_time_iso_8601": "2024-11-18T01:30:35.511634Z",
            "url": "https://files.pythonhosted.org/packages/3c/55/e713bea9edc5494ed844874890d89906a8f5596ed131880e11c6715510bd/llama_index_packs_multidoc_autoretrieval-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-18 01:30:35",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-packs-multidoc-autoretrieval"
}
        
Elapsed time: 1.19837s