llama-index-packs-multidoc-autoretrieval


Namellama-index-packs-multidoc-autoretrieval JSON
Version 0.4.0 PyPI version JSON
download
home_pageNone
Summaryllama-index packs multidoc_autoretrieval integration
upload_time2025-07-30 21:40:01
maintainerjerryjliu
docs_urlNone
authorNone
requires_python<4.0,>=3.9
licenseNone
keywords autoretrieval document multi multidoc retrieval
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Multi-Document AutoRetrieval (with Weaviate) Pack

This LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.

## CLI Usage

You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:

```bash
llamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack
```

You can then inspect the files at `./multidoc_autoretrieval_pack` and use them as a template for your own project!

## Code Usage

You can download the pack to a the `./multidoc_autoretrieval_pack` directory:

```python
from llama_index.core.llama_pack import download_llama_pack

# download and install dependencies
MultiDocAutoRetrieverPack = download_llama_pack(
    "MultiDocAutoRetrieverPack", "./multidoc_autoretrieval_pack"
)
```

From here, you can use the pack. To initialize it, you need to define a few arguments, see below.

Then, you can set up the pack like so:

```python
# setup pack arguments
from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo

import weaviate

# cloud
auth_config = weaviate.AuthApiKey(api_key="<api_key>")
client = weaviate.Client(
    "https://<cluster>.weaviate.network",
    auth_client_secret=auth_config,
)

vector_store_info = VectorStoreInfo(
    content_info="Github Issues",
    metadata_info=[
        MetadataInfo(
            name="state",
            description="Whether the issue is `open` or `closed`",
            type="string",
        ),
        ...,
    ],
)

# metadata_nodes is set of nodes with metadata representing each document
# docs is the source docs
# metadata_nodes and docs must be the same length
metadata_nodes = [TextNode(..., metadata={...}), ...]
docs = [Document(...), ...]

pack = MultiDocAutoRetrieverPack(
    client,
    "<metadata_index_name>",
    "<doc_chunks_index_name>",
    metadata_nodes,
    docs,
    vector_store_info,
    auto_retriever_kwargs={
        # any kwargs for the auto-retriever
        ...
    },
)
```

The `run()` function is a light wrapper around `query_engine.query()`.

```python
response = pack.run("Tell me a bout a Music celebritiy.")
```

You can also use modules individually.

```python
# use the retriever
retriever = pack.retriever
nodes = retriever.retrieve("query_str")

# use the query engine
query_engine = pack.query_engine
response = query_engine.query("query_str")
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-packs-multidoc-autoretrieval",
    "maintainer": "jerryjliu",
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "autoretrieval, document, multi, multidoc, retrieval",
    "author": null,
    "author_email": "Your Name <you@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/18/64/5deb0370fe72aaa861397dd06fdc7f058b83e98879685fdc4c095128842c/llama_index_packs_multidoc_autoretrieval-0.4.0.tar.gz",
    "platform": null,
    "description": "# Multi-Document AutoRetrieval (with Weaviate) Pack\n\nThis LlamaPack implements structured hierarchical retrieval over multiple documents, using multiple @weaviate_io collections.\n\n## CLI Usage\n\nYou can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:\n\n```bash\nllamaindex-cli download-llamapack MultiDocAutoRetrieverPack --download-dir ./multidoc_autoretrieval_pack\n```\n\nYou can then inspect the files at `./multidoc_autoretrieval_pack` and use them as a template for your own project!\n\n## Code Usage\n\nYou can download the pack to a the `./multidoc_autoretrieval_pack` directory:\n\n```python\nfrom llama_index.core.llama_pack import download_llama_pack\n\n# download and install dependencies\nMultiDocAutoRetrieverPack = download_llama_pack(\n    \"MultiDocAutoRetrieverPack\", \"./multidoc_autoretrieval_pack\"\n)\n```\n\nFrom here, you can use the pack. To initialize it, you need to define a few arguments, see below.\n\nThen, you can set up the pack like so:\n\n```python\n# setup pack arguments\nfrom llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo\n\nimport weaviate\n\n# cloud\nauth_config = weaviate.AuthApiKey(api_key=\"<api_key>\")\nclient = weaviate.Client(\n    \"https://<cluster>.weaviate.network\",\n    auth_client_secret=auth_config,\n)\n\nvector_store_info = VectorStoreInfo(\n    content_info=\"Github Issues\",\n    metadata_info=[\n        MetadataInfo(\n            name=\"state\",\n            description=\"Whether the issue is `open` or `closed`\",\n            type=\"string\",\n        ),\n        ...,\n    ],\n)\n\n# metadata_nodes is set of nodes with metadata representing each document\n# docs is the source docs\n# metadata_nodes and docs must be the same length\nmetadata_nodes = [TextNode(..., metadata={...}), ...]\ndocs = [Document(...), ...]\n\npack = MultiDocAutoRetrieverPack(\n    client,\n    \"<metadata_index_name>\",\n    \"<doc_chunks_index_name>\",\n    metadata_nodes,\n    docs,\n    vector_store_info,\n    auto_retriever_kwargs={\n        # any kwargs for the auto-retriever\n        ...\n    },\n)\n```\n\nThe `run()` function is a light wrapper around `query_engine.query()`.\n\n```python\nresponse = pack.run(\"Tell me a bout a Music celebritiy.\")\n```\n\nYou can also use modules individually.\n\n```python\n# use the retriever\nretriever = pack.retriever\nnodes = retriever.retrieve(\"query_str\")\n\n# use the query engine\nquery_engine = pack.query_engine\nresponse = query_engine.query(\"query_str\")\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "llama-index packs multidoc_autoretrieval integration",
    "version": "0.4.0",
    "project_urls": null,
    "split_keywords": [
        "autoretrieval",
        " document",
        " multi",
        " multidoc",
        " retrieval"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4d6558142bbeca00ad4aada906ab13929c5927cfd7ce343a32fe3232b7201de4",
                "md5": "359c99ad403c89d5dc11257413e59913",
                "sha256": "dd8e0f164864172b3c169471b2d49ad6661c443a1ff01ef741e270c05ed1bc8b"
            },
            "downloads": -1,
            "filename": "llama_index_packs_multidoc_autoretrieval-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "359c99ad403c89d5dc11257413e59913",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 5466,
            "upload_time": "2025-07-30T21:39:59",
            "upload_time_iso_8601": "2025-07-30T21:39:59.414943Z",
            "url": "https://files.pythonhosted.org/packages/4d/65/58142bbeca00ad4aada906ab13929c5927cfd7ce343a32fe3232b7201de4/llama_index_packs_multidoc_autoretrieval-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "18645deb0370fe72aaa861397dd06fdc7f058b83e98879685fdc4c095128842c",
                "md5": "b8410812637816eaa846d4b44365c631",
                "sha256": "fa7f110b3f26b066cb82b1241ecb3edcaa933e1a44c20f1b69473573d0a588bd"
            },
            "downloads": -1,
            "filename": "llama_index_packs_multidoc_autoretrieval-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "b8410812637816eaa846d4b44365c631",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 5634,
            "upload_time": "2025-07-30T21:40:01",
            "upload_time_iso_8601": "2025-07-30T21:40:01.785685Z",
            "url": "https://files.pythonhosted.org/packages/18/64/5deb0370fe72aaa861397dd06fdc7f058b83e98879685fdc4c095128842c/llama_index_packs_multidoc_autoretrieval-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-30 21:40:01",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-packs-multidoc-autoretrieval"
}
        
Elapsed time: 1.61890s