llama-index-packs-subdoc-summary


Namellama-index-packs-subdoc-summary JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
Summaryllama-index packs subdoc-summary implementation
upload_time2024-08-22 17:58:50
maintainerNone
docs_urlNone
authorYour Name
requires_python<4.0,>=3.8.1
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LlamaIndex Packs Integration: Subdoc-Summary

This LlamaPack provides an advanced technique for injecting each chunk with "sub-document" metadata. This context augmentation technique is helpful for both retrieving relevant context and for synthesizing correct answers.

It is a step beyond simply adding a summary of the document as the metadata to each chunk. Within a long document, there can be multiple distinct themes, and we want each chunk to be grounded in global but relevant context.

This technique was inspired by our "Practical Tips and Tricks" video: https://www.youtube.com/watch?v=ZP1F9z-S7T0.

## Installation

```bash
pip install llama-index llama-index-packs-subdoc-summary
```

## CLI Usage

You can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:

```bash
llamaindex-cli download-llamapack SubDocSummaryPack --download-dir ./subdoc_summary_pack
```

You can then inspect the files at `./subdoc_summary_pack` and use them as a template for your own project.

## Code Usage

You can download the pack to a the `./subdoc_summary_pack` directory:

```python
from llama_index.core.llama_pack import download_llama_pack

# download and install dependencies
SubDocSummaryPack = download_llama_pack(
    "SubDocSummaryPack", "./subdoc_summary_pack"
)

# You can use any llama-hub loader to get documents!
subdoc_summary_pack = SubDocSummaryPack(
    documents,
    parent_chunk_size=8192,  # default,
    child_chunk_size=512,  # default
    llm=OpenAI(model="gpt-3.5-turbo"),
    embed_model=OpenAIEmbedding(),
)
```

Initializing the pack will split documents into parent chunks and child chunks. It will inject parent chunk summaries into child chunks, and index the child chunks.

Running the pack will run the query engine over the vectorized child chunks.

```python
response = subdoc_summary_pack.run("<query>", similarity_top_k=2)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-packs-subdoc-summary",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8.1",
    "maintainer_email": null,
    "keywords": null,
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/fd/af/782fe11fef4e541c492d117093941db04da4a1ce12cb40c735ab7b0fe0b0/llama_index_packs_subdoc_summary-0.2.0.tar.gz",
    "platform": null,
    "description": "# LlamaIndex Packs Integration: Subdoc-Summary\n\nThis LlamaPack provides an advanced technique for injecting each chunk with \"sub-document\" metadata. This context augmentation technique is helpful for both retrieving relevant context and for synthesizing correct answers.\n\nIt is a step beyond simply adding a summary of the document as the metadata to each chunk. Within a long document, there can be multiple distinct themes, and we want each chunk to be grounded in global but relevant context.\n\nThis technique was inspired by our \"Practical Tips and Tricks\" video: https://www.youtube.com/watch?v=ZP1F9z-S7T0.\n\n## Installation\n\n```bash\npip install llama-index llama-index-packs-subdoc-summary\n```\n\n## CLI Usage\n\nYou can download llamapacks directly using `llamaindex-cli`, which comes installed with the `llama-index` python package:\n\n```bash\nllamaindex-cli download-llamapack SubDocSummaryPack --download-dir ./subdoc_summary_pack\n```\n\nYou can then inspect the files at `./subdoc_summary_pack` and use them as a template for your own project.\n\n## Code Usage\n\nYou can download the pack to a the `./subdoc_summary_pack` directory:\n\n```python\nfrom llama_index.core.llama_pack import download_llama_pack\n\n# download and install dependencies\nSubDocSummaryPack = download_llama_pack(\n    \"SubDocSummaryPack\", \"./subdoc_summary_pack\"\n)\n\n# You can use any llama-hub loader to get documents!\nsubdoc_summary_pack = SubDocSummaryPack(\n    documents,\n    parent_chunk_size=8192,  # default,\n    child_chunk_size=512,  # default\n    llm=OpenAI(model=\"gpt-3.5-turbo\"),\n    embed_model=OpenAIEmbedding(),\n)\n```\n\nInitializing the pack will split documents into parent chunks and child chunks. It will inject parent chunk summaries into child chunks, and index the child chunks.\n\nRunning the pack will run the query engine over the vectorized child chunks.\n\n```python\nresponse = subdoc_summary_pack.run(\"<query>\", similarity_top_k=2)\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index packs subdoc-summary implementation",
    "version": "0.2.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "85aee1aad51f7cca2be875d2b8fc0097ced6ae6435f3aa306787cd632137b0d1",
                "md5": "40845d37f7e17a1755fb5e88b7508ac9",
                "sha256": "c8d50e5d72dacd3257be82ec9c57404d86b5b54e25c33267323976b5a11a43c7"
            },
            "downloads": -1,
            "filename": "llama_index_packs_subdoc_summary-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "40845d37f7e17a1755fb5e88b7508ac9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8.1",
            "size": 3577,
            "upload_time": "2024-08-22T17:58:49",
            "upload_time_iso_8601": "2024-08-22T17:58:49.471487Z",
            "url": "https://files.pythonhosted.org/packages/85/ae/e1aad51f7cca2be875d2b8fc0097ced6ae6435f3aa306787cd632137b0d1/llama_index_packs_subdoc_summary-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fdaf782fe11fef4e541c492d117093941db04da4a1ce12cb40c735ab7b0fe0b0",
                "md5": "3e79258ba8637eb8c3d1d8954093b16b",
                "sha256": "24c6558a4a17bd2fe8cf46c2b41895d6c0e09b8922e76c571039ca7ea4fc9e0d"
            },
            "downloads": -1,
            "filename": "llama_index_packs_subdoc_summary-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "3e79258ba8637eb8c3d1d8954093b16b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8.1",
            "size": 3239,
            "upload_time": "2024-08-22T17:58:50",
            "upload_time_iso_8601": "2024-08-22T17:58:50.403373Z",
            "url": "https://files.pythonhosted.org/packages/fd/af/782fe11fef4e541c492d117093941db04da4a1ce12cb40c735ab7b0fe0b0/llama_index_packs_subdoc_summary-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-22 17:58:50",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-packs-subdoc-summary"
}
        
Elapsed time: 0.34479s