bulk-chain


Namebulk-chain JSON
Version 0.25.0 PyPI version JSON
download
home_pagehttps://github.com/nicolay-r/bulk-chain
SummaryA lightweight, no-strings-attached Chain-of-Thought framework for your LLM, ensuring reliable results for bulk input requests.
upload_time2024-12-24 11:16:03
maintainerNone
docs_urlNone
authorNicolay Rusnachenko
requires_python>=3.6
licenseMIT License
keywords natural language processing chain-of-thought reasoning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bulk-chain 0.25.0
![](https://img.shields.io/badge/Python-3.9-brightgreen.svg)
[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nicolay-r/bulk-chain/blob/master/bulk_chain_tutorial.ipynb)
[![twitter](https://img.shields.io/twitter/url/https/shields.io.svg?style=social)](https://x.com/nicolayr_/status/1847969224636961033)
[![PyPI downloads](https://img.shields.io/pypi/dm/bulk-chain.svg)](https://pypistats.org/packages/bulk-chain)

<p align="center">
    <img src="logo.png"/>
</p>

A lightweight, no-strings-attached **framework**  for your LLM that allows applying [Chain-of-Thought](https://arxiv.org/abs/2201.11903) prompt `schema` (See [related section](#chain-of-thought-schema)) towards a massive textual collections.

### Main Features
* ✅ **No-strings**: you're free to LLM dependencies and flexible `venv` customization.
* ✅ **Support schemas descriptions** for Chain-of-Thought concept.
* ✅ **Provides iterator over infinite amount of input contexts** served in `CSV`/`JSONL`.

### Extra Features
* ✅ **Progress caching [for remote LLMs]**: withstanding exception during LLM calls by using `sqlite3` engine for caching LLM answers;


# Installation

From PyPI: 

```bash
pip install bulk-chain
```

or latest version from here:

```bash
pip install git+https://github.com/nicolay-r/bulk-chain@master
```

## Chain-of-Thought Schema

To declare Chain-of-Though (CoT) schema, this project exploits `JSON` format.
This format adopts `name` field for declaring a name and `schema` is a list of CoT instructions for the Large Language Model.

Each step represents a dictionary with `prompt` and `out` keys that corresponds to the input prompt and output variable name respectively.
All the variable names are expected to be mentioned in `{}`.

Below, is an example on how to declare your own schema:

```python
{
"name": "schema-name",
"schema": [
    {"prompt": "Given the question '{text}', let's think step-by-step.", 
     "out": "steps"},
    {"prompt": "For the question '{text}' the reasoining steps are '{steps}'. what would be an answer?", 
     "out":  "answer"},
]
}
```

Another templates are available [here](/ext/schema/).

# Usage

Preliminary steps:

1. Define your [schema](#chain-of-thought-schema) ([Example for Sentiment Analysis](/ext/schema/thor_cot_schema.json)))
2. Wrap or pick **LLM model** from the [list of presets](/ext/).

## API

Please take a look at the [**related Wiki page**](https://github.com/nicolay-r/bulk-chain/wiki)

## Shell

> **NOTE:** You have to install `source-iter` package

```bash
python3 -m bulk_chain.infer \
    --src "<PATH-TO-YOUR-CSV-or-JSONL>" \
    --schema "ext/schema/default.json" \
    --adapter "dynamic:ext/replicate.py:Replicate" \
    %%m \
    --api_token "<REPLICATE-API-TOKEN>" \
    --temp 0.1
```

# Embed your LLM

All you have to do is to implement `BaseLM` class, that includes:
* `__init__` -- for setting up *batching mode support* and (optional) *model name*;
* `ask(prompt)` -- infer your model with the given `prompt`.

See examples with models [here](/ext).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nicolay-r/bulk-chain",
    "name": "bulk-chain",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "natural language processing, chain-of-thought, reasoning",
    "author": "Nicolay Rusnachenko",
    "author_email": "rusnicolay@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/58/b7/6792af8820f73352e092771006c8140580f681e987b893351c75b6a32364/bulk_chain-0.25.0.tar.gz",
    "platform": null,
    "description": "# bulk-chain 0.25.0\n![](https://img.shields.io/badge/Python-3.9-brightgreen.svg)\n[![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nicolay-r/bulk-chain/blob/master/bulk_chain_tutorial.ipynb)\n[![twitter](https://img.shields.io/twitter/url/https/shields.io.svg?style=social)](https://x.com/nicolayr_/status/1847969224636961033)\n[![PyPI downloads](https://img.shields.io/pypi/dm/bulk-chain.svg)](https://pypistats.org/packages/bulk-chain)\n\n<p align=\"center\">\n    <img src=\"logo.png\"/>\n</p>\n\nA lightweight, no-strings-attached **framework**  for your LLM that allows applying [Chain-of-Thought](https://arxiv.org/abs/2201.11903) prompt `schema` (See [related section](#chain-of-thought-schema)) towards a massive textual collections.\n\n### Main Features\n* \u2705 **No-strings**: you're free to LLM dependencies and flexible `venv` customization.\n* \u2705 **Support schemas descriptions** for Chain-of-Thought concept.\n* \u2705 **Provides iterator over infinite amount of input contexts** served in `CSV`/`JSONL`.\n\n### Extra Features\n* \u2705 **Progress caching [for remote LLMs]**: withstanding exception during LLM calls by using `sqlite3` engine for caching LLM answers;\n\n\n# Installation\n\nFrom PyPI: \n\n```bash\npip install bulk-chain\n```\n\nor latest version from here:\n\n```bash\npip install git+https://github.com/nicolay-r/bulk-chain@master\n```\n\n## Chain-of-Thought Schema\n\nTo declare Chain-of-Though (CoT) schema, this project exploits `JSON` format.\nThis format adopts `name` field for declaring a name and `schema` is a list of CoT instructions for the Large Language Model.\n\nEach step represents a dictionary with `prompt` and `out` keys that corresponds to the input prompt and output variable name respectively.\nAll the variable names are expected to be mentioned in `{}`.\n\nBelow, is an example on how to declare your own schema:\n\n```python\n{\n\"name\": \"schema-name\",\n\"schema\": [\n    {\"prompt\": \"Given the question '{text}', let's think step-by-step.\", \n     \"out\": \"steps\"},\n    {\"prompt\": \"For the question '{text}' the reasoining steps are '{steps}'. what would be an answer?\", \n     \"out\":  \"answer\"},\n]\n}\n```\n\nAnother templates are available [here](/ext/schema/).\n\n# Usage\n\nPreliminary steps:\n\n1. Define your [schema](#chain-of-thought-schema) ([Example for Sentiment Analysis](/ext/schema/thor_cot_schema.json)))\n2. Wrap or pick **LLM model** from the [list of presets](/ext/).\n\n## API\n\nPlease take a look at the [**related Wiki page**](https://github.com/nicolay-r/bulk-chain/wiki)\n\n## Shell\n\n> **NOTE:** You have to install `source-iter` package\n\n```bash\npython3 -m bulk_chain.infer \\\n    --src \"<PATH-TO-YOUR-CSV-or-JSONL>\" \\\n    --schema \"ext/schema/default.json\" \\\n    --adapter \"dynamic:ext/replicate.py:Replicate\" \\\n    %%m \\\n    --api_token \"<REPLICATE-API-TOKEN>\" \\\n    --temp 0.1\n```\n\n# Embed your LLM\n\nAll you have to do is to implement `BaseLM` class, that includes:\n* `__init__` -- for setting up *batching mode support* and (optional) *model name*;\n* `ask(prompt)` -- infer your model with the given `prompt`.\n\nSee examples with models [here](/ext).\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A lightweight, no-strings-attached Chain-of-Thought framework for your LLM, ensuring reliable results for bulk input requests.",
    "version": "0.25.0",
    "project_urls": {
        "Homepage": "https://github.com/nicolay-r/bulk-chain"
    },
    "split_keywords": [
        "natural language processing",
        " chain-of-thought",
        " reasoning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8cd58d20684e63738d97d59377628fc2336fa595fb25866b04770ea8d3aa06e9",
                "md5": "f0cea9118bd978e4e21c6a1fcb5e0ec1",
                "sha256": "549ec4f15a0e689d7ae6d7217a147996a1ee2d9cff20f0349e5e0bb9bd9d22ea"
            },
            "downloads": -1,
            "filename": "bulk_chain-0.25.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f0cea9118bd978e4e21c6a1fcb5e0ec1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 14585,
            "upload_time": "2024-12-24T11:16:00",
            "upload_time_iso_8601": "2024-12-24T11:16:00.946632Z",
            "url": "https://files.pythonhosted.org/packages/8c/d5/8d20684e63738d97d59377628fc2336fa595fb25866b04770ea8d3aa06e9/bulk_chain-0.25.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "58b76792af8820f73352e092771006c8140580f681e987b893351c75b6a32364",
                "md5": "488ed6bf95aa4b06ef09be7bb413d760",
                "sha256": "f69d7ca0282081603e22386c832fed191bfb0842819ba3c8b40b2cd5b52a00a3"
            },
            "downloads": -1,
            "filename": "bulk_chain-0.25.0.tar.gz",
            "has_sig": false,
            "md5_digest": "488ed6bf95aa4b06ef09be7bb413d760",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 13965,
            "upload_time": "2024-12-24T11:16:03",
            "upload_time_iso_8601": "2024-12-24T11:16:03.253325Z",
            "url": "https://files.pythonhosted.org/packages/58/b7/6792af8820f73352e092771006c8140580f681e987b893351c75b6a32364/bulk_chain-0.25.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-24 11:16:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nicolay-r",
    "github_project": "bulk-chain",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "bulk-chain"
}
        
Elapsed time: 5.43698s