collie-bench


Namecollie-bench JSON
Version 0.1.0 PyPI version JSON
download
home_page
SummaryOfficial Implementation of "COLLIE: Systematic Construction of Constrained Text Generation Tasks"
upload_time2023-07-18 00:52:46
maintainer
docs_urlNone
author
requires_python>=3.7
licenseMIT License
keywords large language model llm constrained generation benchmark
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # COLLIE: Systematic Construction of Constrained Text Generation Tasks ([Website](https://collie-benchmark.github.io/))

![teaser](./teaser.png)

We propose the COLLIE framework for easy constraint structure specification, example extraction, instruction rendering, and model evaluation.

## Install
We recommand using Python 3.9 (3.10 as of now might have incompatabilty of certain dependencies). 

To install in development mode (in cloned project directory):
```bash
pip install -e .
```
After installation you can access the functionalities through `import collie`.

We will add COLLIE to PyPI soon.

## Overview
There are two main ways to use COLLIE:
1. Use the [dataset we constructed](#dataset) to compare performance of your prompting/modeling methods to the ones reported in the paper
2. Write your own constraints; make it harder, more compositional, etc to explore the limits of models and probe failure cases by following the steps in [COLLIE framework](#the-collie-framework)


## Dataset

The dataset used in the paper is at `data/all_data.dill` and can be loaded by 
```python
with open("data/all_data.dill", "rb") as f:
    all_data = dill.load(f)
```

`all_data` will be a dictionary with keys as the data source and constraint type, and values as a list of constraints. For example, `all_data['wiki_c07'][0]` is

```python
{
    'example': 'Black market prices for weapons and ammunition in the Palestinian Authority-controlled areas have been rising, necessitating outside funding for the operation.', 
    'targets': ['have', 'rising', 'the'], 
    'constraint': ..., 
    'prompt': "Please generate a sentence containing the word 'have', 'rising', 'the'.", 
    ...
}
```

Reproducing the results reported in the paper:
- Our model results can be found in `logs/` folder
- To plot the figures/tables in the paper, check out `scripts/analysis.ipynb`
- To run the models to reproduce the results, run `python scripts/run_api_models.py` and `python scripts/run_gpu_models.py`


## The COLLIE Framework 
The framework follows a 4-step process:
1. [Constraint Specification](#step-1-constraint-specification-complete-guide)
2. [Extraction](#step-2-extraction-complete-guide)
3. [Rendering](#step-3-rendering)
4. [Evaluation](#step-4-evaluation)


### Step 1: Constraint Specification ([Complete Guide](docs/constraint_spec.md))

To specify a constraint, you need the following concepts defined as classes in `collie/constraints.py`:
1. `Level`: deriving classes `InputLevel` (the basic unit of the input) and `TargetLevel` (the level for comparing to the target value); levels include `'character'`, `'word'`, `'sentence'`, etc
2. `Transformation`: defines how the input text is modified into values comparable against the provided target value; it derives classes like `Count`, `Position`, `ForEach`, etc
3. `Logic`: `And`, `Or`, `All` that can be used to combine constraints
4. `Relation`: relation such as `'=='` or `'in'` for compariing against the target value
5. `Reduction`: when the target has multiple values, you need to specify how the transformed values from the input is reduced such as `'all'`, `'any'`, `'at least'`
6. `Constraint`: the main class for combining all the above specifications

To specify a constraint, you need to provide at least the `TargetLevel`, `Transformation`, and `Relation`.
They are going to be wrapped in the `c = Constraint(...)` initialization. Once the constraint is specified, you can use `c.check(input_text, target_value)` to verify any given text and target tuple.

Below is an example of specifying a "counting the number of word constraint".
```python
>>> from collie.constraints import Constraint, TargetLevel, Count, Relation

# A very simple "number of word" constraint.
>>> c = Constraint(
>>>     target_level=TargetLevel('word'),
>>>     transformation=Count(), 
>>>     relation=Relation('=='),
>>> )
>>> print(c)
Constraint(
    InputLevel(None),
    TargetLevel(word),
    Transformation('Count()'),
    Relation(==),
    Reduction(None)
)
```
Check out the [guide](docs/constraint_spec.md) to explore more examples.


### Step 2: Extraction ([Complete Guide](./docs/extraction.md))
Once the constraints are defined, you can now extract examples from the datasources (e.g., Gutenberg, Wikipedia) that satisfy the specified constraints.

To download necessary data files including the `Gutenberg, dammit` corpus to the `data` folder, run from the root project dir:
```
bash download.sh
```

Run extraction:
```
python -m collie.examples.extract
```
This will sweep over all constraints and data sources defined in `collie/examples/`. To add additional examples, you can add them to the appropriate python files.
Extracted examples can be found in the folder `sample_data`. The files are named as: `{source}_{level}.dill`. The `data/all_data.dill` file is simply a concatenation of all these source-level dill files.

### Step 3: Rendering

To render a constraint, simply run: 
```python
>>> from collie.constraint_renderer import ConstraintRenderer
>>> renderer = ConstraintRenderer(
>>>     constraint=c,  # Defined in step one
>>>     constraint_value=5
>>> )
>>> print(renderer.prompt)
Please generate a sentence with exactly 5 words.
```

### Step 4: Evaluation

To check constraint satisfication, simply run:
```python
>>> text = 'This is a good sentence.'
>>> print(c.check(text, 5))
True
>>> print(c.check(text, 4))
False
```
## Citation
lease cite our paper if you use SimCSE in your work:

```bibtex
@inproceedings{yao2023collie,
    title = {COLLIE: Systematic Construction of Constrained Text Generation Tasks},
    author = {Yao, Shunyu and Chen, Howard and Wang, Austin and Yang, Runzhe and Narasimhan, Karthik},
    booktitle = {ArXiv},
    year = {2023},
    html = {}
}
```

## License
MIT. Note that this is the license for our code, but each data source retains their own respective licenses. 

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "collie-bench",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "large language model,llm,constrained generation,benchmark",
    "author": "",
    "author_email": "Howard Chen <howardchen@cs.princeton.edu>",
    "download_url": "https://files.pythonhosted.org/packages/98/2a/babc8d2684e3522f21f5b8f075fae72cb0c6aa85d63baf1a68700a68db83/collie-bench-0.1.0.tar.gz",
    "platform": null,
    "description": "# COLLIE: Systematic Construction of Constrained Text Generation Tasks ([Website](https://collie-benchmark.github.io/))\n\n![teaser](./teaser.png)\n\nWe propose the COLLIE framework for easy constraint structure specification, example extraction, instruction rendering, and model evaluation.\n\n## Install\nWe recommand using Python 3.9 (3.10 as of now might have incompatabilty of certain dependencies). \n\nTo install in development mode (in cloned project directory):\n```bash\npip install -e .\n```\nAfter installation you can access the functionalities through `import collie`.\n\nWe will add COLLIE to PyPI soon.\n\n## Overview\nThere are two main ways to use COLLIE:\n1. Use the [dataset we constructed](#dataset) to compare performance of your prompting/modeling methods to the ones reported in the paper\n2. Write your own constraints; make it harder, more compositional, etc to explore the limits of models and probe failure cases by following the steps in [COLLIE framework](#the-collie-framework)\n\n\n## Dataset\n\nThe dataset used in the paper is at `data/all_data.dill` and can be loaded by \n```python\nwith open(\"data/all_data.dill\", \"rb\") as f:\n    all_data = dill.load(f)\n```\n\n`all_data` will be a dictionary with keys as the data source and constraint type, and values as a list of constraints. For example, `all_data['wiki_c07'][0]` is\n\n```python\n{\n    'example': 'Black market prices for weapons and ammunition in the Palestinian Authority-controlled areas have been rising, necessitating outside funding for the operation.', \n    'targets': ['have', 'rising', 'the'], \n    'constraint': ..., \n    'prompt': \"Please generate a sentence containing the word 'have', 'rising', 'the'.\", \n    ...\n}\n```\n\nReproducing the results reported in the paper:\n- Our model results can be found in `logs/` folder\n- To plot the figures/tables in the paper, check out `scripts/analysis.ipynb`\n- To run the models to reproduce the results, run `python scripts/run_api_models.py` and `python scripts/run_gpu_models.py`\n\n\n## The COLLIE Framework \nThe framework follows a 4-step process:\n1. [Constraint Specification](#step-1-constraint-specification-complete-guide)\n2. [Extraction](#step-2-extraction-complete-guide)\n3. [Rendering](#step-3-rendering)\n4. [Evaluation](#step-4-evaluation)\n\n\n### Step 1: Constraint Specification ([Complete Guide](docs/constraint_spec.md))\n\nTo specify a constraint, you need the following concepts defined as classes in `collie/constraints.py`:\n1. `Level`: deriving classes `InputLevel` (the basic unit of the input) and `TargetLevel` (the level for comparing to the target value); levels include `'character'`, `'word'`, `'sentence'`, etc\n2. `Transformation`: defines how the input text is modified into values comparable against the provided target value; it derives classes like `Count`, `Position`, `ForEach`, etc\n3. `Logic`: `And`, `Or`, `All` that can be used to combine constraints\n4. `Relation`: relation such as `'=='` or `'in'` for compariing against the target value\n5. `Reduction`: when the target has multiple values, you need to specify how the transformed values from the input is reduced such as `'all'`, `'any'`, `'at least'`\n6. `Constraint`: the main class for combining all the above specifications\n\nTo specify a constraint, you need to provide at least the `TargetLevel`, `Transformation`, and `Relation`.\nThey are going to be wrapped in the `c = Constraint(...)` initialization. Once the constraint is specified, you can use `c.check(input_text, target_value)` to verify any given text and target tuple.\n\nBelow is an example of specifying a \"counting the number of word constraint\".\n```python\n>>> from collie.constraints import Constraint, TargetLevel, Count, Relation\n\n# A very simple \"number of word\" constraint.\n>>> c = Constraint(\n>>>     target_level=TargetLevel('word'),\n>>>     transformation=Count(), \n>>>     relation=Relation('=='),\n>>> )\n>>> print(c)\nConstraint(\n    InputLevel(None),\n    TargetLevel(word),\n    Transformation('Count()'),\n    Relation(==),\n    Reduction(None)\n)\n```\nCheck out the [guide](docs/constraint_spec.md) to explore more examples.\n\n\n### Step 2: Extraction ([Complete Guide](./docs/extraction.md))\nOnce the constraints are defined, you can now extract examples from the datasources (e.g., Gutenberg, Wikipedia) that satisfy the specified constraints.\n\nTo download necessary data files including the `Gutenberg, dammit` corpus to the `data` folder, run from the root project dir:\n```\nbash download.sh\n```\n\nRun extraction:\n```\npython -m collie.examples.extract\n```\nThis will sweep over all constraints and data sources defined in `collie/examples/`. To add additional examples, you can add them to the appropriate python files.\nExtracted examples can be found in the folder `sample_data`. The files are named as: `{source}_{level}.dill`. The `data/all_data.dill` file is simply a concatenation of all these source-level dill files.\n\n### Step 3: Rendering\n\nTo render a constraint, simply run: \n```python\n>>> from collie.constraint_renderer import ConstraintRenderer\n>>> renderer = ConstraintRenderer(\n>>>     constraint=c,  # Defined in step one\n>>>     constraint_value=5\n>>> )\n>>> print(renderer.prompt)\nPlease generate a sentence with exactly 5 words.\n```\n\n### Step 4: Evaluation\n\nTo check constraint satisfication, simply run:\n```python\n>>> text = 'This is a good sentence.'\n>>> print(c.check(text, 5))\nTrue\n>>> print(c.check(text, 4))\nFalse\n```\n## Citation\nlease cite our paper if you use SimCSE in your work:\n\n```bibtex\n@inproceedings{yao2023collie,\n    title = {COLLIE: Systematic Construction of Constrained Text Generation Tasks},\n    author = {Yao, Shunyu and Chen, Howard and Wang, Austin and Yang, Runzhe and Narasimhan, Karthik},\n    booktitle = {ArXiv},\n    year = {2023},\n    html = {}\n}\n```\n\n## License\nMIT. Note that this is the license for our code, but each data source retains their own respective licenses. \n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Official Implementation of \"COLLIE: Systematic Construction of Constrained Text Generation Tasks\"",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/princeton-nlp/Collie"
    },
    "split_keywords": [
        "large language model",
        "llm",
        "constrained generation",
        "benchmark"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8841a6db844794033c5a8094508af72e4403320133e3835ffe43bc334cd5cd2d",
                "md5": "11b833266fac6dffe9bdfc8672068516",
                "sha256": "3cc5d59c70d91130f1d6b7cf67b01854fb648e69827f5fd98ef96600976f06d0"
            },
            "downloads": -1,
            "filename": "collie_bench-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "11b833266fac6dffe9bdfc8672068516",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 33342,
            "upload_time": "2023-07-18T00:52:44",
            "upload_time_iso_8601": "2023-07-18T00:52:44.222921Z",
            "url": "https://files.pythonhosted.org/packages/88/41/a6db844794033c5a8094508af72e4403320133e3835ffe43bc334cd5cd2d/collie_bench-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "982ababc8d2684e3522f21f5b8f075fae72cb0c6aa85d63baf1a68700a68db83",
                "md5": "9803995dec83fe81bc3a9cbe6ada6a97",
                "sha256": "f2c183b2dee5ae1791f47a19cadcd8be4702f2ebb0c0df6867d5275ffcf3a3d6"
            },
            "downloads": -1,
            "filename": "collie-bench-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9803995dec83fe81bc3a9cbe6ada6a97",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 29512,
            "upload_time": "2023-07-18T00:52:46",
            "upload_time_iso_8601": "2023-07-18T00:52:46.192609Z",
            "url": "https://files.pythonhosted.org/packages/98/2a/babc8d2684e3522f21f5b8f075fae72cb0c6aa85d63baf1a68700a68db83/collie-bench-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-18 00:52:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "princeton-nlp",
    "github_project": "Collie",
    "github_not_found": true,
    "lcname": "collie-bench"
}
        
Elapsed time: 1.61068s