cycleformers

Name	cycleformers JSON
Version	0.1.0 JSON
	download
home_page	None
Summary	A comprehensive implementation of the cycle-consistency training paradigm, extending the Huggingface Transformers trainer API to accommodate arbitrary combinations of generative models.
upload_time	2024-12-08 07:42:28
maintainer	None
docs_url	None
author	William Thorne
requires_python	<3.13,>=3.11
license	Attribution 4.0 International
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Cycleformers

<div align="center">

[![Python](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/)
[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
<!-- ![Coverage](.github/badges/coverage.svg) -->
<!-- [![Build Status](https://github.com/wrmthorne/cycleformers/workflows/CI-Pipeline/badge.svg)](https://github.com/wrmthorne/cycleformers/actions) -->

</div>

A Python library for efficient cycle-consistency training of transformer models. Cycleformers simplifies iterative back-translation with support for both causal and seq2seq architectures. We also implement Multi-Adapter Cycle-Consistency Training (MACCT), enabling training of LoRA adapters on a frozen base model for `7.5x` larger model capacity for the same memory footprint.

## Features

- 🤗 Seamless integration with Hugging Face Transformers
- 🚀 PEFT/LoRA support for memory-efficient training
- 🤖 Compatible with both causal and seq2seq models
- 🔥 Optimized for various hardware configurations


## Quick Tour

### Installation

```bash
pip install cycleformers
```

### Training

The `CycleTrainer` class is an extension but significant redesign of the 🤗 Transformers trainer, designed to abstract away the specifics of training while remaining configurable. Both Seq2Seq and Causal architectures are supported, each able to train via PEFT adapter swapping for memory efficient configurations. Check the [docs] for [usage] details and [examples].

To train using two identical models the following sample code can be used along with two datasets:

```python
from cycleformers import CycleTrainer, CycleTrainingArguments

model = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

args = CycleTrainingArguments(output_dir="gpt2-cct")
trainer = CycleTrainer(
    args, 
    models = model
    tokenizers = tokenizer
    train_dataset_A = dataset_A,
    train_dataset_B = dataset_B
)
trainer.train()
```

Any two models (🚧 currently both seq2seq or both causal) can be combined together for completely customisable training:

```python
model_A = AutoModelForCausalLM.from_pretrained("gpt2", device_map="auto")
model_B = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base", device_map="auto")
tokenizer_A = AutoTokenizer.from_pretrained("gpt2")
tokenizer_B = AutoTokenizer.from_pretrained("google/flan-t5-small")

trainer = CycleTrainer(
    args, 
    models = {
        "A": model_A,
        "B": model_B
    }
    tokenizers = {
        "A": tokenizer_A,
        "B": tokenizer_B
    }
    train_dataset_A = dataset_A,
    train_dataset_B = dataset_B
)
```

### Multi-Adapter Cycle-Consistency Training (MACCT)

The `CycleTrainer` class is also setup to accept a single base model and train two PEFT adapters ontop of it, switching between them to emulate the two model setup. This allows for the training of `7.5x larger models` for the same memory footprint:

```python
peft_config = PeftConfig(
    task_type="CAUSAL_LM",
    r=16,
    lora_alpha=32,
    target_modules="all-linear",
    inference_mode=False,
    bias="none"
)

args = CycleTrainingArguments(output_dir="gpt2-macct")
trainer = CycleTrainer(
    args, 
    model = model,
    tokenizer = tokenizer,
    peft_configs = peft_config # Or same A, B dict
)
```



## Citing

If you use Cycleformers in your research, please cite:

```bibtex
add once zenodo/paper citation is available
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "cycleformers",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "William Thorne",
    "author_email": "wthorne1@sheffield.ac.uk",
    "download_url": "https://files.pythonhosted.org/packages/59/87/bfc7a537a4834d2977d8eab37f710b8c06062e3ee364ef20ccbc3e8a779c/cycleformers-0.1.0.tar.gz",
    "platform": null,
    "description": "# Cycleformers\n\n<div align=\"center\">\n\n[![Python](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/)\n[![License: CC BY 4.0](https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)\n<!-- ![Coverage](.github/badges/coverage.svg) -->\n<!-- [![Build Status](https://github.com/wrmthorne/cycleformers/workflows/CI-Pipeline/badge.svg)](https://github.com/wrmthorne/cycleformers/actions) -->\n\n</div>\n\nA Python library for efficient cycle-consistency training of transformer models. Cycleformers simplifies iterative back-translation with support for both causal and seq2seq architectures. We also implement Multi-Adapter Cycle-Consistency Training (MACCT), enabling training of LoRA adapters on a frozen base model for `7.5x` larger model capacity for the same memory footprint.\n\n## Features\n\n- \ud83e\udd17 Seamless integration with Hugging Face Transformers\n- \ud83d\ude80 PEFT/LoRA support for memory-efficient training\n- \ud83e\udd16 Compatible with both causal and seq2seq models\n- \ud83d\udd25 Optimized for various hardware configurations\n\n\n## Quick Tour\n\n### Installation\n\n```bash\npip install cycleformers\n```\n\n### Training\n\nThe `CycleTrainer` class is an extension but significant redesign of the \ud83e\udd17 Transformers trainer, designed to abstract away the specifics of training while remaining configurable. Both Seq2Seq and Causal architectures are supported, each able to train via PEFT adapter swapping for memory efficient configurations. Check the [docs] for [usage] details and [examples].\n\nTo train using two identical models the following sample code can be used along with two datasets:\n\n```python\nfrom cycleformers import CycleTrainer, CycleTrainingArguments\n\nmodel = AutoModelForCausalLM.from_pretrained(\"gpt2\", device_map=\"auto\")\ntokenizer = AutoTokenizer.from_pretrained(\"gpt2\")\n\nargs = CycleTrainingArguments(output_dir=\"gpt2-cct\")\ntrainer = CycleTrainer(\n    args, \n    models = model\n    tokenizers = tokenizer\n    train_dataset_A = dataset_A,\n    train_dataset_B = dataset_B\n)\ntrainer.train()\n```\n\nAny two models (\ud83d\udea7 currently both seq2seq or both causal) can be combined together for completely customisable training:\n\n```python\nmodel_A = AutoModelForCausalLM.from_pretrained(\"gpt2\", device_map=\"auto\")\nmodel_B = AutoModelForSeq2SeqLM.from_pretrained(\"google/flan-t5-base\", device_map=\"auto\")\ntokenizer_A = AutoTokenizer.from_pretrained(\"gpt2\")\ntokenizer_B = AutoTokenizer.from_pretrained(\"google/flan-t5-small\")\n\ntrainer = CycleTrainer(\n    args, \n    models = {\n        \"A\": model_A,\n        \"B\": model_B\n    }\n    tokenizers = {\n        \"A\": tokenizer_A,\n        \"B\": tokenizer_B\n    }\n    train_dataset_A = dataset_A,\n    train_dataset_B = dataset_B\n)\n```\n\n### Multi-Adapter Cycle-Consistency Training (MACCT)\n\nThe `CycleTrainer` class is also setup to accept a single base model and train two PEFT adapters ontop of it, switching between them to emulate the two model setup. This allows for the training of `7.5x larger models` for the same memory footprint:\n\n```python\npeft_config = PeftConfig(\n    task_type=\"CAUSAL_LM\",\n    r=16,\n    lora_alpha=32,\n    target_modules=\"all-linear\",\n    inference_mode=False,\n    bias=\"none\"\n)\n\nargs = CycleTrainingArguments(output_dir=\"gpt2-macct\")\ntrainer = CycleTrainer(\n    args, \n    model = model,\n    tokenizer = tokenizer,\n    peft_configs = peft_config # Or same A, B dict\n)\n```\n\n\n\n## Citing\n\nIf you use Cycleformers in your research, please cite:\n\n```bibtex\nadd once zenodo/paper citation is available\n```\n",
    "bugtrack_url": null,
    "license": "Attribution 4.0 International",
    "summary": "A comprehensive implementation of the cycle-consistency training paradigm, extending the Huggingface Transformers trainer API to accommodate arbitrary combinations of generative models.",
    "version": "0.1.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b309615d7e84df1f24527bdd9499ff3ca3f8f163a95c1c13b1fc00c88f6f3011",
                "md5": "1ee2ea12d648d6c98aa5b098ea3d95aa",
                "sha256": "cd05e82ac0654133d0580603bce064ad455ec6a8918a788a42d47e3b9dfa1074"
            },
            "downloads": -1,
            "filename": "cycleformers-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1ee2ea12d648d6c98aa5b098ea3d95aa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.11",
            "size": 15967,
            "upload_time": "2024-12-08T07:42:26",
            "upload_time_iso_8601": "2024-12-08T07:42:26.832802Z",
            "url": "https://files.pythonhosted.org/packages/b3/09/615d7e84df1f24527bdd9499ff3ca3f8f163a95c1c13b1fc00c88f6f3011/cycleformers-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5987bfc7a537a4834d2977d8eab37f710b8c06062e3ee364ef20ccbc3e8a779c",
                "md5": "13ac53c911b6ed1b5feb37e9c966bfad",
                "sha256": "23f5985ab08eba95b7106d7d13c7ff3f9f7bd08655564981df63744725dfafb7"
            },
            "downloads": -1,
            "filename": "cycleformers-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "13ac53c911b6ed1b5feb37e9c966bfad",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.11",
            "size": 15722,
            "upload_time": "2024-12-08T07:42:28",
            "upload_time_iso_8601": "2024-12-08T07:42:28.434364Z",
            "url": "https://files.pythonhosted.org/packages/59/87/bfc7a537a4834d2977d8eab37f710b8c06062e3ee364ef20ccbc3e8a779c/cycleformers-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-08 07:42:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "cycleformers"
}

William Thorne