terge


Nameterge JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryAn easy-to-use Python library for merging PyTorch models.
upload_time2024-06-13 05:05:57
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords ai artificial intelligence machine learning merge merger merging ml model models neural net neural nets neural network neural networks nn nns pytorch torch
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <a href="https://github.com/umarbutler/terge" alt="terge logo"><img src="https://raw.githubusercontent.com/umarbutler/terge/main/assets/banner.svg"></a>

-----------------------------------------------------------------------------
<p align="center"><a href="https://pypi.org/project/terge/" alt="pypi version"><img src="https://img.shields.io/pypi/v/terge"></a> <a href="https://github.com/umarbutler/terge/actions/workflows/ci.yml" alt="build status"><img src="https://img.shields.io/github/actions/workflow/status/umarbutler/terge/ci.yml?branch=main"></a> <a href="https://app.codecov.io/gh/umarbutler/terge" alt="code coverage"><img src="https://img.shields.io/codecov/c/github/umarbutler/terge"></a> <!-- <a href="https://pypistats.org/packages/terge" alt="Downloads"><img src="https://img.shields.io/pypi/dm/terge"></a> --> </p>

terge is an *easy-to-use* Python library for merging PyTorch models. It works with models of any size and architecture, including Hugging Face 🤗 Transformers.

## Features 🎯
- **👌 Easy-to-use**: a single line of code is all you need to get started.
- **⚡ Lightning-fast**: billions of parameters can be merged in mere seconds.
- **📐 Architecture-agnostic**: models of any size and architecture can be merged, provided they share a couple parameters with the same name and shape.
- **🛠️ Hyper-customizable**: parameters can be filtered in or out with regex, and custom weights can be assigned to models or even to their individual parameters.
- **🌳 Lineage tracking**: maps of merged parameter names to models' weightings can be produced to document precisely how models were merged.
- **🤗 Hugging Face-friendly**: Hugging Face 🤗 Transformers are supported out of the box.

## Installation 🧑‍🔧
`terge` can be installed with `pip`:
```bash
pip install terge
```

## Usage 👩‍💻
The following code snippet demonstrates how you can get started with `terge`:
```python
import re
import torch
import terge

from transformers import AutoModel # NOTE `transformers` isn't required, this is just for demo purposes.

# A single line is all it takes to merge any number of models.
model = terge.merge([torch.nn.Linear(10, 1) for _ in range(3)])

# This also works for models of different architectures...
model = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(10, 1, num_layers = 2)])

# And models of different sizes...
model = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(100, 1, num_layers = 2)])

# And even Hugging Face 🤗 Transformers...
model = terge.merge([AutoModel.from_pretrained('umarbutler/emubert'),
                     AutoModel.from_pretrained('roberta-base')],
                     progress = True)

# Just make sure there's at least one shared named parameter in there.
model = terge.merge([torch.nn.Linear(10, 1), torch.nn.Linear(1, 10)]) # -> terge.NoParametersToMergeWarning
```

If you want even greater control over the merging process, `terge` has got you covered:
```python
# Changing how parameters are merged and what model serves as the base is trivial.
model = terge.merge(
    [torch.nn.Linear(10, 1) for _ in range(3)],
    base = torch.nn.Linear(10, 1), # The base model doesn't even need to be getting merged! You can also
    # use the index of a model in the input models. The default is 0.
    weights = [1, 2, 3], # Weights are relative and correspond to the order of the input models such that,
    # here, the second model is weighted double the weight of the first model and the third model is weighted
    # triple the weight of the first model. The default is [1, 1, ...].
)

# Assigning custom weights to individual parameters is also easy.
model = terge.merge(
    [torch.nn.Linear(10, 1) for _ in range(3)],
    weights = {re.compile(r'weight'): [1, 2, 3], 'bias': [3, 2, 1]}, # Anything that doesn't match this map
    # will get a weight of 1. You can change that adding `re.compile(r'.*'): [...]` to the *end* of your
    # weights map.
)

# If you want to filter specific parameters in or out, that can be done too.
model = terge.merge(
    [torch.nn.Linear(10, 1) for _ in range(3)],
    included = re.compile(r'weight'), # Only parameters with 'weight' in their name will be merged.
    # You could also pass a string for an exact match.
    excluded = ['bias', re.compile(r'bias')], # Lists of strings and regex patterns work as well.
    # NOTE Exclusions execute after inclusions, so this isn't actually necessary.
)

# You can also enable lineage tracking to understand exactly how models got merged.
model, lineage = terge.merge(
    [torch.nn.Linear(10, 1) for _ in range(3)],
    lineage = True,
) # -> {'weight': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)]),
  #     'bias': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)])}

# Finally, for an extra speed boost, you can merge in-place (just keep in mind, this will modify your base model).
models = terge.merge(
    [torch.nn.Linear(10, 1) for _ in range(3)],
    inplace = True,
)
```

## API 🧩
### `merge()`
```python
def merge(
    models: list[torch.nn.Module],
    base: torch.nn.Module | int = 0,
    method: Literal['arithmetic'] | dict[str | re.Pattern, Literal['arithmetic']] = 'arithmetic',
    weights: list[float] | dict[str | re.Pattern, list[float]] = None,
    included: re.Pattern | str | list[str | re.Pattern] = None,
    excluded: re.Pattern | str | list[str | re.Pattern] = None,
    inplace: bool = False,
    dtype: torch.dtype = torch.float64,
    lineage: bool = False,
    progress: bool = False,
) -> torch.nn.Module | tuple[torch.nn.Module, dict[str, tuple[str, list[tuple[int, float]]]]]
```

`merge()` merges PyTorch models.

`models` represents the models to be merged.

`base` represents the model whose parameters will be used as defaults and that, if `inplace` is set to `True`, will be merged into; or the index of such a model in `models`. It defaults to `0`, that is, the index of the first model in `models`.

`method` represents the method to be used for merging the models' parameters, or a map of parameter names or regex patterns matching parameter names to the methods to be used to merge them. Currently, only the `'arithmetic'` method is supported (that is, the merging of parameters by taking their ordinary or weighted arithmetic mean). `method` defaults to `'arithmetic'`.

`weights` represents a list of all of the relative weights to be assigned to the models' parameters, or a map of parameter names or regex patterns matching parameter names to lists of weights. If set to `None`, all models will be weighted equally. If a dictionary is provided and there are any parameters to be merged that do not match any of the keys of that dictionary, they will be also weighted equally. `weights` defaults to `None`.

`included` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be merged. If set to `None`, all parameters will be merged. `included` defaults to `None`.

`excluded` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be excluded from merging. If set to `None`, no parameters will be excluded. If `included` is provided, this argument will apply to the subset of parameters that match `included`. `excluded` defaults to `None`.

`inplace` represents whether, for the sake of expediency or memory conservation, the `base` should be merged into in place instead of being deep copied. It defaults to `False`.

`dtype` represents the data type to be used for storing the weightings. It defaults to `torch.float64`.

`lineage` represents whether to output a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned. It defaults to `False`.

`progress` represents whether to display a progress bar. It defaults to `False`.

`merge()` will return either a merged model, or, if `lineage` is `True`, a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned, which looks like this:
```python
{
    'parameter_name': ('method', [(model_index, weight), ...]),
    ...
}
```

## Changelog 🔄
terge adheres to [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html). All notable changes to terge are documented in its [Changelog 🔄](https://github.com/umarbutler/terge/blob/main/CHANGELOG.md).

## License 📜
terge is licensed under the [MIT License](https://github.com/umarbutler/terge/blob/main/LICENSE).
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "terge",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "ai, artificial intelligence, machine learning, merge, merger, merging, ml, model, models, neural net, neural nets, neural network, neural networks, nn, nns, pytorch, torch",
    "author": null,
    "author_email": "Umar Butler <umar@umar.au>",
    "download_url": "https://files.pythonhosted.org/packages/e9/8c/2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91/terge-0.1.1.tar.gz",
    "platform": null,
    "description": "<a href=\"https://github.com/umarbutler/terge\" alt=\"terge logo\"><img src=\"https://raw.githubusercontent.com/umarbutler/terge/main/assets/banner.svg\"></a>\n\n-----------------------------------------------------------------------------\n<p align=\"center\"><a href=\"https://pypi.org/project/terge/\" alt=\"pypi version\"><img src=\"https://img.shields.io/pypi/v/terge\"></a> <a href=\"https://github.com/umarbutler/terge/actions/workflows/ci.yml\" alt=\"build status\"><img src=\"https://img.shields.io/github/actions/workflow/status/umarbutler/terge/ci.yml?branch=main\"></a> <a href=\"https://app.codecov.io/gh/umarbutler/terge\" alt=\"code coverage\"><img src=\"https://img.shields.io/codecov/c/github/umarbutler/terge\"></a> <!-- <a href=\"https://pypistats.org/packages/terge\" alt=\"Downloads\"><img src=\"https://img.shields.io/pypi/dm/terge\"></a> --> </p>\n\nterge is an *easy-to-use* Python library for merging PyTorch models. It works with models of any size and architecture, including Hugging Face \ud83e\udd17 Transformers.\n\n## Features \ud83c\udfaf\n- **\ud83d\udc4c Easy-to-use**: a single line of code is all you need to get started.\n- **\u26a1 Lightning-fast**: billions of parameters can be merged in mere seconds.\n- **\ud83d\udcd0 Architecture-agnostic**: models of any size and architecture can be merged, provided they share a couple parameters with the same name and shape.\n- **\ud83d\udee0\ufe0f Hyper-customizable**: parameters can be filtered in or out with regex, and custom weights can be assigned to models or even to their individual parameters.\n- **\ud83c\udf33 Lineage tracking**: maps of merged parameter names to models' weightings can be produced to document precisely how models were merged.\n- **\ud83e\udd17 Hugging Face-friendly**: Hugging Face \ud83e\udd17 Transformers are supported out of the box.\n\n## Installation \ud83e\uddd1\u200d\ud83d\udd27\n`terge` can be installed with `pip`:\n```bash\npip install terge\n```\n\n## Usage \ud83d\udc69\u200d\ud83d\udcbb\nThe following code snippet demonstrates how you can get started with `terge`:\n```python\nimport re\nimport torch\nimport terge\n\nfrom transformers import AutoModel # NOTE `transformers` isn't required, this is just for demo purposes.\n\n# A single line is all it takes to merge any number of models.\nmodel = terge.merge([torch.nn.Linear(10, 1) for _ in range(3)])\n\n# This also works for models of different architectures...\nmodel = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(10, 1, num_layers = 2)])\n\n# And models of different sizes...\nmodel = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(100, 1, num_layers = 2)])\n\n# And even Hugging Face \ud83e\udd17 Transformers...\nmodel = terge.merge([AutoModel.from_pretrained('umarbutler/emubert'),\n                     AutoModel.from_pretrained('roberta-base')],\n                     progress = True)\n\n# Just make sure there's at least one shared named parameter in there.\nmodel = terge.merge([torch.nn.Linear(10, 1), torch.nn.Linear(1, 10)]) # -> terge.NoParametersToMergeWarning\n```\n\nIf you want even greater control over the merging process, `terge` has got you covered:\n```python\n# Changing how parameters are merged and what model serves as the base is trivial.\nmodel = terge.merge(\n    [torch.nn.Linear(10, 1) for _ in range(3)],\n    base = torch.nn.Linear(10, 1), # The base model doesn't even need to be getting merged! You can also\n    # use the index of a model in the input models. The default is 0.\n    weights = [1, 2, 3], # Weights are relative and correspond to the order of the input models such that,\n    # here, the second model is weighted double the weight of the first model and the third model is weighted\n    # triple the weight of the first model. The default is [1, 1, ...].\n)\n\n# Assigning custom weights to individual parameters is also easy.\nmodel = terge.merge(\n    [torch.nn.Linear(10, 1) for _ in range(3)],\n    weights = {re.compile(r'weight'): [1, 2, 3], 'bias': [3, 2, 1]}, # Anything that doesn't match this map\n    # will get a weight of 1. You can change that adding `re.compile(r'.*'): [...]` to the *end* of your\n    # weights map.\n)\n\n# If you want to filter specific parameters in or out, that can be done too.\nmodel = terge.merge(\n    [torch.nn.Linear(10, 1) for _ in range(3)],\n    included = re.compile(r'weight'), # Only parameters with 'weight' in their name will be merged.\n    # You could also pass a string for an exact match.\n    excluded = ['bias', re.compile(r'bias')], # Lists of strings and regex patterns work as well.\n    # NOTE Exclusions execute after inclusions, so this isn't actually necessary.\n)\n\n# You can also enable lineage tracking to understand exactly how models got merged.\nmodel, lineage = terge.merge(\n    [torch.nn.Linear(10, 1) for _ in range(3)],\n    lineage = True,\n) # -> {'weight': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)]),\n  #     'bias': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)])}\n\n# Finally, for an extra speed boost, you can merge in-place (just keep in mind, this will modify your base model).\nmodels = terge.merge(\n    [torch.nn.Linear(10, 1) for _ in range(3)],\n    inplace = True,\n)\n```\n\n## API \ud83e\udde9\n### `merge()`\n```python\ndef merge(\n    models: list[torch.nn.Module],\n    base: torch.nn.Module | int = 0,\n    method: Literal['arithmetic'] | dict[str | re.Pattern, Literal['arithmetic']] = 'arithmetic',\n    weights: list[float] | dict[str | re.Pattern, list[float]] = None,\n    included: re.Pattern | str | list[str | re.Pattern] = None,\n    excluded: re.Pattern | str | list[str | re.Pattern] = None,\n    inplace: bool = False,\n    dtype: torch.dtype = torch.float64,\n    lineage: bool = False,\n    progress: bool = False,\n) -> torch.nn.Module | tuple[torch.nn.Module, dict[str, tuple[str, list[tuple[int, float]]]]]\n```\n\n`merge()` merges PyTorch models.\n\n`models` represents the models to be merged.\n\n`base` represents the model whose parameters will be used as defaults and that, if `inplace` is set to `True`, will be merged into; or the index of such a model in `models`. It defaults to `0`, that is, the index of the first model in `models`.\n\n`method` represents the method to be used for merging the models' parameters, or a map of parameter names or regex patterns matching parameter names to the methods to be used to merge them. Currently, only the `'arithmetic'` method is supported (that is, the merging of parameters by taking their ordinary or weighted arithmetic mean). `method` defaults to `'arithmetic'`.\n\n`weights` represents a list of all of the relative weights to be assigned to the models' parameters, or a map of parameter names or regex patterns matching parameter names to lists of weights. If set to `None`, all models will be weighted equally. If a dictionary is provided and there are any parameters to be merged that do not match any of the keys of that dictionary, they will be also weighted equally. `weights` defaults to `None`.\n\n`included` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be merged. If set to `None`, all parameters will be merged. `included` defaults to `None`.\n\n`excluded` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be excluded from merging. If set to `None`, no parameters will be excluded. If `included` is provided, this argument will apply to the subset of parameters that match `included`. `excluded` defaults to `None`.\n\n`inplace` represents whether, for the sake of expediency or memory conservation, the `base` should be merged into in place instead of being deep copied. It defaults to `False`.\n\n`dtype` represents the data type to be used for storing the weightings. It defaults to `torch.float64`.\n\n`lineage` represents whether to output a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned. It defaults to `False`.\n\n`progress` represents whether to display a progress bar. It defaults to `False`.\n\n`merge()` will return either a merged model, or, if `lineage` is `True`, a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned, which looks like this:\n```python\n{\n    'parameter_name': ('method', [(model_index, weight), ...]),\n    ...\n}\n```\n\n## Changelog \ud83d\udd04\nterge adheres to [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html). All notable changes to terge are documented in its [Changelog \ud83d\udd04](https://github.com/umarbutler/terge/blob/main/CHANGELOG.md).\n\n## License \ud83d\udcdc\nterge is licensed under the [MIT License](https://github.com/umarbutler/terge/blob/main/LICENSE).",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "An easy-to-use Python library for merging PyTorch models.",
    "version": "0.1.1",
    "project_urls": {
        "Documentation": "https://github.com/umarbutler/terge/blob/main/README.md",
        "Homepage": "https://github.com/umarbutler/terge",
        "Issues": "https://github.com/umarbutler/terge/issues",
        "Source": "https://github.com/umarbutler/terge"
    },
    "split_keywords": [
        "ai",
        " artificial intelligence",
        " machine learning",
        " merge",
        " merger",
        " merging",
        " ml",
        " model",
        " models",
        " neural net",
        " neural nets",
        " neural network",
        " neural networks",
        " nn",
        " nns",
        " pytorch",
        " torch"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f924f664711caae479aa2375769d07f6658b3832bf99cd3c26fb799ed3cf0519",
                "md5": "a1d866519055d03773feb65fa2b41c18",
                "sha256": "df0407668b0b6d1e550e02e7ad2d64b3a801834c27ba68fd31383ec5f24e6f45"
            },
            "downloads": -1,
            "filename": "terge-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a1d866519055d03773feb65fa2b41c18",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 9055,
            "upload_time": "2024-06-13T05:05:54",
            "upload_time_iso_8601": "2024-06-13T05:05:54.878075Z",
            "url": "https://files.pythonhosted.org/packages/f9/24/f664711caae479aa2375769d07f6658b3832bf99cd3c26fb799ed3cf0519/terge-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e98c2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91",
                "md5": "b40a50333c2910ab8f90f2f4b3ab844b",
                "sha256": "3e690b1cb2de7bf1771ebe1c04a026d39d014717c8ca1e4c946e56137df4cf16"
            },
            "downloads": -1,
            "filename": "terge-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "b40a50333c2910ab8f90f2f4b3ab844b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 337329,
            "upload_time": "2024-06-13T05:05:57",
            "upload_time_iso_8601": "2024-06-13T05:05:57.146846Z",
            "url": "https://files.pythonhosted.org/packages/e9/8c/2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91/terge-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-13 05:05:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "umarbutler",
    "github_project": "terge",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "terge"
}
        
Elapsed time: 0.30021s