<a href="https://github.com/umarbutler/terge" alt="terge logo"><img src="https://raw.githubusercontent.com/umarbutler/terge/main/assets/banner.svg"></a>
-----------------------------------------------------------------------------
<p align="center"><a href="https://pypi.org/project/terge/" alt="pypi version"><img src="https://img.shields.io/pypi/v/terge"></a> <a href="https://github.com/umarbutler/terge/actions/workflows/ci.yml" alt="build status"><img src="https://img.shields.io/github/actions/workflow/status/umarbutler/terge/ci.yml?branch=main"></a> <a href="https://app.codecov.io/gh/umarbutler/terge" alt="code coverage"><img src="https://img.shields.io/codecov/c/github/umarbutler/terge"></a> <!-- <a href="https://pypistats.org/packages/terge" alt="Downloads"><img src="https://img.shields.io/pypi/dm/terge"></a> --> </p>
terge is an *easy-to-use* Python library for merging PyTorch models. It works with models of any size and architecture, including Hugging Face 🤗 Transformers.
## Features 🎯
- **👌 Easy-to-use**: a single line of code is all you need to get started.
- **⚡ Lightning-fast**: billions of parameters can be merged in mere seconds.
- **📐 Architecture-agnostic**: models of any size and architecture can be merged, provided they share a couple parameters with the same name and shape.
- **🛠️ Hyper-customizable**: parameters can be filtered in or out with regex, and custom weights can be assigned to models or even to their individual parameters.
- **🌳 Lineage tracking**: maps of merged parameter names to models' weightings can be produced to document precisely how models were merged.
- **🤗 Hugging Face-friendly**: Hugging Face 🤗 Transformers are supported out of the box.
## Installation 🧑🔧
`terge` can be installed with `pip`:
```bash
pip install terge
```
## Usage 👩💻
The following code snippet demonstrates how you can get started with `terge`:
```python
import re
import torch
import terge
from transformers import AutoModel # NOTE `transformers` isn't required, this is just for demo purposes.
# A single line is all it takes to merge any number of models.
model = terge.merge([torch.nn.Linear(10, 1) for _ in range(3)])
# This also works for models of different architectures...
model = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(10, 1, num_layers = 2)])
# And models of different sizes...
model = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(100, 1, num_layers = 2)])
# And even Hugging Face 🤗 Transformers...
model = terge.merge([AutoModel.from_pretrained('umarbutler/emubert'),
AutoModel.from_pretrained('roberta-base')],
progress = True)
# Just make sure there's at least one shared named parameter in there.
model = terge.merge([torch.nn.Linear(10, 1), torch.nn.Linear(1, 10)]) # -> terge.NoParametersToMergeWarning
```
If you want even greater control over the merging process, `terge` has got you covered:
```python
# Changing how parameters are merged and what model serves as the base is trivial.
model = terge.merge(
[torch.nn.Linear(10, 1) for _ in range(3)],
base = torch.nn.Linear(10, 1), # The base model doesn't even need to be getting merged! You can also
# use the index of a model in the input models. The default is 0.
weights = [1, 2, 3], # Weights are relative and correspond to the order of the input models such that,
# here, the second model is weighted double the weight of the first model and the third model is weighted
# triple the weight of the first model. The default is [1, 1, ...].
)
# Assigning custom weights to individual parameters is also easy.
model = terge.merge(
[torch.nn.Linear(10, 1) for _ in range(3)],
weights = {re.compile(r'weight'): [1, 2, 3], 'bias': [3, 2, 1]}, # Anything that doesn't match this map
# will get a weight of 1. You can change that adding `re.compile(r'.*'): [...]` to the *end* of your
# weights map.
)
# If you want to filter specific parameters in or out, that can be done too.
model = terge.merge(
[torch.nn.Linear(10, 1) for _ in range(3)],
included = re.compile(r'weight'), # Only parameters with 'weight' in their name will be merged.
# You could also pass a string for an exact match.
excluded = ['bias', re.compile(r'bias')], # Lists of strings and regex patterns work as well.
# NOTE Exclusions execute after inclusions, so this isn't actually necessary.
)
# You can also enable lineage tracking to understand exactly how models got merged.
model, lineage = terge.merge(
[torch.nn.Linear(10, 1) for _ in range(3)],
lineage = True,
) # -> {'weight': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)]),
# 'bias': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)])}
# Finally, for an extra speed boost, you can merge in-place (just keep in mind, this will modify your base model).
models = terge.merge(
[torch.nn.Linear(10, 1) for _ in range(3)],
inplace = True,
)
```
## API 🧩
### `merge()`
```python
def merge(
models: list[torch.nn.Module],
base: torch.nn.Module | int = 0,
method: Literal['arithmetic'] | dict[str | re.Pattern, Literal['arithmetic']] = 'arithmetic',
weights: list[float] | dict[str | re.Pattern, list[float]] = None,
included: re.Pattern | str | list[str | re.Pattern] = None,
excluded: re.Pattern | str | list[str | re.Pattern] = None,
inplace: bool = False,
dtype: torch.dtype = torch.float64,
lineage: bool = False,
progress: bool = False,
) -> torch.nn.Module | tuple[torch.nn.Module, dict[str, tuple[str, list[tuple[int, float]]]]]
```
`merge()` merges PyTorch models.
`models` represents the models to be merged.
`base` represents the model whose parameters will be used as defaults and that, if `inplace` is set to `True`, will be merged into; or the index of such a model in `models`. It defaults to `0`, that is, the index of the first model in `models`.
`method` represents the method to be used for merging the models' parameters, or a map of parameter names or regex patterns matching parameter names to the methods to be used to merge them. Currently, only the `'arithmetic'` method is supported (that is, the merging of parameters by taking their ordinary or weighted arithmetic mean). `method` defaults to `'arithmetic'`.
`weights` represents a list of all of the relative weights to be assigned to the models' parameters, or a map of parameter names or regex patterns matching parameter names to lists of weights. If set to `None`, all models will be weighted equally. If a dictionary is provided and there are any parameters to be merged that do not match any of the keys of that dictionary, they will be also weighted equally. `weights` defaults to `None`.
`included` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be merged. If set to `None`, all parameters will be merged. `included` defaults to `None`.
`excluded` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be excluded from merging. If set to `None`, no parameters will be excluded. If `included` is provided, this argument will apply to the subset of parameters that match `included`. `excluded` defaults to `None`.
`inplace` represents whether, for the sake of expediency or memory conservation, the `base` should be merged into in place instead of being deep copied. It defaults to `False`.
`dtype` represents the data type to be used for storing the weightings. It defaults to `torch.float64`.
`lineage` represents whether to output a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned. It defaults to `False`.
`progress` represents whether to display a progress bar. It defaults to `False`.
`merge()` will return either a merged model, or, if `lineage` is `True`, a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned, which looks like this:
```python
{
'parameter_name': ('method', [(model_index, weight), ...]),
...
}
```
## Changelog 🔄
terge adheres to [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html). All notable changes to terge are documented in its [Changelog 🔄](https://github.com/umarbutler/terge/blob/main/CHANGELOG.md).
## License 📜
terge is licensed under the [MIT License](https://github.com/umarbutler/terge/blob/main/LICENSE).
Raw data
{
"_id": null,
"home_page": null,
"name": "terge",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "ai, artificial intelligence, machine learning, merge, merger, merging, ml, model, models, neural net, neural nets, neural network, neural networks, nn, nns, pytorch, torch",
"author": null,
"author_email": "Umar Butler <umar@umar.au>",
"download_url": "https://files.pythonhosted.org/packages/e9/8c/2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91/terge-0.1.1.tar.gz",
"platform": null,
"description": "<a href=\"https://github.com/umarbutler/terge\" alt=\"terge logo\"><img src=\"https://raw.githubusercontent.com/umarbutler/terge/main/assets/banner.svg\"></a>\n\n-----------------------------------------------------------------------------\n<p align=\"center\"><a href=\"https://pypi.org/project/terge/\" alt=\"pypi version\"><img src=\"https://img.shields.io/pypi/v/terge\"></a> <a href=\"https://github.com/umarbutler/terge/actions/workflows/ci.yml\" alt=\"build status\"><img src=\"https://img.shields.io/github/actions/workflow/status/umarbutler/terge/ci.yml?branch=main\"></a> <a href=\"https://app.codecov.io/gh/umarbutler/terge\" alt=\"code coverage\"><img src=\"https://img.shields.io/codecov/c/github/umarbutler/terge\"></a> <!-- <a href=\"https://pypistats.org/packages/terge\" alt=\"Downloads\"><img src=\"https://img.shields.io/pypi/dm/terge\"></a> --> </p>\n\nterge is an *easy-to-use* Python library for merging PyTorch models. It works with models of any size and architecture, including Hugging Face \ud83e\udd17 Transformers.\n\n## Features \ud83c\udfaf\n- **\ud83d\udc4c Easy-to-use**: a single line of code is all you need to get started.\n- **\u26a1 Lightning-fast**: billions of parameters can be merged in mere seconds.\n- **\ud83d\udcd0 Architecture-agnostic**: models of any size and architecture can be merged, provided they share a couple parameters with the same name and shape.\n- **\ud83d\udee0\ufe0f Hyper-customizable**: parameters can be filtered in or out with regex, and custom weights can be assigned to models or even to their individual parameters.\n- **\ud83c\udf33 Lineage tracking**: maps of merged parameter names to models' weightings can be produced to document precisely how models were merged.\n- **\ud83e\udd17 Hugging Face-friendly**: Hugging Face \ud83e\udd17 Transformers are supported out of the box.\n\n## Installation \ud83e\uddd1\u200d\ud83d\udd27\n`terge` can be installed with `pip`:\n```bash\npip install terge\n```\n\n## Usage \ud83d\udc69\u200d\ud83d\udcbb\nThe following code snippet demonstrates how you can get started with `terge`:\n```python\nimport re\nimport torch\nimport terge\n\nfrom transformers import AutoModel # NOTE `transformers` isn't required, this is just for demo purposes.\n\n# A single line is all it takes to merge any number of models.\nmodel = terge.merge([torch.nn.Linear(10, 1) for _ in range(3)])\n\n# This also works for models of different architectures...\nmodel = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(10, 1, num_layers = 2)])\n\n# And models of different sizes...\nmodel = terge.merge([torch.nn.LSTM(10, 1, num_layers = 1), torch.nn.LSTM(100, 1, num_layers = 2)])\n\n# And even Hugging Face \ud83e\udd17 Transformers...\nmodel = terge.merge([AutoModel.from_pretrained('umarbutler/emubert'),\n AutoModel.from_pretrained('roberta-base')],\n progress = True)\n\n# Just make sure there's at least one shared named parameter in there.\nmodel = terge.merge([torch.nn.Linear(10, 1), torch.nn.Linear(1, 10)]) # -> terge.NoParametersToMergeWarning\n```\n\nIf you want even greater control over the merging process, `terge` has got you covered:\n```python\n# Changing how parameters are merged and what model serves as the base is trivial.\nmodel = terge.merge(\n [torch.nn.Linear(10, 1) for _ in range(3)],\n base = torch.nn.Linear(10, 1), # The base model doesn't even need to be getting merged! You can also\n # use the index of a model in the input models. The default is 0.\n weights = [1, 2, 3], # Weights are relative and correspond to the order of the input models such that,\n # here, the second model is weighted double the weight of the first model and the third model is weighted\n # triple the weight of the first model. The default is [1, 1, ...].\n)\n\n# Assigning custom weights to individual parameters is also easy.\nmodel = terge.merge(\n [torch.nn.Linear(10, 1) for _ in range(3)],\n weights = {re.compile(r'weight'): [1, 2, 3], 'bias': [3, 2, 1]}, # Anything that doesn't match this map\n # will get a weight of 1. You can change that adding `re.compile(r'.*'): [...]` to the *end* of your\n # weights map.\n)\n\n# If you want to filter specific parameters in or out, that can be done too.\nmodel = terge.merge(\n [torch.nn.Linear(10, 1) for _ in range(3)],\n included = re.compile(r'weight'), # Only parameters with 'weight' in their name will be merged.\n # You could also pass a string for an exact match.\n excluded = ['bias', re.compile(r'bias')], # Lists of strings and regex patterns work as well.\n # NOTE Exclusions execute after inclusions, so this isn't actually necessary.\n)\n\n# You can also enable lineage tracking to understand exactly how models got merged.\nmodel, lineage = terge.merge(\n [torch.nn.Linear(10, 1) for _ in range(3)],\n lineage = True,\n) # -> {'weight': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)]),\n # 'bias': ('arithmetic', [(0, 0.3333333333333333), (1, 0.3333333333333333), (2, 0.3333333333333333)])}\n\n# Finally, for an extra speed boost, you can merge in-place (just keep in mind, this will modify your base model).\nmodels = terge.merge(\n [torch.nn.Linear(10, 1) for _ in range(3)],\n inplace = True,\n)\n```\n\n## API \ud83e\udde9\n### `merge()`\n```python\ndef merge(\n models: list[torch.nn.Module],\n base: torch.nn.Module | int = 0,\n method: Literal['arithmetic'] | dict[str | re.Pattern, Literal['arithmetic']] = 'arithmetic',\n weights: list[float] | dict[str | re.Pattern, list[float]] = None,\n included: re.Pattern | str | list[str | re.Pattern] = None,\n excluded: re.Pattern | str | list[str | re.Pattern] = None,\n inplace: bool = False,\n dtype: torch.dtype = torch.float64,\n lineage: bool = False,\n progress: bool = False,\n) -> torch.nn.Module | tuple[torch.nn.Module, dict[str, tuple[str, list[tuple[int, float]]]]]\n```\n\n`merge()` merges PyTorch models.\n\n`models` represents the models to be merged.\n\n`base` represents the model whose parameters will be used as defaults and that, if `inplace` is set to `True`, will be merged into; or the index of such a model in `models`. It defaults to `0`, that is, the index of the first model in `models`.\n\n`method` represents the method to be used for merging the models' parameters, or a map of parameter names or regex patterns matching parameter names to the methods to be used to merge them. Currently, only the `'arithmetic'` method is supported (that is, the merging of parameters by taking their ordinary or weighted arithmetic mean). `method` defaults to `'arithmetic'`.\n\n`weights` represents a list of all of the relative weights to be assigned to the models' parameters, or a map of parameter names or regex patterns matching parameter names to lists of weights. If set to `None`, all models will be weighted equally. If a dictionary is provided and there are any parameters to be merged that do not match any of the keys of that dictionary, they will be also weighted equally. `weights` defaults to `None`.\n\n`included` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be merged. If set to `None`, all parameters will be merged. `included` defaults to `None`.\n\n`excluded` represents a regex pattern, string or list of regex patterns and strings matching parameter names to be excluded from merging. If set to `None`, no parameters will be excluded. If `included` is provided, this argument will apply to the subset of parameters that match `included`. `excluded` defaults to `None`.\n\n`inplace` represents whether, for the sake of expediency or memory conservation, the `base` should be merged into in place instead of being deep copied. It defaults to `False`.\n\n`dtype` represents the data type to be used for storing the weightings. It defaults to `torch.float64`.\n\n`lineage` represents whether to output a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned. It defaults to `False`.\n\n`progress` represents whether to display a progress bar. It defaults to `False`.\n\n`merge()` will return either a merged model, or, if `lineage` is `True`, a tuple containing the merged model along with a dictionary mapping the names of merged parameters to a tuple containing the names of merge methods and a list of tuples containing the indices of merged models that contributed to those parameters and the weights they were assigned, which looks like this:\n```python\n{\n 'parameter_name': ('method', [(model_index, weight), ...]),\n ...\n}\n```\n\n## Changelog \ud83d\udd04\nterge adheres to [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and [Semantic Versioning](https://semver.org/spec/v2.0.0.html). All notable changes to terge are documented in its [Changelog \ud83d\udd04](https://github.com/umarbutler/terge/blob/main/CHANGELOG.md).\n\n## License \ud83d\udcdc\nterge is licensed under the [MIT License](https://github.com/umarbutler/terge/blob/main/LICENSE).",
"bugtrack_url": null,
"license": "MIT",
"summary": "An easy-to-use Python library for merging PyTorch models.",
"version": "0.1.1",
"project_urls": {
"Documentation": "https://github.com/umarbutler/terge/blob/main/README.md",
"Homepage": "https://github.com/umarbutler/terge",
"Issues": "https://github.com/umarbutler/terge/issues",
"Source": "https://github.com/umarbutler/terge"
},
"split_keywords": [
"ai",
" artificial intelligence",
" machine learning",
" merge",
" merger",
" merging",
" ml",
" model",
" models",
" neural net",
" neural nets",
" neural network",
" neural networks",
" nn",
" nns",
" pytorch",
" torch"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f924f664711caae479aa2375769d07f6658b3832bf99cd3c26fb799ed3cf0519",
"md5": "a1d866519055d03773feb65fa2b41c18",
"sha256": "df0407668b0b6d1e550e02e7ad2d64b3a801834c27ba68fd31383ec5f24e6f45"
},
"downloads": -1,
"filename": "terge-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a1d866519055d03773feb65fa2b41c18",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 9055,
"upload_time": "2024-06-13T05:05:54",
"upload_time_iso_8601": "2024-06-13T05:05:54.878075Z",
"url": "https://files.pythonhosted.org/packages/f9/24/f664711caae479aa2375769d07f6658b3832bf99cd3c26fb799ed3cf0519/terge-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e98c2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91",
"md5": "b40a50333c2910ab8f90f2f4b3ab844b",
"sha256": "3e690b1cb2de7bf1771ebe1c04a026d39d014717c8ca1e4c946e56137df4cf16"
},
"downloads": -1,
"filename": "terge-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "b40a50333c2910ab8f90f2f4b3ab844b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 337329,
"upload_time": "2024-06-13T05:05:57",
"upload_time_iso_8601": "2024-06-13T05:05:57.146846Z",
"url": "https://files.pythonhosted.org/packages/e9/8c/2934826e3e444ef18e003f6f9b673679423ed4757271c2870e26c2033e91/terge-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-13 05:05:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "umarbutler",
"github_project": "terge",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "terge"
}