Introduction
------------
A crucial element of systems for data-analysis is laying out all the
hyperparameters of that system so they can be easily examined and modified.
We add a few useful extensions to a popular human-readable data-serialization
language known as YAML (YAML Ain't Markup Language). This provides support
for a rather expansive idea of what constitutes a hyperparameter, and cleans
up python files for data analysis to just the bare algorithm.
### Table of Contents
* [YAML basics](#yaml-basics)
* [HyperPyYAML](#hyperpyyaml)
* [Objects](#objects)
* [Aliases](#aliases)
* [Tuples](#tuples)
* [How to use HyperPyYAML](#how-to-use-hyperpyyaml)
* [Conclusion](#conclusion)
YAML basics
-----------
YAML is a data-serialization language, similar to JSON, and it supports
three basic types of nodes: scalar, sequential, and mapping. PyYAML naturally
converts sequential nodes to python lists and mapping nodes to python dicts.
Scalar nodes can take one of the following forms:
```yaml
string: abcd # No quotes needed
integer: 1
float: 1.3
bool: True
none: null
```
Note that we've used a simple mapping to demonstrate the scalar nodes. A mapping
is a set of `key: value` pairs, defined so that the key can be used to easily
retrieve the corresponding value. In addition to the format above, mappings
can also be specified in a similar manner to JSON:
```yaml
{foo: 1, bar: 2.5, baz: "abc"}
```
Sequences, or lists of items, can also be specified in two ways:
```yaml
- foo
- bar
- baz
```
or
```yaml
[foo, bar, baz]
```
Note that when not using the inline version, YAML uses whitespace to denote
nested items:
```yaml
foo:
a: 1
b: 2
bar:
- c
- d
```
YAML has a few more advanced features (such as
[aliases](https://pyyaml.org/wiki/PyYAMLDocumentation#aliases) and
[merge keys](https://yaml.org/type/merge.html)) that you may want to explore
on your own. We will briefly discuss one here since it is relevant for our
extensions: [YAML tags](https://pyyaml.org/wiki/PyYAMLDocumentation#tags).
Tags are added with a `!` prefix, and they specify the type of the node. This
allows types beyond the simple types listed above to be used. PyYAML supports a
few additional types, such as:
```yaml
!!set # set
!!timestamp # datetime.datetime
!!python/tuple # tuple
!!python/complex # complex
!!python/name:module.name # A class or function
!!python/module:package.module # A module
!!python/object/new:module.cls # An instance of a class
```
These can all be quite useful, however we found that this system was a bit
cumbersome, especially with the frequency with which we were using them. So
we decided to implement some shortcuts for these features, which we are
calling "HyperPyYAML".
HyperPyYAML
-----------
We make several extensions to yaml including easier object creation, nicer
aliases, and tuples.
### Objects
Our first extension is to simplify the structure for specifying an instance,
module, class, or function. As an example:
```yaml
model: !new:collections.Counter
```
This tag, prefixed with `!new:`, constructs an instance of the specified class.
If the node is a mapping node, all the items are passed as keyword arguments
to the class when the instance is created. A list can similarly be used to
pass positional arguments. See the following examples:
```yaml
foo: !new:collections.Counter
- abracadabra
bar: !new: collections.Counter
a: 2
b: 1
c: 5
```
We also simplify the interface for specifying a function or class or other
static Python entity:
```yaml
add: !name:operator.add
```
This code stores the `add` function. It can later be used in the usual way:
```python
>>> loaded_yaml = load_hyperpyyaml("add: !name:operator.add")
>>> loaded_yaml["add"](2, 4)
6
```
### Aliases
Another extension is a nicer alias system that supports things like
string interpolation. We've added a tag written `!ref` that
takes keys in angle brackets, and searches for them inside the yaml
file itself. As an example:
```yaml
folder1: abc/def
folder2: ghi/jkl
folder3: !ref <folder1>/<folder2>
foo: 1024
bar: 512
baz: !ref <foo> // <bar> + 1
```
This allows us to change some values and automatically change the
dependent values accordingly.
You can also refer to other references, and to sub-nodes using brackets.
```yaml
block_index: 1
cnn1:
out_channels: !ref <block_index> * 64
kernel_size: (3, 3)
cnn2:
out_channels: !ref <cnn1[out_channels]>
kernel_size: (3, 3)
```
Finally, you can make references to nodes that are objects, not just scalars.
```python
yaml_string = """
foo: !new:collections.Counter
a: 4
bar: !ref <foo>
baz: !copy <foo>
"""
loaded_yaml = load_hyperpyyaml(yaml_string)
loaded_yaml["foo"].update({"b": 10})
print(loaded_yaml["bar"])
print(loaded_yaml["baz"])
```
This provides the output:
```
Counter({'b': 10, 'a': 4})
Counter({'a': 4})
```
Note that `!ref` makes only a shallow copy, so updating `foo`
also updates `bar`. If you want a deep copy, use the `!copy` tag.
There are some issues (#7 #11) mentioning that `!ref` cannot refer to the return value of `!apply` function.
Thus we provide another `!applyref` tag to work with `!ref`, which can be used in four ways:
```yaml
# 1. Pass the positional and keyword arguments at the same time. Like `!!python/object/apply:module.function` in pyyaml
c: !applyref:sorted
_args:
- [3, 4, 1, 2]
_kwargs:
reverse: False
d: !ref <c>-<c>
# 2. Only pass the keyword arguments
e: !applyref:random.randint
a: 1
b: 3
f: !ref <e><e>
# 3. Only pass the positional arguments
g: !applyref:random.randint
- 1
- 3
h: !ref <g><g>
# 4. No arguments
i: !applyref:random.random
j: !ref <i><i>
```
Note that `!applyref` cannot return an object, otherwise the `RepresenterError` will be raised.
### Tuples
One last minor extension to the yaml syntax we've made is to implicitly
resolve any string starting with `(` and ending with `)` to a tuple.
This makes the use of YAML more intuitive for Python users.
How to use HyperPyYAML
---------------------
All of the listed extensions are available by loading yaml using the
`load_hyperpyyaml` function. This function returns an object in a similar
manner to pyyaml and other yaml libraries.
Also, `load_hyperpyyaml` takes an optional argument, `overrides`
which allows changes to any of the parameters listed in the YAML.
The following example demonstrates changing the `out_channels`
of the CNN layer:
```python
>>> yaml_string = """
... block_index: 1
... cnn1:
... out_channels: !ref <block_index> * 64
... kernel_size: (3, 3)
... cnn2:
... out_channels: !ref <cnn1[out_channels]>
... kernel_size: (3, 3)
... """
>>> overrides = {"block_index": 2}
>>> with open("hyperparameters.yaml") as f:
... hyperparameters = load_hyperpyyaml(f, overrides)
>>> hyperparameters["block_index"]
2
>>> hyperparameters["cnn2"]["out_channels"]
128
```
Conclusion
----------
We've defined a number of extensions to the YAML syntax, designed to
make it easier to use for hyperparameter specification. Feedback is welcome!
Raw data
{
"_id": null,
"home_page": "https://github.com/speechbrain/HyperPyYAML",
"name": "HyperPyYAML",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Peter Plantinga, Aku Rouhe",
"author_email": "speechbrain@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/52/e3/3ac46d9a662b037f699a6948b39c8d03bfcff0b592335d5953ba0c55d453/HyperPyYAML-1.2.2.tar.gz",
"platform": null,
"description": "Introduction\n------------\n\nA crucial element of systems for data-analysis is laying out all the\nhyperparameters of that system so they can be easily examined and modified.\nWe add a few useful extensions to a popular human-readable data-serialization\nlanguage known as YAML (YAML Ain't Markup Language). This provides support\nfor a rather expansive idea of what constitutes a hyperparameter, and cleans\nup python files for data analysis to just the bare algorithm.\n\n### Table of Contents\n* [YAML basics](#yaml-basics)\n* [HyperPyYAML](#hyperpyyaml)\n * [Objects](#objects)\n * [Aliases](#aliases)\n * [Tuples](#tuples)\n* [How to use HyperPyYAML](#how-to-use-hyperpyyaml)\n* [Conclusion](#conclusion)\n\nYAML basics\n-----------\n\nYAML is a data-serialization language, similar to JSON, and it supports\nthree basic types of nodes: scalar, sequential, and mapping. PyYAML naturally\nconverts sequential nodes to python lists and mapping nodes to python dicts.\n\nScalar nodes can take one of the following forms:\n\n```yaml\nstring: abcd # No quotes needed\ninteger: 1\nfloat: 1.3\nbool: True\nnone: null\n```\n\nNote that we've used a simple mapping to demonstrate the scalar nodes. A mapping\nis a set of `key: value` pairs, defined so that the key can be used to easily\nretrieve the corresponding value. In addition to the format above, mappings\ncan also be specified in a similar manner to JSON:\n\n```yaml\n{foo: 1, bar: 2.5, baz: \"abc\"}\n```\n\nSequences, or lists of items, can also be specified in two ways:\n\n```yaml\n- foo\n- bar\n- baz\n```\n\nor\n\n```yaml\n[foo, bar, baz]\n```\n\nNote that when not using the inline version, YAML uses whitespace to denote\nnested items:\n\n```yaml\nfoo:\n a: 1\n b: 2\nbar:\n - c\n - d\n```\n\nYAML has a few more advanced features (such as\n[aliases](https://pyyaml.org/wiki/PyYAMLDocumentation#aliases) and\n[merge keys](https://yaml.org/type/merge.html)) that you may want to explore\non your own. We will briefly discuss one here since it is relevant for our\nextensions: [YAML tags](https://pyyaml.org/wiki/PyYAMLDocumentation#tags).\n\nTags are added with a `!` prefix, and they specify the type of the node. This\nallows types beyond the simple types listed above to be used. PyYAML supports a\nfew additional types, such as:\n\n```yaml\n!!set # set\n!!timestamp # datetime.datetime\n!!python/tuple # tuple\n!!python/complex # complex\n!!python/name:module.name # A class or function\n!!python/module:package.module # A module\n!!python/object/new:module.cls # An instance of a class\n```\n\nThese can all be quite useful, however we found that this system was a bit\ncumbersome, especially with the frequency with which we were using them. So\nwe decided to implement some shortcuts for these features, which we are\ncalling \"HyperPyYAML\".\n\nHyperPyYAML\n-----------\n\nWe make several extensions to yaml including easier object creation, nicer\naliases, and tuples.\n\n### Objects\n\nOur first extension is to simplify the structure for specifying an instance,\nmodule, class, or function. As an example:\n\n```yaml\nmodel: !new:collections.Counter\n```\n\nThis tag, prefixed with `!new:`, constructs an instance of the specified class.\nIf the node is a mapping node, all the items are passed as keyword arguments\nto the class when the instance is created. A list can similarly be used to\npass positional arguments. See the following examples:\n\n```yaml\nfoo: !new:collections.Counter\n - abracadabra\nbar: !new: collections.Counter\n a: 2\n b: 1\n c: 5\n```\n\nWe also simplify the interface for specifying a function or class or other\nstatic Python entity:\n\n```yaml\nadd: !name:operator.add\n```\n\nThis code stores the `add` function. It can later be used in the usual way:\n\n```python\n>>> loaded_yaml = load_hyperpyyaml(\"add: !name:operator.add\")\n>>> loaded_yaml[\"add\"](2, 4)\n6\n```\n\n### Aliases\n\nAnother extension is a nicer alias system that supports things like\nstring interpolation. We've added a tag written `!ref` that\ntakes keys in angle brackets, and searches for them inside the yaml\nfile itself. As an example:\n\n```yaml\nfolder1: abc/def\nfolder2: ghi/jkl\nfolder3: !ref <folder1>/<folder2>\n\nfoo: 1024\nbar: 512\nbaz: !ref <foo> // <bar> + 1\n```\n\nThis allows us to change some values and automatically change the\ndependent values accordingly.\nYou can also refer to other references, and to sub-nodes using brackets.\n\n```yaml\nblock_index: 1\ncnn1:\n out_channels: !ref <block_index> * 64\n kernel_size: (3, 3)\ncnn2: \n out_channels: !ref <cnn1[out_channels]>\n kernel_size: (3, 3)\n```\n\nFinally, you can make references to nodes that are objects, not just scalars.\n\n```python\nyaml_string = \"\"\"\nfoo: !new:collections.Counter\n a: 4\nbar: !ref <foo>\nbaz: !copy <foo>\n\"\"\"\nloaded_yaml = load_hyperpyyaml(yaml_string)\nloaded_yaml[\"foo\"].update({\"b\": 10})\nprint(loaded_yaml[\"bar\"])\nprint(loaded_yaml[\"baz\"])\n```\n\nThis provides the output:\n```\nCounter({'b': 10, 'a': 4})\nCounter({'a': 4})\n```\n\nNote that `!ref` makes only a shallow copy, so updating `foo`\nalso updates `bar`. If you want a deep copy, use the `!copy` tag.\n\nThere are some issues (#7 #11) mentioning that `!ref` cannot refer to the return value of `!apply` function. \nThus we provide another `!applyref` tag to work with `!ref`, which can be used in four ways:\n\n```yaml\n# 1. Pass the positional and keyword arguments at the same time. Like `!!python/object/apply:module.function` in pyyaml\nc: !applyref:sorted\n _args: \n - [3, 4, 1, 2]\n _kwargs:\n reverse: False\nd: !ref <c>-<c>\n\n# 2. Only pass the keyword arguments\ne: !applyref:random.randint\n a: 1\n b: 3\nf: !ref <e><e>\n\n# 3. Only pass the positional arguments\ng: !applyref:random.randint\n - 1\n - 3\nh: !ref <g><g>\n\n# 4. No arguments\ni: !applyref:random.random\nj: !ref <i><i>\n```\n\nNote that `!applyref` cannot return an object, otherwise the `RepresenterError` will be raised.\n\n### Tuples\n\nOne last minor extension to the yaml syntax we've made is to implicitly\nresolve any string starting with `(` and ending with `)` to a tuple.\nThis makes the use of YAML more intuitive for Python users.\n\n\nHow to use HyperPyYAML\n---------------------\n\nAll of the listed extensions are available by loading yaml using the\n`load_hyperpyyaml` function. This function returns an object in a similar\nmanner to pyyaml and other yaml libraries.\nAlso, `load_hyperpyyaml` takes an optional argument, `overrides`\nwhich allows changes to any of the parameters listed in the YAML.\nThe following example demonstrates changing the `out_channels`\nof the CNN layer:\n\n```python\n>>> yaml_string = \"\"\"\n... block_index: 1\n... cnn1:\n... out_channels: !ref <block_index> * 64\n... kernel_size: (3, 3)\n... cnn2: \n... out_channels: !ref <cnn1[out_channels]>\n... kernel_size: (3, 3)\n... \"\"\"\n>>> overrides = {\"block_index\": 2}\n>>> with open(\"hyperparameters.yaml\") as f:\n... hyperparameters = load_hyperpyyaml(f, overrides)\n>>> hyperparameters[\"block_index\"]\n2\n>>> hyperparameters[\"cnn2\"][\"out_channels\"]\n128\n```\n\nConclusion\n----------\n\nWe've defined a number of extensions to the YAML syntax, designed to\nmake it easier to use for hyperparameter specification. Feedback is welcome!\n",
"bugtrack_url": null,
"license": "",
"summary": "Extensions to YAML syntax for better python interaction",
"version": "1.2.2",
"project_urls": {
"Homepage": "https://github.com/speechbrain/HyperPyYAML"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "33c9751b6401887f4b50f9307cc1e53d287b3dc77c375c126aeb6335aff73ccb",
"md5": "660b1ca888f8d319057be6856f82f50c",
"sha256": "3c5864bdc8864b2f0fbd7bc495e7e8fdf2dfd5dd80116f72da27ca96a128bdeb"
},
"downloads": -1,
"filename": "HyperPyYAML-1.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "660b1ca888f8d319057be6856f82f50c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 16118,
"upload_time": "2023-09-21T14:45:25",
"upload_time_iso_8601": "2023-09-21T14:45:25.101199Z",
"url": "https://files.pythonhosted.org/packages/33/c9/751b6401887f4b50f9307cc1e53d287b3dc77c375c126aeb6335aff73ccb/HyperPyYAML-1.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "52e33ac46d9a662b037f699a6948b39c8d03bfcff0b592335d5953ba0c55d453",
"md5": "e16a5bbc3bcf3dc0d0c90f6490a99e73",
"sha256": "bdb734210d18770a262f500fe5755c7a44a5d3b91521b06e24f7a00a36ee0f87"
},
"downloads": -1,
"filename": "HyperPyYAML-1.2.2.tar.gz",
"has_sig": false,
"md5_digest": "e16a5bbc3bcf3dc0d0c90f6490a99e73",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 17085,
"upload_time": "2023-09-21T14:45:27",
"upload_time_iso_8601": "2023-09-21T14:45:27.779460Z",
"url": "https://files.pythonhosted.org/packages/52/e3/3ac46d9a662b037f699a6948b39c8d03bfcff0b592335d5953ba0c55d453/HyperPyYAML-1.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-21 14:45:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "speechbrain",
"github_project": "HyperPyYAML",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "hyperpyyaml"
}