Name | ucumvert JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | Python parser & interface for UCUM (Unified Code for Units of Measure). |
upload_time | 2024-09-10 22:35:54 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | MIT |
keywords |
ucum
parser
units of measurement
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[](https://github.com/dalito/ucumvert/actions/workflows/ci.yml)
[](https://pypi.org/project/ucumvert)
# Easier access to UCUM from Python
> **Feedback welcome!**
> Currently only the conversion direction from UCM to pint is supported.
> Please review the definitions before you trust them.
> While we have many tests in place and reviewed the mappings carefully, bugs may still be present.
[UCUM](https://ucum.org/) (Unified Code for Units of Measure) is a code system intended to cover all units of measures.
It provides a formalism to express units in an unambiguous way suitable for electronic communication.
Note that UCUM does not provide a canonical representation, e.g. `m/s` and `m.s-1` are expressing the same unit in two ways.
**ucumvert** is a pip-installable Python package. Features:
- Parser for UCUM unit strings that implements the full grammar.
- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.
- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint´s default units with UCUM units. All UCUM units from the new version 2.2 of the specification (June 2024) are included.
**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).
So updating the parser for new UCUM releases is straight forward.
The parser is built with the great [lark](https://pypi.org/project/lark/) parser toolkit.
The generated lark grammar file for case-sensitive UCUM codes is included in the repository, see [ucum_grammar.lark](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_grammar.lark).
Some of the UCUM unit atoms are invalid unit names in pint, for example `cal_[15]`, `m[H2O]`, `10*`, `[in_i'H2O]`.
For all of them we define mappings to valid pint unit names in [ucum_pint.py](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_pint.py), e.g. `{"cal_[15]": "cal_15"}`.
## Install
ucumvert is available as Python package from [PyPi](https://pypi.org/project/ucumvert) and can be pip-installed in the usual way.
```bash
pip install ucumvert
```
To install the most recent code from git in developer mode including creation of a virtual environment use:
Linux
```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e .[dev]
```
Windows
```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
py -m venv .venv
.venv\Scripts\activate.bat
py -m pip install --upgrade pip
pip install -e .[dev]
```
Optionally you can visualize the parse trees with [Graphviz](https://www.graphviz.org/) as shown below. It requires the additional package [pydot](https://pypi.org/project/pydot/); install by running `pip install pydot`.
## Demo
We provide a basic command line interface.
```cmd
(.venv) $ ucumvert
```
It has an interactive mode to test parsing UCUM codes:
```cmd
(.venv) $ ucumvert -i
Enter UCUM unit code to parse, or 'q' to quit.
> m/s2.kg
Created visualization of parse tree (parse_tree.png).
main_term
term
term
simple_unit m
/
annotatable
simple_unit s
2
.
simple_unit
k
g
--> Pint <Quantity(1.0, 'kilogram * meter / second ** 2')>
> q
```
So the intermediate result is a tree which is then traversed to convert the elements to pint quantities (or pint-compatible strings):

The package includes an UCUM-aware pint UnitRegistry which loads all definitions for UCUM units on instantiation.
It comes with an additional method `from_ucum` to convert UCUM codes to pint.
```python
>>> from ucumvert import PintUcumRegistry
>>> ureg = PintUcumRegistry()
>>> ureg.from_ucum("m/s2.kg")
<Quantity(1.0, 'kilogram * meter / second ** 2')>
>>> ureg.from_ucum("m[H2O]{35Cel}") # UCUM code with annotation
<Quantity(1, 'm_H2O')>
>>> _.to("mbar")
<Quantity(98.0665, 'millibar')>
>>> ureg("degC") # a standard pint unit
<Quantity(1, 'degree_Celsius')>
>>>
```
## Tests
The unit tests include parsing and converting all common UCUM unit codes from the official repo. Run the test suite by:
```bash
pytest
```
The common UCUM unit codes are available only in binary form (xlsx, docs, pdf).
Here we keep a copy in tsv-format `ucum_examples.tsv`.
To (re)generate this tsv-file from the official xlsx-file in the [UCUM repository](https://github.com/ucum-org/ucum/tree/main/common-units) run
```bash
pip install openpyxl
python src/ucumvert/vendor/get_ucum_example_as_tsv.py
```
## Useful links
- UCUM [online-validator](https://ucum.nlm.nih.gov/ucum-lhc/demo.html)
- Issue in pint that motivated this work: [To what extent is pint compatible with UCUM?](https://github.com/hgrecco/pint/issues/1769)
## License
The code in this repository is distributed under MIT license with the exception of the `ucum-*.*` files in the directory `src/ucumvert/vendor`
that fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).
We consider **ucumvert** according to §1.3 not as "Derivative Works" of UCUM because **ucumvert** only *"interoperates with an unmodified instance of the Work"*.
Raw data
{
"_id": null,
"home_page": null,
"name": "ucumvert",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "David Linke <david.linke@catalysis.de>",
"keywords": "UCUM, parser, units of measurement",
"author": null,
"author_email": "David Linke <david.linke@catalysis.de>",
"download_url": "https://files.pythonhosted.org/packages/6e/af/885220dcbb45664424573a8cd4046e9af1675c8218c04e040b6690886f8e/ucumvert-0.2.1.tar.gz",
"platform": null,
"description": "[](https://github.com/dalito/ucumvert/actions/workflows/ci.yml)\n[](https://pypi.org/project/ucumvert)\n\n# Easier access to UCUM from Python\n\n> **Feedback welcome!**\n> Currently only the conversion direction from UCM to pint is supported.\n> Please review the definitions before you trust them.\n> While we have many tests in place and reviewed the mappings carefully, bugs may still be present.\n\n[UCUM](https://ucum.org/) (Unified Code for Units of Measure) is a code system intended to cover all units of measures.\nIt provides a formalism to express units in an unambiguous way suitable for electronic communication.\nNote that UCUM does not provide a canonical representation, e.g. `m/s` and `m.s-1` are expressing the same unit in two ways.\n\n**ucumvert** is a pip-installable Python package. Features:\n\n- Parser for UCUM unit strings that implements the full grammar.\n- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.\n- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint\u00b4s default units with UCUM units. All UCUM units from the new version 2.2 of the specification (June 2024) are included.\n\n**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).\nSo updating the parser for new UCUM releases is straight forward.\nThe parser is built with the great [lark](https://pypi.org/project/lark/) parser toolkit.\nThe generated lark grammar file for case-sensitive UCUM codes is included in the repository, see [ucum_grammar.lark](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_grammar.lark).\n\nSome of the UCUM unit atoms are invalid unit names in pint, for example `cal_[15]`, `m[H2O]`, `10*`, `[in_i'H2O]`.\nFor all of them we define mappings to valid pint unit names in [ucum_pint.py](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_pint.py), e.g. `{\"cal_[15]\": \"cal_15\"}`.\n\n## Install\n\nucumvert is available as Python package from [PyPi](https://pypi.org/project/ucumvert) and can be pip-installed in the usual way.\n\n```bash\npip install ucumvert\n```\n\nTo install the most recent code from git in developer mode including creation of a virtual environment use:\n\nLinux\n\n```bash\ngit clone https://github.com/dalito/ucumvert.git\ncd ucumvert\npython -m venv .venv\nsource .venv/bin/activate\npython -m pip install --upgrade pip\npip install -e .[dev]\n```\n\nWindows\n\n```bash\ngit clone https://github.com/dalito/ucumvert.git\ncd ucumvert\npy -m venv .venv\n.venv\\Scripts\\activate.bat\npy -m pip install --upgrade pip\npip install -e .[dev]\n```\n\nOptionally you can visualize the parse trees with [Graphviz](https://www.graphviz.org/) as shown below. It requires the additional package [pydot](https://pypi.org/project/pydot/); install by running `pip install pydot`.\n\n## Demo\n\nWe provide a basic command line interface.\n\n```cmd\n(.venv) $ ucumvert\n```\n\nIt has an interactive mode to test parsing UCUM codes:\n\n```cmd\n(.venv) $ ucumvert -i\nEnter UCUM unit code to parse, or 'q' to quit.\n> m/s2.kg\nCreated visualization of parse tree (parse_tree.png).\nmain_term\n term\n term\n simple_unit m\n /\n annotatable\n simple_unit s\n 2\n .\n simple_unit\n k\n g\n--> Pint <Quantity(1.0, 'kilogram * meter / second ** 2')>\n> q\n```\n\nSo the intermediate result is a tree which is then traversed to convert the elements to pint quantities (or pint-compatible strings):\n\n\n\nThe package includes an UCUM-aware pint UnitRegistry which loads all definitions for UCUM units on instantiation.\nIt comes with an additional method `from_ucum` to convert UCUM codes to pint.\n\n```python\n>>> from ucumvert import PintUcumRegistry\n>>> ureg = PintUcumRegistry()\n>>> ureg.from_ucum(\"m/s2.kg\")\n<Quantity(1.0, 'kilogram * meter / second ** 2')>\n>>> ureg.from_ucum(\"m[H2O]{35Cel}\") # UCUM code with annotation\n<Quantity(1, 'm_H2O')>\n>>> _.to(\"mbar\")\n<Quantity(98.0665, 'millibar')>\n>>> ureg(\"degC\") # a standard pint unit\n<Quantity(1, 'degree_Celsius')>\n>>>\n```\n\n## Tests\n\nThe unit tests include parsing and converting all common UCUM unit codes from the official repo. Run the test suite by:\n\n```bash\npytest\n```\n\nThe common UCUM unit codes are available only in binary form (xlsx, docs, pdf).\nHere we keep a copy in tsv-format `ucum_examples.tsv`.\nTo (re)generate this tsv-file from the official xlsx-file in the [UCUM repository](https://github.com/ucum-org/ucum/tree/main/common-units) run\n\n```bash\npip install openpyxl\npython src/ucumvert/vendor/get_ucum_example_as_tsv.py\n```\n\n## Useful links\n\n- UCUM [online-validator](https://ucum.nlm.nih.gov/ucum-lhc/demo.html)\n- Issue in pint that motivated this work: [To what extent is pint compatible with UCUM?](https://github.com/hgrecco/pint/issues/1769)\n\n## License\n\nThe code in this repository is distributed under MIT license with the exception of the `ucum-*.*` files in the directory `src/ucumvert/vendor`\nthat fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).\nWe consider **ucumvert** according to \u00a71.3 not as \"Derivative Works\" of UCUM because **ucumvert** only *\"interoperates with an unmodified instance of the Work\"*.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python parser & interface for UCUM (Unified Code for Units of Measure).",
"version": "0.2.1",
"project_urls": {
"Changelog": "https://github.com/dalito/ucumvert/releases",
"Documentation": "https://github.com/dalito/ucumvert",
"GitHub": "https://github.com/dalito/ucumvert"
},
"split_keywords": [
"ucum",
" parser",
" units of measurement"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9f833bc090b81240377a2b393702d48375fe5c037be290baae5b46afbefa1eac",
"md5": "5372bdee87f54d68713e17a8c0c7c59e",
"sha256": "22bc803903981a34f1e9485f594031603ab64eae6ff2fd519fa15b6f284ba308"
},
"downloads": -1,
"filename": "ucumvert-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5372bdee87f54d68713e17a8c0c7c59e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 53071,
"upload_time": "2024-09-10T22:35:53",
"upload_time_iso_8601": "2024-09-10T22:35:53.257024Z",
"url": "https://files.pythonhosted.org/packages/9f/83/3bc090b81240377a2b393702d48375fe5c037be290baae5b46afbefa1eac/ucumvert-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6eaf885220dcbb45664424573a8cd4046e9af1675c8218c04e040b6690886f8e",
"md5": "1ed257bb05b84f27669a6b3765d02b8c",
"sha256": "9a9f360ca04df870463ea04583282e24a9d8b839d50a3aa62db9a243495b0c21"
},
"downloads": -1,
"filename": "ucumvert-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "1ed257bb05b84f27669a6b3765d02b8c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 52205,
"upload_time": "2024-09-10T22:35:54",
"upload_time_iso_8601": "2024-09-10T22:35:54.723709Z",
"url": "https://files.pythonhosted.org/packages/6e/af/885220dcbb45664424573a8cd4046e9af1675c8218c04e040b6690886f8e/ucumvert-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-10 22:35:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dalito",
"github_project": "ucumvert",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "ucumvert"
}