ucumvert


Nameucumvert JSON
Version 0.1.2 PyPI version JSON
download
home_page
SummaryPython parser & interface for UCUM (Unified Code for Units of Measure).
upload_time2024-02-12 15:11:11
maintainer
docs_urlNone
author
requires_python>=3.8
licenseMIT
keywords ucum parser units of measurement
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![CI - main](https://github.com/dalito/ucumvert/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/dalito/ucumvert/actions/workflows/ci.yml)
[![PyPI - Version](https://img.shields.io/pypi/v/ucumvert)](https://pypi.org/project/ucumvert)

# Easier access to UCUM from Python

> **Feedback welcome!**
> Currently only the conversion direction from UCM to pint is supported.
> Please review the definitions before you trust them.
> While we have many tests in place and reviewed the mappings carefully, bugs may still be present.

[UCUM](https://ucum.org/) (Unified Code for Units of Measure) is a code system intended to cover all units of measures.
It provides a formalism to express units in an unambiguous way suitable for electronic communication.
Note that UCUM does non provide a canonical representation, e.g. `m/s` and `m.s-1` are expressing the same unit in two ways.

**ucumvert** is a pip-installable Python package. Features:

- Parser for UCUM unit strings that implements the full grammar.
- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.
- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint´s default units with UCUM units. All UCUM units from Version 2.1 of the specification are included.

**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).
So updating the parser for new UCUM releases is straight forward.
The parser is built with the great [lark](https://pypi.org/project/lark/) parser toolkit.
The generated lark grammar file for case-sensitive UCUM codes is included in the repository, see [ucum_grammar.lark](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_grammar.lark).

Some of the UCUM unit atoms are invalid unit names in pint, for example `cal_[15]`, `m[H2O]`, `10*`, `[in_i'H2O]`.
For all of them we define mappings to valid pint unit names in [ucum_pint.py](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_pint.py), e.g. `{"cal_[15]": "cal_15"}`.

## Install

ucumvert is available as Python package from [PyPi](https://pypi.org/project/ucumvert) and can be pip-installed in the usual way.

```bash
pip install ucumvert
```

To install the most recent code from git in developer mode including creation of a virtual environment use:

Linux

```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
pip install -e .[dev]
```

Windows

```bash
git clone https://github.com/dalito/ucumvert.git
cd ucumvert
py -m venv .venv
.venv\Scripts\activate.bat
py -m pip install --upgrade pip
pip install -e .[dev]
```

Optionally you can visualize the parse trees with [Graphviz](https://www.graphviz.org/) as shown below. It requires the additional package [pydot](https://pypi.org/project/pydot/); install by running `pip install pydot`.

## Demo

We provide a basic command line interface.

```cmd
(.venv) $ ucumvert
```

It has an interactive mode to test parsing UCUM codes:

```cmd
(.venv) $ ucumvert -i
Enter UCUM unit code to parse, or 'q' to quit.
> m/s2.kg
Created visualization of parse tree (parse_tree.png).
main_term
  term
    term
      simple_unit       m
      /
      annotatable
        simple_unit     s
        2
    .
    simple_unit
      k
      g
--> Pint <Quantity(1.0, 'kilogram * meter / second ** 2')>
> q
```

So the intermediate result is a tree which is then traversed to convert the elements to pint quantities (or pint-compatible strings):

![parse tree kg*m*s**-2](https://raw.githubusercontent.com/dalito/ucumvert/main/parse_tree.png)

The package includes an UCUM-aware pint UnitRegistry which loads all definitions for UCUM units on instantiation.
It comes with an additional method `from_ucum` to convert UCUM codes to pint.

```python
>>> from ucumvert import PintUcumRegistry
>>> ureg = PintUcumRegistry()
>>> ureg.from_ucum("m/s2.kg")
<Quantity(1.0, 'kilogram * meter / second ** 2')>
>>> ureg.from_ucum("m[H2O]{35Cel}")  # UCUM code with annotation
<Quantity(1, 'm_H2O')>
>>> _.to("mbar")
<Quantity(98.0665, 'millibar')>
>>> ureg("degC")   # a standard pint unit
<Quantity(1, 'degree_Celsius')>
>>>
```

## Tests

The unit tests include parsing and converting all common UCUM unit codes from the official repo. Run the test suite by:

```bash
pytest
```

The common UCUM unit codes are available only in binary form (xlsx, docs, pdf).
Here we keep a copy in tsv-format `ucum_examples.tsv`.
To (re)generate this tsv-file from the official xlsx-file in the [UCUM repository](https://github.com/ucum-org/ucum/tree/main/common-units) run

```bash
pip install openpyxl
python src/src/ucumvert/vendor/get_ucum_example_as_tsv.py
```

## Useful links

- UCUM [online-validator](https://ucum.nlm.nih.gov/ucum-lhc/demo.html)
- Issue in pint that motivated this work: [To what extent is pint compatible with UCUM?](https://github.com/hgrecco/pint/issues/1769)

## License

The code in this repository is distributed under MIT license with the exception of the `ucum-*.*` files in the directory `src/ucumvert/vendor`
that fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).
We consider **ucumvert** according to §1.3 not as "Derivative Works" of UCUM because **ucumvert** only *"interoperates with an unmodified instance of the Work"*.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "ucumvert",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "David Linke <david.linke@catalysis.de>",
    "keywords": "UCUM,parser,units of measurement",
    "author": "",
    "author_email": "David Linke <david.linke@catalysis.de>",
    "download_url": "https://files.pythonhosted.org/packages/34/4a/f78273ed592b9587a4501b1c708805fdf4c47585b8f4bdce0842e034eeba/ucumvert-0.1.2.tar.gz",
    "platform": null,
    "description": "[![CI - main](https://github.com/dalito/ucumvert/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/dalito/ucumvert/actions/workflows/ci.yml)\n[![PyPI - Version](https://img.shields.io/pypi/v/ucumvert)](https://pypi.org/project/ucumvert)\n\n# Easier access to UCUM from Python\n\n> **Feedback welcome!**\n> Currently only the conversion direction from UCM to pint is supported.\n> Please review the definitions before you trust them.\n> While we have many tests in place and reviewed the mappings carefully, bugs may still be present.\n\n[UCUM](https://ucum.org/) (Unified Code for Units of Measure) is a code system intended to cover all units of measures.\nIt provides a formalism to express units in an unambiguous way suitable for electronic communication.\nNote that UCUM does non provide a canonical representation, e.g. `m/s` and `m.s-1` are expressing the same unit in two ways.\n\n**ucumvert** is a pip-installable Python package. Features:\n\n- Parser for UCUM unit strings that implements the full grammar.\n- Converter for creating [pint](https://pypi.org/project/pint/) units from UCUM unit strings.\n- A pint unit definition file [pint_ucum_defs.txt](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/pint_ucum_defs.txt) that extends pint\u00b4s default units with UCUM units. All UCUM units from Version 2.1 of the specification are included.\n\n**ucumvert** generates the UCUM grammar by filling a template with unit codes, prefixes etc. from the official [ucum-essence.xml](https://github.com/ucum-org/ucum/blob/main/ucum-essence.xml) file (a copy is included in this repo).\nSo updating the parser for new UCUM releases is straight forward.\nThe parser is built with the great [lark](https://pypi.org/project/lark/) parser toolkit.\nThe generated lark grammar file for case-sensitive UCUM codes is included in the repository, see [ucum_grammar.lark](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_grammar.lark).\n\nSome of the UCUM unit atoms are invalid unit names in pint, for example `cal_[15]`, `m[H2O]`, `10*`, `[in_i'H2O]`.\nFor all of them we define mappings to valid pint unit names in [ucum_pint.py](https://github.com/dalito/ucumvert/blob/main/src/ucumvert/ucum_pint.py), e.g. `{\"cal_[15]\": \"cal_15\"}`.\n\n## Install\n\nucumvert is available as Python package from [PyPi](https://pypi.org/project/ucumvert) and can be pip-installed in the usual way.\n\n```bash\npip install ucumvert\n```\n\nTo install the most recent code from git in developer mode including creation of a virtual environment use:\n\nLinux\n\n```bash\ngit clone https://github.com/dalito/ucumvert.git\ncd ucumvert\npython -m venv .venv\nsource .venv/bin/activate\npython -m pip install --upgrade pip\npip install -e .[dev]\n```\n\nWindows\n\n```bash\ngit clone https://github.com/dalito/ucumvert.git\ncd ucumvert\npy -m venv .venv\n.venv\\Scripts\\activate.bat\npy -m pip install --upgrade pip\npip install -e .[dev]\n```\n\nOptionally you can visualize the parse trees with [Graphviz](https://www.graphviz.org/) as shown below. It requires the additional package [pydot](https://pypi.org/project/pydot/); install by running `pip install pydot`.\n\n## Demo\n\nWe provide a basic command line interface.\n\n```cmd\n(.venv) $ ucumvert\n```\n\nIt has an interactive mode to test parsing UCUM codes:\n\n```cmd\n(.venv) $ ucumvert -i\nEnter UCUM unit code to parse, or 'q' to quit.\n> m/s2.kg\nCreated visualization of parse tree (parse_tree.png).\nmain_term\n  term\n    term\n      simple_unit       m\n      /\n      annotatable\n        simple_unit     s\n        2\n    .\n    simple_unit\n      k\n      g\n--> Pint <Quantity(1.0, 'kilogram * meter / second ** 2')>\n> q\n```\n\nSo the intermediate result is a tree which is then traversed to convert the elements to pint quantities (or pint-compatible strings):\n\n![parse tree kg*m*s**-2](https://raw.githubusercontent.com/dalito/ucumvert/main/parse_tree.png)\n\nThe package includes an UCUM-aware pint UnitRegistry which loads all definitions for UCUM units on instantiation.\nIt comes with an additional method `from_ucum` to convert UCUM codes to pint.\n\n```python\n>>> from ucumvert import PintUcumRegistry\n>>> ureg = PintUcumRegistry()\n>>> ureg.from_ucum(\"m/s2.kg\")\n<Quantity(1.0, 'kilogram * meter / second ** 2')>\n>>> ureg.from_ucum(\"m[H2O]{35Cel}\")  # UCUM code with annotation\n<Quantity(1, 'm_H2O')>\n>>> _.to(\"mbar\")\n<Quantity(98.0665, 'millibar')>\n>>> ureg(\"degC\")   # a standard pint unit\n<Quantity(1, 'degree_Celsius')>\n>>>\n```\n\n## Tests\n\nThe unit tests include parsing and converting all common UCUM unit codes from the official repo. Run the test suite by:\n\n```bash\npytest\n```\n\nThe common UCUM unit codes are available only in binary form (xlsx, docs, pdf).\nHere we keep a copy in tsv-format `ucum_examples.tsv`.\nTo (re)generate this tsv-file from the official xlsx-file in the [UCUM repository](https://github.com/ucum-org/ucum/tree/main/common-units) run\n\n```bash\npip install openpyxl\npython src/src/ucumvert/vendor/get_ucum_example_as_tsv.py\n```\n\n## Useful links\n\n- UCUM [online-validator](https://ucum.nlm.nih.gov/ucum-lhc/demo.html)\n- Issue in pint that motivated this work: [To what extent is pint compatible with UCUM?](https://github.com/hgrecco/pint/issues/1769)\n\n## License\n\nThe code in this repository is distributed under MIT license with the exception of the `ucum-*.*` files in the directory `src/ucumvert/vendor`\nthat fall under the [UCUM Copyright Notice and License](https://github.com/ucum-org/ucum/blob/main/LICENSE.md) (Version 1.0).\nWe consider **ucumvert** according to \u00a71.3 not as \"Derivative Works\" of UCUM because **ucumvert** only *\"interoperates with an unmodified instance of the Work\"*.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python parser & interface for UCUM (Unified Code for Units of Measure).",
    "version": "0.1.2",
    "project_urls": {
        "Changelog": "https://github.com/dalito/ucumvert/releases",
        "Documentation": "https://github.com/dalito/ucumvert",
        "GitHub": "https://github.com/dalito/ucumvert"
    },
    "split_keywords": [
        "ucum",
        "parser",
        "units of measurement"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "70063349c311a58fb4233b842323e0f8bbdd6b3127e4914077367dc7edb2c87b",
                "md5": "4fccbc8ece8c48179cf2f5b3aadec3af",
                "sha256": "480f0cd9315ebc5c6da3117de42296a4f8e35deb84a7f0b8719a909129afb72b"
            },
            "downloads": -1,
            "filename": "ucumvert-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4fccbc8ece8c48179cf2f5b3aadec3af",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 52796,
            "upload_time": "2024-02-12T15:11:09",
            "upload_time_iso_8601": "2024-02-12T15:11:09.905016Z",
            "url": "https://files.pythonhosted.org/packages/70/06/3349c311a58fb4233b842323e0f8bbdd6b3127e4914077367dc7edb2c87b/ucumvert-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "344af78273ed592b9587a4501b1c708805fdf4c47585b8f4bdce0842e034eeba",
                "md5": "76c834e718bcd13bde3d96e16bbaa63b",
                "sha256": "a7c0dab878571ed5e03abab988c46e40e40092b253d6da5041d2305e1ddecd02"
            },
            "downloads": -1,
            "filename": "ucumvert-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "76c834e718bcd13bde3d96e16bbaa63b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 51934,
            "upload_time": "2024-02-12T15:11:11",
            "upload_time_iso_8601": "2024-02-12T15:11:11.068647Z",
            "url": "https://files.pythonhosted.org/packages/34/4a/f78273ed592b9587a4501b1c708805fdf4c47585b8f4bdce0842e034eeba/ucumvert-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-12 15:11:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dalito",
    "github_project": "ucumvert",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ucumvert"
}
        
Elapsed time: 0.17927s