rxn-utils


Namerxn-utils JSON
Version 2.0.0 PyPI version JSON
download
home_pagehttps://github.com/rxn4chemistry/rxn-utilities
SummaryGeneral utilities (not related to chemistry)
upload_time2024-02-13 20:37:41
maintainer
docs_urlNone
authorIBM RXN team
requires_python>=3.7
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # RXN utilities package

[![Actions tests](https://github.com/rxn4chemistry/rxn-utilities/actions/workflows/tests.yaml/badge.svg)](https://github.com/rxn4chemistry/rxn-utilities/actions)

This repository contains general Python utilities commonly used in the RXN universe.
For utilities related to chemistry, see our other repository [`rxn-chemutils`](https://github.com/rxn4chemistry/rxn-chemutils).

Links:
* [GitHub repository](https://github.com/rxn4chemistry/rxn-utilities)
* [Documentation](https://rxn4chemistry.github.io/rxn-utilities/)
* [PyPI package](https://pypi.org/project/rxn-utils/)

## System Requirements

This package is supported on all operating systems.
It has been tested on the following systems:

+ macOS: Big Sur (11.1)

+ Linux: Ubuntu 18.04.4

A Python version of 3.6 or greater is recommended.

## Installation guide

The package can be installed from Pypi:

```bash
pip install rxn-utils
```

For local development, the package can be installed with:

```bash
pip install -e ".[dev]"
```

## Package highlights

### File-related utilities

* [`load_list_from_file`](./src/rxn/utilities/files.py): read a files into a list of strings.
* [`iterate_lines_from_file`](./src/rxn/utilities/files.py): same as `load_list_from_file`, but produces an iterator instead of a list. This can be much more memory-efficient.
* [`dump_list_to_file`](./src/rxn/utilities/files.py) and [`append_to_file`](./src/rxn/utilities/files.py): Write an iterable of strings to a file (one per line).
* [`named_temporary_path`](./src/rxn/utilities/files.py) and [`named_temporary_directory`](./src/rxn/utilities/files.py): provide a context with a file or directory that will be deleted when the context closes. Useful for unit tests.
  ```pycon
  >>> with named_temporary_path() as temporary_path:
  ...     # do something on the temporary path.
  ...     # The file or directory at that path will be deleted at the
  ...     # end of the context, except if delete=False.
  ```
* ... and others.

### CSV-related functionality

* The function [`iterate_csv_column`](./src/rxn/utilities/csv/column_iterator.py) and the related executable `rxn-extract-csv-column` provide an easy way to extract one single column from a CSV file.
* The [`StreamingCsvEditor`](./src/rxn/utilities/csv/streaming_csv_editor.py) allows for doing a series of operations onto a CSV file without loading it fully in the memory. 
  This is for instance used in [`rxn-reaction-preprocessing`](https://github.com/rxn4chemistry/rxn-reaction-preprocessing).
  See a few examples in the [unit tests](./tests/csv/test_streaming_csv_editor.py).

### Stable shuffling

For reproducible shuffling, or for shuffling two files of identical length so that the same permutation is obtained, one can use the [`stable_shuffle`](./src/rxn/utilities/files.py) function.
The executable `rxn-stable-shuffle` is also provided for this purpose.

Both also work with CSV files if the appropriate flag is provided.

### `chunker` and `remove_duplicates`

For batching an iterable into lists of a specified size, `chunker` comes in handy. 
It also does so in a memory-efficient way.
```pycon
>>> from rxn.utilities.containers import chunker
>>> for chunk in chunker(range(1, 10), chunk_size=4):
...     print(chunk)
[1, 2, 3, 4]
[5, 6, 7, 8]
[9]
```

[`remove_duplicates`](./src/rxn/utilities/containers.py) (or [`iterate_unique_values`](./src/rxn/utilities/containers.py), its memory-efficient variant) removes duplicates from a container, possibly based on a callable instead of the values:
```pycon
>>> from rxn.utilities.containers import remove_duplicates
>>> remove_duplicates([3, 6, 9, 2, 3, 1, 9])
[3, 6, 9, 2, 1]
>>> remove_duplicates(["ab", "cd", "efg", "hijk", "", "lmn"], key=lambda x: len(x))
['ab', 'efg', 'hijk', '']
```

### Regex utilities

[`regex.py`](./src/rxn/utilities/regex.py) provides a few functions that make it easier to build regex strings (considering whether segments should be optional, capturing, etc.).

### Others

* A custom, more general enum class, [`RxnEnum`](./src/rxn/utilities/types.py).
* [`remove_prefix`](./src/rxn/utilities/strings.py), [`remove_postfix`](./src/rxn/utilities/strings.py).
* Initialization of loggers, in a `logging`-compatible way: [`logging.py`](./src/rxn/utilities/logging.py).
* [`sandboxed_random_context`](./src/rxn/utilities/basic.py) and [`temporary_random_seed`](./src/rxn/utilities/basic.py), to create a context with a specific random state that will not have side effects. 
  Especially useful for testing purposes (unit tests).
* ... and others.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rxn4chemistry/rxn-utilities",
    "name": "rxn-utils",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "IBM RXN team",
    "author_email": "rxn4chemistry@zurich.ibm.com",
    "download_url": "https://files.pythonhosted.org/packages/cd/53/fb8eaf6e2119aacfbc75b9458923535c72e462fac06c637b6e5419756444/rxn-utils-2.0.0.tar.gz",
    "platform": null,
    "description": "# RXN utilities package\n\n[![Actions tests](https://github.com/rxn4chemistry/rxn-utilities/actions/workflows/tests.yaml/badge.svg)](https://github.com/rxn4chemistry/rxn-utilities/actions)\n\nThis repository contains general Python utilities commonly used in the RXN universe.\nFor utilities related to chemistry, see our other repository [`rxn-chemutils`](https://github.com/rxn4chemistry/rxn-chemutils).\n\nLinks:\n* [GitHub repository](https://github.com/rxn4chemistry/rxn-utilities)\n* [Documentation](https://rxn4chemistry.github.io/rxn-utilities/)\n* [PyPI package](https://pypi.org/project/rxn-utils/)\n\n## System Requirements\n\nThis package is supported on all operating systems.\nIt has been tested on the following systems:\n\n+ macOS: Big Sur (11.1)\n\n+ Linux: Ubuntu 18.04.4\n\nA Python version of 3.6 or greater is recommended.\n\n## Installation guide\n\nThe package can be installed from Pypi:\n\n```bash\npip install rxn-utils\n```\n\nFor local development, the package can be installed with:\n\n```bash\npip install -e \".[dev]\"\n```\n\n## Package highlights\n\n### File-related utilities\n\n* [`load_list_from_file`](./src/rxn/utilities/files.py): read a files into a list of strings.\n* [`iterate_lines_from_file`](./src/rxn/utilities/files.py): same as `load_list_from_file`, but produces an iterator instead of a list. This can be much more memory-efficient.\n* [`dump_list_to_file`](./src/rxn/utilities/files.py) and [`append_to_file`](./src/rxn/utilities/files.py): Write an iterable of strings to a file (one per line).\n* [`named_temporary_path`](./src/rxn/utilities/files.py) and [`named_temporary_directory`](./src/rxn/utilities/files.py): provide a context with a file or directory that will be deleted when the context closes. Useful for unit tests.\n  ```pycon\n  >>> with named_temporary_path() as temporary_path:\n  ...     # do something on the temporary path.\n  ...     # The file or directory at that path will be deleted at the\n  ...     # end of the context, except if delete=False.\n  ```\n* ... and others.\n\n### CSV-related functionality\n\n* The function [`iterate_csv_column`](./src/rxn/utilities/csv/column_iterator.py) and the related executable `rxn-extract-csv-column` provide an easy way to extract one single column from a CSV file.\n* The [`StreamingCsvEditor`](./src/rxn/utilities/csv/streaming_csv_editor.py) allows for doing a series of operations onto a CSV file without loading it fully in the memory. \n  This is for instance used in [`rxn-reaction-preprocessing`](https://github.com/rxn4chemistry/rxn-reaction-preprocessing).\n  See a few examples in the [unit tests](./tests/csv/test_streaming_csv_editor.py).\n\n### Stable shuffling\n\nFor reproducible shuffling, or for shuffling two files of identical length so that the same permutation is obtained, one can use the [`stable_shuffle`](./src/rxn/utilities/files.py) function.\nThe executable `rxn-stable-shuffle` is also provided for this purpose.\n\nBoth also work with CSV files if the appropriate flag is provided.\n\n### `chunker` and `remove_duplicates`\n\nFor batching an iterable into lists of a specified size, `chunker` comes in handy. \nIt also does so in a memory-efficient way.\n```pycon\n>>> from rxn.utilities.containers import chunker\n>>> for chunk in chunker(range(1, 10), chunk_size=4):\n...     print(chunk)\n[1, 2, 3, 4]\n[5, 6, 7, 8]\n[9]\n```\n\n[`remove_duplicates`](./src/rxn/utilities/containers.py) (or [`iterate_unique_values`](./src/rxn/utilities/containers.py), its memory-efficient variant) removes duplicates from a container, possibly based on a callable instead of the values:\n```pycon\n>>> from rxn.utilities.containers import remove_duplicates\n>>> remove_duplicates([3, 6, 9, 2, 3, 1, 9])\n[3, 6, 9, 2, 1]\n>>> remove_duplicates([\"ab\", \"cd\", \"efg\", \"hijk\", \"\", \"lmn\"], key=lambda x: len(x))\n['ab', 'efg', 'hijk', '']\n```\n\n### Regex utilities\n\n[`regex.py`](./src/rxn/utilities/regex.py) provides a few functions that make it easier to build regex strings (considering whether segments should be optional, capturing, etc.).\n\n### Others\n\n* A custom, more general enum class, [`RxnEnum`](./src/rxn/utilities/types.py).\n* [`remove_prefix`](./src/rxn/utilities/strings.py), [`remove_postfix`](./src/rxn/utilities/strings.py).\n* Initialization of loggers, in a `logging`-compatible way: [`logging.py`](./src/rxn/utilities/logging.py).\n* [`sandboxed_random_context`](./src/rxn/utilities/basic.py) and [`temporary_random_seed`](./src/rxn/utilities/basic.py), to create a context with a specific random state that will not have side effects. \n  Especially useful for testing purposes (unit tests).\n* ... and others.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "General utilities (not related to chemistry)",
    "version": "2.0.0",
    "project_urls": {
        "Documentation": "https://rxn4chemistry.github.io/rxn-utilities/",
        "Homepage": "https://github.com/rxn4chemistry/rxn-utilities",
        "Repository": "https://github.com/rxn4chemistry/rxn-utilities"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "353983dcba297793c24aa0402c80c628e571d425837362d30ac3fae9b3f0f9dd",
                "md5": "068fd09b862ed5f7cbb896ea447ea101",
                "sha256": "d1d8598040f9d0fdabfea15d0ec140bd39aa1d32d873efa1890143a7ab29b030"
            },
            "downloads": -1,
            "filename": "rxn_utils-2.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "068fd09b862ed5f7cbb896ea447ea101",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 30089,
            "upload_time": "2024-02-13T20:37:40",
            "upload_time_iso_8601": "2024-02-13T20:37:40.105344Z",
            "url": "https://files.pythonhosted.org/packages/35/39/83dcba297793c24aa0402c80c628e571d425837362d30ac3fae9b3f0f9dd/rxn_utils-2.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cd53fb8eaf6e2119aacfbc75b9458923535c72e462fac06c637b6e5419756444",
                "md5": "14c8f3f325c8c41e6cb5f81224cc3365",
                "sha256": "5af2feabb4b82dffb2aa3dda6973c6d9f658175d36380ac36d8e88808afa033e"
            },
            "downloads": -1,
            "filename": "rxn-utils-2.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "14c8f3f325c8c41e6cb5f81224cc3365",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 34193,
            "upload_time": "2024-02-13T20:37:41",
            "upload_time_iso_8601": "2024-02-13T20:37:41.442736Z",
            "url": "https://files.pythonhosted.org/packages/cd/53/fb8eaf6e2119aacfbc75b9458923535c72e462fac06c637b6e5419756444/rxn-utils-2.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-13 20:37:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rxn4chemistry",
    "github_project": "rxn-utilities",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "rxn-utils"
}
        
Elapsed time: 0.16659s