randonneur


Namerandonneur JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/brightway-lca/randonneur
SummaryMigrate LCA data. Compatible with the Brightway framework.
upload_time2023-03-14 22:14:14
maintainerChris Mutel
docs_urlNone
authorChris Mutel
requires_python>=3.8
licenseBSD-3-Clause
keywords "brightway" "development"
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # randonneur

<img src="randonneur.jpg" alt="https://www.flickr.com/photos/jswg/35681111281/">

Keep moving forward.

Randonneur is a library to make changes to life cycle inventory databases. You can use it to re-link your data to the latest version of a background database, to update existing databases with new data, or to perform other data transformations. Randonneur uses JSON files to describe these changes; contrast this with [wurst](https://github.com/polca/wurst), which can do these manipulations and more, but documents its manipulations in code.

Although designed to work with [Brightway](https://brightway.dev/), this library is not Brightway-specific.

[![PyPI](https://img.shields.io/pypi/v/randonneur.svg)][pypi status]
[![Status](https://img.shields.io/pypi/status/randonneur.svg)][pypi status]
[![Python Version](https://img.shields.io/pypi/pyversions/randonneur)][pypi status]
[![License](https://img.shields.io/pypi/l/randonneur)][license]

[![Tests](https://github.com/brightway-lca/randonneur/workflows/Tests/badge.svg)][tests]
[![Codecov](https://codecov.io/gh/brightway-lca/randonneur/branch/main/graph/badge.svg)][codecov]

[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)][pre-commit]
[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)][black]

[pypi status]: https://pypi.org/project/randonneur/
[tests]: https://github.com/brightway-lca/randonneur/actions?workflow=Tests
[codecov]: https://app.codecov.io/gh/brightway-lca/randonneur
[pre-commit]: https://github.com/pre-commit/pre-commit
[black]: https://github.com/psf/black

## Usage

### Basic pattern

* Load a `randonneur` data migration file.
* Load an inventory database in the [`wurst` format](https://wurst.readthedocs.io/#internal-data-format)
* Do any necessary data preparation steps, e.g. nomenclature harmonization between migration and database fields
* Decide on what data types you want to apply changes to; `exchanges`, `datasets`, or both.
* Decide on what types of changes you want to apply: `create`, `replace`, `update`, `delete`, and `disaggregate`
* Decide whether you want to apply the changes in the migration file to all datasets/exchanges or just a subset; build `dataset_filter` and/or `exchange_filter` functions if necessary.
* Apply `migrate_datasets` or `migrate_exchanges`
* Write the resulting database to disk

### Data format

Migration data is specified in a JSON file as a single dictionary. This file **must** include the following keys:

* `name`: Follows the [data package specification](https://specs.frictionlessdata.io/data-package/#name)
* `licenses`: Follows the [data package specification](https://specs.frictionlessdata.io/data-package/#licenses). Must be a list.

In addition, the following properties should follow the [data package specification](https://specs.frictionlessdata.io/data-package/) if provided:

* `version`
* `description`
* `sources`
* `homepage`
* `contributors`
* `created`

Finally, at least one change type should be included. The change types are:

* `create-datasets`
* `create-exchanges`
* `replace`
* `update`
* `udpate`
* `delete`
* `disaggregate`

See the directory `examples` for real-world implementations.




For migrating exchanges: Given a database, iterate through the datasets. If a `dataset_filter` is given, ignore any datasets which don't pass the filter. In each dataset, iterate through the exchanges. If an `exchange_filter` is given, ignore any exchanges which don't pass the filter. For each exchange, look at the following possible transformations in order: `delete`, `replace`, `update`, and `disaggregate`. Only one transformation can be done to an exchange. Each transformation will change or delete the exchange under consideration, and maybe add some new exchanges to the dataset, though this addition will only happen after the original exchanges have been examined. After looking at all the exchanges, apply the `create` transformation to add more exchanges if provided.

For each exchange and transformation, we need to decide if that transformation should be applied. We do this based on the attributes of the dataset and exchange, and the attributes given in the transformation data. We compare the attribute values for a given set of fields, and these attributes must match exactly. The default fields are `name`, `reference product`, `product`, `location`, `unit`; you can specify your own fields.

Migrating datasets works the same way, except that we operate directly on the datasets instead of the exchanges.

### Migrating exchanges

Exchanges are the consumption or production of a good or service. Exchanges link two datasets (two activities, one product and one activity, one activity and one biosphere flow, or even other dataset types). We support the following types of exchange changes:

* `delete`
* `replace`
* `update`
* `disaggregate`
* `create`

#### Create

Creates a new exchange in all datasets, or in one specific dataset.

Because we are specifying a new exchange, we need to list **all** information needed to define an exchange, **including** the exchange `amount`. This is different than the other modification types, where *relative* amounts are given with the key `allocation`. We can't give relative amounts here because we have no exchange to refer to, and we don't have a surefire way to identify the reference production exchange (and there might not be one in any case).

If you want to add an exchange to all datasets:

```python
{
    "create": [{
        "targets": [{
            # All fields needed to define an exchange
        }]
    }]
}
```

If you only want to create an exchange in one dataset:

```python
{
    "create": [{
        "targets": [{
            # All fields needed to define an exchange
        }],
        "dataset": {
            # All fields needed to identify the dataset
        }
    }]
}
```

`dataset` must be a `dict`, not a list; it can only identify one dataset.

Note that in the `wurst` format, `dataset` use the key `reference product` while exchanges use the key `product`; these are two different concepts, so have different keys.

### Replace

Replacement substitutes an exchange one-to-one; as such, the new exchange must be completely defined. **However**, the `amount` should not be specified; rather, an `allocation` factor should be given, and the `amount` of the original exchange will be multiplied by `allocation`.

If `allocation` is not given, a default value of 1.0 is used.

**Note**: `randonneur` currently does not adjust uncertainty when rescaling.

Aside from the quantitative values, no other data from the original exchange is taken over to the new exchange. If you only want to change a few fields, use an `update` instead. If you don't want the exchange amount re-scaled, use a combination of `delete` and `create`.

The data format for `replace` type is:

```python
{
    "replace": [{
        "source": {
            # All fields needed to identify the exchange to be replaced
        },
        "target": {
            # All fields needed to define the new exchange
        },
        # `dataset` is optional
        "dataset": {
            # All fields needed to identify the dataset to change
        }
    }]
}
```

### Update

`update` differs from `replace` in that it changes attributes of the original exchange instead of creating a completely new object; otherwise, its behaviour is the same as `replace`. The data format is:

```python
{
    "update": [{
        "source": {
            # All fields needed to identify the exchange to be modified
        },
        "target": {
            # Some fields which you want to change
        },
        # `dataset` is optional
        "dataset": {
            # All fields needed to identify the dataset to change
        }
    }]
}
```

### Delete

Delete exchanges. Follows the same patterns as `replace` and `update`:

```python
{
    "delete": [{
        "source": {
            # All fields needed to identify the exchange to be deleted
        },
        # `dataset` is optional
        "dataset": {
            # All fields needed to identify the dataset to change
        }
    }]
}
```

### Disaggregate

Disaggregation is splitting one exchange into many. The `allocation` field is used to determine how much of the exchange passes to each new exchange.

The new exchanges start as **copies** of the original exchange, and are updating using the additional data provided. In other words, this functions more like an `update` than a `replace`. This is because the most common use case for disaggregation is to split one input or output among several regions, where almost all metadata for the child exchanges would be identical.

`allocation` fields do not have to sum to one.

The data format includes a list of new exchanges for each matched source:

```python
{
    "disaggregate": [{
        "source": {
            # All fields needed to identify the exchange to be disaggregated
        },
        "targets": [{
            # Some fields which you want to change
        }],
        # `dataset` is optional
        "dataset": {
            # All fields needed to identify the dataset to change
        }
    }]
}
```

## Contributing

Contributions are very welcome.
To learn more, see the [Contributor Guide].

## License

Distributed under the terms of the [BSD 3 Clause license][license],
`randonneur` is free and open source software.

## Issues

If you encounter any problems,
please [file an issue](https://github.com/cmutel/randonneur/issues/new/choose) along with a detailed description.


<!-- github-only -->

[command-line reference]: https://randonneur.readthedocs.io/en/latest/usage.html
[license]: https://github.com/brightway-lca/randonneur/blob/main/LICENSE
[contributor guide]: https://github.com/brightway-lca/randonneur/blob/main/CONTRIBUTING.md

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/brightway-lca/randonneur",
    "name": "randonneur",
    "maintainer": "Chris Mutel",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "<cmutel@gmail.com>",
    "keywords": "\"brightway\",\"development\"",
    "author": "Chris Mutel",
    "author_email": "<cmutel@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ff/86/cac722fdc2966c62d942a01e6d3a26fdd93e9bca188156b571882bf01515/randonneur-0.0.4.tar.gz",
    "platform": "any",
    "description": "# randonneur\n\n<img src=\"randonneur.jpg\" alt=\"https://www.flickr.com/photos/jswg/35681111281/\">\n\nKeep moving forward.\n\nRandonneur is a library to make changes to life cycle inventory databases. You can use it to re-link your data to the latest version of a background database, to update existing databases with new data, or to perform other data transformations. Randonneur uses JSON files to describe these changes; contrast this with [wurst](https://github.com/polca/wurst), which can do these manipulations and more, but documents its manipulations in code.\n\nAlthough designed to work with [Brightway](https://brightway.dev/), this library is not Brightway-specific.\n\n[![PyPI](https://img.shields.io/pypi/v/randonneur.svg)][pypi status]\n[![Status](https://img.shields.io/pypi/status/randonneur.svg)][pypi status]\n[![Python Version](https://img.shields.io/pypi/pyversions/randonneur)][pypi status]\n[![License](https://img.shields.io/pypi/l/randonneur)][license]\n\n[![Tests](https://github.com/brightway-lca/randonneur/workflows/Tests/badge.svg)][tests]\n[![Codecov](https://codecov.io/gh/brightway-lca/randonneur/branch/main/graph/badge.svg)][codecov]\n\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)][pre-commit]\n[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)][black]\n\n[pypi status]: https://pypi.org/project/randonneur/\n[tests]: https://github.com/brightway-lca/randonneur/actions?workflow=Tests\n[codecov]: https://app.codecov.io/gh/brightway-lca/randonneur\n[pre-commit]: https://github.com/pre-commit/pre-commit\n[black]: https://github.com/psf/black\n\n## Usage\n\n### Basic pattern\n\n* Load a `randonneur` data migration file.\n* Load an inventory database in the [`wurst` format](https://wurst.readthedocs.io/#internal-data-format)\n* Do any necessary data preparation steps, e.g. nomenclature harmonization between migration and database fields\n* Decide on what data types you want to apply changes to; `exchanges`, `datasets`, or both.\n* Decide on what types of changes you want to apply: `create`, `replace`, `update`, `delete`, and `disaggregate`\n* Decide whether you want to apply the changes in the migration file to all datasets/exchanges or just a subset; build `dataset_filter` and/or `exchange_filter` functions if necessary.\n* Apply `migrate_datasets` or `migrate_exchanges`\n* Write the resulting database to disk\n\n### Data format\n\nMigration data is specified in a JSON file as a single dictionary. This file **must** include the following keys:\n\n* `name`: Follows the [data package specification](https://specs.frictionlessdata.io/data-package/#name)\n* `licenses`: Follows the [data package specification](https://specs.frictionlessdata.io/data-package/#licenses). Must be a list.\n\nIn addition, the following properties should follow the [data package specification](https://specs.frictionlessdata.io/data-package/) if provided:\n\n* `version`\n* `description`\n* `sources`\n* `homepage`\n* `contributors`\n* `created`\n\nFinally, at least one change type should be included. The change types are:\n\n* `create-datasets`\n* `create-exchanges`\n* `replace`\n* `update`\n* `udpate`\n* `delete`\n* `disaggregate`\n\nSee the directory `examples` for real-world implementations.\n\n\n\n\nFor migrating exchanges: Given a database, iterate through the datasets. If a `dataset_filter` is given, ignore any datasets which don't pass the filter. In each dataset, iterate through the exchanges. If an `exchange_filter` is given, ignore any exchanges which don't pass the filter. For each exchange, look at the following possible transformations in order: `delete`, `replace`, `update`, and `disaggregate`. Only one transformation can be done to an exchange. Each transformation will change or delete the exchange under consideration, and maybe add some new exchanges to the dataset, though this addition will only happen after the original exchanges have been examined. After looking at all the exchanges, apply the `create` transformation to add more exchanges if provided.\n\nFor each exchange and transformation, we need to decide if that transformation should be applied. We do this based on the attributes of the dataset and exchange, and the attributes given in the transformation data. We compare the attribute values for a given set of fields, and these attributes must match exactly. The default fields are `name`, `reference product`, `product`, `location`, `unit`; you can specify your own fields.\n\nMigrating datasets works the same way, except that we operate directly on the datasets instead of the exchanges.\n\n### Migrating exchanges\n\nExchanges are the consumption or production of a good or service. Exchanges link two datasets (two activities, one product and one activity, one activity and one biosphere flow, or even other dataset types). We support the following types of exchange changes:\n\n* `delete`\n* `replace`\n* `update`\n* `disaggregate`\n* `create`\n\n#### Create\n\nCreates a new exchange in all datasets, or in one specific dataset.\n\nBecause we are specifying a new exchange, we need to list **all** information needed to define an exchange, **including** the exchange `amount`. This is different than the other modification types, where *relative* amounts are given with the key `allocation`. We can't give relative amounts here because we have no exchange to refer to, and we don't have a surefire way to identify the reference production exchange (and there might not be one in any case).\n\nIf you want to add an exchange to all datasets:\n\n```python\n{\n    \"create\": [{\n        \"targets\": [{\n            # All fields needed to define an exchange\n        }]\n    }]\n}\n```\n\nIf you only want to create an exchange in one dataset:\n\n```python\n{\n    \"create\": [{\n        \"targets\": [{\n            # All fields needed to define an exchange\n        }],\n        \"dataset\": {\n            # All fields needed to identify the dataset\n        }\n    }]\n}\n```\n\n`dataset` must be a `dict`, not a list; it can only identify one dataset.\n\nNote that in the `wurst` format, `dataset` use the key `reference product` while exchanges use the key `product`; these are two different concepts, so have different keys.\n\n### Replace\n\nReplacement substitutes an exchange one-to-one; as such, the new exchange must be completely defined. **However**, the `amount` should not be specified; rather, an `allocation` factor should be given, and the `amount` of the original exchange will be multiplied by `allocation`.\n\nIf `allocation` is not given, a default value of 1.0 is used.\n\n**Note**: `randonneur` currently does not adjust uncertainty when rescaling.\n\nAside from the quantitative values, no other data from the original exchange is taken over to the new exchange. If you only want to change a few fields, use an `update` instead. If you don't want the exchange amount re-scaled, use a combination of `delete` and `create`.\n\nThe data format for `replace` type is:\n\n```python\n{\n    \"replace\": [{\n        \"source\": {\n            # All fields needed to identify the exchange to be replaced\n        },\n        \"target\": {\n            # All fields needed to define the new exchange\n        },\n        # `dataset` is optional\n        \"dataset\": {\n            # All fields needed to identify the dataset to change\n        }\n    }]\n}\n```\n\n### Update\n\n`update` differs from `replace` in that it changes attributes of the original exchange instead of creating a completely new object; otherwise, its behaviour is the same as `replace`. The data format is:\n\n```python\n{\n    \"update\": [{\n        \"source\": {\n            # All fields needed to identify the exchange to be modified\n        },\n        \"target\": {\n            # Some fields which you want to change\n        },\n        # `dataset` is optional\n        \"dataset\": {\n            # All fields needed to identify the dataset to change\n        }\n    }]\n}\n```\n\n### Delete\n\nDelete exchanges. Follows the same patterns as `replace` and `update`:\n\n```python\n{\n    \"delete\": [{\n        \"source\": {\n            # All fields needed to identify the exchange to be deleted\n        },\n        # `dataset` is optional\n        \"dataset\": {\n            # All fields needed to identify the dataset to change\n        }\n    }]\n}\n```\n\n### Disaggregate\n\nDisaggregation is splitting one exchange into many. The `allocation` field is used to determine how much of the exchange passes to each new exchange.\n\nThe new exchanges start as **copies** of the original exchange, and are updating using the additional data provided. In other words, this functions more like an `update` than a `replace`. This is because the most common use case for disaggregation is to split one input or output among several regions, where almost all metadata for the child exchanges would be identical.\n\n`allocation` fields do not have to sum to one.\n\nThe data format includes a list of new exchanges for each matched source:\n\n```python\n{\n    \"disaggregate\": [{\n        \"source\": {\n            # All fields needed to identify the exchange to be disaggregated\n        },\n        \"targets\": [{\n            # Some fields which you want to change\n        }],\n        # `dataset` is optional\n        \"dataset\": {\n            # All fields needed to identify the dataset to change\n        }\n    }]\n}\n```\n\n## Contributing\n\nContributions are very welcome.\nTo learn more, see the [Contributor Guide].\n\n## License\n\nDistributed under the terms of the [BSD 3 Clause license][license],\n`randonneur` is free and open source software.\n\n## Issues\n\nIf you encounter any problems,\nplease [file an issue](https://github.com/cmutel/randonneur/issues/new/choose) along with a detailed description.\n\n\n<!-- github-only -->\n\n[command-line reference]: https://randonneur.readthedocs.io/en/latest/usage.html\n[license]: https://github.com/brightway-lca/randonneur/blob/main/LICENSE\n[contributor guide]: https://github.com/brightway-lca/randonneur/blob/main/CONTRIBUTING.md\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "Migrate LCA data. Compatible with the Brightway framework.",
    "version": "0.0.4",
    "split_keywords": [
        "\"brightway\"",
        "\"development\""
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "11edf116ed9c850e696cdaae7c15dd34c7edc2d4981d338f14503227ad3b6973",
                "md5": "12c8ecaf6f559f5fb59519a8ed86f6ae",
                "sha256": "9d9b07ecc52e830143773141071f11890d96343eb4bf4565485405d74b92f8a0"
            },
            "downloads": -1,
            "filename": "randonneur-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "12c8ecaf6f559f5fb59519a8ed86f6ae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 9697,
            "upload_time": "2023-03-14T22:14:12",
            "upload_time_iso_8601": "2023-03-14T22:14:12.713932Z",
            "url": "https://files.pythonhosted.org/packages/11/ed/f116ed9c850e696cdaae7c15dd34c7edc2d4981d338f14503227ad3b6973/randonneur-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ff86cac722fdc2966c62d942a01e6d3a26fdd93e9bca188156b571882bf01515",
                "md5": "2f08418f8ffbfb3d28c395c961670022",
                "sha256": "f42660443b01726c021a6485f93471468aa9251f1e1825a692b776889b994eec"
            },
            "downloads": -1,
            "filename": "randonneur-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "2f08418f8ffbfb3d28c395c961670022",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 12371,
            "upload_time": "2023-03-14T22:14:14",
            "upload_time_iso_8601": "2023-03-14T22:14:14.454061Z",
            "url": "https://files.pythonhosted.org/packages/ff/86/cac722fdc2966c62d942a01e6d3a26fdd93e9bca188156b571882bf01515/randonneur-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-14 22:14:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "brightway-lca",
    "github_project": "randonneur",
    "lcname": "randonneur"
}
        
Elapsed time: 0.04272s