optimade-maker


Nameoptimade-maker JSON
Version 0.3.0 PyPI version JSON
download
home_pageNone
SummaryTools for making OPTIMADE APIs from raw structural data.
upload_time2024-07-30 16:47:42
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords optimade jsonapi materials
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center" style="padding: 2em;">
<span style="padding: 1em">
<img height="70px" align="center" src="https://matsci.org/uploads/default/original/2X/b/bd2f59b3bf14fb046b74538750699d7da4c19ac1.svg">
</span>
</div>

# <div align="center">optimade-maker</div>

[![PyPI - Version](https://img.shields.io/pypi/v/optimade-maker?color=4CC61E)](https://pypi.org/project/optimade-maker/)

Tools for making [OPTIMADE APIs](https://optimade.org) from various formats of structural data (e.g. an archive of CIF files).

This repository contains the `src/optimade-maker` Python package and the corresponding CLI tool `optimake` that work towards this aim. Features include

- definition of a config file format (`optimade.yaml`) for annotating data archives to be used in the OPTIMADE ecosystem;
- conversion of the raw data into corresponding OPTIMADE types using pre-existing parsers (e.g., ASE for structures);
- conversion of the annotated data archive into an intermediate JSONLines file format that can be ingested into a database and used to serve a full OPTIMADE API.
- serving either an annotated data archive or a JSONLines file as an OPTIMADE API (using the [`optimade-python-tools`](https://github.com/Materials-Consortia/optimade-python-tools/)
  reference server implementation).

## Usage

See `./examples` for a more complete set of supported formats and corresponding `optimade.yaml` config files.

### Annotating with `optimade.yaml`

To annotate your structural data for `optimade-maker`, the data archive needs to be accompanied by an `optimade.yaml` config file. The following is a simple example for a zip archive (`structures.zip`) of cif files together with an optional property file (`data.csv`):

```yaml
config_version: 0.1.0
database_description: Simple database

entries:
  - entry_type: structures
    entry_paths:
      - file: structures.zip
        matches:
          - cifs/*/*.cif
    # (optional) property file and definitions:
    property_paths:
      - file: data.csv
    property_definitions:
      - name: energy
        title: Total energy per atom
        description: The total energy per atom as computed by DFT
        unit: eV/atom
        type: float
```

### Structure `id`s and property files

`optimade-maker` will assign an `id` for each structure based on its full path in the archive, following a simple deterministic rule: from the set of all archive paths, the maximum common path prefix and postfix (including file extensions) are removed. E.g.

```
structures.zip/cifs/set1/101.cif
structures.zip/cifs/set2/102.cif
```

produces `["set1/101", "set2/102"]`.

The property files need to either refer to these `id`s or the full path in the archive to be associated with a structure. E.g. a possible property `csv` file could be

```csv
id,energy
set1/101,2.5
structures.zip/cifs/set2/102.cif,3.2
```

### Installing and running `optimake`

Install with

```bash
pip install optimade-maker
```

this will also make the `optimake` CLI utility available.

For a folder containing the data archive and the `optimade.yaml` file (such as in `/examples`), run

- `optimake convert .` to just convert the entry into the JSONL format (see below).
- `optimake serve .` to start the OPTIMADE API (this also first converts the entry, if needed);

For more detailed information see also `optimake --help`.

## `optimade-maker` JSONLines Format

As described above, `optimade-maker` works via an intermediate JSONLines file representation of an OPTIMADE API (see also the [corresponding issue in the specification](https://github.com/Materials-Consortia/OPTIMADE/issues/471)).
This file should provide enough metadata to spin up an OPTIMADE API with many different entry types.
The format is as follows:

- First line must be a dictionary with the key `x-optimade`, containing a sub-dictionary of metadata (such as the OPTIMADE API version).
- Second line contains the `info/structures` endpoint.
- Third line contains the `info/references` endpoint, if present.
- Then each line contains an entry from the corresponding individual structure/reference endpoints.

```json
{"x-optimade": {"meta": {"api_version": "1.1.0"}}}
{"type": "info", "id": "structures", "properties": {...}}
{"type": "info", "id": "references", "properties": {...}}
{"type": "structures", "id": "1234", "attributes": {...}}
{"type": "structures", "id": "1235", "attributes": {...}}
{"type": "references", "id": "sfdas", "attributes": {...}}
```

NOTE: the `info/` endpoints in [OPTIMADE v1.2.0](https://www.optimade.org/specification/#entry-listing-info-endpoints) will include `type` and `id` as well.

## Relevant links

- [Roadmap and meeting notes](https://docs.google.com/document/d/1cIpwuX6Ty5d3ZHKYWktQaBBQcI9fYmgG_hsD1P1UpO4/edit)
- [OPTIMADE serialization format notes](https://docs.google.com/document/d/1vf8_qxSRP5lCSb0P3M9gTr6nqkERxgOoSDno6YLcCjo/edit)
- [Flow diagram](https://excalidraw.com/#json=MBNl66sARCQekVrKZXDg8,K35f5FwmiS46vlsYGMJdrw)

## Contributors

Initial prototype was created at the Paul Scherrer Institute, Switzerland in the week of
12th-16th June 2023.

Authors (alphabetical):

- Kristjan Eimre
- Matthew Evans
- Giovanni Pizzi
- Gian-Marco Rignanese
- Jusong Yu
- Xing Wang

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "optimade-maker",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "optimade, jsonapi, materials",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/26/20/ef437f60f09b811fbb2a306ee1331b314c3c002e52ba8336bc8bf6f7f5a4/optimade_maker-0.3.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\" style=\"padding: 2em;\">\n<span style=\"padding: 1em\">\n<img height=\"70px\" align=\"center\" src=\"https://matsci.org/uploads/default/original/2X/b/bd2f59b3bf14fb046b74538750699d7da4c19ac1.svg\">\n</span>\n</div>\n\n# <div align=\"center\">optimade-maker</div>\n\n[![PyPI - Version](https://img.shields.io/pypi/v/optimade-maker?color=4CC61E)](https://pypi.org/project/optimade-maker/)\n\nTools for making [OPTIMADE APIs](https://optimade.org) from various formats of structural data (e.g. an archive of CIF files).\n\nThis repository contains the `src/optimade-maker` Python package and the corresponding CLI tool `optimake` that work towards this aim. Features include\n\n- definition of a config file format (`optimade.yaml`) for annotating data archives to be used in the OPTIMADE ecosystem;\n- conversion of the raw data into corresponding OPTIMADE types using pre-existing parsers (e.g., ASE for structures);\n- conversion of the annotated data archive into an intermediate JSONLines file format that can be ingested into a database and used to serve a full OPTIMADE API.\n- serving either an annotated data archive or a JSONLines file as an OPTIMADE API (using the [`optimade-python-tools`](https://github.com/Materials-Consortia/optimade-python-tools/)\n  reference server implementation).\n\n## Usage\n\nSee `./examples` for a more complete set of supported formats and corresponding `optimade.yaml` config files.\n\n### Annotating with `optimade.yaml`\n\nTo annotate your structural data for `optimade-maker`, the data archive needs to be accompanied by an `optimade.yaml` config file. The following is a simple example for a zip archive (`structures.zip`) of cif files together with an optional property file (`data.csv`):\n\n```yaml\nconfig_version: 0.1.0\ndatabase_description: Simple database\n\nentries:\n  - entry_type: structures\n    entry_paths:\n      - file: structures.zip\n        matches:\n          - cifs/*/*.cif\n    # (optional) property file and definitions:\n    property_paths:\n      - file: data.csv\n    property_definitions:\n      - name: energy\n        title: Total energy per atom\n        description: The total energy per atom as computed by DFT\n        unit: eV/atom\n        type: float\n```\n\n### Structure `id`s and property files\n\n`optimade-maker` will assign an `id` for each structure based on its full path in the archive, following a simple deterministic rule: from the set of all archive paths, the maximum common path prefix and postfix (including file extensions) are removed. E.g.\n\n```\nstructures.zip/cifs/set1/101.cif\nstructures.zip/cifs/set2/102.cif\n```\n\nproduces `[\"set1/101\", \"set2/102\"]`.\n\nThe property files need to either refer to these `id`s or the full path in the archive to be associated with a structure. E.g. a possible property `csv` file could be\n\n```csv\nid,energy\nset1/101,2.5\nstructures.zip/cifs/set2/102.cif,3.2\n```\n\n### Installing and running `optimake`\n\nInstall with\n\n```bash\npip install optimade-maker\n```\n\nthis will also make the `optimake` CLI utility available.\n\nFor a folder containing the data archive and the `optimade.yaml` file (such as in `/examples`), run\n\n- `optimake convert .` to just convert the entry into the JSONL format (see below).\n- `optimake serve .` to start the OPTIMADE API (this also first converts the entry, if needed);\n\nFor more detailed information see also `optimake --help`.\n\n## `optimade-maker` JSONLines Format\n\nAs described above, `optimade-maker` works via an intermediate JSONLines file representation of an OPTIMADE API (see also the [corresponding issue in the specification](https://github.com/Materials-Consortia/OPTIMADE/issues/471)).\nThis file should provide enough metadata to spin up an OPTIMADE API with many different entry types.\nThe format is as follows:\n\n- First line must be a dictionary with the key `x-optimade`, containing a sub-dictionary of metadata (such as the OPTIMADE API version).\n- Second line contains the `info/structures` endpoint.\n- Third line contains the `info/references` endpoint, if present.\n- Then each line contains an entry from the corresponding individual structure/reference endpoints.\n\n```json\n{\"x-optimade\": {\"meta\": {\"api_version\": \"1.1.0\"}}}\n{\"type\": \"info\", \"id\": \"structures\", \"properties\": {...}}\n{\"type\": \"info\", \"id\": \"references\", \"properties\": {...}}\n{\"type\": \"structures\", \"id\": \"1234\", \"attributes\": {...}}\n{\"type\": \"structures\", \"id\": \"1235\", \"attributes\": {...}}\n{\"type\": \"references\", \"id\": \"sfdas\", \"attributes\": {...}}\n```\n\nNOTE: the `info/` endpoints in [OPTIMADE v1.2.0](https://www.optimade.org/specification/#entry-listing-info-endpoints) will include `type` and `id` as well.\n\n## Relevant links\n\n- [Roadmap and meeting notes](https://docs.google.com/document/d/1cIpwuX6Ty5d3ZHKYWktQaBBQcI9fYmgG_hsD1P1UpO4/edit)\n- [OPTIMADE serialization format notes](https://docs.google.com/document/d/1vf8_qxSRP5lCSb0P3M9gTr6nqkERxgOoSDno6YLcCjo/edit)\n- [Flow diagram](https://excalidraw.com/#json=MBNl66sARCQekVrKZXDg8,K35f5FwmiS46vlsYGMJdrw)\n\n## Contributors\n\nInitial prototype was created at the Paul Scherrer Institute, Switzerland in the week of\n12th-16th June 2023.\n\nAuthors (alphabetical):\n\n- Kristjan Eimre\n- Matthew Evans\n- Giovanni Pizzi\n- Gian-Marco Rignanese\n- Jusong Yu\n- Xing Wang\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Tools for making OPTIMADE APIs from raw structural data.",
    "version": "0.3.0",
    "project_urls": null,
    "split_keywords": [
        "optimade",
        " jsonapi",
        " materials"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cf6c39035c4e4028f7027c5fd4f57cfc20188d187c7ef8e86aefa015fcfefe21",
                "md5": "782633c7487e28d84b99b7056a37affc",
                "sha256": "ca7739a72596d8ada7d5d3a080f1fd0d902807197e6fdb0597b34d59204c65ba"
            },
            "downloads": -1,
            "filename": "optimade_maker-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "782633c7487e28d84b99b7056a37affc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 22276,
            "upload_time": "2024-07-30T16:47:40",
            "upload_time_iso_8601": "2024-07-30T16:47:40.755104Z",
            "url": "https://files.pythonhosted.org/packages/cf/6c/39035c4e4028f7027c5fd4f57cfc20188d187c7ef8e86aefa015fcfefe21/optimade_maker-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2620ef437f60f09b811fbb2a306ee1331b314c3c002e52ba8336bc8bf6f7f5a4",
                "md5": "c30cfcfd4fbbe0fa30b533a6e4a7d7cb",
                "sha256": "8bc30a074e92dae3103634c747f4153db3bcfdd30e6fed8daa7c9539c671f25d"
            },
            "downloads": -1,
            "filename": "optimade_maker-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c30cfcfd4fbbe0fa30b533a6e4a7d7cb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 23622,
            "upload_time": "2024-07-30T16:47:42",
            "upload_time_iso_8601": "2024-07-30T16:47:42.293489Z",
            "url": "https://files.pythonhosted.org/packages/26/20/ef437f60f09b811fbb2a306ee1331b314c3c002e52ba8336bc8bf6f7f5a4/optimade_maker-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-30 16:47:42",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "optimade-maker"
}
        
Elapsed time: 0.29041s