# FPdataViewer
Reads first-principle atomic structures files and graphs various statistics to provide a small overview of the file's content using matplotlib.
Built around [VASP MLFF](https://www.vasp.at/wiki/index.php/Machine_learning_force_field_calculations:_Basics) [input and output files](https://www.vasp.at/wiki/index.php/ML_AB): ML_AB and ML_ABN.
Either saves to a PDF file (`plot`) or launches matplotlib (`plot --interactive`).
Also provides some tools for converting between file types using [ASE](https://wiki.fysik.dtu.dk/ase/)
(`convert`), repairing broken files (`validate`), and quickly inspecting the contents files (`inspect`). Some of these tools are also provided by the ASE CLI, and will become obsolete when the _vasp-mlab_ format is implemented).
| ![front page](images/image_a.png) | ![image page](images/image_b.png) |
|---------------------------------------------------|--------------------------------------------------|
| ![atom type page for bismuth](images/image_c.png) | ![atom type page for oxygen](images/image_d.png) |
## Table of contents
- [Installation](#installation-using-pip)
- [Requirements](#requirements)
- [Usage](#usage)
- [plot](#fpdataviewer-plot)
- [inspect](#fpdataviewer-inspect)
- [convert](#fpdataviewer-convert)
- [validate](#fpdataviewer-validate)
- [Config](#config-file)
## Installation
### pip
The easiest method is to install through pip.
```shell
pip install fpdataviewer
```
> Installation through pip is the preferred method, but will pull in a number of large libraries used in analysis, some of which may not be supported on Windows.
> If this is not preferable, consider installing with `--no-deps` and using `--skip` to avoid said libraries (see [requirements](#requirements) and [options](#options)).
### conda
```shell
# NOT CURRENTLY AVAILABLE
```
## Requirements
Not all dependencies are required when `--skip` is used.
| Component | Dependencies (immediate) |
|-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **required** | **[numpy](https://pypi.org/project/numpy/) [pandas](https://pypi.org/project/pandas/) [matplotlib](https://pypi.org/project/matplotlib/) [seaborn](https://pypi.org/project/seaborn/)** |
| **radial distribution functions** | **[numba](https://pypi.org/project/numba/)** |
| **descriptors** | **[scikit-learn](https://pypi.org/project/scikit-learn/) [dscribe](https://pypi.org/project/dscribe/) (possible compatability issues)** |
| **rendering** | **[ovito](https://pypi.org/project/ovito/) [PySide6](https://pypi.org/project/PySide6/) [Pillow](https://pypi.org/project/Pillow/)** |
## Usage
### fpdataviewer plot
Main functionality. Graphs statistics into pdf or onto screen (with `--interactive``).
```shell
# Basic PDF generation
fpdataviewer plot -i examples/ML_AB -o overview.pdf
# Interactive plots
fpdataviewer plot -i examples/ML_AB --interactive
# Specify custom config
fpdataviewer plot -i examples/ML_AB --config mlab_viewer.json
# Skip radial distribution functions and image rendering, rasterize remaining graphs
fpdataviewer plot --rasterize --skip rdf img
```
<details>
<summary>Options</summary>
##### `--interactive`, `-x`
Save to a PDF file (`pdf`, default), show interactive plots (`plt`), or only print to console (`none`).
##### `--config <file>`, `-c`
See [Config file](#config-file).
##### `--skip <rdf/desc/img>`, `-s`
Skip calculations for radial distribution functions (`rdf`), descriptors (`desc`), or image rendering (`img`). Multiple can be selected. Useful when only certain statistics are needed.
##### `--strict`, `-t`
Validates the input file.
Some formats (like VASP's ML_AB) contain redundant or possibly self-contradictory information that can cause parsers to fail unpredictably.
This option will check the input file against specifications to minimize these errors and help the user repair the broken file.
##### `--rasterize`, `-r`
Disables vector image format for plots and uses raster images. This can greatly reduce file size when many descriptors are being drawn. Simply feeds `rasterize=True` to matplotlib.
</details>
### fpdataviewer inspect
Summarized file contents to console, no analysis. Recommened to use before plotting large files.
```shell
fpdataviewer inspect -i examples/ML_AB
```
<details>
<summary>Options</summary>
##### `--strict`, `-t`
Validates the input file. See `fpdataviewer validate`.
</details>
### fpdataviewer convert
Converts between file types using ASE. Useful for reading ML_AB files, otherwise recommended to use ASE CLI directly instead.
```shell
# Convert first structure in ML_AB file to a POSCAR file
fpdataviewer convert -i examples/ML_AB -o examples/POSCAR -f vasp-mlab -t vasp -x 0
```
<details>
<summary>Options</summary>
##### `--from`, `-f`
Source format; see [ASE documentation](https://wiki.fysik.dtu.dk/ase/ase/io/io.html) for options. Use `vasp-mlab` for ML_AB format.
##### `--to`, `-t`
Target format; see [ASE documentation](https://wiki.fysik.dtu.dk/ase/ase/io/io.html) for options.
##### `--index`, `-x`
Selects range of structures from source, in Python slice format (e.g. `0` for the first structure, `-1` for the last, `:4` for the first four, etc.).
##### `--append`, `-a`
Appends to end of the target file instead of overwriting.
</details>
### fpdataviewer validate
Validates the input file and reports problems.
Some formats (like VASP's ML_AB) contain redundant or possibly self-contradictory information that can cause parsers to fail unpredictably.
This option will check the input file against specifications to minimize these errors and help the user repair the broken file.
```shell
fpdataviewer validate -i examples/ML_AB
```
## Config file
Specifying a custom config will override settings from the default, which is located in [config.py](fpdataviewer/cli/config.py).
```json
{
"global": {
"bins": 100
},
"rdf": {
"bins": 1000,
"structures": 1.0,
"r_min": 0.0,
"r_max": "auto",
"skip_pairs": []
},
"descriptors": {
"structures": 1.0,
"soap": {
"r_cut": "auto",
"n_max": 8,
"l_max": 8
}
},
"rendering": {
"width": 1024,
"height": 1024
}
}
```
Most settings are self-explanatory, but more specifically:
- `"descriptors"`
- See DScribe [documentation](https://singroup.github.io/dscribe/latest/doc/dscribe.descriptors.html). The inner content is fed to the respective (local) descriptor object. It should specify one of
- `"soap"`
- `"acsf"`
- `"lmbtr"`
- `"rdf"`
- `"skip_pairs"` is an array of values `"<atom 1>-<atom 2>"` (e.g. `"Bi-O"`). Usually RDF calculations are fast enough for this to be unnecessary.
- Anywhere
- `"structures"` specify the number of structures to be included in some calculation, chosen at random. It should specify
- `0.0 < x < 1.0` for a portion
- `x > 1` for a specific number
- `"auto"` will be replaced with the maximum possible radius such that radii never overlap in a periodic structure (the `non periodic distance` in the overview panel).
Raw data
{
"_id": null,
"home_page": "",
"name": "fpdataviewer",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "vasp,descriptor,machine learning,atomic structure,materials science",
"author": "",
"author_email": "Thijmen Kuipers <t.p.w.kuipers@student.utwente.nl>",
"download_url": "https://files.pythonhosted.org/packages/44/87/88e455844815687f40fbed6c1e35b69237c3d8920ea3ef0e33d990d39692/fpdataviewer-1.0.1.tar.gz",
"platform": null,
"description": "# FPdataViewer\n\nReads first-principle atomic structures files and graphs various statistics to provide a small overview of the file's content using matplotlib. \nBuilt around [VASP MLFF](https://www.vasp.at/wiki/index.php/Machine_learning_force_field_calculations:_Basics) [input and output files](https://www.vasp.at/wiki/index.php/ML_AB): ML_AB and ML_ABN.\nEither saves to a PDF file (`plot`) or launches matplotlib (`plot --interactive`). \n\nAlso provides some tools for converting between file types using [ASE](https://wiki.fysik.dtu.dk/ase/) \n(`convert`), repairing broken files (`validate`), and quickly inspecting the contents files (`inspect`). Some of these tools are also provided by the ASE CLI, and will become obsolete when the _vasp-mlab_ format is implemented).\n\n| ![front page](images/image_a.png) | ![image page](images/image_b.png) |\n|---------------------------------------------------|--------------------------------------------------|\n| ![atom type page for bismuth](images/image_c.png) | ![atom type page for oxygen](images/image_d.png) |\n\n## Table of contents\n\n- [Installation](#installation-using-pip)\n- [Requirements](#requirements)\n- [Usage](#usage)\n - [plot](#fpdataviewer-plot)\n - [inspect](#fpdataviewer-inspect)\n - [convert](#fpdataviewer-convert)\n - [validate](#fpdataviewer-validate)\n- [Config](#config-file)\n\n## Installation\n\n### pip\n\nThe easiest method is to install through pip.\n\n```shell\npip install fpdataviewer\n```\n\n> Installation through pip is the preferred method, but will pull in a number of large libraries used in analysis, some of which may not be supported on Windows. \n> If this is not preferable, consider installing with `--no-deps` and using `--skip` to avoid said libraries (see [requirements](#requirements) and [options](#options)).\n\n### conda\n\n```shell\n# NOT CURRENTLY AVAILABLE\n```\n\n## Requirements\n\nNot all dependencies are required when `--skip` is used.\n\n| Component | Dependencies (immediate) |\n|-----------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **required** | **[numpy](https://pypi.org/project/numpy/) [pandas](https://pypi.org/project/pandas/) [matplotlib](https://pypi.org/project/matplotlib/) [seaborn](https://pypi.org/project/seaborn/)** |\n| **radial distribution functions** | **[numba](https://pypi.org/project/numba/)** |\n| **descriptors** | **[scikit-learn](https://pypi.org/project/scikit-learn/) [dscribe](https://pypi.org/project/dscribe/) (possible compatability issues)** |\n| **rendering** | **[ovito](https://pypi.org/project/ovito/) [PySide6](https://pypi.org/project/PySide6/) [Pillow](https://pypi.org/project/Pillow/)** |\n\n## Usage\n\n### fpdataviewer plot\n\nMain functionality. Graphs statistics into pdf or onto screen (with `--interactive``).\n\n```shell\n# Basic PDF generation\nfpdataviewer plot -i examples/ML_AB -o overview.pdf\n\n# Interactive plots\nfpdataviewer plot -i examples/ML_AB --interactive\n\n# Specify custom config\nfpdataviewer plot -i examples/ML_AB --config mlab_viewer.json\n\n# Skip radial distribution functions and image rendering, rasterize remaining graphs\nfpdataviewer plot --rasterize --skip rdf img\n```\n\n<details>\n<summary>Options</summary>\n\n##### `--interactive`, `-x`\nSave to a PDF file (`pdf`, default), show interactive plots (`plt`), or only print to console (`none`).\n\n##### `--config <file>`, `-c`\nSee [Config file](#config-file).\n\n##### `--skip <rdf/desc/img>`, `-s`\nSkip calculations for radial distribution functions (`rdf`), descriptors (`desc`), or image rendering (`img`). Multiple can be selected. Useful when only certain statistics are needed.\n\n##### `--strict`, `-t`\nValidates the input file. \nSome formats (like VASP's ML_AB) contain redundant or possibly self-contradictory information that can cause parsers to fail unpredictably. \nThis option will check the input file against specifications to minimize these errors and help the user repair the broken file.\n\n##### `--rasterize`, `-r`\nDisables vector image format for plots and uses raster images. This can greatly reduce file size when many descriptors are being drawn. Simply feeds `rasterize=True` to matplotlib.\n\n</details>\n\n### fpdataviewer inspect\n\nSummarized file contents to console, no analysis. Recommened to use before plotting large files.\n\n```shell\nfpdataviewer inspect -i examples/ML_AB\n```\n\n<details>\n<summary>Options</summary>\n\n##### `--strict`, `-t`\nValidates the input file. See `fpdataviewer validate`.\n\n</details>\n\n### fpdataviewer convert\n\nConverts between file types using ASE. Useful for reading ML_AB files, otherwise recommended to use ASE CLI directly instead.\n\n```shell\n# Convert first structure in ML_AB file to a POSCAR file\nfpdataviewer convert -i examples/ML_AB -o examples/POSCAR -f vasp-mlab -t vasp -x 0\n```\n\n<details>\n<summary>Options</summary>\n\n##### `--from`, `-f`\nSource format; see [ASE documentation](https://wiki.fysik.dtu.dk/ase/ase/io/io.html) for options. Use `vasp-mlab` for ML_AB format.\n\n##### `--to`, `-t`\nTarget format; see [ASE documentation](https://wiki.fysik.dtu.dk/ase/ase/io/io.html) for options.\n\n##### `--index`, `-x`\nSelects range of structures from source, in Python slice format (e.g. `0` for the first structure, `-1` for the last, `:4` for the first four, etc.).\n\n##### `--append`, `-a`\nAppends to end of the target file instead of overwriting.\n\n</details>\n\n### fpdataviewer validate\n\nValidates the input file and reports problems. \nSome formats (like VASP's ML_AB) contain redundant or possibly self-contradictory information that can cause parsers to fail unpredictably. \nThis option will check the input file against specifications to minimize these errors and help the user repair the broken file.\n\n```shell\nfpdataviewer validate -i examples/ML_AB\n```\n\n## Config file\n\nSpecifying a custom config will override settings from the default, which is located in [config.py](fpdataviewer/cli/config.py).\n\n```json\n{\n \"global\": {\n \"bins\": 100\n },\n \"rdf\": {\n \"bins\": 1000,\n \"structures\": 1.0,\n \"r_min\": 0.0,\n \"r_max\": \"auto\",\n \"skip_pairs\": []\n },\n \"descriptors\": {\n \"structures\": 1.0,\n \"soap\": {\n \"r_cut\": \"auto\",\n \"n_max\": 8,\n \"l_max\": 8\n }\n },\n \"rendering\": {\n \"width\": 1024,\n \"height\": 1024\n }\n}\n```\n\nMost settings are self-explanatory, but more specifically:\n\n- `\"descriptors\"`\n - See DScribe [documentation](https://singroup.github.io/dscribe/latest/doc/dscribe.descriptors.html). The inner content is fed to the respective (local) descriptor object. It should specify one of\n - `\"soap\"`\n - `\"acsf\"`\n - `\"lmbtr\"`\n- `\"rdf\"`\n - `\"skip_pairs\"` is an array of values `\"<atom 1>-<atom 2>\"` (e.g. `\"Bi-O\"`). Usually RDF calculations are fast enough for this to be unnecessary.\n- Anywhere\n - `\"structures\"` specify the number of structures to be included in some calculation, chosen at random. It should specify\n - `0.0 < x < 1.0` for a portion\n - `x > 1` for a specific number\n - `\"auto\"` will be replaced with the maximum possible radius such that radii never overlap in a periodic structure (the `non periodic distance` in the overview panel).\n",
"bugtrack_url": null,
"license": "LGPL-3.0-only",
"summary": "Reads first-principle molecular simulation data and graphs various statistics",
"version": "1.0.1",
"project_urls": {
"Homepage": "https://github.com/dynamicsolids/FPdataViewer"
},
"split_keywords": [
"vasp",
"descriptor",
"machine learning",
"atomic structure",
"materials science"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "021074ddcf58d0d628e3070bd9ebb05da56b186746e3e4b30186fddb69fd4c83",
"md5": "b133f37d6aadbdd2cd1020bec3e44075",
"sha256": "31bb04118966e12491804983ea7b318cbec9f26ea1b7d3a10f676c931f09f501"
},
"downloads": -1,
"filename": "fpdataviewer-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b133f37d6aadbdd2cd1020bec3e44075",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 30188,
"upload_time": "2023-09-28T22:07:42",
"upload_time_iso_8601": "2023-09-28T22:07:42.408608Z",
"url": "https://files.pythonhosted.org/packages/02/10/74ddcf58d0d628e3070bd9ebb05da56b186746e3e4b30186fddb69fd4c83/fpdataviewer-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "448788e455844815687f40fbed6c1e35b69237c3d8920ea3ef0e33d990d39692",
"md5": "3d8fc32ffe43e232e8b825fad1b292ab",
"sha256": "e3ae3cdb373147e05010a17e00eb8d7bf69764e54f28e94c94abe88f68a2c01b"
},
"downloads": -1,
"filename": "fpdataviewer-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "3d8fc32ffe43e232e8b825fad1b292ab",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 25535,
"upload_time": "2023-09-28T22:07:44",
"upload_time_iso_8601": "2023-09-28T22:07:44.246934Z",
"url": "https://files.pythonhosted.org/packages/44/87/88e455844815687f40fbed6c1e35b69237c3d8920ea3ef0e33d990d39692/fpdataviewer-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-28 22:07:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dynamicsolids",
"github_project": "FPdataViewer",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "fpdataviewer"
}