# Table of Contents
* [Introduction](#introduction)
* [SCANN Framework](#scann-framework)
* [Installation](#installation)
* [Usage](#usage)
* [Datasets](#datasets)
* [References](#references)
<a name="introduction"></a>
# Introduction
This repository is the official implementation of [Towards understanding structure–property relations in materials with interpretable deep learning](https://www.nature.com/articles/s41524-023-01163-9).
Please cite us as
```
@article{Vu2023,
doi = {10.1038/s41524-023-01163-9},
issn = {2057-3960},
journal = {npj Computational Materials},
number = {1},
pages = {215},
title = {{Towards understanding structure–property relations in materials with interpretable deep learning}},
url = {https://doi.org/10.1038/s41524-023-01163-9},
volume = {9},
year = {2023}
}
```
We developed a `Self-Consistent Atention-based Neural Network (SCANN)` that takes advantage of a neural network to quantitatively capture
the contribution of the local structure of material properties.
The model captures information on atomic sites
and their local environments by considering self-consistent long-range interactions to enrich the structural
representations of the materials. A comparative experiment was performed on benchmark datasets QM9 and Material Project (2018.6.1) to compare the performance of the proposed model with state-of-the-art representations in terms of prediction accuracy
for several target properties.
Furthermore,
the quantitative contribution of each local structure to the properties of the materials can help understand
the structural-property relationships of the materials.
<a name="DeepAt-framework"></a>
# SCANN framework
The Self-Consistent Atention-based Neural Network (SCANN) is an implementation of deep attention mechanism for materials science.
Figure 1 shows the overall schematic of the model
![Model architecture](resources/model_semantic.jpg)
<div align='center'><strong>Figure 1. Schematic of SCANN.</strong></div>
<a name="usage"></a>
# Installation
Firstly, create a conda environment to install the package, for example:
```
conda create -n test python==3.9
source activate test
```
### Optional GPU dependencies
For hardwares that have CUDA support, the <b>tensorflow version with gpu options</b> should be installed. Please follow the installation from https://www.tensorflow.org/install for more details.
Tensorflow can also be installed from ```conda``` for simplification settings:
```
conda install -c conda-forge tensorflow-gpu
```
#### Method 1 (directly install from git)
You can install the lastes development version of SCANN from this repo and install using:
```
git clone https://github.com/sinhvt3421/scann-material
cd scann-material
python -m pip install -e .
```
#### Method 2 (using pypi)
SCANN can be installed via pip for the latest stable version:
```
pip install scann-model
```
# Usage
Our current implementation supports a variety of use cases for users with
different requirements and experience with deep learning. Please also visit
the [notebooks directory](notebooks) for Jupyter notebooks with more detailed code examples.
Below is an example of predicting the "HOMO" and corresponding global attention score:
```python
from scann.utils import load_file, prepare_input_pmt
from scann.models import SCANN
import yaml
#load config and pretrained model from folders
config = yaml.safe_load(open('trained_model/homo/config.yaml'))
scann = SCANN(config, pretrained='trained_model/homo/model_homo.h5', mode='infer')
#load file for structure using pymatgen Structure
struct = load_file('abc.xyz') # pymatgen.core.Structure
inputs = prepare_input_pmt(struct, d_t=4.0, w_t=0.4, angle=False) # Distance, weights threshold
# Predict the target property with the ga score for interpretation
pre_target, ga_score = scann.model.predict(inputs)
```
## Using pre-built models
In our work, we have already built models for the QM9 [1] and Material Project 2018 [2] datasets . The model is provided as serialized HDF5+yaml files.
* QM9 molecule data:
* HOMO: Highest occupied molecular orbital energy
* LUMO: Lowest unoccupied molecular orbital energy
* Gap: energy gap
* α: isotropic polarizability
* Cv: heat capacity at 298 K
The MAEs on the various models are given below:
### Performance on QM9
| Property | Units | SCANN | SCANN<sup>+</sup>|
|----------|------------|-------|-------|
| HOMO | meV | 41 | 32 |
| LUMO | meV | 37 |31|
| Gap | meV | 61 |52|
| α | Bohr^3 | 0.141|0.115|
| Cv | cal/(molK) | 0.050 |0.041|
### Performance on Material Project 2018.6.1
| Property | Units | SCANN | SCANN<sup>+</sup>|
|----------|------------|-------|-------|
| Ef | meV(atom)<sup>-1</sup> | 29 | 28 |
| Eg | meV | 260 |225|
<a name="dataset"></a>
# Datasets
## Experiments
The settings for experiments specific is placed in the folder [configs](configs)
We provide an implementation for the QM9 experiments, the fullerene-MD, the Pt/graphene-MD, Material Project 2018.6.1, and SmFe12-CD [3] experiments.
# Basic usage
## Data preprocessing
For training new model for each datasets, please follow the below example scripts. If the data is not avaiable, please run the code ```preprocess_data.py``` for downloading and creating suitable data formats for SCANN model. For example:
```
$ python preprocess_data.py qm9 processed_data --dt=4.0 --wt=0.4 --p=8
-----
$ python preprocess_data.py mp2018 processed_data --dt=6.0 --wt=0.4 --p=8
```
The data for <b>QM9</b> or <b>Material Project 2018</b> will be automatically downloaded and processed into folder [propessed_data](processed_data). For all avaiable datasets and options for cutoff distance/Voronoi angle, please run ```python preprocess.py --h``` to show all details.
## Model training
After that, please change the config file located in folder [configs](configs) for customizing the model hyperparameters or data loading/saving path.
```
$ python train.py homo configs/model_qm9.yaml --use_drop=True
```
For training dataset fullerene-MD with pretrained weights from QM9 dataset, please follow these steps. The pretrained model will be load based on the path from argument.
```
$ python train.py homo configs/model_fullerene.yaml --pretrained=.../qm9/homo/models/model.h5
```
For running the evaluation from pretrained weights, please follow these steps.
```
$ python train.py homo ..../qm9/homo/configs.yaml --pretrained=.../qm9/homo/models/model.h5 --mode=eval
```
## Model inference
The code ```predict_files.py``` supports loading a ```xyz``` file and predicting the properties with the pretrained models. The information about global attention (GA) score for interpreting the structure-property relationship is also provided and saved into ```xyz``` format. Please use a visualization tool such as Ovito [4] for showing the results.
```
$ python predict_files.py ..../models.h5 save_path.../ experiments/molecules/Dimethyl_fumarate.xyz
```
![Visualization of GA scores](resources/ovito_visual.png)
<div align='center'><strong>Figure 2. Example of SCANN prediction for LUMO property.</strong></div>
<a name="usage"></a>
<a name="references"></a>
# References
[1] Ramakrishnan, R., Dral, P., Rupp, M. et al. Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1, 140022 (2014). https://doi.org/10.1038/sdata.2014.22
[2] Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018). https://doi.org/10.1021/acs.chemmater.9b01294
[3] Nguyen, DN., Kino, H., Miyake, T. et al. Explainable active learning in investigating structure–stability of SmFe<sub>12-α-β</sub> X<sub>α</sub>Y<sub>β</sub> structures X, Y {Mo, Zn, Co, Cu, Ti, Al, Ga}. MRS Bulletin 48, 31–44 (2023). https://doi.org/10.1557/s43577-022-00372-9
[4] A. Stukowski, Visualization and Analysis of Atomistic Simulation Data with OVITO–the Open Visualization Tool, Model. Simul. Mater. Sci. Eng. 18, 15012 (2009). https://doi.org/10.1088/0965-0393/18/1/015012
Raw data
{
"_id": null,
"home_page": "https://github.com/sinhvt3421/scann--material",
"name": "scann-model",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "",
"keywords": "materials,science,interpretable,deep,attention,networks,neural",
"author": "Vu Tien-Sinh",
"author_email": "sinh.vt@jaist.ac.jp",
"download_url": "https://files.pythonhosted.org/packages/f9/e9/784cb3c3deed5e7a168474bac6d7284503ca711dd9304ce2aae8fee724ee/scann-model-1.0.tar.gz",
"platform": null,
"description": "# Table of Contents\n\n* [Introduction](#introduction)\n* [SCANN Framework](#scann-framework)\n* [Installation](#installation)\n* [Usage](#usage)\n* [Datasets](#datasets)\n* [References](#references)\n\n<a name=\"introduction\"></a>\n\n# Introduction\nThis repository is the official implementation of [Towards understanding structure\u2013property relations in materials with interpretable deep learning](https://www.nature.com/articles/s41524-023-01163-9).\n\nPlease cite us as\n\n```\n@article{Vu2023,\ndoi = {10.1038/s41524-023-01163-9},\nissn = {2057-3960},\njournal = {npj Computational Materials},\nnumber = {1},\npages = {215},\ntitle = {{Towards understanding structure\u2013property relations in materials with interpretable deep learning}},\nurl = {https://doi.org/10.1038/s41524-023-01163-9},\nvolume = {9},\nyear = {2023}\n}\n```\n\nWe developed a `Self-Consistent Atention-based Neural Network (SCANN)` that takes advantage of a neural network to quantitatively capture\nthe contribution of the local structure of material properties.\n\nThe model captures information on atomic sites\nand their local environments by considering self-consistent long-range interactions to enrich the structural\nrepresentations of the materials. A comparative experiment was performed on benchmark datasets QM9 and Material Project (2018.6.1) to compare the performance of the proposed model with state-of-the-art representations in terms of prediction accuracy\nfor several target properties.\n\nFurthermore,\nthe quantitative contribution of each local structure to the properties of the materials can help understand\nthe structural-property relationships of the materials.\n\n<a name=\"DeepAt-framework\"></a>\n\n# SCANN framework\n\nThe Self-Consistent Atention-based Neural Network (SCANN) is an implementation of deep attention mechanism for materials science.\n\nFigure 1 shows the overall schematic of the model\n\n![Model architecture](resources/model_semantic.jpg)\n<div align='center'><strong>Figure 1. Schematic of SCANN.</strong></div>\n\n<a name=\"usage\"></a>\n\n# Installation\n\nFirstly, create a conda environment to install the package, for example:\n```\nconda create -n test python==3.9\nsource activate test\n```\n\n### Optional GPU dependencies\n\nFor hardwares that have CUDA support, the <b>tensorflow version with gpu options</b> should be installed. Please follow the installation from https://www.tensorflow.org/install for more details.\n\nTensorflow can also be installed from ```conda``` for simplification settings:\n```\nconda install -c conda-forge tensorflow-gpu\n```\n\n#### Method 1 (directly install from git)\nYou can install the lastes development version of SCANN from this repo and install using:\n```\ngit clone https://github.com/sinhvt3421/scann-material\ncd scann-material\npython -m pip install -e .\n```\n\n#### Method 2 (using pypi)\nSCANN can be installed via pip for the latest stable version:\n```\npip install scann-model\n```\n\n# Usage\n\nOur current implementation supports a variety of use cases for users with\ndifferent requirements and experience with deep learning. Please also visit\nthe [notebooks directory](notebooks) for Jupyter notebooks with more detailed code examples.\n\nBelow is an example of predicting the \"HOMO\" and corresponding global attention score:\n\n```python\nfrom scann.utils import load_file, prepare_input_pmt\nfrom scann.models import SCANN\nimport yaml\n\n#load config and pretrained model from folders\n\nconfig = yaml.safe_load(open('trained_model/homo/config.yaml'))\nscann = SCANN(config, pretrained='trained_model/homo/model_homo.h5', mode='infer')\n\n#load file for structure using pymatgen Structure \n\nstruct = load_file('abc.xyz') # pymatgen.core.Structure \ninputs = prepare_input_pmt(struct, d_t=4.0, w_t=0.4, angle=False) # Distance, weights threshold\n\n# Predict the target property with the ga score for interpretation\npre_target, ga_score = scann.model.predict(inputs)\n\n```\n\n## Using pre-built models\n\nIn our work, we have already built models for the QM9 [1] and Material Project 2018 [2] datasets . The model is provided as serialized HDF5+yaml files. \n\n* QM9 molecule data:\n * HOMO: Highest occupied molecular orbital energy\n * LUMO: Lowest unoccupied molecular orbital energy\n * Gap: energy gap\n * \u03b1: isotropic polarizability\n * Cv: heat capacity at 298 K\n\nThe MAEs on the various models are given below:\n\n### Performance on QM9\n\n| Property | Units | SCANN | SCANN<sup>+</sup>|\n|----------|------------|-------|-------|\n| HOMO | meV | 41 | 32 |\n| LUMO | meV | 37 |31|\n| Gap | meV | 61 |52|\n| \u03b1 | Bohr^3 | 0.141|0.115|\n| Cv | cal/(molK) | 0.050 |0.041|\n\n### Performance on Material Project 2018.6.1\n\n| Property | Units | SCANN | SCANN<sup>+</sup>|\n|----------|------------|-------|-------|\n| Ef | meV(atom)<sup>-1</sup> | 29 | 28 |\n| Eg | meV | 260 |225|\n\n\n<a name=\"dataset\"></a>\n\n# Datasets\n\n## Experiments\n\nThe settings for experiments specific is placed in the folder [configs](configs)\n\nWe provide an implementation for the QM9 experiments, the fullerene-MD, the Pt/graphene-MD, Material Project 2018.6.1, and SmFe12-CD [3] experiments.\n\n# Basic usage\n## Data preprocessing\nFor training new model for each datasets, please follow the below example scripts. If the data is not avaiable, please run the code ```preprocess_data.py``` for downloading and creating suitable data formats for SCANN model. For example:\n```\n$ python preprocess_data.py qm9 processed_data --dt=4.0 --wt=0.4 --p=8\n\n-----\n\n$ python preprocess_data.py mp2018 processed_data --dt=6.0 --wt=0.4 --p=8\n\n```\nThe data for <b>QM9</b> or <b>Material Project 2018</b> will be automatically downloaded and processed into folder [propessed_data](processed_data). For all avaiable datasets and options for cutoff distance/Voronoi angle, please run ```python preprocess.py --h``` to show all details.\n\n## Model training\nAfter that, please change the config file located in folder [configs](configs) for customizing the model hyperparameters or data loading/saving path.\n```\n$ python train.py homo configs/model_qm9.yaml --use_drop=True\n```\n\nFor training dataset fullerene-MD with pretrained weights from QM9 dataset, please follow these steps. The pretrained model will be load based on the path from argument. \n```\n$ python train.py homo configs/model_fullerene.yaml --pretrained=.../qm9/homo/models/model.h5\n```\n\nFor running the evaluation from pretrained weights, please follow these steps.\n```\n$ python train.py homo ..../qm9/homo/configs.yaml --pretrained=.../qm9/homo/models/model.h5 --mode=eval \n```\n\n## Model inference\nThe code ```predict_files.py``` supports loading a ```xyz``` file and predicting the properties with the pretrained models. The information about global attention (GA) score for interpreting the structure-property relationship is also provided and saved into ```xyz``` format. Please use a visualization tool such as Ovito [4] for showing the results.\n```\n$ python predict_files.py ..../models.h5 save_path.../ experiments/molecules/Dimethyl_fumarate.xyz\n``` \n![Visualization of GA scores](resources/ovito_visual.png)\n<div align='center'><strong>Figure 2. Example of SCANN prediction for LUMO property.</strong></div>\n\n<a name=\"usage\"></a>\n\n<a name=\"references\"></a>\n# References\n\n[1] Ramakrishnan, R., Dral, P., Rupp, M. et al. Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1, 140022 (2014). https://doi.org/10.1038/sdata.2014.22 \n\n[2] Xie, T. & Grossman, J. C. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett. 120, 145301 (2018). https://doi.org/10.1021/acs.chemmater.9b01294\n\n[3] Nguyen, DN., Kino, H., Miyake, T. et al. Explainable active learning in investigating structure\u2013stability of SmFe<sub>12-\u03b1-\u03b2</sub> X<sub>\u03b1</sub>Y<sub>\u03b2</sub> structures X, Y {Mo, Zn, Co, Cu, Ti, Al, Ga}. MRS Bulletin 48, 31\u201344 (2023). https://doi.org/10.1557/s43577-022-00372-9\n\n[4] A. Stukowski, Visualization and Analysis of Atomistic Simulation Data with OVITO\u2013the Open Visualization Tool, Model. Simul. Mater. Sci. Eng. 18, 15012 (2009). https://doi.org/10.1088/0965-0393/18/1/015012\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "SCANN - Self-Consistent Atention-based Neural Network",
"version": "1.0",
"project_urls": {
"Homepage": "https://github.com/sinhvt3421/scann--material"
},
"split_keywords": [
"materials",
"science",
"interpretable",
"deep",
"attention",
"networks",
"neural"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f9e9784cb3c3deed5e7a168474bac6d7284503ca711dd9304ce2aae8fee724ee",
"md5": "d17d5fcb2c2e0a036382a5e4fbea6f0e",
"sha256": "ba395279603815074f32c3bb2989864ab0cb586887716c3b8778371f118f931d"
},
"downloads": -1,
"filename": "scann-model-1.0.tar.gz",
"has_sig": false,
"md5_digest": "d17d5fcb2c2e0a036382a5e4fbea6f0e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 28812,
"upload_time": "2023-12-17T16:30:48",
"upload_time_iso_8601": "2023-12-17T16:30:48.247925Z",
"url": "https://files.pythonhosted.org/packages/f9/e9/784cb3c3deed5e7a168474bac6d7284503ca711dd9304ce2aae8fee724ee/scann-model-1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-17 16:30:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sinhvt3421",
"github_project": "scann--material",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "scann-model"
}