Name | molbar JSON |
Version |
1.1.3
JSON |
| download |
home_page | https://git.rwth-aachen.de/bannwarthlab/MolBar |
Summary | Molecular Barcode (MolBar): Molecular Identifier for Organic and Inorganic Molecules |
upload_time | 2024-10-03 11:05:44 |
maintainer | None |
docs_url | None |
author | Nils van Staalduinen |
requires_python | >=3.8 |
license | MIT License Copyright (c) 2022 Nils van Staalduinen, Christoph Bannwarth Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
molecular identifier
chemical data science
stereoisomerism
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# MolBar
<div align="center">
<img src="logo.png" alt="logo" width="400" />
</div>
This package offers an implementation of the Molecular Barcode (MolBar), a molecular identifier inspired by quantum chemistry, designed to ensure data uniqueness in chemical structure databases. The identifier is optimized for computational chemistry applications, such as black-box chemical space exploration, by generating the identifier directly from 3D Cartesian coordinates and encoding the 3D shape of the molecule, including information typically not found in a Molfile. This encompasses details about highly distorted structures that deviate significantly from VSEPR geometries. It supports both organic and inorganic molecules and aims to describe relative and absolute configurations, including centric, axial/helical, and planar chirality.
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI Downloads](https://img.shields.io/pypi/dm/molbar.svg?label=PyPI%20downloads)](https://pypi.org/project/molbar/)
[![ChemRxiv Paper](https://img.shields.io/badge/DOI-10.1038%2Fs41586--020--2649--2-blue)](
https://chemrxiv.org/engage/chemrxiv/article-details/65e3cd80e9ebbb4db9c71da0)
- **ChemRxiv Paper:** https://chemrxiv.org/engage/chemrxiv/article-details/65e3cd80e9ebbb4db9c71da0
- **Documentation:** https://git.rwth-aachen.de/bannwarthlab/molbar/-/blob/main/README.md?ref_type=heads
- **Source code:** https://git.rwth-aachen.de/bannwarthlab/molbar
- **Bug reports:** https://git.rwth-aachen.de/bannwarthlab/molbar/-/issues
- **Email contact:** van.staalduinen@pc.rwth-aachen.de
It does this by fragmentating a molecule into rigid parts which are then idealized with a specialized non-physical force field. The molecule is then described by different matrices encoding topology (connectivity), topography (3D positions of atoms after input unification), and absolute configuration (by calculating a chirality index). The final barcode is the concatenated spectra of these matrices.
## Current Limitations
**The input file must contain 3D coordinates and explicit hydrogens.**
MolBar should work well for organic and inorganic molecules with typical 2c2e bonding. It can describe molecules based on their relative and absolute configuration, including centric, axial/helical and planar chirality.
Given that 3D Cartesian coordinates are the usual starting point, problems may arise when determining which atoms are bonded, particularly in metal complexes with η-bonds. Additionally, challenges can occur if the geometry around a metal in a complex cannot be classified by one of the standard VSEPR models. Solutions to these issues are being developed and are released with the next versions. If you encounter difficulties, you can use the -d option when using MolBar as a command-line tool or set write_trj=True when using MolBar as a Python module to examine the optimized trajectories of each fragment. If anything is unclear or if you encounter any unusual behavior, please report it by posting issues or via email at van.staalduinen@pc.rwth-aachen.de.
For rigidity analysis, MolBar only considers double/triple bonds and rings to be rigid. For example, an obstacle to rotation due to bulkiness of substituents is not taken into account, but can be added manually from the input file (additional dihedral constraint, but that should be used as an exception and carefully).
## Getting started (tested on Linux and macOS, compiling works for Windows only in WSL)
### For Linux/macOS
Using a virtual environment is highly recommended because it allows you to create isolated environments with their own dependencies, without interfering with other Python projects or the system Python installation. This ensures that your Python environment remains consistent and reproducible across different machines and over time. To create one, type in the following command in your terminal:
```bash
python3 -m venv your_path
```
To activate the environment, type in:
```bash
source your_path/bin/activate
```
To install Molbar, enter the following command in your terminal:
```bash
pip install molbar
```
### For Windows
Since compiling in a standard Windows environment does not work yet, it is highly recommended to use the WSL (Windows Subsystem for Linux) extension. Simply follow this installation guide: https://learn.microsoft.com/en-us/windows/wsl/install. Note that a Fortran compiler needs to be installed manually in the WSL environment. Otherwise, the installation of MolBar will result in an error.
For Python usage, it is highly recommended to use Visual Studio Code (VSC) as it provides specific extensions to code directly in WSL. A more detailed guide can be found here: https://code.visualstudio.com/docs/remote/wsl
# MolBar Structure
For l-alanine, the MolBar reads:
```text
MolBar | 1.1.2 | C3NO2H7 | 0 | -339 -140 -110 -32 13 20 20 20 160 237 432 528 850 | -209 -8 130 160 354 633 | -108 -79 -42 -24 11 11 11 47 75 140 293 433 891 | -11 0 0 0 31
```
The different parts of MolBar are defined as follows:
```text
Version: 1.1.2
Molecular Formula: C3H7NO2
Topology Spectrum: -339 -140 -110 -32 13 20 20 20 160 237 432 528 850 (Encoding atomic connectivity)
Heavy Atom Topology Spectrum: -209 -8 130 160 354 633 (Encoding atomic connectivity without hydrogen atoms. So if for two molecules, the topology spectra are different but the tautomer spectra are the same, both molecules are tautomeric structures)
Topography Spectrum : -108 -79 -42 -24 11 11 11 47 75 140 293 433 891 (3D arrangement of atoms in Cartesian space, also describes diastereomerism)
Chirality: -11 0 0 0 31 (Encoding absolute configuration for each fragment)
```
The chirality barcode can only be compared between two molecules if their topology and topography barcodes are identical. If the chirality barcodes differ only in their sign, this indicates that the two molecules are enantiomers.
# Generating MolBar
MolBar can be generated using either of the following methods:
1. Python Interface: Refer to the Python Module Usage for detailed instructions on generating MolBar using Python.
2. Command Line Interface with the following command: molbar coord.xyz (see Command Line Usage for more information)
## Python Module Usage
MolBar can be generated by Python function calls:
1. for a single molecule with ```get_molbar_from_coordinates``` by specifying the Cartesian coordinates as a list,
2. for several molecules at once with ```get_molbars_from_coordinates``` by giving a list of lists with Cartesian coordinates,
3. for a single molecule with ```get_molbar_from_file``` by specifying a file path,
4. for several molecules at once with ```get_molbars_from_files``` by specifying a list of file paths.
### 1. get_molbar_from_coordinates
NOTE:
If you need to process multiple molecules at once, it is recommended to use ```get_molbars_from_coordinates``` instead of ```get_molbar_from_coordinates```.
```python
from molbar.barcode import get_molbar_from_coordinates
def get_molbar_from_coordinates(coordinates: list, elements: list, return_data=False, timing=False, input_constraint=None, mode="mb") -> Union[str, dict]
```
#### Arguments:
```python
coordinates (list): Molecular geometry provided by atomic Cartesian coordinates with shape (n_atoms, 3).
```
```python
elements (list):A list of elements in that molecule. Either the element symbols or atomic numbers can be used.
```
```python
return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
timing (bool): Whether to print the duration of this calculation.
```
```python
input_constraint (dict, optional): A dict of extra constraints for the calculation. See below for more information. USED ONLY IN EXCEPTIONAL CASES.
```
```python
mode (str): Whether to calculate the molecular barcode ("mb") or only the topology part of the molecular barcode ("topo").
```
#### Returns:
```python
Union[str, dict]: Either MolBar or the MolBar and MolBar data.
```
Input constraints should be used only in exceptional cases. However, they may be useful to constrain the molecule with additional dihedrals for rotations that are normally considered around single bonds but whose rotation is hindered (e.g., 90° binol systems with bulky substituents).
```python
{
'constraints': {
'dihedrals': [{'atoms': [1,2,3,4], 'value':90.0},...]} #atoms: list of atoms that define the dihedral, value is the ideal dihedral angle in degrees, atom indexing starts with 1.
}
```
### 2. get_molbars_from_coordinates
NOTE:
If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.
```python
from molbar.barcode import get_molbars_from_coordinates
def get_molbars_from_coordinates(list_of_coordinates: list, list_of_elements: list, return_data=False, threads=1, timing=False, input_constraints=None, progress=False, mode="mb") -> Union[list, Union[str, dict]]:
```
#### Arguments:
```python
list_of_coordinates (list): A list of molecular geometries provided by atomic Cartesian coordinates with shape (n_molecules, n_atoms, 3).
```
```python
list_of_elements (list): A list of element lists for each molecule in the list_of_coordinates with shape (n_molecules, n_atoms). Either the element symbols or atomic numbers can be used.
```
```python
return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
threads (int): Number of threads to use for the calculation. If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.
```
```python
timing (bool): Whether to print the duration of this calculation.
```
```python
input_constraints (list, optional): A list of constraints for the calculation. Each constraint in that list is a Python dict as shown above for get_molbar_from_coordinates.
```
```python
progress (bool): Whether to show a progress bar.
```
```python
mode (str): Whether to calculate the molecular barcode ("mb") or the topology part of the molecular barcode ("topo").
```
#### Returns:
```python
Union[list, Union[str, dict]]: Either MolBar or the MolBar and MolBar data.
```
### 3. get_molbar_from_file
NOTE:
If you need to process multiple molecules provided by files at once, it is recommended to use ```get_molbars_from_files``` instead of ```get_molbar_from_file```.
```python
from molbar.barcode import get_molbar_from_file
def get_molbar_from_file(file: str, return_data=False, timing=False, input_constraint=None, mode="mb", write_trj=False) -> Union[str, dict]:
```
#### Arguments:
```python
file (str): The path to the file containing the molecule information. The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```python
return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
timing (bool): Whether to print the duration of this calculation.
```
```python
input_constraint (dict, optional): A dict of extra constraints for the calculation. See below for more information. USED ONLY IN EXCEPTIONAL CASES.
```
```python
mode (str): Whether to calculate the molecular barcode ("mb") or only the topology part of the molecular barcode ("topo").
```
```python
write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.
```
#### Returns:
```python
Union[str, dict]: Either MolBar or the MolBar and MolBar data.
```
Example for input file in .yml format. Input constraint should be used only in exceptional cases. However, it may be useful to constrain bonds with a additional dihedral for the barcode that are normally considered single bonds but whose rotation is hindered (e.g., 90° binol systems with bulky substituents).
```yml
constraints:
dihedrals:
- atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral
value: 90.0 # Actual values for the dihedral parameters
```
### 4. get_molbars_from_files
NOTE:
If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.
```python
from molbar.barcode import get_molbars_from_files
def get_molbars_from_files(files: list, return_data=False, threads=1, timing=False, input_constraints=None, progress=False, mode="mb", write_trj=False) ->Union[list, Union[str, dict]]:
```
#### Arguments:
```python
files (list): The list of paths to the files containing the molecule information. The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```python
return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
threads (int): Number of threads to use for the calculation. If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.
```
```python
timing (bool): Whether to print the duration of this calculation.
```
```python
input_constraints (list, optional): A list of file paths to the input files for the calculation. Each constrained is specified by a file path to a .yml file, as shown above for get_molbar_from_file.
```
```python
progress (bool): Whether to show a progress bar.
```
```python
mode (str): Whether to calculate the molecular barcode ("mb") or the topology part of the molecular barcode ("topo").
```
```python
write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.
```
#### Returns:
```python
Union[list, Union[str, dict]]: Either MolBar or the MolBar and MolBar data.
```
## Commandline Usage
MolBar can also be used as commandline tool. Just simply type:
```
molbar coord.xyz
```
and the MolBar is printed to the stdout.
NOTE:
If you need to process several molecules at once, it is recommended to pass all molecules to the code at once (e.g. with *.xyz) while specifying the number of threads the code should use:
```bash
molbar *.xyz -T N_threads -s
```
The latter option (-s) is used to store the barcode to .mb files.
Further, the commandline tool provides several options:
```bash
Basic usage example:
molbar coord.xyz
Usage with multiple files:
molbar *.xyz -T 8
Usage for topology barcode only:
molbar coord.mol -m topo
Usage to return additional data:
molbar coord.pdb -d
Usage to print timings:
molbar coord.com -t
```
#### Arguments and Options:
```bash
positional arguments:
file(s)
The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```bash
-m {mb,topo,opt}, --mode {mb,topo,opt}
The mode to use for the calculations (either "mb" (default, calculates MolBar), "topo" (topology part only)
or "opt" (using stand-alone force field idealization, writes ".opt" with final structure))
```
```bash
-i path/to/input --inp path/to/input
Path to input file in .yml format to add further constraints. Example input can be found below.
```
```bash
-d, --data Whether to print MolBar data.
Writes a "filename/" directory containing a json file with
important information that defines MolBar. Writes idealization trajectories of each fragment to same directory.
```
```bash
-T number_of_threads, --threads number_of_threads
The number of threads to use for parallel processing of several files. MolBar generation for a single file is not parallelized. Should be used together with -s/--save (e.g. molbar *.xyz -T 8 -s)
```
```bash
-s, --save Whether to save the result to a file of type "filename.mb"
```
```bash
-t, --time Print out timings.
```
```bash
-p, --progress Use a progress bar when several files are handled.
```
Example for input file constraints in yml format. Input constraint should be used only in exceptional cases. However, it may be useful to constrain bonds with a additional dihedral for the barcode that are normally considered single bonds but whose rotation is hindered (e.g., 90° binol systems with bulky substituents).
```yml
constraints:
dihedrals:
- atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral
value: 90.0 # Actual values for the dihedral parameters
```
## Using the unification force field for the whole molecule.
The force field can be used to idealize the structure of a whole molecule where the coordinates are either given in Python by a file:
1. as a command line tool with the ```molbar coord.xyz -m opt``` option
2. in Python with ```idealize_structure_from_file``` by providing a file path
3. in Python with ```idealize_structure_from_coordinates``` by providing Cartesian coordinates as a list
### Commandline tool
```text
molbar coord.xyz -m opt
```
This writes a coord.opt file that contains the idealized coordinates.
### In Python from a file:
```python
from molbar.idealize import idealize_structure_from_file
def idealize_structure_from_file(file: str, return_data=False, timing=False, input_constraint=None, write_trj=False) -> Union[list, str]
```
#### Arguments:
```python
file (str): The path to the input file to be processed.
The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```python
return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
timing (bool): Whether to print the duration of this calculation.
```
```python
input_constraint (str): The path to the input file containing the constraint for the calculation. See down below for more information.
```
```python
write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.
```
#### Returns:
```python
n_atoms (int): Number of atoms in the molecule.
```
```python
energy (float): Final energy of the molecule after idealization.
```
```python
coordinates (list): Final coordinates of the molecule after idealization.
```
```python
elements (list): Elements of the molecule.
```
```python
data (dict): Molbar data.
```
This is an example input as a yml file:
```yml
bond_order_assignment: False # False if bond order assignment should be skipped, only reasonable opt mode (standalone force-field optimization)
cycle_detection: True # False if cycle detection should be skipped, only reasonable opt mode (standalone force-field optimization).
repulsion_charge: 100.0 # Charge used for the Coulomb term in the Force field, every atom-atom pair uses the same charge, only reasonable opt mode (standalone force-field optimization). Defaults to 100.0
set_edges: True #False if no bonds should be constrained automatically.
set_angles: True #False if no angles should be constrained automatically.
set_dihedrals: True # False if no dihedrals should be constrained automatically.
set_repulsion: True #False if no coulomb term should be used automatically.
constraints:
bonds:
- atoms: [19, 23] # List of atoms involved in the bond
value: 1.5 # Ideal bond length.
angles:
- atoms: [19, 23, 35] # List of atoms involved in the angle
value: 45.0 # Angle to which the angle between the three atoms is to be constrained
- atoms: [35, 23, 19] # List of atoms involved in the angle
value: 45.0 # Angle to which the angle between the three atoms is to be constrained
dihedrals:
- atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral
value: 90.0 # Actual values for the dihedral parameters
```
### In Python from a list of Cartesian coordinates:
```python
from molbar.idealize import idealize_structure_from_coordinates
def idealize_structure_from_coordinates(coordinates: list, elements: list, return_data=False, timing=False, input_constraint=None) -> Union[list, str]:
```
#### Arguments:
```python
coordinates (list): Cartesian coordinates of the molecule.
```
```python
elements (list): Elements of the molecule. Either the element symbols or atomic numbers can be used.
```
```python
return_data (bool, optional): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.
```
```python
timing (bool, optional): Whether to print the duration of this calculation.
```
```python
input_constraint (dict, optional): The constraint for the calculation. See documentation for more information.
```
#### Returns:
```python
n_atoms (int): Number of atoms in the molecule.
```
```python
energy (float): Final energy of the molecule after idealization.
```
```python
coordinates (list): Final coordinates of the molecule after idealization.
```
```python
elements (list): Elements of the molecule.
```
```python
data (dict): Molbar data.
```
This is an example input as a Python dict:
```python
{'bond_order_assignment': True, #False if bond order assignment should be skipped, only reasonable opt mode (standalone force-field optimization)
'cycle_detection': True, #False if cycle detection should be skipped, only reasonable opt mode (standalone force-field optimization).
'set_edges': True #False if no bonds should be constrained automatically.
'set_angles': True #False if no angles should be constrained automatically.
'set_dihedrals': True #False if no dihedrals should be constrained automatically.
'set_repulsion': True #False if no coulomb term should be used automatically.
'repulsion_charge': 100.0, # Charged used for the Coulomb term in the Force field, every atom-atom pair uses the same charge, only reasonable opt mode (standalone force-field optimization). Defaults to 100.0
'constraints': {'bonds': [{'atoms': [1,2], 'value':1.5},...], #atoms: list of atoms that define the bond, value is the ideal bond length in angstrom, atom indexing starts with 1.
'angles': [{'atoms': [1,2,3], 'value':90.0},...], #atoms: list of atoms that define the angle, value is the ideal angle in degrees, atom indexing starts with 1.
'dihedrals': [{'atoms': [1,2,3,4], 'value':180.0},...]} #atoms: list of atoms that define the dihedral, value is the ideal dihedral angle in degrees, atom indexing starts with 1.
}
```
# Additional Command line Scripts provided by the MolBar package.
## Ensplit
Helper script to split an ensemble file into individual ensembles, each with a unique molecular barcode:
```bash
ensplit ensemble.trj -T 8 -topo
```
```bash
file Input file
The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```bash
-T THREADS, --threads number_of_threads. Number of threads for processing multiple files.
```
```bash
-p, --progress Show progress bar
```
```bash
-topo, --topo Only evaluate topology
```
## Princax
Helper script to align the molecule to the principal axes.
```bash
princax coord.xyz -r
```
```bash
file Input file
The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
```bash
-r, --replace Overwrite the input file with the aligned coordinates, or print to a new file if not specified.
```
## Invstruc
Helper script to invert structures to yield enantiomers.
```bash
invstruc coord.xyz
```
```bash
file Input file
The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.
```
## Acknowledgments
MolBar relies on the following libraries
and packages:
* [ase](https://wiki.fysik.dtu.dk/ase/)
* [dscribe](https://singroup.github.io/dscribe/latest/)
* [networkx](https://networkx.org/)
* [NumPy](https://numpy.org)
* [Numba](https://numba.pydata.org)
* [SciPy](https://scipy.org)
* [tqdm](https://github.com/tqdm/tqdm)
* [joblib](https://joblib.readthedocs.io/en/latest/)
Thank you!
## License and Disclaimer
MIT License
Copyright (c) 2022 Nils van Staalduinen, Christoph Bannwarth
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Raw data
{
"_id": null,
"home_page": "https://git.rwth-aachen.de/bannwarthlab/MolBar",
"name": "molbar",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Nils van Staalduinen <van.staalduinen@pc.rwth-aachen.de>",
"keywords": "molecular identifier, chemical data science, stereoisomerism",
"author": "Nils van Staalduinen",
"author_email": "Nils van Staalduinen <van.staalduinen@pc.rwth-aachen.de>, Christoph Bannwarth <bannwarth@pc.rwth-aachen.de>",
"download_url": "https://files.pythonhosted.org/packages/14/30/62d59086aef1a1fca7a50454cd2ef2a564d30e9f897149bc58210bf88cc6/molbar-1.1.3.tar.gz",
"platform": null,
"description": "\n\n# MolBar\n\n<div align=\"center\">\n<img src=\"logo.png\" alt=\"logo\" width=\"400\" />\n</div>\n\nThis package offers an implementation of the Molecular Barcode (MolBar), a molecular identifier inspired by quantum chemistry, designed to ensure data uniqueness in chemical structure databases. The identifier is optimized for computational chemistry applications, such as black-box chemical space exploration, by generating the identifier directly from 3D Cartesian coordinates and encoding the 3D shape of the molecule, including information typically not found in a Molfile. This encompasses details about highly distorted structures that deviate significantly from VSEPR geometries. It supports both organic and inorganic molecules and aims to describe relative and absolute configurations, including centric, axial/helical, and planar chirality.\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/molbar.svg?label=PyPI%20downloads)](https://pypi.org/project/molbar/)\n[![ChemRxiv Paper](https://img.shields.io/badge/DOI-10.1038%2Fs41586--020--2649--2-blue)](\nhttps://chemrxiv.org/engage/chemrxiv/article-details/65e3cd80e9ebbb4db9c71da0)\n\n- **ChemRxiv Paper:** https://chemrxiv.org/engage/chemrxiv/article-details/65e3cd80e9ebbb4db9c71da0\n- **Documentation:** https://git.rwth-aachen.de/bannwarthlab/molbar/-/blob/main/README.md?ref_type=heads\n- **Source code:** https://git.rwth-aachen.de/bannwarthlab/molbar\n- **Bug reports:** https://git.rwth-aachen.de/bannwarthlab/molbar/-/issues\n- **Email contact:** van.staalduinen@pc.rwth-aachen.de\n\nIt does this by fragmentating a molecule into rigid parts which are then idealized with a specialized non-physical force field. The molecule is then described by different matrices encoding topology (connectivity), topography (3D positions of atoms after input unification), and absolute configuration (by calculating a chirality index). The final barcode is the concatenated spectra of these matrices.\n\n## Current Limitations \n\n**The input file must contain 3D coordinates and explicit hydrogens.**\n\nMolBar should work well for organic and inorganic molecules with typical 2c2e bonding. It can describe molecules based on their relative and absolute configuration, including centric, axial/helical and planar chirality.\n\nGiven that 3D Cartesian coordinates are the usual starting point, problems may arise when determining which atoms are bonded, particularly in metal complexes with \u03b7-bonds. Additionally, challenges can occur if the geometry around a metal in a complex cannot be classified by one of the standard VSEPR models. Solutions to these issues are being developed and are released with the next versions. If you encounter difficulties, you can use the -d option when using MolBar as a command-line tool or set write_trj=True when using MolBar as a Python module to examine the optimized trajectories of each fragment. If anything is unclear or if you encounter any unusual behavior, please report it by posting issues or via email at van.staalduinen@pc.rwth-aachen.de.\n\nFor rigidity analysis, MolBar only considers double/triple bonds and rings to be rigid. For example, an obstacle to rotation due to bulkiness of substituents is not taken into account, but can be added manually from the input file (additional dihedral constraint, but that should be used as an exception and carefully).\n\n\n## Getting started (tested on Linux and macOS, compiling works for Windows only in WSL)\n\n### For Linux/macOS\n\nUsing a virtual environment is highly recommended because it allows you to create isolated environments with their own dependencies, without interfering with other Python projects or the system Python installation. This ensures that your Python environment remains consistent and reproducible across different machines and over time. To create one, type in the following command in your terminal:\n\n```bash\n python3 -m venv your_path\n```\nTo activate the environment, type in:\n```bash\n source your_path/bin/activate\n```\nTo install Molbar, enter the following command in your terminal:\n\n```bash\npip install molbar\n```\n\n### For Windows\n\nSince compiling in a standard Windows environment does not work yet, it is highly recommended to use the WSL (Windows Subsystem for Linux) extension. Simply follow this installation guide: https://learn.microsoft.com/en-us/windows/wsl/install. Note that a Fortran compiler needs to be installed manually in the WSL environment. Otherwise, the installation of MolBar will result in an error.\n\nFor Python usage, it is highly recommended to use Visual Studio Code (VSC) as it provides specific extensions to code directly in WSL. A more detailed guide can be found here: https://code.visualstudio.com/docs/remote/wsl\n\n\n# MolBar Structure\n\nFor l-alanine, the MolBar reads:\n\n```text\n\nMolBar | 1.1.2 | C3NO2H7 | 0 | -339 -140 -110 -32 13 20 20 20 160 237 432 528 850 | -209 -8 130 160 354 633 | -108 -79 -42 -24 11 11 11 47 75 140 293 433 891 | -11 0 0 0 31\n\n```\n\nThe different parts of MolBar are defined as follows:\n\n```text\nVersion: 1.1.2\nMolecular Formula: C3H7NO2 \nTopology Spectrum: -339 -140 -110 -32 13 20 20 20 160 237 432 528 850 (Encoding atomic connectivity)\nHeavy Atom Topology Spectrum: -209 -8 130 160 354 633 (Encoding atomic connectivity without hydrogen atoms. So if for two molecules, the topology spectra are different but the tautomer spectra are the same, both molecules are tautomeric structures)\nTopography Spectrum : -108 -79 -42 -24 11 11 11 47 75 140 293 433 891 (3D arrangement of atoms in Cartesian space, also describes diastereomerism)\nChirality: -11 0 0 0 31 (Encoding absolute configuration for each fragment)\n```\n\nThe chirality barcode can only be compared between two molecules if their topology and topography barcodes are identical. If the chirality barcodes differ only in their sign, this indicates that the two molecules are enantiomers.\n\n# Generating MolBar\n\nMolBar can be generated using either of the following methods:\n\n\t1.\tPython Interface: Refer to the Python Module Usage for detailed instructions on generating MolBar using Python.\n\n\t2.\tCommand Line Interface with the following command: molbar coord.xyz (see Command Line Usage for more information)\n\n\n## Python Module Usage\n\nMolBar can be generated by Python function calls:\n\n1. for a single molecule with ```get_molbar_from_coordinates``` by specifying the Cartesian coordinates as a list,\n2. for several molecules at once with ```get_molbars_from_coordinates``` by giving a list of lists with Cartesian coordinates,\n3. for a single molecule with ```get_molbar_from_file``` by specifying a file path,\n4. for several molecules at once with ```get_molbars_from_files``` by specifying a list of file paths.\n\n### 1. get_molbar_from_coordinates\nNOTE:\nIf you need to process multiple molecules at once, it is recommended to use ```get_molbars_from_coordinates``` instead of ```get_molbar_from_coordinates```.\n\n```python\n from molbar.barcode import get_molbar_from_coordinates\n\n def get_molbar_from_coordinates(coordinates: list, elements: list, return_data=False, timing=False, input_constraint=None, mode=\"mb\") -> Union[str, dict]\n\n```\n#### Arguments:\n\n```python\n coordinates (list): Molecular geometry provided by atomic Cartesian coordinates with shape (n_atoms, 3).\n```\n```python\n elements (list):A list of elements in that molecule. Either the element symbols or atomic numbers can be used.\n```\n```python\n return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n timing (bool): Whether to print the duration of this calculation.\n```\n```python\n input_constraint (dict, optional): A dict of extra constraints for the calculation. See below for more information. USED ONLY IN EXCEPTIONAL CASES.\n```\n```python\n mode (str): Whether to calculate the molecular barcode (\"mb\") or only the topology part of the molecular barcode (\"topo\").\n```\n\n#### Returns:\n\n```python\n Union[str, dict]: Either MolBar or the MolBar and MolBar data.\n```\n\nInput constraints should be used only in exceptional cases. However, they may be useful to constrain the molecule with additional dihedrals for rotations that are normally considered around single bonds but whose rotation is hindered (e.g., 90\u00b0 binol systems with bulky substituents).\n\n ```python\n {\n 'constraints': {\n 'dihedrals': [{'atoms': [1,2,3,4], 'value':90.0},...]} #atoms: list of atoms that define the dihedral, value is the ideal dihedral angle in degrees, atom indexing starts with 1.\n }\n ```\n\n\n### 2. get_molbars_from_coordinates\nNOTE:\nIf you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.\n\n ```python\n from molbar.barcode import get_molbars_from_coordinates\n\n def get_molbars_from_coordinates(list_of_coordinates: list, list_of_elements: list, return_data=False, threads=1, timing=False, input_constraints=None, progress=False, mode=\"mb\") -> Union[list, Union[str, dict]]:\n```\n\n#### Arguments:\n\n```python\n list_of_coordinates (list): A list of molecular geometries provided by atomic Cartesian coordinates with shape (n_molecules, n_atoms, 3).\n```\n```python\n list_of_elements (list): A list of element lists for each molecule in the list_of_coordinates with shape (n_molecules, n_atoms). Either the element symbols or atomic numbers can be used.\n```\n```python\n return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n threads (int): Number of threads to use for the calculation. If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.\n```\n```python\n timing (bool): Whether to print the duration of this calculation.\n```\n```python\n input_constraints (list, optional): A list of constraints for the calculation. Each constraint in that list is a Python dict as shown above for get_molbar_from_coordinates.\n```\n```python\n progress (bool): Whether to show a progress bar.\n```\n```python\n mode (str): Whether to calculate the molecular barcode (\"mb\") or the topology part of the molecular barcode (\"topo\").\n```\n\n#### Returns:\n\n```python\n Union[list, Union[str, dict]]: Either MolBar or the MolBar and MolBar data.\n```\n\n### 3. get_molbar_from_file\nNOTE:\nIf you need to process multiple molecules provided by files at once, it is recommended to use ```get_molbars_from_files``` instead of ```get_molbar_from_file```.\n\n ```python\n from molbar.barcode import get_molbar_from_file\n\n def get_molbar_from_file(file: str, return_data=False, timing=False, input_constraint=None, mode=\"mb\", write_trj=False) -> Union[str, dict]:\n```\n#### Arguments:\n\n```python\n file (str): The path to the file containing the molecule information. The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n```\n```python\n return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n timing (bool): Whether to print the duration of this calculation.\n```\n```python\n input_constraint (dict, optional): A dict of extra constraints for the calculation. See below for more information. USED ONLY IN EXCEPTIONAL CASES.\n```\n```python\n mode (str): Whether to calculate the molecular barcode (\"mb\") or only the topology part of the molecular barcode (\"topo\").\n```\n```python\n write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.\n```\n \n#### Returns:\n\n```python\n Union[str, dict]: Either MolBar or the MolBar and MolBar data.\n ```\n\nExample for input file in .yml format. Input constraint should be used only in exceptional cases. However, it may be useful to constrain bonds with a additional dihedral for the barcode that are normally considered single bonds but whose rotation is hindered (e.g., 90\u00b0 binol systems with bulky substituents).\n```yml\nconstraints:\n dihedrals:\n - atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral\n value: 90.0 # Actual values for the dihedral parameters\n```\n\n\n### 4. get_molbars_from_files\n\nNOTE:\nIf you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.\n\n ```python\n from molbar.barcode import get_molbars_from_files\n\n def get_molbars_from_files(files: list, return_data=False, threads=1, timing=False, input_constraints=None, progress=False, mode=\"mb\", write_trj=False) ->Union[list, Union[str, dict]]:\n ```\n\n#### Arguments:\n```python\n files (list): The list of paths to the files containing the molecule information. The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n```\n```python\n return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n threads (int): Number of threads to use for the calculation. If you need to process multiple molecules at once, it is recommended to use this function and specify the number of threads that can be used to process multiple molecules simultaneously.\n```\n```python\n timing (bool): Whether to print the duration of this calculation. \n```\n```python\n input_constraints (list, optional): A list of file paths to the input files for the calculation. Each constrained is specified by a file path to a .yml file, as shown above for get_molbar_from_file.\n```\n```python\n progress (bool): Whether to show a progress bar.\n```\n```python\n mode (str): Whether to calculate the molecular barcode (\"mb\") or the topology part of the molecular barcode (\"topo\").\n```\n```python\n write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.\n```\n\n#### Returns:\n```python\n Union[list, Union[str, dict]]: Either MolBar or the MolBar and MolBar data.\n\n ```\n\n\n## Commandline Usage\n\nMolBar can also be used as commandline tool. Just simply type:\n\n```\nmolbar coord.xyz\n```\nand the MolBar is printed to the stdout.\n\nNOTE:\nIf you need to process several molecules at once, it is recommended to pass all molecules to the code at once (e.g. with *.xyz) while specifying the number of threads the code should use:\n```bash\nmolbar *.xyz -T N_threads -s\n```\nThe latter option (-s) is used to store the barcode to .mb files. \n\n\nFurther, the commandline tool provides several options:\n\n```bash\nBasic usage example: \nmolbar coord.xyz\n\nUsage with multiple files: \nmolbar *.xyz -T 8\n\nUsage for topology barcode only: \nmolbar coord.mol -m topo\n\nUsage to return additional data: \nmolbar coord.pdb -d\n\nUsage to print timings: \nmolbar coord.com -t\n```\n#### Arguments and Options:\n\n```bash\npositional arguments:\n file(s) \n\n The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n```\n```bash\n -m {mb,topo,opt}, --mode {mb,topo,opt}\n\n The mode to use for the calculations (either \"mb\" (default, calculates MolBar), \"topo\" (topology part only)\n or \"opt\" (using stand-alone force field idealization, writes \".opt\" with final structure))\n```\n```bash\n -i path/to/input --inp path/to/input\n\n Path to input file in .yml format to add further constraints. Example input can be found below.\n```\n```bash\n -d, --data Whether to print MolBar data. \n Writes a \"filename/\" directory containing a json file with\n important information that defines MolBar. Writes idealization trajectories of each fragment to same directory.\n```\n```bash\n -T number_of_threads, --threads number_of_threads\n The number of threads to use for parallel processing of several files. MolBar generation for a single file is not parallelized. Should be used together with -s/--save (e.g. molbar *.xyz -T 8 -s)\n```\n```bash\n -s, --save Whether to save the result to a file of type \"filename.mb\"\n```\n```bash\n -t, --time Print out timings.\n```\n```bash\n -p, --progress Use a progress bar when several files are handled.\n```\n\nExample for input file constraints in yml format. Input constraint should be used only in exceptional cases. However, it may be useful to constrain bonds with a additional dihedral for the barcode that are normally considered single bonds but whose rotation is hindered (e.g., 90\u00b0 binol systems with bulky substituents).\n\n```yml\nconstraints:\n dihedrals:\n - atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral\n value: 90.0 # Actual values for the dihedral parameters\n```\n\n\n\n## Using the unification force field for the whole molecule.\n\nThe force field can be used to idealize the structure of a whole molecule where the coordinates are either given in Python by a file:\n\n1. as a command line tool with the ```molbar coord.xyz -m opt``` option\n2. in Python with ```idealize_structure_from_file``` by providing a file path\n3. in Python with ```idealize_structure_from_coordinates``` by providing Cartesian coordinates as a list\n \n\n### Commandline tool\n```text\nmolbar coord.xyz -m opt\n```\nThis writes a coord.opt file that contains the idealized coordinates.\n\n### In Python from a file:\n```python\n from molbar.idealize import idealize_structure_from_file\n\n def idealize_structure_from_file(file: str, return_data=False, timing=False, input_constraint=None, write_trj=False) -> Union[list, str]\n```\n#### Arguments:\n```python\n file (str): The path to the input file to be processed. \n \n The file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n```\n```python\n return_data (bool): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n timing (bool): Whether to print the duration of this calculation.\n```\n```python\n input_constraint (str): The path to the input file containing the constraint for the calculation. See down below for more information.\n```\n```python\n write_trj (bool, optional): Whether to write a trajectory of the unification process. Defaults to False.\n```\n#### Returns:\n```python\n n_atoms (int): Number of atoms in the molecule.\n```\n```python\n energy (float): Final energy of the molecule after idealization.\n```\n```python\n coordinates (list): Final coordinates of the molecule after idealization.\n```\n```python\n elements (list): Elements of the molecule.\n```\n```python\n data (dict): Molbar data.\n```\n\nThis is an example input as a yml file:\n```yml\nbond_order_assignment: False # False if bond order assignment should be skipped, only reasonable opt mode (standalone force-field optimization)\ncycle_detection: True # False if cycle detection should be skipped, only reasonable opt mode (standalone force-field optimization).\nrepulsion_charge: 100.0 # Charge used for the Coulomb term in the Force field, every atom-atom pair uses the same charge, only reasonable opt mode (standalone force-field optimization). Defaults to 100.0\nset_edges: True #False if no bonds should be constrained automatically.\nset_angles: True #False if no angles should be constrained automatically.\nset_dihedrals: True # False if no dihedrals should be constrained automatically.\nset_repulsion: True #False if no coulomb term should be used automatically.\n\nconstraints:\n bonds:\n - atoms: [19, 23] # List of atoms involved in the bond\n value: 1.5 # Ideal bond length. \n angles:\n - atoms: [19, 23, 35] # List of atoms involved in the angle\n value: 45.0 # Angle to which the angle between the three atoms is to be constrained\n - atoms: [35, 23, 19] # List of atoms involved in the angle\n value: 45.0 # Angle to which the angle between the three atoms is to be constrained\n\n dihedrals:\n - atoms: [30, 18, 14, 13] # List of atoms involved in the dihedral\n value: 90.0 # Actual values for the dihedral parameters\n```\n\n### In Python from a list of Cartesian coordinates:\n```python\nfrom molbar.idealize import idealize_structure_from_coordinates\n\ndef idealize_structure_from_coordinates(coordinates: list, elements: list, return_data=False, timing=False, input_constraint=None) -> Union[list, str]:\n```\n#### Arguments:\n\n```python\n coordinates (list): Cartesian coordinates of the molecule.\n```\n```python\n elements (list): Elements of the molecule. Either the element symbols or atomic numbers can be used.\n```\n```python\n return_data (bool, optional): Whether to return MolBar data. MolBar can return detailed data, including information about bonding, VSEPR geometries, fragments, and more. This data is useful for understanding what MolBar recognizes and can be leveraged for other projects.\n```\n```python\n timing (bool, optional): Whether to print the duration of this calculation.\n```\n```python\n input_constraint (dict, optional): The constraint for the calculation. See documentation for more information.\n```\n \n#### Returns:\n```python\n n_atoms (int): Number of atoms in the molecule.\n```\n```python\n energy (float): Final energy of the molecule after idealization.\n```\n```python\n coordinates (list): Final coordinates of the molecule after idealization.\n```\n```python\n elements (list): Elements of the molecule.\n```\n```python\n data (dict): Molbar data.\n```\n\nThis is an example input as a Python dict:\n\n```python\n {'bond_order_assignment': True, #False if bond order assignment should be skipped, only reasonable opt mode (standalone force-field optimization)\n 'cycle_detection': True, #False if cycle detection should be skipped, only reasonable opt mode (standalone force-field optimization).\n 'set_edges': True #False if no bonds should be constrained automatically.\n 'set_angles': True #False if no angles should be constrained automatically.\n 'set_dihedrals': True #False if no dihedrals should be constrained automatically.\n 'set_repulsion': True #False if no coulomb term should be used automatically.\n 'repulsion_charge': 100.0, # Charged used for the Coulomb term in the Force field, every atom-atom pair uses the same charge, only reasonable opt mode (standalone force-field optimization). Defaults to 100.0\n 'constraints': {'bonds': [{'atoms': [1,2], 'value':1.5},...], #atoms: list of atoms that define the bond, value is the ideal bond length in angstrom, atom indexing starts with 1.\n 'angles': [{'atoms': [1,2,3], 'value':90.0},...], #atoms: list of atoms that define the angle, value is the ideal angle in degrees, atom indexing starts with 1.\n 'dihedrals': [{'atoms': [1,2,3,4], 'value':180.0},...]} #atoms: list of atoms that define the dihedral, value is the ideal dihedral angle in degrees, atom indexing starts with 1.\n }\n```\n\n# Additional Command line Scripts provided by the MolBar package.\n\n## Ensplit\nHelper script to split an ensemble file into individual ensembles, each with a unique molecular barcode:\n\n```bash\nensplit ensemble.trj -T 8 -topo\n```\n\n```bash\nfile Input file \n\nThe file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n\n```\n```bash\n -T THREADS, --threads number_of_threads. Number of threads for processing multiple files.\n```\n```bash\n -p, --progress Show progress bar\n```\n```bash\n -topo, --topo Only evaluate topology\n```\n\n## Princax\n\nHelper script to align the molecule to the principal axes.\n\n```bash\nprincax coord.xyz -r\n```\n\n```bash\nfile Input file \n\nThe file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n\n```\n```bash\n-r, --replace Overwrite the input file with the aligned coordinates, or print to a new file if not specified.\n```\n\n## Invstruc\n\nHelper script to invert structures to yield enantiomers.\n\n```bash\ninvstruc coord.xyz\n```\n\n```bash\nfile Input file \n\nThe file can be in formats such as XYZ, Turbomole coord, SDF/MOL, CIF, PDB, Gaussian (.com, .gjf, .log), and many others. For a complete list, refer to the ASE documentation, ASE I/O Formats: https://wiki.fysik.dtu.dk/ase/ase/io/io.html). The file must contain 3D coordinates and explicit hydrogen atoms.\n\n```\n\n## Acknowledgments\n\nMolBar relies on the following libraries\nand packages:\n\n* [ase](https://wiki.fysik.dtu.dk/ase/)\n* [dscribe](https://singroup.github.io/dscribe/latest/)\n* [networkx](https://networkx.org/)\n* [NumPy](https://numpy.org)\n* [Numba](https://numba.pydata.org)\n* [SciPy](https://scipy.org)\n* [tqdm](https://github.com/tqdm/tqdm)\n* [joblib](https://joblib.readthedocs.io/en/latest/)\n\nThank you!\n\n\n## License and Disclaimer\n\nMIT License\n\nCopyright (c) 2022 Nils van Staalduinen, Christoph Bannwarth\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\nof this software and associated documentation files (the \"Software\"), to deal\nin the Software without restriction, including without limitation the rights\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\ncopies of the Software, and to permit persons to whom the Software is\nfurnished to do so, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2022 Nils van Staalduinen, Christoph Bannwarth Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "Molecular Barcode (MolBar): Molecular Identifier for Organic and Inorganic Molecules",
"version": "1.1.3",
"project_urls": {
"Changelog": "https://git.rwth-aachen.de/bannwarthlab/molbar/-/issues",
"Documentation": "https://git.rwth-aachen.de/bannwarthlab/molbar/",
"Homepage": "https://git.rwth-aachen.de/bannwarthlab/molbar/",
"Issues": "https://git.rwth-aachen.de/bannwarthlab/molbar/-/issues",
"Repository": "https://git.rwth-aachen.de:bannwarthlab/molbar.git"
},
"split_keywords": [
"molecular identifier",
" chemical data science",
" stereoisomerism"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b73e510323770e4bd4ee87ff40c0af0c06558a2221e4d72034eff130c737b4ea",
"md5": "db3e039402f469c4d0c40b1eb7eca6b4",
"sha256": "0d0d490609c92a10c6a4883b624e0d44446bc6266a4ae78a7355bfc7b0c0346a"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp310-cp310-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "db3e039402f469c4d0c40b1eb7eca6b4",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 1067470,
"upload_time": "2024-10-03T11:05:02",
"upload_time_iso_8601": "2024-10-03T11:05:02.829214Z",
"url": "https://files.pythonhosted.org/packages/b7/3e/510323770e4bd4ee87ff40c0af0c06558a2221e4d72034eff130c737b4ea/molbar-1.1.3-cp310-cp310-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "021fcf643f3b2439b85e097bcf38877fff4833022cf90fbbab7814938a780547",
"md5": "a4f7f42494c857e87b7eb4a0929a445e",
"sha256": "0c691744a450983e978d40f425e57778fae845bab9093cc5d4c86f032bd32a53"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "a4f7f42494c857e87b7eb4a0929a445e",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 1310368,
"upload_time": "2024-10-03T11:05:09",
"upload_time_iso_8601": "2024-10-03T11:05:09.097809Z",
"url": "https://files.pythonhosted.org/packages/02/1f/cf643f3b2439b85e097bcf38877fff4833022cf90fbbab7814938a780547/molbar-1.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "694cdb5a41b0ef04a986f47f42319b739eb1182bbe80ca4f98ba3edc50445567",
"md5": "88d5e53db85ba7758abab014e8cac93a",
"sha256": "10af7d6c03aac58752c01ffb0e3c4364ce9c6483adab951ffe1f250e4377ae0f"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp311-cp311-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "88d5e53db85ba7758abab014e8cac93a",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 1067474,
"upload_time": "2024-10-03T11:05:13",
"upload_time_iso_8601": "2024-10-03T11:05:13.791360Z",
"url": "https://files.pythonhosted.org/packages/69/4c/db5a41b0ef04a986f47f42319b739eb1182bbe80ca4f98ba3edc50445567/molbar-1.1.3-cp311-cp311-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4223a86c2f712cb1600f894c91044254df177bcdf643152d8b59e85746f38385",
"md5": "6e2656d1443b7913d7fa46f1932acdf2",
"sha256": "11f59a30521e145c4dbf9f297985ec246a1227e0e3c5a9661ab9d6a2cd13ebb4"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "6e2656d1443b7913d7fa46f1932acdf2",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 1310407,
"upload_time": "2024-10-03T11:05:18",
"upload_time_iso_8601": "2024-10-03T11:05:18.060804Z",
"url": "https://files.pythonhosted.org/packages/42/23/a86c2f712cb1600f894c91044254df177bcdf643152d8b59e85746f38385/molbar-1.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e7ef1b729f2e4e8623a1d8b29c3b5a0456249120faf8bc4aeea566bf3baa0baa",
"md5": "85589c9f53d3065a923658c7a968b82b",
"sha256": "58b462779e1a1d1ea6c04edf92a0307c4e4d46d0097a2bd0ebeca8e2171db009"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp312-cp312-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "85589c9f53d3065a923658c7a968b82b",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 1068002,
"upload_time": "2024-10-03T11:05:21",
"upload_time_iso_8601": "2024-10-03T11:05:21.464101Z",
"url": "https://files.pythonhosted.org/packages/e7/ef/1b729f2e4e8623a1d8b29c3b5a0456249120faf8bc4aeea566bf3baa0baa/molbar-1.1.3-cp312-cp312-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f2ad65bee87aa40bd181f2e0971eb869355c43deef91b64bc86c822c4366b3e9",
"md5": "f3f36842af5dd4377af0e1ace92e14a3",
"sha256": "0b68e7c45d21ec6ac5f97387dd0084667d70b8411eeeba45d5106246673bbc03"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "f3f36842af5dd4377af0e1ace92e14a3",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 1310705,
"upload_time": "2024-10-03T11:05:25",
"upload_time_iso_8601": "2024-10-03T11:05:25.923823Z",
"url": "https://files.pythonhosted.org/packages/f2/ad/65bee87aa40bd181f2e0971eb869355c43deef91b64bc86c822c4366b3e9/molbar-1.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c0cd80e04155077d19000c634cd1d7935b6a163813e29b87bfd85be5502e31c6",
"md5": "860c8395d0564d8b51efbad5cf5bf834",
"sha256": "251d66c0a0cfb185e7ab441a7e8597e0977ffd90dbc518b461bf66ad9b22739c"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp38-cp38-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "860c8395d0564d8b51efbad5cf5bf834",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 1065710,
"upload_time": "2024-10-03T11:05:29",
"upload_time_iso_8601": "2024-10-03T11:05:29.786846Z",
"url": "https://files.pythonhosted.org/packages/c0/cd/80e04155077d19000c634cd1d7935b6a163813e29b87bfd85be5502e31c6/molbar-1.1.3-cp38-cp38-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bf09c183b8c67f58cf11552d229a51db075f9d7eb4af26ad51d8275c771c1b38",
"md5": "633ecc3761e60adaf7286867152e0f57",
"sha256": "5bd965e8967e98a648ed8707bfa542e8fbb4161b280bfd02af0c38b850f8c87d"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "633ecc3761e60adaf7286867152e0f57",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 1308352,
"upload_time": "2024-10-03T11:05:33",
"upload_time_iso_8601": "2024-10-03T11:05:33.731076Z",
"url": "https://files.pythonhosted.org/packages/bf/09/c183b8c67f58cf11552d229a51db075f9d7eb4af26ad51d8275c771c1b38/molbar-1.1.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3f8dfb95cdf1ac2b20383acac689189627bd592d801b821be0da40916eff9c34",
"md5": "230f648bf2b16fd7d493638958f0eacb",
"sha256": "5deb2cb2c8e793a4e0499a68b8c47f1bbaf62fc20f89330ca6d6d14683ae7a8c"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp39-cp39-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "230f648bf2b16fd7d493638958f0eacb",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 1067710,
"upload_time": "2024-10-03T11:05:37",
"upload_time_iso_8601": "2024-10-03T11:05:37.550402Z",
"url": "https://files.pythonhosted.org/packages/3f/8d/fb95cdf1ac2b20383acac689189627bd592d801b821be0da40916eff9c34/molbar-1.1.3-cp39-cp39-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b81daf0674f91cbda868799d100ba6a576b7dc2223cef39ee8b7a7a13207a72f",
"md5": "051ee8a711cef9844bdf73bfa4322070",
"sha256": "e27abdb9d606551d182045d168c582027436035375b56d6a3e84a09984379502"
},
"downloads": -1,
"filename": "molbar-1.1.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "051ee8a711cef9844bdf73bfa4322070",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 1310641,
"upload_time": "2024-10-03T11:05:42",
"upload_time_iso_8601": "2024-10-03T11:05:42.686339Z",
"url": "https://files.pythonhosted.org/packages/b8/1d/af0674f91cbda868799d100ba6a576b7dc2223cef39ee8b7a7a13207a72f/molbar-1.1.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "143062d59086aef1a1fca7a50454cd2ef2a564d30e9f897149bc58210bf88cc6",
"md5": "7dabb9b650616db8db2a1fb85dae5f7d",
"sha256": "893ff8bfd885009904f0957a56c890f932dbb52ed3483e5247282c7c7e793da3"
},
"downloads": -1,
"filename": "molbar-1.1.3.tar.gz",
"has_sig": false,
"md5_digest": "7dabb9b650616db8db2a1fb85dae5f7d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 419288,
"upload_time": "2024-10-03T11:05:44",
"upload_time_iso_8601": "2024-10-03T11:05:44.955086Z",
"url": "https://files.pythonhosted.org/packages/14/30/62d59086aef1a1fca7a50454cd2ef2a564d30e9f897149bc58210bf88cc6/molbar-1.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-03 11:05:44",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "molbar"
}