navicat-marc


Namenavicat-marc JSON
Version 0.1.10 PyPI version JSON
download
home_pagehttps://github.com/lcmd-epfl/marc/
SummaryModular Analysis of Representative Conformers
upload_time2023-04-13 08:27:16
maintainer
docs_urlNone
authorrlaplaza, lcmd-epfl
requires_python>=3.8
license
keywords compchem
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            navicat-marc: modular Analysis of Representative Conformers
==============================================
<!-- zenodo badge will go here -->

![marc logo](./images/marc_logo.png)
[![PyPI version](https://badge.fury.io/py/navicat-marc.svg)](https://badge.fury.io/py/navicat_marc)

## Contents
* [About](#about-)
* [Install](#install-)
* [Concept](#concept-)
* [Examples](#examples-)
* [Citation](#citation-)

## About [↑](#about)

The code runs on pure python with the following dependencies: 
- `numpy`
- `scipy`
- `matplotlib`
- `scikit-learn`
- `networkx`


## Install [↑](#install)

You can install marc using pip:

```python
pip install navicat_marc
```

Afterwards, you can call marc as:

```python 
python -m navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]
```
or simply

```python 
navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]
```

Alternatively, you can download the package and execute:

```python 
python setup.py install
```

Afterwards, you can call marc as:

```python 
python -m navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]
```
or

```python 
navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]
```

Options can be consulted using the `-h` flag in either case. The help menu is quite detailed. 

Note that the main functions are all exposed and called directly in sequential order from `marc.py`, in case you want to incorporate them in your own code.

## Concept [↑](#concept)

Several strategies are available for the generation of conformational ensembles. Typically, one then needs to sort the ensemble and proceed with the study of the most energetically favored conformers, which will be the most accesible thermodynamically following a Boltzmann distribution.

However, sorting conformers accurately requires high quality energy computations. Accurately determining the energy of every structure may be too computationally demanding. Hence, marc provides a convenient way of accomplishing three goals:

- Select a handful of conformers that are representative of the diversity of the conformational ensemble using combined metrics.
- Apply energy cutoffs based on the available energies to remove entire clusters from the space using the `-ewin` flag and inputting a treshold in kcal/mol.
- Proceed iteratively, helping the user select non-redundant conformers than can then be refined with a higher level and fed back to marc.

The default clustering metric used in marc is the `"mix"` distance, which measures pairwise similarity based on heavy-atom rmsd times the energy difference times the kernel of the heavy-atom dihedral angles of the system. 

The logic behind this choice is that rmsd ought to be good except in cases where trivial single bond rotations increase the rmsd without affecting the energy, while the dihedral metric smooths systems that only differ by a few torsions. The possible metrics (to be fed to the `-m` flag) are `"rmsd"`, `"erel"` (based on the available energies), `"da"` (based on the most relevant dihedral angle of the molecule), `"ewrmsd"` (combining geometry and energy) and `"mix"` (combining geometry, dihedrals and energy).  

## Examples [↑](#examples)

The examples subdirectory contains some examples obtained by running [CREST](https://xtb-docs.readthedocs.io/en/latest/crest.html). Any of the xyz files can be run as:

```python
navicar_marc -i [FILENAME]
```

Options can be consulted with the `-h` flag.

The input of marc is either a series or xyz files or a single trajectory-like xyz file with many conformers. All structures are expected to be analogous in terms of sorting and molecular topology. Energies per conformer, at any level of theory of your liking but in atomic units, can be provided in atomic units in the title line of each xyz block or file. Alternatively, energies can be provided in a plaintext file whose filename can be passed to the `ewin` command line argument. Such file must contain the same number of lines as conformers and two numbers per line (separated by blank spaces): an index, and an energy in atomic units. The energy window specified in the `ewin` command line argument should be in kcal/mol.

Note that, by default, marc will select the most representative conformer out of every cluster. If you can provide energy values that you trust strongly, the `mine` flag will ensure that the lowest energy conformer of every cluster is selected.

The output of marc are `n` selected xyz files which will be called `INPUT_selected_n.xyz` in the runtime directory. Conformers discarded by the `ewin` threshold will be printed with the `rejected` appendix instead. The discarding checks two criteria: if a cluster has an average energy that is `mine` kcal/mol higher than the lowest conformer (plus half a standard deviation), and its lowest energy member is also higher than the threshold, the entire cluster will be discarded.

High verbosity levels (`-v 1`, `-v 2`, etc.) will print significantly more information while marc runs. To be as automated as possible, reasonable default values are set for most choices, but extreme verbosity can be obtained by raising the value.

As a final note, marc does not consider hydrogen atoms for geometry analysis. You can force marc to include them by using the `yesh` flag.

## Citation [↑](#citation)

Please cite our work with the repository DOI.

---



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/lcmd-epfl/marc/",
    "name": "navicat-marc",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "compchem",
    "author": "rlaplaza, lcmd-epfl",
    "author_email": "\"R. Laplaza\" <rlaplaza@duck.com>",
    "download_url": "https://files.pythonhosted.org/packages/f7/fe/5919e76937c7c74e41c6f98906741799c61420bbdf247635d16d0f037a55/navicat_marc-0.1.10.tar.gz",
    "platform": null,
    "description": "navicat-marc: modular Analysis of Representative Conformers\n==============================================\n<!-- zenodo badge will go here -->\n\n![marc logo](./images/marc_logo.png)\n[![PyPI version](https://badge.fury.io/py/navicat-marc.svg)](https://badge.fury.io/py/navicat_marc)\n\n## Contents\n* [About](#about-)\n* [Install](#install-)\n* [Concept](#concept-)\n* [Examples](#examples-)\n* [Citation](#citation-)\n\n## About [\u2191](#about)\n\nThe code runs on pure python with the following dependencies: \n- `numpy`\n- `scipy`\n- `matplotlib`\n- `scikit-learn`\n- `networkx`\n\n\n## Install [\u2191](#install)\n\nYou can install marc using pip:\n\n```python\npip install navicat_marc\n```\n\nAfterwards, you can call marc as:\n\n```python \npython -m navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]\n```\nor simply\n\n```python \nnavicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]\n```\n\nAlternatively, you can download the package and execute:\n\n```python \npython setup.py install\n```\n\nAfterwards, you can call marc as:\n\n```python \npython -m navicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]\n```\nor\n\n```python \nnavicat_marc [-h] [-version] -i INPUT [INPUT ...] [-c C] [-m M] [-n N] [-ewin EWIN] [-sf SF] [-mine] [-yesh] [-s] [-as] [-efile EFILE] [-v VERB] [-pm PLOTMODE]\n```\n\nOptions can be consulted using the `-h` flag in either case. The help menu is quite detailed. \n\nNote that the main functions are all exposed and called directly in sequential order from `marc.py`, in case you want to incorporate them in your own code.\n\n## Concept [\u2191](#concept)\n\nSeveral strategies are available for the generation of conformational ensembles. Typically, one then needs to sort the ensemble and proceed with the study of the most energetically favored conformers, which will be the most accesible thermodynamically following a Boltzmann distribution.\n\nHowever, sorting conformers accurately requires high quality energy computations. Accurately determining the energy of every structure may be too computationally demanding. Hence, marc provides a convenient way of accomplishing three goals:\n\n- Select a handful of conformers that are representative of the diversity of the conformational ensemble using combined metrics.\n- Apply energy cutoffs based on the available energies to remove entire clusters from the space using the `-ewin` flag and inputting a treshold in kcal/mol.\n- Proceed iteratively, helping the user select non-redundant conformers than can then be refined with a higher level and fed back to marc.\n\nThe default clustering metric used in marc is the `\"mix\"` distance, which measures pairwise similarity based on heavy-atom rmsd times the energy difference times the kernel of the heavy-atom dihedral angles of the system. \n\nThe logic behind this choice is that rmsd ought to be good except in cases where trivial single bond rotations increase the rmsd without affecting the energy, while the dihedral metric smooths systems that only differ by a few torsions. The possible metrics (to be fed to the `-m` flag) are `\"rmsd\"`, `\"erel\"` (based on the available energies), `\"da\"` (based on the most relevant dihedral angle of the molecule), `\"ewrmsd\"` (combining geometry and energy) and `\"mix\"` (combining geometry, dihedrals and energy).  \n\n## Examples [\u2191](#examples)\n\nThe examples subdirectory contains some examples obtained by running [CREST](https://xtb-docs.readthedocs.io/en/latest/crest.html). Any of the xyz files can be run as:\n\n```python\nnavicar_marc -i [FILENAME]\n```\n\nOptions can be consulted with the `-h` flag.\n\nThe input of marc is either a series or xyz files or a single trajectory-like xyz file with many conformers. All structures are expected to be analogous in terms of sorting and molecular topology. Energies per conformer, at any level of theory of your liking but in atomic units, can be provided in atomic units in the title line of each xyz block or file. Alternatively, energies can be provided in a plaintext file whose filename can be passed to the `ewin` command line argument. Such file must contain the same number of lines as conformers and two numbers per line (separated by blank spaces): an index, and an energy in atomic units. The energy window specified in the `ewin` command line argument should be in kcal/mol.\n\nNote that, by default, marc will select the most representative conformer out of every cluster. If you can provide energy values that you trust strongly, the `mine` flag will ensure that the lowest energy conformer of every cluster is selected.\n\nThe output of marc are `n` selected xyz files which will be called `INPUT_selected_n.xyz` in the runtime directory. Conformers discarded by the `ewin` threshold will be printed with the `rejected` appendix instead. The discarding checks two criteria: if a cluster has an average energy that is `mine` kcal/mol higher than the lowest conformer (plus half a standard deviation), and its lowest energy member is also higher than the threshold, the entire cluster will be discarded.\n\nHigh verbosity levels (`-v 1`, `-v 2`, etc.) will print significantly more information while marc runs. To be as automated as possible, reasonable default values are set for most choices, but extreme verbosity can be obtained by raising the value.\n\nAs a final note, marc does not consider hydrogen atoms for geometry analysis. You can force marc to include them by using the `yesh` flag.\n\n## Citation [\u2191](#citation)\n\nPlease cite our work with the repository DOI.\n\n---\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Modular Analysis of Representative Conformers",
    "version": "0.1.10",
    "split_keywords": [
        "compchem"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a02e8b8df3eac5c940e27062c2c5ce5fa409295dee0d76772efefb71dd023998",
                "md5": "909dd5c33f52b81639b3dafe7a1c65c3",
                "sha256": "8c050435a952c3e51a9d7fd2ea67337ca6483e5b02672c6de6cb3968f397d57f"
            },
            "downloads": -1,
            "filename": "navicat_marc-0.1.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "909dd5c33f52b81639b3dafe7a1c65c3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 29208,
            "upload_time": "2023-04-13T08:27:15",
            "upload_time_iso_8601": "2023-04-13T08:27:15.112568Z",
            "url": "https://files.pythonhosted.org/packages/a0/2e/8b8df3eac5c940e27062c2c5ce5fa409295dee0d76772efefb71dd023998/navicat_marc-0.1.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f7fe5919e76937c7c74e41c6f98906741799c61420bbdf247635d16d0f037a55",
                "md5": "1a154c8ece32104160e404fead3dddc3",
                "sha256": "3299cce1ec810139714c924fdc53450e5e6725a3e708a1420af10a6c94fb0107"
            },
            "downloads": -1,
            "filename": "navicat_marc-0.1.10.tar.gz",
            "has_sig": false,
            "md5_digest": "1a154c8ece32104160e404fead3dddc3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 28300,
            "upload_time": "2023-04-13T08:27:16",
            "upload_time_iso_8601": "2023-04-13T08:27:16.256027Z",
            "url": "https://files.pythonhosted.org/packages/f7/fe/5919e76937c7c74e41c6f98906741799c61420bbdf247635d16d0f037a55/navicat_marc-0.1.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-13 08:27:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "lcmd-epfl",
    "github_project": "marc",
    "lcname": "navicat-marc"
}
        
Elapsed time: 0.05864s