# Python scripts to automate molecular docking
### Installation
```
pip install moldock
```
or the latest version
```
pip install git+https://github.com/ci-lab-cz/docking-scripts.git
```
### Dependencies
from conda
```
conda install -c conda-forge python=3.9 numpy=1.20 rdkit scipy dask distributed vina
```
from pypi
```
pip install meeko
```
Installation of gnina is described at https://github.com/gnina/gnina
### Description
Fully automatic pipeline for molecular docking.
- two major scripts `vina_dock` and `gnina_dock` which support docking with `vina` and `gnina` (`gnina` also supports `smina` and its custom scoring functions)
- can be used as command line scripts or imported as a python module
- support distributed computing using `dask` library
- if calculation was interrupted it can be continued by invoking the same command, but everything except output DB may be omitted (these will be ignored nevertheless if DB exists) because all data is stored in DB at the first call
- `get_sdf_from_db` is used to extract data from output DB
Pipeline:
- input SMILES are converted in 3D by RDKit embedding, if input is 3D structures in SDF their conformations wil be taken as starting without changes.
- ligands are protonated by chemaxon at pH 7.4 and the most stable tautomers are generated (optional, requires a Chemaxon license)
- molecules are converted in PDBQT format
- docking with `vina`/`gnina`
- output poses are converted in MOL format and stored into output DB along with docking scores
### Example
Both scripts `vina_dock` and `gnina_dock` have similar common arguments.
Docking using input SMILES, prepared protein and config files. Ligands will not be protonated with Chemaxon, so their supplied charged states will be used. 4 CPU cores will be used. When docking will finish an SDF will be created with top docking poses for each ligand.
```
vina_dock -i input.smi -o output.db -p protein.pdbqt -s vina_config --no_protonation -c 4 --sdf
```
Retrieve second poses for compounds `mol_id_1` and `mol_id_2` with their docking scores in SDF format:
```
get_sdf_from_db -i output.db -o out.sdf -d mol_id_1,mol_id_4 --fields docking_score --poses 2
```
Instead of a comma-separated list of ids a text file can be supplied as an argument `-d`.
Retrieve top poses for compounds with docking score less then -10:
```
get_sdf_from_db -i output.db -o out.sdf --fields docking_score --poses 1 --add_sql 'docking_score < -10'
```
Use as a Python module
```python
from moldock import vina_dock
vina_dock.iter_docking(dbname='output.db', receptor_pdbqt_fname='protein.pdbqt', protein_setup='vina_config.txt', protonation=False, exhaustiveness=8, seed=-1, n_poses=10, ncpu=4)
```
### Changelog
**0.1.2**
- (bugfix) docking of macrocycles is rigid (in future may be changed)
### Licence
CC BY-NC-SA 4.0
Raw data
{
"_id": null,
"home_page": "https://github.com/ci-lab-cz/docking-scripts",
"name": "moldock",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Pavel Polishchuk",
"author_email": "pavel_polishchuk@ukr.net",
"download_url": "https://files.pythonhosted.org/packages/ef/c2/6878db411975255ab4415b8340a560c74c4c3454a3f34f3236632f29fda3/moldock-0.1.3.tar.gz",
"platform": null,
"description": "# Python scripts to automate molecular docking\n\n### Installation\n\n```\npip install moldock\n```\nor the latest version\n```\npip install git+https://github.com/ci-lab-cz/docking-scripts.git\n```\n\n### Dependencies\n\nfrom conda\n```\nconda install -c conda-forge python=3.9 numpy=1.20 rdkit scipy dask distributed vina\n```\n\nfrom pypi\n```\npip install meeko\n```\n\nInstallation of gnina is described at https://github.com/gnina/gnina\n\n### Description\n\nFully automatic pipeline for molecular docking.\n- two major scripts `vina_dock` and `gnina_dock` which support docking with `vina` and `gnina` (`gnina` also supports `smina` and its custom scoring functions)\n- can be used as command line scripts or imported as a python module\n- support distributed computing using `dask` library\n- if calculation was interrupted it can be continued by invoking the same command, but everything except output DB may be omitted (these will be ignored nevertheless if DB exists) because all data is stored in DB at the first call\n- `get_sdf_from_db` is used to extract data from output DB \n\nPipeline:\n- input SMILES are converted in 3D by RDKit embedding, if input is 3D structures in SDF their conformations wil be taken as starting without changes.\n- ligands are protonated by chemaxon at pH 7.4 and the most stable tautomers are generated (optional, requires a Chemaxon license)\n- molecules are converted in PDBQT format\n- docking with `vina`/`gnina`\n- output poses are converted in MOL format and stored into output DB along with docking scores\n\n### Example\n\nBoth scripts `vina_dock` and `gnina_dock` have similar common arguments.\n\nDocking using input SMILES, prepared protein and config files. Ligands will not be protonated with Chemaxon, so their supplied charged states will be used. 4 CPU cores will be used. When docking will finish an SDF will be created with top docking poses for each ligand. \n```\nvina_dock -i input.smi -o output.db -p protein.pdbqt -s vina_config --no_protonation -c 4 --sdf \n``` \n\nRetrieve second poses for compounds `mol_id_1` and `mol_id_2` with their docking scores in SDF format:\n```\nget_sdf_from_db -i output.db -o out.sdf -d mol_id_1,mol_id_4 --fields docking_score --poses 2 \n```\nInstead of a comma-separated list of ids a text file can be supplied as an argument `-d`.\n\nRetrieve top poses for compounds with docking score less then -10:\n```\nget_sdf_from_db -i output.db -o out.sdf --fields docking_score --poses 1 --add_sql 'docking_score < -10' \n```\n\nUse as a Python module\n\n```python\nfrom moldock import vina_dock\n\nvina_dock.iter_docking(dbname='output.db', receptor_pdbqt_fname='protein.pdbqt', protein_setup='vina_config.txt', protonation=False, exhaustiveness=8, seed=-1, n_poses=10, ncpu=4)\n```\n\n### Changelog\n\n**0.1.2**\n- (bugfix) docking of macrocycles is rigid (in future may be changed)\n\n### Licence\nCC BY-NC-SA 4.0\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Python moldock to facilitate molecular docking",
"version": "0.1.3",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5f556247f188516da1c086ed6dda72f99fc9af72ec75febf8ca087e4a487b910",
"md5": "cc989074f35f125e37275a414e75ae6a",
"sha256": "f7fb9c89ba834dafe8a55476186726480d63728d223474c5e15e2e3f9f9d9fc5"
},
"downloads": -1,
"filename": "moldock-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cc989074f35f125e37275a414e75ae6a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 22126,
"upload_time": "2023-03-28T08:23:55",
"upload_time_iso_8601": "2023-03-28T08:23:55.281633Z",
"url": "https://files.pythonhosted.org/packages/5f/55/6247f188516da1c086ed6dda72f99fc9af72ec75febf8ca087e4a487b910/moldock-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "efc26878db411975255ab4415b8340a560c74c4c3454a3f34f3236632f29fda3",
"md5": "fe03e98f9708acdca8753fac2e039b68",
"sha256": "42c525e85fa97cdb824cfbed3827819c6ae43a97ad15fcf60ac67abe12062458"
},
"downloads": -1,
"filename": "moldock-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "fe03e98f9708acdca8753fac2e039b68",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 17034,
"upload_time": "2023-03-28T08:23:56",
"upload_time_iso_8601": "2023-03-28T08:23:56.976956Z",
"url": "https://files.pythonhosted.org/packages/ef/c2/6878db411975255ab4415b8340a560c74c4c3454a3f34f3236632f29fda3/moldock-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-28 08:23:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "ci-lab-cz",
"github_project": "docking-scripts",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "moldock"
}