[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
# CDK Python wrapper
Python wrapper to ease the calculation of [CDK](https://cdk.github.io/) molecular descriptors and fingerprints.
## Installation
From source:
git clone https://github.com/OlivierBeq/CDK_pywrapper.git
pip install ./CDK_pywrapper
with pip:
```bash
pip install CDK-pywrapper
```
### Get started
```python
from CDK_pywrapper import CDK
from rdkit import Chem
smiles_list = [
# erlotinib
"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
# midecamycin
"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
# selenofolate
"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
# cisplatin
"N.N.Cl[Pt]Cl"
]
mols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]
cdk = CDK()
print(cdk.calculate(mols))
```
The above calculates 222 molecular descriptors (23 1D and 200 2D).<br/>
The additional 65 three-dimensional (3D) descriptors may be obtained with the following:
:warning: Molecules are required to have conformers for 3D descriptors to be calculated.<br/>
```python
from rdkit.Chem import AllChem
for mol in mols:
_ = AllChem.EmbedMolecule(mol)
cdk = CDK(ignore_3D=False)
print(cdk.calculate(mols))
```
To obtain molecular fingerprint, one can used the following:
```python
from CDK_pywrapper import CDK, FPType
cdk = CDK(fingerprint=.PubchemFP)
print(cdk.calculate(mols))
```
The following fingerprints can be calculated:
| FPType | Fingerprint name |
|-----------|------------------------------------------------------------------------------------|
| FP | CDK fingerprint |
| ExtFP | Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses) |
| EStateFP | Electrotopological state fingerprint (79 bits) |
| GraphFP | CDK fingerprinter ignoring bond orders |
| MACCSFP | Public MACCS fingerprint |
| PubchemFP | PubChem substructure fingerprint |
| SubFP | Fingerprint describing 307 substructures |
| KRFP | Klekota-Roth fingerprint |
| AP2DFP | Atom pair 2D fingerprint as implemented in PaDEL |
| HybridFP | CDK fingerprint ignoring aromaticity |
| LingoFP | LINGO fingerprint |
| SPFP | Fingerprint based on the shortest paths between two atoms |
| SigFP | Signature fingerprint |
| CircFP | Circular fingerprint |
## Documentation
```python
class CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):
```
Constructor of a CDK calculator for molecular descriptors or fingerprints
Parameters:
- ***ignore_3D : bool***
Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.
- ***fingerprint : FPType***
Type of fingerprint to calculate (default: None). If None, calculate descriptors.
- ***nbits : int***
Number of bits in the fingerprint.
- ***depth : int***
Depth of the fingerprint.
<br/>
<br/>
```python
def calculate(mols, show_banner=True, njobs=1, chunksize=1000):
```
Default method to calculate CDK molecular descriptors and fingerprints.
Parameters:
- ***mols : Iterable[Chem.Mol]***
RDKit molecule objects for which to obtain CDK descriptors.
- ***show_banner : bool***
Displays default notice about CDK.
- ***njobs : int***
Maximum number of simultaneous processes.
- ***chunksize : int***
Maximum number of molecules each process is charged of.
Raw data
{
"_id": null,
"home_page": "https://github.com/OlivierBeq/CDK_pywrapper",
"name": "CDK-pywrapper",
"maintainer": "Olivier J. M. B\u00e9quignon",
"docs_url": null,
"requires_python": "",
"maintainer_email": "\"olivier.bequignon.maintainer@gmail.com\"",
"keywords": "Chemistry Development Kit,molecular descriptors,molecular fingerprints,cheminformatics,QSAR",
"author": "Olivier J. M. B\u00e9quignon",
"author_email": "\"olivier.bequignon.maintainer@gmail.com\"",
"download_url": "https://files.pythonhosted.org/packages/6a/8b/f9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee/CDK_pywrapper-0.1.0.tar.gz",
"platform": null,
"description": "[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\n# CDK Python wrapper\r\n\r\nPython wrapper to ease the calculation of [CDK](https://cdk.github.io/) molecular descriptors and fingerprints.\r\n\r\n## Installation\r\n\r\nFrom source:\r\n\r\n git clone https://github.com/OlivierBeq/CDK_pywrapper.git\r\n pip install ./CDK_pywrapper\r\n\r\nwith pip:\r\n\r\n```bash\r\npip install CDK-pywrapper\r\n```\r\n\r\n### Get started\r\n\r\n```python\r\nfrom CDK_pywrapper import CDK\r\nfrom rdkit import Chem\r\n\r\nsmiles_list = [\r\n # erlotinib\r\n \"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C\",\r\n # midecamycin\r\n \"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C\",\r\n # selenofolate\r\n \"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N\",\r\n # cisplatin\r\n \"N.N.Cl[Pt]Cl\"\r\n]\r\nmols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]\r\n\r\ncdk = CDK()\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\nThe above calculates 222 molecular descriptors (23 1D and 200 2D).<br/>\r\n\r\nThe additional 65 three-dimensional (3D) descriptors may be obtained with the following:\r\n:warning: Molecules are required to have conformers for 3D descriptors to be calculated.<br/>\r\n\r\n```python\r\nfrom rdkit.Chem import AllChem\r\n\r\nfor mol in mols:\r\n _ = AllChem.EmbedMolecule(mol)\r\n\r\ncdk = CDK(ignore_3D=False)\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\n\r\nTo obtain molecular fingerprint, one can used the following:\r\n\r\n```python\r\nfrom CDK_pywrapper import CDK, FPType\r\ncdk = CDK(fingerprint=.PubchemFP)\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\nThe following fingerprints can be calculated:\r\n\r\n| FPType | Fingerprint name |\r\n|-----------|------------------------------------------------------------------------------------|\r\n| FP | CDK fingerprint |\r\n| ExtFP | Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses) |\r\n| EStateFP | Electrotopological state fingerprint (79 bits) |\r\n| GraphFP | CDK fingerprinter ignoring bond orders |\r\n| MACCSFP | Public MACCS fingerprint |\r\n| PubchemFP | PubChem substructure fingerprint |\r\n| SubFP | Fingerprint describing 307 substructures |\r\n| KRFP | Klekota-Roth fingerprint |\r\n| AP2DFP | Atom pair 2D fingerprint as implemented in PaDEL |\r\n| HybridFP | CDK fingerprint ignoring aromaticity |\r\n| LingoFP | LINGO fingerprint |\r\n| SPFP | Fingerprint based on the shortest paths between two atoms |\r\n| SigFP | Signature fingerprint |\r\n| CircFP | Circular fingerprint |\r\n\r\n## Documentation\r\n\r\n```python\r\nclass CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):\r\n```\r\n\r\nConstructor of a CDK calculator for molecular descriptors or fingerprints\r\n\r\nParameters:\r\n\r\n- ***ignore_3D : bool***\r\n Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.\r\n- ***fingerprint : FPType*** \r\n Type of fingerprint to calculate (default: None). If None, calculate descriptors.\r\n- ***nbits : int*** \r\n Number of bits in the fingerprint.\r\n- ***depth : int*** \r\n Depth of the fingerprint.\r\n<br/>\r\n<br/>\r\n```python\r\ndef calculate(mols, show_banner=True, njobs=1, chunksize=1000):\r\n```\r\n\r\nDefault method to calculate CDK molecular descriptors and fingerprints.\r\n\r\nParameters:\r\n\r\n- ***mols : Iterable[Chem.Mol]*** \r\n RDKit molecule objects for which to obtain CDK descriptors.\r\n- ***show_banner : bool*** \r\n Displays default notice about CDK.\r\n- ***njobs : int*** \r\n Maximum number of simultaneous processes.\r\n- ***chunksize : int*** \r\n Maximum number of molecules each process is charged of.\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Python wrapper for CDK molecular descriptors and fingerprints",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/OlivierBeq/CDK_pywrapper"
},
"split_keywords": [
"chemistry development kit",
"molecular descriptors",
"molecular fingerprints",
"cheminformatics",
"qsar"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "cad086bc2fe825521183af0652039cd6b621675bc306b0a9ab7531d876e2f1f1",
"md5": "e8a4672637f4ba744e2eaafa21b0ad7d",
"sha256": "b5faa179aec7be01094eb5a090276c1a66a1ee84117d303ee2bc0c70658b4fa6"
},
"downloads": -1,
"filename": "CDK_pywrapper-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e8a4672637f4ba744e2eaafa21b0ad7d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 43162902,
"upload_time": "2024-01-15T21:13:19",
"upload_time_iso_8601": "2024-01-15T21:13:19.204235Z",
"url": "https://files.pythonhosted.org/packages/ca/d0/86bc2fe825521183af0652039cd6b621675bc306b0a9ab7531d876e2f1f1/CDK_pywrapper-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6a8bf9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee",
"md5": "463586e81891efb7b82bc00c921c563d",
"sha256": "20b1acc5107404b2c333a1386ac4e02166fc49dbbe67fef38a634aeacd1f7676"
},
"downloads": -1,
"filename": "CDK_pywrapper-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "463586e81891efb7b82bc00c921c563d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 43151199,
"upload_time": "2024-01-15T21:13:43",
"upload_time_iso_8601": "2024-01-15T21:13:43.080948Z",
"url": "https://files.pythonhosted.org/packages/6a/8b/f9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee/CDK_pywrapper-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-15 21:13:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "OlivierBeq",
"github_project": "CDK_pywrapper",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"tox": true,
"lcname": "cdk-pywrapper"
}