CDK-pywrapper


NameCDK-pywrapper JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/OlivierBeq/CDK_pywrapper
SummaryPython wrapper for CDK molecular descriptors and fingerprints
upload_time2024-01-15 21:13:43
maintainerOlivier J. M. Béquignon
docs_urlNone
authorOlivier J. M. Béquignon
requires_python
license
keywords chemistry development kit molecular descriptors molecular fingerprints cheminformatics qsar
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

# CDK Python wrapper

Python wrapper to ease the calculation of [CDK](https://cdk.github.io/) molecular descriptors and fingerprints.

## Installation

From source:

    git clone https://github.com/OlivierBeq/CDK_pywrapper.git
    pip install ./CDK_pywrapper

with pip:

```bash
pip install CDK-pywrapper
```

### Get started

```python
from CDK_pywrapper import CDK
from rdkit import Chem

smiles_list = [
  # erlotinib
  "n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C",
  # midecamycin
  "CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C",
  # selenofolate
  "C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N",
  # cisplatin
  "N.N.Cl[Pt]Cl"
]
mols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]

cdk = CDK()
print(cdk.calculate(mols))
```

The above calculates 222 molecular descriptors (23 1D and 200 2D).<br/>

The additional 65 three-dimensional (3D) descriptors may be obtained with the following:
:warning: Molecules are required to have conformers for 3D descriptors to be calculated.<br/>

```python
from rdkit.Chem import AllChem

for mol in mols:
    _ = AllChem.EmbedMolecule(mol)

cdk = CDK(ignore_3D=False)
print(cdk.calculate(mols))
```


To obtain molecular fingerprint, one can used the following:

```python
from CDK_pywrapper import CDK, FPType
cdk = CDK(fingerprint=.PubchemFP)
print(cdk.calculate(mols))
```

The following fingerprints can be calculated:

| FPType    | Fingerprint name                                                                   |
|-----------|------------------------------------------------------------------------------------|
| FP        | CDK fingerprint                                                                    |
| ExtFP     | Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses)  |
| EStateFP  | Electrotopological state fingerprint (79 bits)                                     |
| GraphFP   | CDK fingerprinter ignoring bond orders                                             |
| MACCSFP   | Public MACCS fingerprint                                                           |
| PubchemFP | PubChem substructure fingerprint                                                   |
| SubFP     | Fingerprint describing 307 substructures                                           |
| KRFP      | Klekota-Roth fingerprint                                                           |
| AP2DFP    | Atom pair 2D fingerprint as implemented in PaDEL                                   |
| HybridFP  | CDK fingerprint ignoring aromaticity                                               |
| LingoFP   | LINGO fingerprint                                                                  |
| SPFP      | Fingerprint based on the shortest paths between two atoms                          |
| SigFP     | Signature fingerprint                                                              |
| CircFP    | Circular fingerprint                                                               |

## Documentation

```python
class CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):
```

Constructor of a CDK calculator for molecular descriptors or fingerprints

Parameters:

- ***ignore_3D  : bool***
  Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.
- ***fingerprint  : FPType***  
  Type of fingerprint to calculate (default: None). If None, calculate descriptors.
- ***nbits  : int***  
  Number of bits in the fingerprint.
- ***depth  : int***  
  Depth of the fingerprint.
<br/>
<br/>
```python
def calculate(mols, show_banner=True, njobs=1, chunksize=1000):
```

Default method to calculate CDK molecular descriptors and fingerprints.

Parameters:

- ***mols  : Iterable[Chem.Mol]***  
  RDKit molecule objects for which to obtain CDK descriptors.
- ***show_banner  : bool***  
  Displays default notice about CDK.
- ***njobs  : int***  
  Maximum number of simultaneous processes.
- ***chunksize  : int***  
  Maximum number of molecules each process is charged of.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/OlivierBeq/CDK_pywrapper",
    "name": "CDK-pywrapper",
    "maintainer": "Olivier J. M. B\u00e9quignon",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "\"olivier.bequignon.maintainer@gmail.com\"",
    "keywords": "Chemistry Development Kit,molecular descriptors,molecular fingerprints,cheminformatics,QSAR",
    "author": "Olivier J. M. B\u00e9quignon",
    "author_email": "\"olivier.bequignon.maintainer@gmail.com\"",
    "download_url": "https://files.pythonhosted.org/packages/6a/8b/f9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee/CDK_pywrapper-0.1.0.tar.gz",
    "platform": null,
    "description": "[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\n# CDK Python wrapper\r\n\r\nPython wrapper to ease the calculation of [CDK](https://cdk.github.io/) molecular descriptors and fingerprints.\r\n\r\n## Installation\r\n\r\nFrom source:\r\n\r\n    git clone https://github.com/OlivierBeq/CDK_pywrapper.git\r\n    pip install ./CDK_pywrapper\r\n\r\nwith pip:\r\n\r\n```bash\r\npip install CDK-pywrapper\r\n```\r\n\r\n### Get started\r\n\r\n```python\r\nfrom CDK_pywrapper import CDK\r\nfrom rdkit import Chem\r\n\r\nsmiles_list = [\r\n  # erlotinib\r\n  \"n1cnc(c2cc(c(cc12)OCCOC)OCCOC)Nc1cc(ccc1)C#C\",\r\n  # midecamycin\r\n  \"CCC(=O)O[C@@H]1CC(=O)O[C@@H](C/C=C/C=C/[C@@H]([C@@H](C[C@@H]([C@@H]([C@H]1OC)O[C@H]2[C@@H]([C@H]([C@@H]([C@H](O2)C)O[C@H]3C[C@@]([C@H]([C@@H](O3)C)OC(=O)CC)(C)O)N(C)C)O)CC=O)C)O)C\",\r\n  # selenofolate\r\n  \"C1=CC(=CC=C1C(=O)NC(CCC(=O)OCC[Se]C#N)C(=O)O)NCC2=CN=C3C(=N2)C(=O)NC(=N3)N\",\r\n  # cisplatin\r\n  \"N.N.Cl[Pt]Cl\"\r\n]\r\nmols = [Chem.AddHs(Chem.MolFromSmiles(smiles)) for smiles in smiles_list]\r\n\r\ncdk = CDK()\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\nThe above calculates 222 molecular descriptors (23 1D and 200 2D).<br/>\r\n\r\nThe additional 65 three-dimensional (3D) descriptors may be obtained with the following:\r\n:warning: Molecules are required to have conformers for 3D descriptors to be calculated.<br/>\r\n\r\n```python\r\nfrom rdkit.Chem import AllChem\r\n\r\nfor mol in mols:\r\n    _ = AllChem.EmbedMolecule(mol)\r\n\r\ncdk = CDK(ignore_3D=False)\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\n\r\nTo obtain molecular fingerprint, one can used the following:\r\n\r\n```python\r\nfrom CDK_pywrapper import CDK, FPType\r\ncdk = CDK(fingerprint=.PubchemFP)\r\nprint(cdk.calculate(mols))\r\n```\r\n\r\nThe following fingerprints can be calculated:\r\n\r\n| FPType    | Fingerprint name                                                                   |\r\n|-----------|------------------------------------------------------------------------------------|\r\n| FP        | CDK fingerprint                                                                    |\r\n| ExtFP     | Extended CDK fingerprint (includes 25 bits for ring features and isotopic masses)  |\r\n| EStateFP  | Electrotopological state fingerprint (79 bits)                                     |\r\n| GraphFP   | CDK fingerprinter ignoring bond orders                                             |\r\n| MACCSFP   | Public MACCS fingerprint                                                           |\r\n| PubchemFP | PubChem substructure fingerprint                                                   |\r\n| SubFP     | Fingerprint describing 307 substructures                                           |\r\n| KRFP      | Klekota-Roth fingerprint                                                           |\r\n| AP2DFP    | Atom pair 2D fingerprint as implemented in PaDEL                                   |\r\n| HybridFP  | CDK fingerprint ignoring aromaticity                                               |\r\n| LingoFP   | LINGO fingerprint                                                                  |\r\n| SPFP      | Fingerprint based on the shortest paths between two atoms                          |\r\n| SigFP     | Signature fingerprint                                                              |\r\n| CircFP    | Circular fingerprint                                                               |\r\n\r\n## Documentation\r\n\r\n```python\r\nclass CDK(ignore_3D=True, fingerprint=None, nbits=1024, depth=6):\r\n```\r\n\r\nConstructor of a CDK calculator for molecular descriptors or fingerprints\r\n\r\nParameters:\r\n\r\n- ***ignore_3D  : bool***\r\n  Should 3D molecular descriptors be calculated (default: False). Ignored if a fingerprint is set.\r\n- ***fingerprint  : FPType***  \r\n  Type of fingerprint to calculate (default: None). If None, calculate descriptors.\r\n- ***nbits  : int***  \r\n  Number of bits in the fingerprint.\r\n- ***depth  : int***  \r\n  Depth of the fingerprint.\r\n<br/>\r\n<br/>\r\n```python\r\ndef calculate(mols, show_banner=True, njobs=1, chunksize=1000):\r\n```\r\n\r\nDefault method to calculate CDK molecular descriptors and fingerprints.\r\n\r\nParameters:\r\n\r\n- ***mols  : Iterable[Chem.Mol]***  \r\n  RDKit molecule objects for which to obtain CDK descriptors.\r\n- ***show_banner  : bool***  \r\n  Displays default notice about CDK.\r\n- ***njobs  : int***  \r\n  Maximum number of simultaneous processes.\r\n- ***chunksize  : int***  \r\n  Maximum number of molecules each process is charged of.\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Python wrapper for CDK molecular descriptors and fingerprints",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/OlivierBeq/CDK_pywrapper"
    },
    "split_keywords": [
        "chemistry development kit",
        "molecular descriptors",
        "molecular fingerprints",
        "cheminformatics",
        "qsar"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cad086bc2fe825521183af0652039cd6b621675bc306b0a9ab7531d876e2f1f1",
                "md5": "e8a4672637f4ba744e2eaafa21b0ad7d",
                "sha256": "b5faa179aec7be01094eb5a090276c1a66a1ee84117d303ee2bc0c70658b4fa6"
            },
            "downloads": -1,
            "filename": "CDK_pywrapper-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e8a4672637f4ba744e2eaafa21b0ad7d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 43162902,
            "upload_time": "2024-01-15T21:13:19",
            "upload_time_iso_8601": "2024-01-15T21:13:19.204235Z",
            "url": "https://files.pythonhosted.org/packages/ca/d0/86bc2fe825521183af0652039cd6b621675bc306b0a9ab7531d876e2f1f1/CDK_pywrapper-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6a8bf9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee",
                "md5": "463586e81891efb7b82bc00c921c563d",
                "sha256": "20b1acc5107404b2c333a1386ac4e02166fc49dbbe67fef38a634aeacd1f7676"
            },
            "downloads": -1,
            "filename": "CDK_pywrapper-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "463586e81891efb7b82bc00c921c563d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 43151199,
            "upload_time": "2024-01-15T21:13:43",
            "upload_time_iso_8601": "2024-01-15T21:13:43.080948Z",
            "url": "https://files.pythonhosted.org/packages/6a/8b/f9041378dc54f402bd586eae572b62815998829990e529fa965831a0ecee/CDK_pywrapper-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-15 21:13:43",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "OlivierBeq",
    "github_project": "CDK_pywrapper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "cdk-pywrapper"
}
        
Elapsed time: 0.17950s