conplex-dti


Nameconplex-dti JSON
Version 0.1.12 PyPI version JSON
download
home_pagehttps://github.com/samsledje/ConPLex
SummaryAdapting protein language models and contrastive learning for DTI prediction.
upload_time2024-02-24 19:25:16
maintainer
docs_urlNone
authorsamsledje
requires_python>=3.9,<4.0
licenseMIT
keywords protein language models contrastive learning drug target interaction dti
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ConPLex

![ConPLex Schematic](assets/images/Fig2_Schematic.png)

[![ConPLex Releases](https://img.shields.io/github/v/release/samsledje/ConPLex?include_prereleases)](https://github.com/samsledje/ConPLex/releases)
[![PyPI](https://img.shields.io/pypi/v/conplex-dti)](https://pypi.org/project/conplex-dti/)
[![Build](https://github.com/samsledje/ConPLex/actions/workflows/build.yml/badge.svg)](https://github.com/samsledje/ConPLex/actions/workflows/build.yml)
[![Documentation Status](https://readthedocs.org/projects/conplex/badge/?version=latest)](https://conplex.readthedocs.io/en/main/?badge=main)
[![License](https://img.shields.io/github/license/samsledje/ConPLex)](https://github.com/samsledje/ConPLex/blob/main/LICENSE)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

🚧🚧 Please note that ConPLex is currently a pre-release and is actively being developed. For the code used to generate our PNAS results, see the [manuscript code](https://github.com/samsledje/ConPLex_dev) 🚧🚧

 - [Homepage](http://conplex.csail.mit.edu)
 - [Documentation](https://d-script.readthedocs.io/en/main/)

## Abstract

Sequence-based prediction of drug-target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational prediction needs to be generalizable and scalable while remaining sensitive to subtle variations in the inputs. However, current computational techniques fail to simultaneously meet these goals, often sacrificing performance on one to achieve the others. We develop a deep learning model, ConPLex, successfully leveraging the advances in pre-trained protein language models ("PLex") and employing  a novel  protein-anchored contrastive co-embedding ("Con") to outperform state-of-the-art approaches. ConPLex achieves high accuracy, broad adaptivity to unseen data, and specificity against decoy compounds. It makes predictions of binding based on the distance between learned representations, enabling predictions at the scale of massive compound libraries and the human proteome. Experimental testing of 19 kinase-drug interaction predictions validated 12 interactions, including four with sub-nanomolar affinity, plus a novel strongly-binding EPHB1 inhibitor ($K_D = 1.3nM$). Furthermore, ConPLex embeddings are interpretable, which enables us to visualize the drug-target embedding space and use embeddings to characterize the function of human cell-surface proteins. We anticipate ConPLex will facilitate novel drug discovery by making highly sensitive in-silico drug screening feasible at genome scale.

## Installation

### Install from PyPI

You should first have a version of [`cudatoolkit`](https://anaconda.org/nvidia/cudatoolkit) compatible with your system installed. Then run

```bash
pip install conplex-dti
conplex-dti --help
```

### Compile from Source

```bash
git clone https://github.com/samsledje/ConPLex.git
cd ConPLex
conda create -n conplex-dti python=3.9
conda activate conplex-dti
make poetry-download
export PATH="[poetry  install  location]:$PATH"
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
make install
conplex-dti --help
```

## Usage

### Download benchmark data sets and pre-trained models

```bash
conplex-dti download --to datasets --benchmarks davis bindingdb biosnap biosnap_prot biosnap_mol dude
```

```bash
conplex-dti download --to . --models ConPLex_v1_BindingDB
```

### Run benchmark training

```bash
conplex-dti train --run-id TestRun --config config/default_config.yaml
```

### Make predictions with a trained model

```bash
conplex-dti predict --data-file [pair predict file].tsv --model-path ./models/ConPLex_v1_BindingDB.pt --outfile ./results.tsv
```

Format of `[pair predict file].tsv` should be `[protein ID]\t[molecule ID]\t[protein Sequence]\t[molecule SMILES]`

### Visualize co-embedding space

```bash
...
```

## Reference

If you use ConPLex, please cite [Contrastive learning in protein language space predicts interactions between drugs and protein targets](https://www.pnas.org/doi/10.1073/pnas.2220778120) by Rohit Singh*, Samuel Sledzieski*, Bryan Bryson, Lenore Cowen and Bonnie Berger.

```bash
@article{singh2023contrastive,
  title={Contrastive learning in protein language space predicts interactions between drugs and protein targets},
  author={Singh, Rohit and Sledzieski, Samuel and Bryson, Bryan and Cowen, Lenore and Berger, Bonnie},
  journal={Proceedings of the National Academy of Sciences},
  volume={120},
  number={24},
  pages={e2220778120},
  year={2023},
  publisher={National Acad Sciences}
}
```

Thanks to Ava Amini, Kevin Yang, and Sevahn Vorperian from MSR New England for suggesting the use of the triplet distance contrastive loss function without the sigmoid activation. The default has now been changed. For the original formulation with the sigmoid activation, you can set the `--use-sigmoid-cosine` flag during training.

### Manuscript Code

Code used to generate results in the manuscript can be found in the [development repository](https://github.com/samsledje/ConPLex_dev)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/samsledje/ConPLex",
    "name": "conplex-dti",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "protein language models,contrastive learning,drug target interaction,DTI",
    "author": "samsledje",
    "author_email": "samsl@mit.edu",
    "download_url": "https://files.pythonhosted.org/packages/72/49/80936e8366c26b9631be7f7c5b4106bc7c8d507a09e4c127a4b5df27b34f/conplex_dti-0.1.12.tar.gz",
    "platform": null,
    "description": "# ConPLex\n\n![ConPLex Schematic](assets/images/Fig2_Schematic.png)\n\n[![ConPLex Releases](https://img.shields.io/github/v/release/samsledje/ConPLex?include_prereleases)](https://github.com/samsledje/ConPLex/releases)\n[![PyPI](https://img.shields.io/pypi/v/conplex-dti)](https://pypi.org/project/conplex-dti/)\n[![Build](https://github.com/samsledje/ConPLex/actions/workflows/build.yml/badge.svg)](https://github.com/samsledje/ConPLex/actions/workflows/build.yml)\n[![Documentation Status](https://readthedocs.org/projects/conplex/badge/?version=latest)](https://conplex.readthedocs.io/en/main/?badge=main)\n[![License](https://img.shields.io/github/license/samsledje/ConPLex)](https://github.com/samsledje/ConPLex/blob/main/LICENSE)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n\ud83d\udea7\ud83d\udea7 Please note that ConPLex is currently a pre-release and is actively being developed. For the code used to generate our PNAS results, see the [manuscript code](https://github.com/samsledje/ConPLex_dev) \ud83d\udea7\ud83d\udea7\n\n - [Homepage](http://conplex.csail.mit.edu)\n - [Documentation](https://d-script.readthedocs.io/en/main/)\n\n## Abstract\n\nSequence-based prediction of drug-target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational prediction needs to be generalizable and scalable while remaining sensitive to subtle variations in the inputs. However, current computational techniques fail to simultaneously meet these goals, often sacrificing performance on one to achieve the others. We develop a deep learning model, ConPLex, successfully leveraging the advances in pre-trained protein language models (\"PLex\") and employing  a novel  protein-anchored contrastive co-embedding (\"Con\") to outperform state-of-the-art approaches. ConPLex achieves high accuracy, broad adaptivity to unseen data, and specificity against decoy compounds. It makes predictions of binding based on the distance between learned representations, enabling predictions at the scale of massive compound libraries and the human proteome. Experimental testing of 19 kinase-drug interaction predictions validated 12 interactions, including four with sub-nanomolar affinity, plus a novel strongly-binding EPHB1 inhibitor ($K_D = 1.3nM$). Furthermore, ConPLex embeddings are interpretable, which enables us to visualize the drug-target embedding space and use embeddings to characterize the function of human cell-surface proteins. We anticipate ConPLex will facilitate novel drug discovery by making highly sensitive in-silico drug screening feasible at genome scale.\n\n## Installation\n\n### Install from PyPI\n\nYou should first have a version of [`cudatoolkit`](https://anaconda.org/nvidia/cudatoolkit) compatible with your system installed. Then run\n\n```bash\npip install conplex-dti\nconplex-dti --help\n```\n\n### Compile from Source\n\n```bash\ngit clone https://github.com/samsledje/ConPLex.git\ncd ConPLex\nconda create -n conplex-dti python=3.9\nconda activate conplex-dti\nmake poetry-download\nexport PATH=\"[poetry  install  location]:$PATH\"\nexport PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring\nmake install\nconplex-dti --help\n```\n\n## Usage\n\n### Download benchmark data sets and pre-trained models\n\n```bash\nconplex-dti download --to datasets --benchmarks davis bindingdb biosnap biosnap_prot biosnap_mol dude\n```\n\n```bash\nconplex-dti download --to . --models ConPLex_v1_BindingDB\n```\n\n### Run benchmark training\n\n```bash\nconplex-dti train --run-id TestRun --config config/default_config.yaml\n```\n\n### Make predictions with a trained model\n\n```bash\nconplex-dti predict --data-file [pair predict file].tsv --model-path ./models/ConPLex_v1_BindingDB.pt --outfile ./results.tsv\n```\n\nFormat of `[pair predict file].tsv` should be `[protein ID]\\t[molecule ID]\\t[protein Sequence]\\t[molecule SMILES]`\n\n### Visualize co-embedding space\n\n```bash\n...\n```\n\n## Reference\n\nIf you use ConPLex, please cite [Contrastive learning in protein language space predicts interactions between drugs and protein targets](https://www.pnas.org/doi/10.1073/pnas.2220778120) by Rohit Singh*, Samuel Sledzieski*, Bryan Bryson, Lenore Cowen and Bonnie Berger.\n\n```bash\n@article{singh2023contrastive,\n  title={Contrastive learning in protein language space predicts interactions between drugs and protein targets},\n  author={Singh, Rohit and Sledzieski, Samuel and Bryson, Bryan and Cowen, Lenore and Berger, Bonnie},\n  journal={Proceedings of the National Academy of Sciences},\n  volume={120},\n  number={24},\n  pages={e2220778120},\n  year={2023},\n  publisher={National Acad Sciences}\n}\n```\n\nThanks to Ava Amini, Kevin Yang, and Sevahn Vorperian from MSR New England for suggesting the use of the triplet distance contrastive loss function without the sigmoid activation. The default has now been changed. For the original formulation with the sigmoid activation, you can set the `--use-sigmoid-cosine` flag during training.\n\n### Manuscript Code\n\nCode used to generate results in the manuscript can be found in the [development repository](https://github.com/samsledje/ConPLex_dev)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Adapting protein language models and contrastive learning for DTI prediction.",
    "version": "0.1.12",
    "project_urls": {
        "Homepage": "https://github.com/samsledje/ConPLex",
        "Repository": "https://github.com/samsledje/ConPLex"
    },
    "split_keywords": [
        "protein language models",
        "contrastive learning",
        "drug target interaction",
        "dti"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "490e3377791cbba20ee03bb5304a0a3b54e983c7890445664765035f99babaa6",
                "md5": "43d51673a0ef2ea0dda55142eac979b7",
                "sha256": "fe42bcd9dfe3c77428ae000a4ba04939fbb53a3e8426445874a9ec9e68663703"
            },
            "downloads": -1,
            "filename": "conplex_dti-0.1.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "43d51673a0ef2ea0dda55142eac979b7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 37010,
            "upload_time": "2024-02-24T19:25:14",
            "upload_time_iso_8601": "2024-02-24T19:25:14.948252Z",
            "url": "https://files.pythonhosted.org/packages/49/0e/3377791cbba20ee03bb5304a0a3b54e983c7890445664765035f99babaa6/conplex_dti-0.1.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "724980936e8366c26b9631be7f7c5b4106bc7c8d507a09e4c127a4b5df27b34f",
                "md5": "d1464c18fb8f472619fc8efc4ef7677f",
                "sha256": "afa5f7fe0a33af2d4588de6496258daf8d759ec63ec645afc85b764c2b38e8fd"
            },
            "downloads": -1,
            "filename": "conplex_dti-0.1.12.tar.gz",
            "has_sig": false,
            "md5_digest": "d1464c18fb8f472619fc8efc4ef7677f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 34582,
            "upload_time": "2024-02-24T19:25:16",
            "upload_time_iso_8601": "2024-02-24T19:25:16.948244Z",
            "url": "https://files.pythonhosted.org/packages/72/49/80936e8366c26b9631be7f7c5b4106bc7c8d507a09e4c127a4b5df27b34f/conplex_dti-0.1.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-24 19:25:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "samsledje",
    "github_project": "ConPLex",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "conplex-dti"
}
        
Elapsed time: 0.19988s