# PDBe Arpeggio
This repository contains a refactored and expanded version of the [arpeggio](https://github.com/harryjubb/arpeggio) tool, modified and maintained by the [PDBe team](https://pdbe.org) to use as part of their weekly PDB release process.
`pdbe-arpeggio` is a python library, that can be used to calculate interatomic contacts in a protein based on the rules defined in [CREDO](https://pubmed.ncbi.nlm.nih.gov/19207418/). This library only supports protein structures in mmCIF format.
If you make use of `pdbe-arpeggio`, please cite the following article:
Harry C Jubb, Alicia P Higueruelo, Bernardo Ochoa-MontaƱo, Will R Pitt, David B Ascher, Tom L Blundell,
[Arpeggio: A Web Server for Calculating and Visualising Interatomic Interactions in Protein Structures](https://doi.org/10.1016/j.jmb.2016.12.004).
Journal of Molecular Biology,
Volume 429, Issue 3,
2017,
Pages 365-371,
ISSN 0022-2836,
# Disclaimer
This is a refactored version of the original [arpeggio](https://github.com/harryjubb/arpeggio) by Harry Jubb. Main changes are:
* Python 3 support
* Modular architecture to make `arpeggio` PIP installable.
* Support for mmCIF format to process also larger structures.
* Results in JSON format.
`pdbe-arpeggio` doesn't do any processing of input protein structure, other than what BioPython does by default. Alternate locations and missing density are not explicitly accounted for and may result in anomalous results. Please use with caution.
# Getting Started
## Web Interface
If you would like to run original version of arpeggio on a small number of individual structures, the easiest way to get started is to use the [web interface](http://biosig.unimelb.edu.au/arpeggioweb/).
## Installation
The easiest way to set up arpeggio is using [Conda](https://docs.conda.io/en/latest/). Create a conda environment and install arpeggio dependencies:
```bash
conda create -n arpeggio-env python=3.9 -c conda-forge gemmi openbabel biopython
conda activate arpeggio-env
pip install pdbe-arpeggio
```
## Dependencies
* [Open Babel](https://openbabel.org) >= 3.0.0
* [Biopython](https://biopython.org) >= 1.80
* [gemmi](https://gemmi.readthedocs.io/en/latest/index.html) >= 0.5.8
## Running arpeggio
`pdbe-arpeggio 1tqn_h.cif [options]`
e.g.
`pdbe-arpeggio -s /A/508/ -o out/ 1tqn_h.cif`
Use `pdbe-arpeggio -h` for available options.
## Output
All the identified interactions are written in a JSON file.
### Interactions
#### atom-atom interactions
| Key | Interaction | Description |
| --- | ----------- | ----------- |
| clash | Clash | Denotes if the atom is involved in a steric clash. |
| covalent | Covalent | Denotes if the atom appears to be covalently bonded. |
| vdw_clash| VdW Clash | Denotes if the van der Waals radius of the atom is clashing with one or more other atoms. |
| vdw | VdW | Denotes if the van der Waals radius of the the atom is interacting with one or more other atoms. |
| proximal | Proximal | Denotes if the atom is > the VdW interaction distance, but within 5 Angstroms of other atom(s). |
| hbond | Hydrogen Bond | Denotes if the atom forms a hydrogen bond. |
| weak_hbond | Weak Hydrogen Bond | Denotes if the atom forms a weak hydrogen bond. |
| xbond | Halogen Bond | Denotes if the atom forms a halogen bond. |
| ionic | Ionic | Denotes if the atom may interact via charges. |
| metal | Metal Complex | Denotes if the atom is part of a metal complex. |
| aromatic | Aromatic | Denotes an aromatic ring atom interacting with another aromatic ring atom. |
| hydrophobic | Hydrophobic | Denotes hydrophobic interaction. |
| carbonyl | Carbonyl | Denotes a carbonyl-carbon:carbonyl-carbon interaction. |
| polar | Polar | Less strict hydrogen bonding (without angle terms). |
| weak_polar | Weak Polar | Less strict weak hydrogen bonding (without angle terms).
#### atom-plane interactions
| Key | Interaction | Description |
| --- | ----------- | ----------- |
| CARBONPI | Carbon-PI | Weakly electropositive carbon atom - $\Pi$ interactions [[ref]](https://doi.org/10.1016/j.bmc.2007.09.023) |
| CATIONPI | Cation-PI | Cation - $\Pi$ interactions [[ref]](https://doi.org/10.1002/prot.20417) |
| DONORPI | Donor-PI | Hydrogen Bond donor - $\Pi$ interactions [[ref]](https://doi.org/10.1016/j.bmc.2007.09.023) |
| HALOGENPI | Halogen-PI | Halogen bond donors - $\Pi$ [[ref]](https://doi.org/10.1073/pnas.0407607101) |
| METSULPHURPI | Sulphur-PI | Methionine sulphur - $\Pi$ ring interactions [[ref]](https://dx.doi.org/10.1074/jbc.M112.374504) |
#### plane-plane interactions
Follows nomenclature established by [Chakrabarti and Bhattacharyya (2007)](https://doi.org/10.1016/j.pbiomolbio.2007.03.016)
#### group-group/plane interactions
| Key | Interaction | Description |
| --- | ----------- | ----------- |
| AMIDEAMIDE | amide - amide | [[ref]](https://doi.org/10.1002/jcc.21212) |
| AMIDERING | amide - ring | [[ref]](https://doi.org/10.1016/0014-5793(86)80730-X) |
### Interacting entities
| Key | Meaning |
| ---- |---------|
| INTER | Between an atom from the user's selection and a non-selected atom |
| INTRA_SELECTION | Between two atoms both in the user's selection |
| INTRA_NON_SELECTION | Between two atoms that are both not in the user's selection |
| SELECTION_WATER | Between an atom in the user's selection and a water molecule |
| NON_SELECTION_WATER | Between an atom that is not in the user's selection and a |water molecule
| WATER_WATER | Between two water molecules |
### Examples
#### atom-atom interaction
```json
{
"bgn": {
"auth_asym_id": "A",
"auth_atom_id": "CB",
"auth_seq_id": 313,
"label_comp_id": "VAL",
"label_comp_type": "P",
"pdbx_PDB_ins_code": " "
},
"contact": [
"proximal",
"hydrophobic"
],
"distance": 4.02,
"end": {
"auth_asym_id": "A",
"auth_atom_id": "CBB",
"auth_seq_id": 508,
"label_comp_id": "HEM",
"label_comp_type": "B",
"pdbx_PDB_ins_code": " "
},
"interacting_entities": "INTER",
"type": "atom-atom"
},
```
#### atom-plane interaction
```json
{
"bgn": {
"auth_asym_id": "A",
"auth_atom_id": "O",
"auth_seq_id": 523,
"label_comp_id": "HOH",
"label_comp_type": "W",
"pdbx_PDB_ins_code": " "
},
"contact": [
"DONORPI"
],
"distance": 3.9,
"end": {
"auth_asym_id": "A",
"auth_atom_id": "C1A,C2A,C3A,C4A,NA",
"auth_seq_id": 508,
"label_comp_id": "HEM",
"label_comp_type": "B",
"pdbx_PDB_ins_code": " "
},
"interacting_entities": "INTER",
"type": "atom-plane"
},
```
#### plane-plane interaction
```json
{
"bgn": {
"auth_asym_id": "A",
"auth_atom_id": "C1B,C2B,C3B,C4B,NB",
"auth_seq_id": 508,
"label_comp_id": "HEM",
"label_comp_type": "B",
"pdbx_PDB_ins_code": " "
},
"contact": [
"FT",
"ET"
],
"distance": 4.72,
"end": {
"auth_asym_id": "A",
"auth_atom_id": "CD1,CD2,CE1,CE2,CG,CZ",
"auth_seq_id": 435,
"label_comp_id": "PHE",
"label_comp_type": "P",
"pdbx_PDB_ins_code": " "
},
"interacting_entities": "INTER",
"type": "plane-plane"
},
```
#### group-group interaction
```json
{
"bgn": {
"auth_asym_id": "A",
"auth_atom_id": "C,CA,N,O",
"auth_seq_id": 308,
"label_comp_id": "GLU",
"label_comp_type": "P",
"pdbx_PDB_ins_code": " "
},
"contact": [
"AMIDEAMIDE"
],
"distance": 4.29,
"end": {
"auth_asym_id": "A",
"auth_atom_id": "C,CA,N,O",
"auth_seq_id": 310,
"label_comp_id": "THR",
"label_comp_type": "P",
"pdbx_PDB_ins_code": " "
},
"interacting_entities": "INTRA_BINDING_SITE",
"type": "group-group"
},
```
Raw data
{
"_id": null,
"home_page": "",
"name": "pdbe-arpeggio",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "arpeggio PDB contacts bioinformatics mmCIF protein ligand interactions CREDO",
"author": "Harry Jubb",
"author_email": "harry.jubb@sanger.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/01/bc/c29cdcebe961f90a716083251125d3b1d378561410fa941730006f458764/pdbe-arpeggio-1.4.4.tar.gz",
"platform": null,
"description": "\n# PDBe Arpeggio\n\nThis repository contains a refactored and expanded version of the [arpeggio](https://github.com/harryjubb/arpeggio) tool, modified and maintained by the [PDBe team](https://pdbe.org) to use as part of their weekly PDB release process.\n\n`pdbe-arpeggio` is a python library, that can be used to calculate interatomic contacts in a protein based on the rules defined in [CREDO](https://pubmed.ncbi.nlm.nih.gov/19207418/). This library only supports protein structures in mmCIF format.\n\nIf you make use of `pdbe-arpeggio`, please cite the following article:\n\nHarry C Jubb, Alicia P Higueruelo, Bernardo Ochoa-Monta\u00f1o, Will R Pitt, David B Ascher, Tom L Blundell,\n[Arpeggio: A Web Server for Calculating and Visualising Interatomic Interactions in Protein Structures](https://doi.org/10.1016/j.jmb.2016.12.004).\nJournal of Molecular Biology,\nVolume 429, Issue 3,\n2017,\nPages 365-371,\nISSN 0022-2836,\n\n# Disclaimer\n\nThis is a refactored version of the original [arpeggio](https://github.com/harryjubb/arpeggio) by Harry Jubb. Main changes are:\n\n* Python 3 support\n* Modular architecture to make `arpeggio` PIP installable.\n* Support for mmCIF format to process also larger structures.\n* Results in JSON format.\n\n`pdbe-arpeggio` doesn't do any processing of input protein structure, other than what BioPython does by default. Alternate locations and missing density are not explicitly accounted for and may result in anomalous results. Please use with caution.\n\n# Getting Started\n\n## Web Interface\n\nIf you would like to run original version of arpeggio on a small number of individual structures, the easiest way to get started is to use the [web interface](http://biosig.unimelb.edu.au/arpeggioweb/).\n\n\n## Installation\n\nThe easiest way to set up arpeggio is using [Conda](https://docs.conda.io/en/latest/). Create a conda environment and install arpeggio dependencies:\n\n```bash\nconda create -n arpeggio-env python=3.9 -c conda-forge gemmi openbabel biopython\nconda activate arpeggio-env\npip install pdbe-arpeggio\n```\n\n## Dependencies\n\n* [Open Babel](https://openbabel.org) >= 3.0.0\n* [Biopython](https://biopython.org) >= 1.80\n* [gemmi](https://gemmi.readthedocs.io/en/latest/index.html) >= 0.5.8\n\n\n## Running arpeggio\n\n`pdbe-arpeggio 1tqn_h.cif [options]`\n\ne.g. \n\n`pdbe-arpeggio -s /A/508/ -o out/ 1tqn_h.cif`\n\nUse `pdbe-arpeggio -h` for available options.\n\n\n\n## Output\n\nAll the identified interactions are written in a JSON file.\n\n### Interactions\n\n#### atom-atom interactions\n\n| Key | Interaction | Description |\n| --- | ----------- | ----------- |\n| clash | Clash | Denotes if the atom is involved in a steric clash. |\n| covalent | Covalent | Denotes if the atom appears to be covalently bonded. |\n| vdw_clash| VdW Clash | Denotes if the van der Waals radius of the atom is clashing with one or more other atoms. |\n| vdw | VdW | Denotes if the van der Waals radius of the the atom is interacting with one or more other atoms. |\n| proximal | Proximal | Denotes if the atom is > the VdW interaction distance, but within 5 Angstroms of other atom(s). |\n| hbond | Hydrogen Bond | Denotes if the atom forms a hydrogen bond. |\n| weak_hbond | Weak Hydrogen Bond | Denotes if the atom forms a weak hydrogen bond. |\n| xbond | Halogen Bond | Denotes if the atom forms a halogen bond. |\n| ionic | Ionic | Denotes if the atom may interact via charges. |\n| metal | Metal Complex | Denotes if the atom is part of a metal complex. |\n| aromatic | Aromatic | Denotes an aromatic ring atom interacting with another aromatic ring atom. |\n| hydrophobic | Hydrophobic | Denotes hydrophobic interaction. |\n| carbonyl | Carbonyl | Denotes a carbonyl-carbon:carbonyl-carbon interaction. |\n| polar | Polar | Less strict hydrogen bonding (without angle terms). |\n| weak_polar | Weak Polar | Less strict weak hydrogen bonding (without angle terms).\n\n#### atom-plane interactions\n\n| Key | Interaction | Description |\n| --- | ----------- | ----------- |\n| CARBONPI | Carbon-PI | Weakly electropositive carbon atom - $\\Pi$ interactions [[ref]](https://doi.org/10.1016/j.bmc.2007.09.023) |\n| CATIONPI | Cation-PI | Cation - $\\Pi$ interactions [[ref]](https://doi.org/10.1002/prot.20417) |\n| DONORPI | Donor-PI | Hydrogen Bond donor - $\\Pi$ interactions [[ref]](https://doi.org/10.1016/j.bmc.2007.09.023) |\n| HALOGENPI | Halogen-PI | Halogen bond donors - $\\Pi$ [[ref]](https://doi.org/10.1073/pnas.0407607101) |\n| METSULPHURPI | Sulphur-PI | Methionine sulphur - $\\Pi$ ring interactions [[ref]](https://dx.doi.org/10.1074/jbc.M112.374504) |\n\n#### plane-plane interactions\n\nFollows nomenclature established by [Chakrabarti and Bhattacharyya (2007)](https://doi.org/10.1016/j.pbiomolbio.2007.03.016)\n\n#### group-group/plane interactions\n\n| Key | Interaction | Description |\n| --- | ----------- | ----------- |\n| AMIDEAMIDE | amide - amide | [[ref]](https://doi.org/10.1002/jcc.21212) |\n| AMIDERING | amide - ring | [[ref]](https://doi.org/10.1016/0014-5793(86)80730-X) |\n\n### Interacting entities\n\n| Key | Meaning |\n| ---- |---------|\n| INTER | Between an atom from the user's selection and a non-selected atom |\n| INTRA_SELECTION | Between two atoms both in the user's selection |\n| INTRA_NON_SELECTION | Between two atoms that are both not in the user's selection |\n| SELECTION_WATER | Between an atom in the user's selection and a water molecule |\n| NON_SELECTION_WATER | Between an atom that is not in the user's selection and a |water molecule\n| WATER_WATER | Between two water molecules |\n\n### Examples\n\n#### atom-atom interaction\n\n```json\n {\n \"bgn\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"CB\",\n \"auth_seq_id\": 313,\n \"label_comp_id\": \"VAL\",\n \"label_comp_type\": \"P\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"contact\": [\n \"proximal\",\n \"hydrophobic\"\n ],\n \"distance\": 4.02,\n \"end\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"CBB\",\n \"auth_seq_id\": 508,\n \"label_comp_id\": \"HEM\",\n \"label_comp_type\": \"B\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"interacting_entities\": \"INTER\",\n \"type\": \"atom-atom\"\n },\n```\n\n#### atom-plane interaction\n\n```json\n {\n \"bgn\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"O\",\n \"auth_seq_id\": 523,\n \"label_comp_id\": \"HOH\",\n \"label_comp_type\": \"W\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"contact\": [\n \"DONORPI\"\n ],\n \"distance\": 3.9,\n \"end\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"C1A,C2A,C3A,C4A,NA\",\n \"auth_seq_id\": 508,\n \"label_comp_id\": \"HEM\",\n \"label_comp_type\": \"B\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"interacting_entities\": \"INTER\",\n \"type\": \"atom-plane\"\n },\n```\n\n#### plane-plane interaction\n\n```json\n{\n \"bgn\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"C1B,C2B,C3B,C4B,NB\",\n \"auth_seq_id\": 508,\n \"label_comp_id\": \"HEM\",\n \"label_comp_type\": \"B\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"contact\": [\n \"FT\",\n \"ET\"\n ],\n \"distance\": 4.72,\n \"end\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"CD1,CD2,CE1,CE2,CG,CZ\",\n \"auth_seq_id\": 435,\n \"label_comp_id\": \"PHE\",\n \"label_comp_type\": \"P\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"interacting_entities\": \"INTER\",\n \"type\": \"plane-plane\"\n },\n```\n\n#### group-group interaction\n\n```json\n {\n \"bgn\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"C,CA,N,O\",\n \"auth_seq_id\": 308,\n \"label_comp_id\": \"GLU\",\n \"label_comp_type\": \"P\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"contact\": [\n \"AMIDEAMIDE\"\n ],\n \"distance\": 4.29,\n \"end\": {\n \"auth_asym_id\": \"A\",\n \"auth_atom_id\": \"C,CA,N,O\",\n \"auth_seq_id\": 310,\n \"label_comp_id\": \"THR\",\n \"label_comp_type\": \"P\",\n \"pdbx_PDB_ins_code\": \" \"\n },\n \"interacting_entities\": \"INTRA_BINDING_SITE\",\n \"type\": \"group-group\"\n },\n```\n",
"bugtrack_url": null,
"license": "GNU General Public License v3.0",
"summary": "Arpeggio calculates interatomic contacts based on the rules defined in CREDO.",
"version": "1.4.4",
"project_urls": {
"Author's repository": "https://github.com/harryjubb/arpeggio",
"Documentation": "https://github.com/PDBeurope/arpeggio",
"Paper": "https://doi.org/10.1016/j.jmb.2016.12.004",
"Source code": "https://github.com/PDBeurope/arpeggio",
"Web server": "http://biosig.unimelb.edu.au/arpeggioweb/"
},
"split_keywords": [
"arpeggio",
"pdb",
"contacts",
"bioinformatics",
"mmcif",
"protein",
"ligand",
"interactions",
"credo"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "714c19de17e0d100ff64278fe2aaacd846d8077072c19ba6d3472fd89afabb42",
"md5": "7e8c48837f3b7fd13d83be63a7420f51",
"sha256": "0fb2dd2fda71464dcee7de3aac456a282d859df6094301634b82ebdd2bfd2ada"
},
"downloads": -1,
"filename": "pdbe_arpeggio-1.4.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7e8c48837f3b7fd13d83be63a7420f51",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 63048,
"upload_time": "2024-01-24T13:16:15",
"upload_time_iso_8601": "2024-01-24T13:16:15.816581Z",
"url": "https://files.pythonhosted.org/packages/71/4c/19de17e0d100ff64278fe2aaacd846d8077072c19ba6d3472fd89afabb42/pdbe_arpeggio-1.4.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "01bcc29cdcebe961f90a716083251125d3b1d378561410fa941730006f458764",
"md5": "691aa866839d96cc9cfd94e6da36df12",
"sha256": "a0b250a0ffdd010f908d09f5330ebb6a99fee64c7675bd15e67ce2d7cf74850d"
},
"downloads": -1,
"filename": "pdbe-arpeggio-1.4.4.tar.gz",
"has_sig": false,
"md5_digest": "691aa866839d96cc9cfd94e6da36df12",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 62010,
"upload_time": "2024-01-24T13:16:16",
"upload_time_iso_8601": "2024-01-24T13:16:16.890784Z",
"url": "https://files.pythonhosted.org/packages/01/bc/c29cdcebe961f90a716083251125d3b1d378561410fa941730006f458764/pdbe-arpeggio-1.4.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-24 13:16:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "harryjubb",
"github_project": "arpeggio",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "pdbe-arpeggio"
}