Name | evcouplings JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | A Framework for evolutionary couplings analysis |
upload_time | 2024-11-05 11:18:47 |
maintainer | None |
docs_url | None |
author | None |
requires_python | None |
license | None |
keywords |
analysis
couplings
evolutionary
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[![build_and_test Actions Status](https://github.com/debbiemarkslab/EVcouplings/workflows/build_and_test/badge.svg?branch=master)](https://github.com/debbiemarkslab/EVcouplings/actions) [![PyPI version](https://badge.fury.io/py/evcouplings.svg)](https://badge.fury.io/py/evcouplings)
# EVcouplings
Predict protein structure, function and mutations using evolutionary sequence covariation.
## Installation and setup
### Installing the Python package
* If you are simply interested in using EVcouplings as a library, installing the Python package is all you need to do (unless you use functions that depend on external tools).
* If you want to run the *evcouplings* application (alignment generation, model parameter inference, structure prediction, etc.) you will also need to follow the sections on installing external tools and databases.
#### Requirements
EVcouplings actively supports Python >= 3.10 installations.
#### Installation
To install the latest version of EVcouplings on PyPI,
pip install evcouplings
To obtain the latest development version of EVcouplings from the github repository, run
pip install https://github.com/debbiemarkslab/EVcouplings/archive/develop.zip
and to update to the latest version after previously installing EVcouplings from the repository, run
pip install -U --no-deps https://github.com/debbiemarkslab/EVcouplings/archive/develop.zip
### External software tools
*After installation and before running compute jobs, the paths to the respective binaries of the following external tools have to be set in your EVcouplings job configuration file(s).*
#### plmc (required)
Tool for inferring undirected statistical models from sequence variation. Download and install plmc to a directory of your choice from the [plmc github repository](https://github.com/debbiemarkslab/plmc) according to the included documentation.
For compatibility with evcouplings, please compile using
make all-openmp32
#### jackhmmer (required)
Download and install HMMER from the [HMMER webpage](http://hmmer.org/download.html) to a directory of your choice.
#### HHsuite (optional)
evcouplings uses the hhfilter tool to filter sequence alignments. Installation is only required if you need this functionality.
Download and install HHsuite from the [HHsuite github repository](https://github.com/soedinglab/hh-suite) to a directory of your choice.
#### CNSsolve 1.21 (optional)
evcouplings uses CNSsolve for computing 3D structure models from coupled residue pairs. Installation is only required if you want to run the *fold* stage of the computational pipeline.
Download and unpack a compiled version of [CNSsolve 1.21](http://cns-online.org/v1.21/) to a directory of your choice. No further setup is necessary, since evcouplings takes care of setting the right environment variables internally without relying on the included shell script cns_solve_env
(you will have to put the path to the cns binary in your job config file however, e.g. cns_solve_1.21/intel-x86_64bit-linux/bin/cns).
#### PSIPRED (optional)
evcouplings uses PSIPRED for secondary structure prediction, to generate secondary structure distance and dihedral angle restraints for 3D structure computation.
Installation is only required if you want to run the *fold* stage of the computational pipeline, and do not supply your own secondary structure predictions.
Download and install [PSIPRED](http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/) according to the instructions in the included README file.
#### maxcluster (optional)
evcouplings uses maxcluster to compare predicted 3D structure models to experimental protein structures, if there are any for the target protein or one
of its homologs. Installation is only required if you want to run the *fold* stage of the computational pipeline.
Download [maxcluster](http://www.sbg.bio.ic.ac.uk/~maxcluster/) and place it in a directory of your choice.
### Databases
*After download and before running compute jobs, the paths to the respective databases have to be set in your EVcouplings job configuration file(s).*
#### Automatic database setup
The *evcouplings* application minimally needs a sequence database for alignment generation, and structure mapping information for comparison of evolutionary couplings to 3D structures.
Sequence and structure mapping databases for EVcouplings can be automatically downloaded using the included command line tool *evcouplings_dbupdate*.
This tool will fetch the UniProt (SwissProt/TrEMBL), UniRef100 and UniRef90 databases, and generate SIFTS-based structure mapping tables.
Please see
evcouplings_dbupdate --help
for how to download the respective databases. Note that this may take a while, especially the generation of post-processed SIFTS mapping files.
#### Sequence databases for EVcomplex
Running the EVcouplings pipeline for protein complexes (aka EVcomplex) requires two pre-computed databases. You can download these databases here:
ena_genome_location_table: https://marks.hms.harvard.edu/evcomplex_databases/cds_pro_2017_02.txt
uniprot_to_embl_table: https://marks.hms.harvard.edu/evcomplex_databases/idmapping_uniprot_embl_2017_02.txt
Save these databases in your local environment, and then add the paths to the local copies of these databases to your config file for the complex pipeline.
In future releases these databases will be generated automatically.
#### Other sequence databases
You can however use any sequence database of your choice in FASTA format if you prefer to. The database for any particular job needs to be defined in the job configuration file ("databases" section) and set as the input database in the "alignment" section.
#### Structure and mapping databases
Relevant PDB structures for comparison of ECs and 3D structure predictions will be automatically fetched from the web in the new compressed MMTF format on a per-job basis. You can however also pre-download the entire PDB and place the structures in a directory if you want to (and set pdb_mmtf_dir in your job configuration).
Uniprot to PDB index mapping files will be automatically generated by EVcouplings based on the SIFTS database.
You can either generate the files by running *evcouplings_dbupdate* (see above, preferred), or by pointing the sifts_mapping_table and sifts_sequence_db configuration parameters to file paths inside an already existing directory. If these files do not yet exist, they will be created by fetching and integrating data from the web (this may take a while) when the pipeline is first run and saved under the given file paths.
## Documentation and tutorials
Please refer to the Jupyter notebooks in the [notebooks subdirectory](https://github.com/debbiemarkslab/EVcouplings/tree/master/notebooks) on how to
* edit configuration files
* run jobs
* use EVcouplings as a Python library
Documentation for the source code is available at [readthedocs](http://evcouplings.readthedocs.io/en/latest/).
## License
EVcouplings is available under the MIT license, with the exception of the included CNS input scripts (please see [LICENSE](https://github.com/debbiemarkslab/EVcouplings/tree/master/LICENSE) for details).
## References
Please cite the following reference for the EVcouplings Python package;
Hopf T. A., Green A. G., Schubert B., et al. The EVcouplings Python framework for coevolutionary sequence analysis. *Bioinformatics* **35**, 1582–1584 (2019)
Also consider citing the following references, which introduced the methods integrated by the EVcouplings Python package:
Marks D. S., Colwell, L. J., Sheridan, R., Hopf, T.A., Pagnani, A., Zecchina, R., Sander, C. Protein 3D structure computed from evolutionary sequence variation. *PLOS ONE* **6**(12), e28766 (2011)
Hopf T. A., Colwell, L. J., Sheridan, R., Rost, B., Sander, C., Marks, D. S. Three-dimensional structures of membrane proteins from genomic sequencing. *Cell* **149**, 1607-1621 (2012)
Marks, D. S., Hopf, T. A., Sander, C. Protein structure prediction from sequence variation. *Nature Biotechnology* **30**, 1072–1080 (2012)
Hopf, T. A., Schärfe, C. P. I., Rodrigues, J. P. G. L. M., Green, A. G., Kohlbacher, O., Sander, C., Bonvin, A. M. J. J., Marks, D. S. Sequence co-evolution gives 3D contacts and structures of protein complexes. *eLife* Sep 25;3 (2014)
Hopf, T. A., Ingraham, J. B., Poelwijk, F.J., Schärfe, C.P.I., Springer, M., Sander, C., & Marks, D. S. (2017). Mutation effects predicted from sequence co-variation. *Nature Biotechnology* **35**, 128–135 doi:10.1038/nbt.3769
Green, A. G. and Elhabashy, H., Brock, K. P., Maddamsetti, R., Kohlbacher, O., Marks, D. S. (2021) Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences. *Nature Communications* **12**, 1396. https://doi.org/10.1038/s41467-021-21636-z
## Contributors
EVcouplings is developed in the labs of [Debora Marks](http://marks.hms.harvard.edu) and [Chris Sander](http://sanderlab.org/) at Harvard Medical School.
* [Thomas Hopf](mailto:thomas.hopf@gmail.com) (development lead)
* Anna G. Green
* Benjamin Schubert
* Sophia Mersmann
* Charlotta Schärfe
* Agnes Toth-Petroczy
* John Ingraham
* Rob Sheridan
* Christian Dallago
* Joe Min
Raw data
{
"_id": null,
"home_page": null,
"name": "evcouplings",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "analysis, couplings, evolutionary",
"author": null,
"author_email": "Thomas Hopf <thomas.hopf@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/b7/e2/6cf4236003689e0e8624064334049c92177f8cedc55e9010f05f4c64cb08/evcouplings-0.2.1.tar.gz",
"platform": null,
"description": "[![build_and_test Actions Status](https://github.com/debbiemarkslab/EVcouplings/workflows/build_and_test/badge.svg?branch=master)](https://github.com/debbiemarkslab/EVcouplings/actions) [![PyPI version](https://badge.fury.io/py/evcouplings.svg)](https://badge.fury.io/py/evcouplings)\n# EVcouplings\n\nPredict protein structure, function and mutations using evolutionary sequence covariation.\n\n## Installation and setup\n\n### Installing the Python package\n\n* If you are simply interested in using EVcouplings as a library, installing the Python package is all you need to do (unless you use functions that depend on external tools). \n* If you want to run the *evcouplings* application (alignment generation, model parameter inference, structure prediction, etc.) you will also need to follow the sections on installing external tools and databases.\n\n#### Requirements\n\nEVcouplings actively supports Python >= 3.10 installations. \n\n#### Installation\n\nTo install the latest version of EVcouplings on PyPI,\n\n pip install evcouplings\n\nTo obtain the latest development version of EVcouplings from the github repository, run\n\n pip install https://github.com/debbiemarkslab/EVcouplings/archive/develop.zip\n\nand to update to the latest version after previously installing EVcouplings from the repository, run\n\n pip install -U --no-deps https://github.com/debbiemarkslab/EVcouplings/archive/develop.zip\n\n### External software tools\n\n*After installation and before running compute jobs, the paths to the respective binaries of the following external tools have to be set in your EVcouplings job configuration file(s).*\n\n#### plmc (required)\n\nTool for inferring undirected statistical models from sequence variation. Download and install plmc to a directory of your choice from the [plmc github repository](https://github.com/debbiemarkslab/plmc) according to the included documentation.\n\nFor compatibility with evcouplings, please compile using\n\n make all-openmp32\n\n\n#### jackhmmer (required)\n\nDownload and install HMMER from the [HMMER webpage](http://hmmer.org/download.html) to a directory of your choice.\n\n#### HHsuite (optional)\n\nevcouplings uses the hhfilter tool to filter sequence alignments. Installation is only required if you need this functionality.\n\nDownload and install HHsuite from the [HHsuite github repository](https://github.com/soedinglab/hh-suite) to a directory of your choice.\n\n#### CNSsolve 1.21 (optional)\n\nevcouplings uses CNSsolve for computing 3D structure models from coupled residue pairs. Installation is only required if you want to run the *fold* stage of the computational pipeline.\n\nDownload and unpack a compiled version of [CNSsolve 1.21](http://cns-online.org/v1.21/) to a directory of your choice. No further setup is necessary, since evcouplings takes care of setting the right environment variables internally without relying on the included shell script cns_solve_env\n(you will have to put the path to the cns binary in your job config file however, e.g. cns_solve_1.21/intel-x86_64bit-linux/bin/cns).\n\n#### PSIPRED (optional)\n\nevcouplings uses PSIPRED for secondary structure prediction, to generate secondary structure distance and dihedral angle restraints for 3D structure computation.\nInstallation is only required if you want to run the *fold* stage of the computational pipeline, and do not supply your own secondary structure predictions.\n\nDownload and install [PSIPRED](http://bioinfadmin.cs.ucl.ac.uk/downloads/psipred/) according to the instructions in the included README file.\n\n#### maxcluster (optional)\n\nevcouplings uses maxcluster to compare predicted 3D structure models to experimental protein structures, if there are any for the target protein or one\nof its homologs. Installation is only required if you want to run the *fold* stage of the computational pipeline.\n\nDownload [maxcluster](http://www.sbg.bio.ic.ac.uk/~maxcluster/) and place it in a directory of your choice.\n\n### Databases\n\n*After download and before running compute jobs, the paths to the respective databases have to be set in your EVcouplings job configuration file(s).*\n\n#### Automatic database setup\nThe *evcouplings* application minimally needs a sequence database for alignment generation, and structure mapping information for comparison of evolutionary couplings to 3D structures.\n\nSequence and structure mapping databases for EVcouplings can be automatically downloaded using the included command line tool *evcouplings_dbupdate*.\n This tool will fetch the UniProt (SwissProt/TrEMBL), UniRef100 and UniRef90 databases, and generate SIFTS-based structure mapping tables.\n\nPlease see\n\n evcouplings_dbupdate --help\n\nfor how to download the respective databases. Note that this may take a while, especially the generation of post-processed SIFTS mapping files.\n\n#### Sequence databases for EVcomplex\nRunning the EVcouplings pipeline for protein complexes (aka EVcomplex) requires two pre-computed databases. You can download these databases here:\n\nena_genome_location_table: https://marks.hms.harvard.edu/evcomplex_databases/cds_pro_2017_02.txt\nuniprot_to_embl_table: https://marks.hms.harvard.edu/evcomplex_databases/idmapping_uniprot_embl_2017_02.txt\n\nSave these databases in your local environment, and then add the paths to the local copies of these databases to your config file for the complex pipeline.\n\nIn future releases these databases will be generated automatically.\n\n#### Other sequence databases\n\nYou can however use any sequence database of your choice in FASTA format if you prefer to. The database for any particular job needs to be defined in the job configuration file (\"databases\" section) and set as the input database in the \"alignment\" section.\n\n#### Structure and mapping databases\n\nRelevant PDB structures for comparison of ECs and 3D structure predictions will be automatically fetched from the web in the new compressed MMTF format on a per-job basis. You can however also pre-download the entire PDB and place the structures in a directory if you want to (and set pdb_mmtf_dir in your job configuration).\n\nUniprot to PDB index mapping files will be automatically generated by EVcouplings based on the SIFTS database.\nYou can either generate the files by running *evcouplings_dbupdate* (see above, preferred), or by pointing the sifts_mapping_table and sifts_sequence_db configuration parameters to file paths inside an already existing directory. If these files do not yet exist, they will be created by fetching and integrating data from the web (this may take a while) when the pipeline is first run and saved under the given file paths.\n\n## Documentation and tutorials\n\nPlease refer to the Jupyter notebooks in the [notebooks subdirectory](https://github.com/debbiemarkslab/EVcouplings/tree/master/notebooks) on how to\n* edit configuration files\n* run jobs\n* use EVcouplings as a Python library\n\nDocumentation for the source code is available at [readthedocs](http://evcouplings.readthedocs.io/en/latest/).\n\n## License\n\nEVcouplings is available under the MIT license, with the exception of the included CNS input scripts (please see [LICENSE](https://github.com/debbiemarkslab/EVcouplings/tree/master/LICENSE) for details).\n\n## References\n\nPlease cite the following reference for the EVcouplings Python package;\n\nHopf T. A., Green A. G., Schubert B., et al. The EVcouplings Python framework for coevolutionary sequence analysis. *Bioinformatics* **35**, 1582\u20131584 (2019)\n\nAlso consider citing the following references, which introduced the methods integrated by the EVcouplings Python package:\n\nMarks D. S., Colwell, L. J., Sheridan, R., Hopf, T.A., Pagnani, A., Zecchina, R., Sander, C. Protein 3D structure computed from evolutionary sequence variation. *PLOS ONE* **6**(12), e28766 (2011)\n\nHopf T. A., Colwell, L. J., Sheridan, R., Rost, B., Sander, C., Marks, D. S. Three-dimensional structures of membrane proteins from genomic sequencing. *Cell* **149**, 1607-1621 (2012)\n\nMarks, D. S., Hopf, T. A., Sander, C. Protein structure prediction from sequence variation. *Nature Biotechnology* **30**, 1072\u20131080 (2012)\n\nHopf, T. A., Sch\u00e4rfe, C. P. I., Rodrigues, J. P. G. L. M., Green, A. G., Kohlbacher, O., Sander, C., Bonvin, A. M. J. J., Marks, D. S. Sequence co-evolution gives 3D contacts and structures of protein complexes. *eLife* Sep 25;3 (2014)\n\nHopf, T. A., Ingraham, J. B., Poelwijk, F.J., Sch\u00e4rfe, C.P.I., Springer, M., Sander, C., & Marks, D. S. (2017). Mutation effects predicted from sequence co-variation. *Nature Biotechnology* **35**, 128\u2013135 doi:10.1038/nbt.3769\n\nGreen, A. G. and Elhabashy, H., Brock, K. P., Maddamsetti, R., Kohlbacher, O., Marks, D. S. (2021) Large-scale discovery of protein interactions at residue resolution using co-evolution calculated from genomic sequences. *Nature Communications* **12**, 1396. https://doi.org/10.1038/s41467-021-21636-z\n\n## Contributors\n\nEVcouplings is developed in the labs of [Debora Marks](http://marks.hms.harvard.edu) and [Chris Sander](http://sanderlab.org/) at Harvard Medical School.\n\n* [Thomas Hopf](mailto:thomas.hopf@gmail.com) (development lead)\n* Anna G. Green\n* Benjamin Schubert\n* Sophia Mersmann\n* Charlotta Sch\u00e4rfe\n* Agnes Toth-Petroczy\n* John Ingraham\n* Rob Sheridan\n* Christian Dallago\n* Joe Min\n",
"bugtrack_url": null,
"license": null,
"summary": "A Framework for evolutionary couplings analysis",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/debbiemarkslab/EVcouplings"
},
"split_keywords": [
"analysis",
" couplings",
" evolutionary"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c801f57cd1c1481daa17937aed967e391b5aea0f0d4f4b76e87c7a9f143e46f3",
"md5": "aec8433f0e9bf875a78a2149e76120af",
"sha256": "9017f95cd0730d858b6782be48291423125f49793aa52875d23c63995ef8ff45"
},
"downloads": -1,
"filename": "evcouplings-0.2.1-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "aec8433f0e9bf875a78a2149e76120af",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 263266,
"upload_time": "2024-11-05T11:18:46",
"upload_time_iso_8601": "2024-11-05T11:18:46.060430Z",
"url": "https://files.pythonhosted.org/packages/c8/01/f57cd1c1481daa17937aed967e391b5aea0f0d4f4b76e87c7a9f143e46f3/evcouplings-0.2.1-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b7e26cf4236003689e0e8624064334049c92177f8cedc55e9010f05f4c64cb08",
"md5": "7ef339987133d060b57ef759ae0d3b07",
"sha256": "bd7485569184d6392a493c85e28ddd8f5dfebc0a892952a749462f975181e8ce"
},
"downloads": -1,
"filename": "evcouplings-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "7ef339987133d060b57ef759ae0d3b07",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 225127,
"upload_time": "2024-11-05T11:18:47",
"upload_time_iso_8601": "2024-11-05T11:18:47.281245Z",
"url": "https://files.pythonhosted.org/packages/b7/e2/6cf4236003689e0e8624064334049c92177f8cedc55e9010f05f4c64cb08/evcouplings-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-05 11:18:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "debbiemarkslab",
"github_project": "EVcouplings",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "evcouplings"
}