# `peptides.py` [![Stars](https://img.shields.io/github/stars/althonos/peptides.py.svg?style=social&maxAge=3600&label=Star)](https://github.com/althonos/peptides.py/stargazers)
*Physicochemical properties, indices and descriptors for amino-acid sequences.*
[![Actions](https://img.shields.io/github/actions/workflow/status/althonos/peptides.py/test.yml?branch=main&logo=github&style=flat-square&maxAge=300)](https://github.com/althonos/peptides.py/actions)
[![Coverage](https://img.shields.io/codecov/c/gh/althonos/peptides.py?style=flat-square&maxAge=3600)](https://codecov.io/gh/althonos/peptides.py/)
[![License](https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat-square&maxAge=2678400)](https://choosealicense.com/licenses/gpl-3.0/)
[![PyPI](https://img.shields.io/pypi/v/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides)
[![Bioconda](https://img.shields.io/conda/vn/bioconda/peptides?style=flat-square&maxAge=3600&logo=anaconda)](https://anaconda.org/bioconda/peptides)
[![Wheel](https://img.shields.io/pypi/wheel/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides/#files)
[![Python Versions](https://img.shields.io/pypi/pyversions/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides/#files)
[![Python Implementations](https://img.shields.io/badge/impl-universal-success.svg?style=flat-square&maxAge=3600&label=impl)](https://pypi.org/project/peptides/#files)
[![Source](https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400&style=flat-square)](https://github.com/althonos/peptides.py/)
[![Mirror](https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square&maxAge=2678400)](https://git.embl.de/larralde/peptides.py/)
[![GitHub issues](https://img.shields.io/github/issues/althonos/peptides.py.svg?style=flat-square&maxAge=600)](https://github.com/althonos/peptides.py/issues)
[![Docs](https://img.shields.io/readthedocs/peptides/latest?style=flat-square&maxAge=600)](https://peptides.readthedocs.io)
[![Changelog](https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400&style=flat-square)](https://github.com/althonos/peptides.py/blob/master/CHANGELOG.md)
[![Downloads](https://img.shields.io/pypi/dm/peptides?style=flat-square&color=303f9f&maxAge=86400&label=downloads)](https://pepy.tech/project/peptides)
## πΊοΈ Overview
`peptides.py` is a pure-Python package to compute common descriptors for
protein sequences. It started as a port of [`Peptides`](https://cran.r-project.org/web/packages/Peptides/index.html), the R package written by
[Daniel Osorio](https://orcid.org/0000-0003-4424-8422), but now also provides
some additional features from [EMBOSS](http://emboss.bioinformatics.nl/cgi-bin/emboss/),
[ExPASy Protein Identification and Analysis Tools](https://web.expasy.org/protparam/), and [Rcpi](https://bioconductor.org/packages/release/bioc/html/Rcpi.html).
This library has no external dependency and is available for all modern Python
versions (3.6+).
### π Features
A non-exhaustive list of available features:
- Peptide statistics: amino acid counts and frequencies.
- **QSAR** descriptors: BLOSUM indices, Cruciani properties, FASGAI vectors, Kidera factors, MS-WHIM scores, PCP descriptors, ProtFP descriptors, Sneath vectors, ST-scales, T-scales, VHSE-scales, Z-scales.
- Sequence profiles: hydrophobicity, hydrophobic moment, membrane position.
- Physicochemical properties: aliphatic index, instability index, theoretical net charge, isoelectric point, molecular weight (with isotope labelling support).
- Biological properties: structural class prediction.
*If this library is missing a useful statistic or descriptor, feel free to
reach out and open a feature request on the [issue tracker](https://github.com/althonos/peptides.py/issues)
of the [project repository](https://github.com/althonos/peptides.py).*
### π§ Vectorization
Most of the descriptors for a protein sequence are simple averages of values
taken in a lookup table, so computing them can be done in a vectorized manner.
If [`numpy`](https://numpy.org/) can be imported, relevant functions
(like `numpy.sum` or `numpy.take`) will be used, otherwise a fallback
implementation using [`array.array`](https://docs.python.org/3/library/array.html#array.array)
from the standard library is available.
## π§ Installing
Install the `peptides` package directly from [PyPi](https://pypi.org/project/peptides)
which hosts universal wheels that can be installed with `pip`:
```console
$ pip install peptides
```
Otherwise, `peptides.py` is also available as a [Bioconda](https://bioconda.github.io/)
package:
```console
$ conda install -c bioconda peptides
```
## π Documentation
A complete [API reference](https://peptides.readthedocs.io/en/stable/api.html)
can be found in the [online documentation](https://peptides.readthedocs.io/),
or directly from the command line using
[`pydoc`](https://docs.python.org/3/library/pydoc.html):
```console
$ pydoc peptides.Peptide
```
## π‘ Example
Start by creating a `Peptide` object from a protein sequence:
```python
>>> import peptides
>>> peptide = peptides.Peptide("MLKKRFLGALAVATLLTLSFGTPVMAQSGSAVFTNEGVTPFAISYPGGGT")
```
Then use the appropriate methods to compute the descriptors you want:
```python
>>> peptide.aliphatic_index()
89.8...
>>> peptide.boman()
-0.2097...
>>> peptide.charge(pH=7.4)
1.99199...
>>> peptide.isoelectric_point()
10.2436...
```
Methods that return more than one scalar value (for instance, `Peptide.blosum_indices`)
will return a dedicated named tuple:
```python
>>> peptide.ms_whim_scores()
MSWHIMScores(mswhim1=-0.436399..., mswhim2=0.4916..., mswhim3=-0.49200...)
```
Use the `Peptide.descriptors` method to get a dictionary with every available
descriptor. This makes it very easy to create a `pandas.DataFrame` with
descriptors for several protein sequences:
```python
>>> seqs = ["SDKEVDEVDAALSDLEITLE", "ARQQNLFINFCLILIFLLLI", "EGVNDNECEGFFSAR"]
>>> df = pandas.DataFrame([ peptides.Peptide(s).descriptors() for s in seqs ])
>>> df
BLOSUM1 BLOSUM2 BLOSUM3 BLOSUM4 ... Z2 Z3 Z4 Z5
0 0.367000 -0.436000 -0.239 0.014500 ... -0.711000 -0.104500 -1.486500 0.429500
1 -0.697500 -0.372500 -0.493 0.157000 ... -0.307500 -0.627500 -0.450500 0.362000
2 0.479333 -0.001333 0.138 0.228667 ... -0.299333 0.465333 -0.976667 0.023333
[3 rows x 66 columns]
```
## π Feedback
### β οΈ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the [GitHub issue
tracker](https://github.com/althonos/peptides.py/issues) if you need to report
or ask something. If you are filing in on a bug, please include as much
information as you can about the issue, and try to recreate the same bug
in a simple, easily reproducible situation.
### ποΈ Contributing
Contributions are more than welcome! See
[`CONTRIBUTING.md`](https://github.com/althonos/peptides.py/blob/main/CONTRIBUTING.md)
for more details.
## βοΈ License
This library is provided under the [GNU General Public License v3.0](https://choosealicense.com/licenses/gpl-3.0/).
The original R `Peptides` package was written by [Daniel Osorio](https://orcid.org/0000-0003-4424-8422),
[Paola RondΓ³n-Villarreal](https://orcid.org/0000-0001-8209-3885) and
[Rodrigo Torres](https://orcid.org/0000-0003-1113-3020), and is licensed under
the terms of the [GNU General Public License v2.0](https://choosealicense.com/licenses/gpl-2.0/).
The [EMBOSS](http://emboss.bioinformatics.nl/cgi-bin/emboss/) applications are
released under the [GNU General Public License v1.0](https://www.gnu.org/licenses/old-licenses/gpl-1.0.html).
*This project is in no way not affiliated, sponsored, or otherwise endorsed
by the [original `Peptides` authors](https://github.com/dosorio). It was developed
by [Martin Larralde](https://github.com/althonos/) during his PhD project
at the [European Molecular Biology Laboratory](https://www.embl.de/) in
the [Zeller team](https://github.com/zellerlab).*
Raw data
{
"_id": null,
"home_page": "https://github.com/althonos/peptides.py",
"name": "peptides",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "bioinformatics, protein, sequence, peptide, qsar",
"author": "Martin Larralde",
"author_email": "martin.larralde@embl.de",
"download_url": "https://files.pythonhosted.org/packages/db/6e/11df5f10241d2b7d4b6981a126c35a8ed269f5d5008ec4e04255acd5eea7/peptides-0.3.3.post1.tar.gz",
"platform": "posix",
"description": "# `peptides.py` [![Stars](https://img.shields.io/github/stars/althonos/peptides.py.svg?style=social&maxAge=3600&label=Star)](https://github.com/althonos/peptides.py/stargazers)\n\n*Physicochemical properties, indices and descriptors for amino-acid sequences.*\n\n[![Actions](https://img.shields.io/github/actions/workflow/status/althonos/peptides.py/test.yml?branch=main&logo=github&style=flat-square&maxAge=300)](https://github.com/althonos/peptides.py/actions)\n[![Coverage](https://img.shields.io/codecov/c/gh/althonos/peptides.py?style=flat-square&maxAge=3600)](https://codecov.io/gh/althonos/peptides.py/)\n[![License](https://img.shields.io/badge/license-GPLv3-blue.svg?style=flat-square&maxAge=2678400)](https://choosealicense.com/licenses/gpl-3.0/)\n[![PyPI](https://img.shields.io/pypi/v/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides)\n[![Bioconda](https://img.shields.io/conda/vn/bioconda/peptides?style=flat-square&maxAge=3600&logo=anaconda)](https://anaconda.org/bioconda/peptides)\n[![Wheel](https://img.shields.io/pypi/wheel/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides/#files)\n[![Python Versions](https://img.shields.io/pypi/pyversions/peptides.svg?style=flat-square&maxAge=3600)](https://pypi.org/project/peptides/#files)\n[![Python Implementations](https://img.shields.io/badge/impl-universal-success.svg?style=flat-square&maxAge=3600&label=impl)](https://pypi.org/project/peptides/#files)\n[![Source](https://img.shields.io/badge/source-GitHub-303030.svg?maxAge=2678400&style=flat-square)](https://github.com/althonos/peptides.py/)\n[![Mirror](https://img.shields.io/badge/mirror-EMBL-009f4d?style=flat-square&maxAge=2678400)](https://git.embl.de/larralde/peptides.py/)\n[![GitHub issues](https://img.shields.io/github/issues/althonos/peptides.py.svg?style=flat-square&maxAge=600)](https://github.com/althonos/peptides.py/issues)\n[![Docs](https://img.shields.io/readthedocs/peptides/latest?style=flat-square&maxAge=600)](https://peptides.readthedocs.io)\n[![Changelog](https://img.shields.io/badge/keep%20a-changelog-8A0707.svg?maxAge=2678400&style=flat-square)](https://github.com/althonos/peptides.py/blob/master/CHANGELOG.md)\n[![Downloads](https://img.shields.io/pypi/dm/peptides?style=flat-square&color=303f9f&maxAge=86400&label=downloads)](https://pepy.tech/project/peptides)\n\n## \ud83d\uddfa\ufe0f Overview\n\n`peptides.py` is a pure-Python package to compute common descriptors for\nprotein sequences. It started as a port of [`Peptides`](https://cran.r-project.org/web/packages/Peptides/index.html), the R package written by\n[Daniel Osorio](https://orcid.org/0000-0003-4424-8422), but now also provides\nsome additional features from [EMBOSS](http://emboss.bioinformatics.nl/cgi-bin/emboss/),\n[ExPASy Protein Identification and Analysis Tools](https://web.expasy.org/protparam/), and [Rcpi](https://bioconductor.org/packages/release/bioc/html/Rcpi.html).\n\nThis library has no external dependency and is available for all modern Python\nversions (3.6+).\n\n### \ud83d\udccb Features\n\nA non-exhaustive list of available features:\n\n- Peptide statistics: amino acid counts and frequencies.\n- **QSAR** descriptors: BLOSUM indices, Cruciani properties, FASGAI vectors, Kidera factors, MS-WHIM scores, PCP descriptors, ProtFP descriptors, Sneath vectors, ST-scales, T-scales, VHSE-scales, Z-scales.\n- Sequence profiles: hydrophobicity, hydrophobic moment, membrane position.\n- Physicochemical properties: aliphatic index, instability index, theoretical net charge, isoelectric point, molecular weight (with isotope labelling support).\n- Biological properties: structural class prediction.\n\n*If this library is missing a useful statistic or descriptor, feel free to\nreach out and open a feature request on the [issue tracker](https://github.com/althonos/peptides.py/issues)\nof the [project repository](https://github.com/althonos/peptides.py).*\n\n### \ud83e\uddca Vectorization\n\nMost of the descriptors for a protein sequence are simple averages of values\ntaken in a lookup table, so computing them can be done in a vectorized manner.\nIf [`numpy`](https://numpy.org/) can be imported, relevant functions\n(like `numpy.sum` or `numpy.take`) will be used, otherwise a fallback\nimplementation using [`array.array`](https://docs.python.org/3/library/array.html#array.array)\nfrom the standard library is available.\n\n## \ud83d\udd27 Installing\n\nInstall the `peptides` package directly from [PyPi](https://pypi.org/project/peptides)\nwhich hosts universal wheels that can be installed with `pip`:\n```console\n$ pip install peptides\n```\n\nOtherwise, `peptides.py` is also available as a [Bioconda](https://bioconda.github.io/)\npackage:\n```console\n$ conda install -c bioconda peptides\n```\n\n## \ud83d\udcd6 Documentation\n\nA complete [API reference](https://peptides.readthedocs.io/en/stable/api.html)\ncan be found in the [online documentation](https://peptides.readthedocs.io/),\nor directly from the command line using\n[`pydoc`](https://docs.python.org/3/library/pydoc.html):\n```console\n$ pydoc peptides.Peptide\n```\n\n## \ud83d\udca1 Example\n\nStart by creating a `Peptide` object from a protein sequence:\n```python\n>>> import peptides\n>>> peptide = peptides.Peptide(\"MLKKRFLGALAVATLLTLSFGTPVMAQSGSAVFTNEGVTPFAISYPGGGT\")\n```\n\nThen use the appropriate methods to compute the descriptors you want:\n```python\n>>> peptide.aliphatic_index()\n89.8...\n>>> peptide.boman()\n-0.2097...\n>>> peptide.charge(pH=7.4)\n1.99199...\n>>> peptide.isoelectric_point()\n10.2436...\n```\n\nMethods that return more than one scalar value (for instance, `Peptide.blosum_indices`)\nwill return a dedicated named tuple:\n```python\n>>> peptide.ms_whim_scores()\nMSWHIMScores(mswhim1=-0.436399..., mswhim2=0.4916..., mswhim3=-0.49200...)\n```\n\nUse the `Peptide.descriptors` method to get a dictionary with every available\ndescriptor. This makes it very easy to create a `pandas.DataFrame` with\ndescriptors for several protein sequences:\n```python\n>>> seqs = [\"SDKEVDEVDAALSDLEITLE\", \"ARQQNLFINFCLILIFLLLI\", \"EGVNDNECEGFFSAR\"]\n>>> df = pandas.DataFrame([ peptides.Peptide(s).descriptors() for s in seqs ])\n>>> df\n BLOSUM1 BLOSUM2 BLOSUM3 BLOSUM4 ... Z2 Z3 Z4 Z5\n0 0.367000 -0.436000 -0.239 0.014500 ... -0.711000 -0.104500 -1.486500 0.429500\n1 -0.697500 -0.372500 -0.493 0.157000 ... -0.307500 -0.627500 -0.450500 0.362000\n2 0.479333 -0.001333 0.138 0.228667 ... -0.299333 0.465333 -0.976667 0.023333\n\n[3 rows x 66 columns]\n```\n\n## \ud83d\udcad Feedback\n\n### \u26a0\ufe0f Issue Tracker\n\nFound a bug ? Have an enhancement request ? Head over to the [GitHub issue\ntracker](https://github.com/althonos/peptides.py/issues) if you need to report\nor ask something. If you are filing in on a bug, please include as much\ninformation as you can about the issue, and try to recreate the same bug\nin a simple, easily reproducible situation.\n\n### \ud83c\udfd7\ufe0f Contributing\n\nContributions are more than welcome! See\n[`CONTRIBUTING.md`](https://github.com/althonos/peptides.py/blob/main/CONTRIBUTING.md)\nfor more details.\n\n## \u2696\ufe0f License\n\nThis library is provided under the [GNU General Public License v3.0](https://choosealicense.com/licenses/gpl-3.0/).\nThe original R `Peptides` package was written by [Daniel Osorio](https://orcid.org/0000-0003-4424-8422),\n[Paola Rond\u00f3n-Villarreal](https://orcid.org/0000-0001-8209-3885) and\n[Rodrigo Torres](https://orcid.org/0000-0003-1113-3020), and is licensed under\nthe terms of the [GNU General Public License v2.0](https://choosealicense.com/licenses/gpl-2.0/).\nThe [EMBOSS](http://emboss.bioinformatics.nl/cgi-bin/emboss/) applications are\nreleased under the [GNU General Public License v1.0](https://www.gnu.org/licenses/old-licenses/gpl-1.0.html).\n\n*This project is in no way not affiliated, sponsored, or otherwise endorsed\nby the [original `Peptides` authors](https://github.com/dosorio). It was developed\nby [Martin Larralde](https://github.com/althonos/) during his PhD project\nat the [European Molecular Biology Laboratory](https://www.embl.de/) in\nthe [Zeller team](https://github.com/zellerlab).*\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Physicochemical properties, indices and descriptors for amino-acid sequences.",
"version": "0.3.3.post1",
"project_urls": {
"Bug Tracker": "https://github.com/althonos/peptides.py/issues",
"Builds": "https://github.com/althonos/peptides.py/actions",
"Changelog": "https://github.com/althonos/peptides.py/blob/master/CHANGELOG.md",
"Coverage": "https://codecov.io/gh/althonos/peptides.py/",
"Homepage": "https://github.com/althonos/peptides.py",
"PyPI": "https://pypi.org/project/peptides"
},
"split_keywords": [
"bioinformatics",
" protein",
" sequence",
" peptide",
" qsar"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "82d6c9bc4f19cee59eff0e8e8a3a10192f4d2ae44500b78a4d2f58e32a9c2f39",
"md5": "829af0d5866fe01248a38c7ee55c3627",
"sha256": "e6312d8b18daed17a3ea18a5f41d363042afb3116125730cd3b013ce217984cc"
},
"downloads": -1,
"filename": "peptides-0.3.3.post1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "829af0d5866fe01248a38c7ee55c3627",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 119229,
"upload_time": "2024-08-26T14:23:20",
"upload_time_iso_8601": "2024-08-26T14:23:20.209671Z",
"url": "https://files.pythonhosted.org/packages/82/d6/c9bc4f19cee59eff0e8e8a3a10192f4d2ae44500b78a4d2f58e32a9c2f39/peptides-0.3.3.post1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "db6e11df5f10241d2b7d4b6981a126c35a8ed269f5d5008ec4e04255acd5eea7",
"md5": "19e70938321c2b5c27a150b504292c91",
"sha256": "4d84f888b70d38e115c6b9c02231162113c30cb96f7ffccfa307ca4f6f455eb4"
},
"downloads": -1,
"filename": "peptides-0.3.3.post1.tar.gz",
"has_sig": false,
"md5_digest": "19e70938321c2b5c27a150b504292c91",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 72139,
"upload_time": "2024-08-26T14:23:21",
"upload_time_iso_8601": "2024-08-26T14:23:21.263859Z",
"url": "https://files.pythonhosted.org/packages/db/6e/11df5f10241d2b7d4b6981a126c35a8ed269f5d5008ec4e04255acd5eea7/peptides-0.3.3.post1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-26 14:23:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "althonos",
"github_project": "peptides.py",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "peptides"
}