ecpick


Nameecpick JSON
Version 0.0.1 PyPI version JSON
download
home_pagehttps://github.com/datax-lab/ECPICK
SummaryEnzyme Commission Number Prediction
upload_time2023-05-01 00:50:20
maintainer
docs_urlNone
authorMINGYU PARK@Sunmoon University
requires_python
license
keywords enzyme ec number prediction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ECPICK

[![Powered by ](https://img.shields.io/badge/Powered%20by-DataX%20Lab-orange.svg?style=flat&colorA=555&colorB=b42b2c)](https://www.dataxlab.org)
[![Powered by ](https://img.shields.io/badge/Powered%20by-CPS%20Lab-orange.svg?style=flat&colorA=555&colorB=007580)](https://www.sunmoon.ac.kr)

[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ecpick)](https://pypi.org/project/ecpick/)
[![PyPI](https://img.shields.io/pypi/v/ecpick?style=flat&colorB=0679BA)](https://pypi.org/project/ecpick/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/ecpick?label=pypi%20downloads)](https://pypi.org/project/ecpick/)
[![PyPI - License](https://img.shields.io/pypi/l/ecpick)](https://pypi.org/project/ecpick/)

## Biologically interpretable deep learning enhances trustworthy enzyme commission number prediction and discovers potential motif sites

The rapid growth of uncharacterized enzymes and their functional diversity urge
accurate and trustworthy computational functional annotation tools. However,
current approaches lack trustworthiness for the predictions and model interpretation,
limiting model reliability on the multi-label classification problem with thousands
of classes. Here, we demonstrate that our novel biologically interpretable deep
learning model (ECPICK) provides a robust solution for trustworthy predictions
of enzyme commission (EC) numbers with significantly enhanced predictive power
and the capability to discover potential motif sites. ECPICK learns complex
sequential patterns of amino acids and their hierarchical structures from twenty
million proteins to create the EC number predictions. Furthermore, ECPICK identifies
significant amino acids that contribute to the prediction in a given protein sequence
without multiple sequence alignment, which may match to known motif sites for trustworthy
prediction or potential motif sites. Our intensive assessment showed not only outstanding
enhancement of predictive performance on the largest databases of Uniprot, PDB, and KEGG,
but also a capability to discover new motif sites in microorganisms. ECPICK will be a
reliable EC number prediction tool to identify protein functions of an increasing number
of uncharacterized enzymes.

- **Website**: http://ecpick.dataxlab.org
- **Documentation**: https://readthedocs.org/projects/ecpick
- **Source code**: https://github.com/datax-lab/ECPICK

## Installation

**ECPICK** support Python 3.6+, Additionally, you will need
```biopython```, ```numpy```, ```scikit-learn```, ```torch```, ```tqdm```.
However, these packages should be installed automatically when installing this codebase.

### Dependencies+

[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ecpick)](https://pypi.org/project/ecpick/)
![PyPI](https://img.shields.io/pypi/v/torch?label=torch)
![PyPI](https://img.shields.io/pypi/v/biopython?label=biopython)
![PyPI](https://img.shields.io/pypi/v/numpy?label=numpy)
![PyPI](https://img.shields.io/pypi/v/scikit-learn?label=scikit-learn)
![PyPI](https://img.shields.io/pypi/v/tqdm?label=tqdm)

```ECPICK``` is available through PyPi and can easily be installed with a pip install

```shell
$ pip install ecpick
```

## Documentation

Read the documentation on readthedocs (Getting ready)

## Quick Start

```python
from ecpick import ECPICK

ecpick = ECPICK()
ecpick.predict_fasta(fasta_path='sample.fasta', output_path='output')
```

## Usage

## Links:

- ECPICK Web server: http://ecpick.dataxlab.org

## References

Not available yet.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/datax-lab/ECPICK",
    "name": "ecpick",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "enzyme,ec number,prediction",
    "author": "MINGYU PARK@Sunmoon University",
    "author_email": "duveen@duveen.me",
    "download_url": "",
    "platform": null,
    "description": "# ECPICK\r\n\r\n[![Powered by ](https://img.shields.io/badge/Powered%20by-DataX%20Lab-orange.svg?style=flat&colorA=555&colorB=b42b2c)](https://www.dataxlab.org)\r\n[![Powered by ](https://img.shields.io/badge/Powered%20by-CPS%20Lab-orange.svg?style=flat&colorA=555&colorB=007580)](https://www.sunmoon.ac.kr)\r\n\r\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ecpick)](https://pypi.org/project/ecpick/)\r\n[![PyPI](https://img.shields.io/pypi/v/ecpick?style=flat&colorB=0679BA)](https://pypi.org/project/ecpick/)\r\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/ecpick?label=pypi%20downloads)](https://pypi.org/project/ecpick/)\r\n[![PyPI - License](https://img.shields.io/pypi/l/ecpick)](https://pypi.org/project/ecpick/)\r\n\r\n## Biologically interpretable deep learning enhances trustworthy enzyme commission number prediction and discovers potential motif sites\r\n\r\nThe rapid growth of uncharacterized enzymes and their functional diversity urge\r\naccurate and trustworthy computational functional annotation tools. However,\r\ncurrent approaches lack trustworthiness for the predictions and model interpretation,\r\nlimiting model reliability on the multi-label classification problem with thousands\r\nof classes. Here, we demonstrate that our novel biologically interpretable deep\r\nlearning model (ECPICK) provides a robust solution for trustworthy predictions\r\nof enzyme commission (EC) numbers with significantly enhanced predictive power\r\nand the capability to discover potential motif sites. ECPICK learns complex\r\nsequential patterns of amino acids and their hierarchical structures from twenty\r\nmillion proteins to create the EC number predictions. Furthermore, ECPICK identifies\r\nsignificant amino acids that contribute to the prediction in a given protein sequence\r\nwithout multiple sequence alignment, which may match to known motif sites for trustworthy\r\nprediction or potential motif sites. Our intensive assessment showed not only outstanding\r\nenhancement of predictive performance on the largest databases of Uniprot, PDB, and KEGG,\r\nbut also a capability to discover new motif sites in microorganisms. ECPICK will be a\r\nreliable EC number prediction tool to identify protein functions of an increasing number\r\nof uncharacterized enzymes.\r\n\r\n- **Website**: http://ecpick.dataxlab.org\r\n- **Documentation**: https://readthedocs.org/projects/ecpick\r\n- **Source code**: https://github.com/datax-lab/ECPICK\r\n\r\n## Installation\r\n\r\n**ECPICK** support Python 3.6+, Additionally, you will need\r\n```biopython```, ```numpy```, ```scikit-learn```, ```torch```, ```tqdm```.\r\nHowever, these packages should be installed automatically when installing this codebase.\r\n\r\n### Dependencies+\r\n\r\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ecpick)](https://pypi.org/project/ecpick/)\r\n![PyPI](https://img.shields.io/pypi/v/torch?label=torch)\r\n![PyPI](https://img.shields.io/pypi/v/biopython?label=biopython)\r\n![PyPI](https://img.shields.io/pypi/v/numpy?label=numpy)\r\n![PyPI](https://img.shields.io/pypi/v/scikit-learn?label=scikit-learn)\r\n![PyPI](https://img.shields.io/pypi/v/tqdm?label=tqdm)\r\n\r\n```ECPICK``` is available through PyPi and can easily be installed with a pip install\r\n\r\n```shell\r\n$ pip install ecpick\r\n```\r\n\r\n## Documentation\r\n\r\nRead the documentation on readthedocs (Getting ready)\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom ecpick import ECPICK\r\n\r\necpick = ECPICK()\r\necpick.predict_fasta(fasta_path='sample.fasta', output_path='output')\r\n```\r\n\r\n## Usage\r\n\r\n## Links:\r\n\r\n- ECPICK Web server: http://ecpick.dataxlab.org\r\n\r\n## References\r\n\r\nNot available yet.\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Enzyme Commission Number Prediction",
    "version": "0.0.1",
    "project_urls": {
        "Bug Reports": "https://github.com/datax-lab/ECPICK/issue",
        "Homepage": "https://github.com/datax-lab/ECPICK",
        "Source": "https://github.com/datax-lab/ECPICK"
    },
    "split_keywords": [
        "enzyme",
        "ec number",
        "prediction"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0664d5f7378172021d2f2ff75cdfda0a8e5465903712c78abe1791e29ef8bca8",
                "md5": "18d65626a44149ac75a8f21ac101ef4b",
                "sha256": "7f86ab681a7255d8bc27c784953680888de96788f8ba600c3a6066d60db21e56"
            },
            "downloads": -1,
            "filename": "ecpick-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18d65626a44149ac75a8f21ac101ef4b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12144,
            "upload_time": "2023-05-01T00:50:20",
            "upload_time_iso_8601": "2023-05-01T00:50:20.546791Z",
            "url": "https://files.pythonhosted.org/packages/06/64/d5f7378172021d2f2ff75cdfda0a8e5465903712c78abe1791e29ef8bca8/ecpick-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-01 00:50:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "datax-lab",
    "github_project": "ECPICK",
    "lcname": "ecpick"
}
        
Elapsed time: 0.05850s