<p align="center">
  <img src="https://phage.ai/assets/images/phageai_logo2.svg">
</p>
**PhageAI** is an application that simultaneously represents **a repository of knowledge of bacteriophages** and a bioinformatics pipeline to analyse genomes with **Artificial Intelligence support**. This package supports the most critical programmable features from our platform.
Machine Learning algorithms can process enormous amounts of data in relatively short time in order to find connections and dependencies that are obvious for human beings. Correctly designed applications based on AI are able to vastly improve and speed up the work of the domain experts.
Models based on DNA or proteins contextual vectorization and Deep Neural Networks are particularly effective when it comes to analysis of genomic data. The system that we propose aims to use the phages sequences uploaded to the database to build a model which is able to predict if a bacteriophage is **chronic**, **temperate** or **virulent** with a high probability. Furthermore, our system shares more prediction methods for phage taxonomy, phage similarity and annotation extended by proteins structural classes classification.
One of the key system modules is the bacteriophages repository with a clean web interface that allows to browse, upload and share data with other users. The gathered knowledge about the bacteriophages is not only valuable on its own but also because of the ability to train the ever-improving Machine Learning models.
Detection of virulent or temperate features is only one of the first tasks that can be solved with Artificial Intelligence. The combination of Biology, Natural Language Processing and Machine Learning allows us to create algorithms for genomic data processing that could eventually turn out to be effective in a wide range of problems with focus on classification and information extraced from DNA.
[](https://pypi.org/project/phageai/)
[](https://pypi.python.org/pypi/)
[](https://pypi.python.org/pypi/phageai/)
[](https://github.com/psf/black)
[](https://pepy.tech/project/phageai)
[](https://twitter.com/phageai)
[](https://pl.linkedin.com/company/phageai-s-a)
## Table of Contents
[Available methods](https://github.com/phageaisa/phageai#available-methods) | [Installation](https://github.com/phageaisa/phageai#installation-and-usage) | [Benchmark](https://github.com/phageaisa/phageai#benchmark) | [Community and Contributions](https://github.com/phageaisa/phageai#community-and-contributions) | [Have a question?](https://github.com/phageaisa/phageai#have-a-question) | [Found a bug?](https://github.com/phageaisa/phageai#found-a-bug) | [Team](https://github.com/phageaisa/phageai#team) | [Change log](https://github.com/phageaisa/phageai#change-log) | [License](https://github.com/phageaisa/phageai#license) | [Cite](https://github.com/phageaisa/phageai#cite)
## Available methods
* `upload(fasta_path, access)` - upload FASTA file with phage genome as "public", "private" or "" (temporary) sample in the PhageAI repository. Upload stage is starting the bioinformatics pipeline execution for phage characteristic.
* `processing_status(job_id)` - get current processing status for your phage sample related with Job ID;
* `get_lifecycle_classification(job_id)` - get phage lifecycle classification result;
* `get_taxonomy_classification(job_id)` - get phage taxonomy classification results for order, family and genus;
* `get_proteins_classification(job_id)` - get phage proteins structural classes classification results;
* `get_top10_similarities(job_id)` - get TOP-10 similar phages to your sample from the repository;
* `get_full_report(job_id)` - get full phage characteristics report (all meta-data and predictions);
* `get_phage_characteristic(accession_number)` - get meta-data about publicly available phage with specific accession number (with version);
## Installation and usage
#### PhageAI user account (1/3)
Create a free user account in [the PhageAI web platform](https://app.phage.ai/) or use an existing one. If you had to create new one, activate your account by activation link which was sent on your mail inbox. After that, log into the platform successfully and click "My profile" in the top-right menu. In the "API access" section create a new access token (string) and copy it for the steps below.
<p align="center">
  <img src="https://raw.githubusercontent.com/phageaisa/phageai/main/media/phageai-access-token.png">
</p>
#### PhageAI package (2/3)
_PhageAI_ requires Python 3.8.0+ to run and can be installed by running:
```
pip install phageai
```
If you can't wait for the latest hotness from the develop branch, then install it directly from the repository:
```
pip install git+https://github.com/phageaisa/phageai.git@develop
```
#### PhageAI execution (3/3)
`PASTE_YOUR_ACCESS_TOKEN_HERE` - PhageAI web user's access token;
`PASTE_YOUR_FASTA_PATH_HERE` - FASTA filename with *.fasta or *.fa extension;
### Example - how to upload phage to repository with private access
```python
from phageai.platform import PhageAIAccounts
phageai_api = PhageAIAccounts(access_token='PASTE_YOUR_ACCESS_TOKEN_HERE')
phage_example_jobid = phageai_api.upload(fasta_path="PASTE_YOUR_FASTA_PATH_HERE", access="private")
```
Expected output should be the job ID value:
```json
{
  'job_id': '0a71e61a-ec58-447b-859e-d9ba15e103a9'
}
```
or, if you reach out daily API requests limit (100 by default), you can expect:
```json
{
    "author": ["Your daily API limit (100 requests) has been exceeded"]
}
```
If you reach out your daily requests limit, and you still need more, feel free to contact us by contact@phageai.
### Example - how to track the processing progress
Tracking progress of phage processing is super useful if you upload more samples in the same time or when you integrate your service or pipeline with PhageAI.
```python
(...)
phage_example_jobid = phageai_api.upload(fasta_path="PASTE_YOUR_FASTA_PATH_HERE", access="private")
job_id = phage_example_jobid["job_id"]
phageai_api.processing_status(job_id=job_id)
```
Depends on what is the current stage of your sample processing ("Not Started", "In progress", "Done", "Failed"), expected output should be like:
```python
{
  'taxonomy_stage': 'Done',
  'proteins_stage': 'Not Started',
  'top10_stage': 'Not Started',
  'lifecycle_stage': 'Done',
  'final_report': 'In Progress'
}
```
Each of the above stage is work separately in PhageAI so you can expect different statuses for each of them.
### Example - get lifecycle classification
```python
(...)
phageai_api.get_lifecycle_classification(job_id=job_id)
```
Expected output should be like:
```python
{'value': 'Chronic', 'probability': 99.85}
```
In the same way you can execute other [available methods](https://github.com/phageaisa/phageai#available-methods) from the package. 
## Benchmark
PhageAI lifecycle classifier was benchmarked with [DeePhage](https://github.com/shufangwu/DeePhage), [bacphlip](https://github.com/adamhockenberry/bacphlip), [VIBRANT](https://github.com/AnantharamanLab/VIBRANT) and [PHACTS](https://github.com/deprekate/PHACTS) tools using 104 Virulent and Temperate bacteriophages from our paper (testing set). Correct predictions results:
Tool | Version   | Chronic support | Phage sequences used in research | Test set accuracy (%) | DOI |
--- |-----------| --- |----------------------------------|-----------------------| --- |
**PhageAI** | **0.4.1** | **Yes** | **15,235**                       | **93.27**             | **This research** |
DeePhage | 1.0       | No | 1,640                            | 84.62                 | 10.1093/gigascience/giab056 |
bacphlip | 0.9.6     | No | 1,057                            | 100                   | 10.7717/peerj.11396 |
VIBRANT | 1.2.1     | No | 350,626                          | 85.58                 | 10.1186/s40168-020-00867-0 |
PHACTS | 1.8       | No | 227                              | 75.00                 | 10.1093/bioinformatics/bts014 |
## Community and Contributions
Happy to see you willing to make the PhageAI better. Development on the latest stable version of Python 3+ is preferred. As of this writing it's 3.8. You can use any operating system.
If you're fixing a bug or adding a new feature, add a test with *[pytest](https://github.com/pytest-dev/pytest)* and check the code with *[Black](https://github.com/psf/black/)* and *[mypy](https://github.com/python/mypy)*. Before adding any large feature, first open an issue for us to discuss the idea with the core devs and community.
## Have a question?
Obviously if you have a private question or want to cooperate with us, you can always reach out to us directly by mail.
## Found a bug?
Feel free to add a new issue with a respective title and description on the [the PhageAI repository](https://github.com/phageaisa/phageai/issues). If you already found a solution to your problem, we would be happy to review your pull request.
## Team
Core Developers and Domain Experts who contributing to PhageAI:
* Piotr Tynecki
* Łukasz Wałejko
* Krzysztof Owsieniuk
* Joanna Kazimierczak
* Arkadiusz Guziński
* Bogumił Zimoń
* Żaneta Szulc
* Maria Urbanowicz
## Change log
The log's will become rather long. It moved to its own file.
See [CHANGELOG.md](https://github.com/phageaisa/phageai/blob/master/CHANGELOG.md).
## License
The PhageAI package is released under the under terms of [the MIT License](https://github.com/phageaisa/phageai/blob/master/LICENSE).
## Cite
> **PhageAI - Bacteriophage Life Cycle Recognition with Machine Learning and Natural Language Processing**
>
> Tynecki, P.; Guziński, A.; Kazimierczak J.; Zimoń B.; Szulc Ż.; Jadczuk, M.; Dastych, J.; Onisko, A.
>
> Viruses, Special Issue "Bacteriophage Bioinformatics"
(ISSN 1999-4915), DOI: [10.1101/2020.07.11.198606](https://doi.org/10.1101/2020.07.11.198606)
            
         
        Raw data
        
            {
    "_id": null,
    "home_page": "https://github.com/phageaisa/phageai",
    "name": "phageai",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "bacteriophages,phages,phage therapy,phage research,phage lifecycle,phage taxonomy,phage similarity,phage characteristics,virulent phage,temperate phage,chronic phage",
    "author": "PhageAI S.A.",
    "author_email": "contact@phage.ai",
    "download_url": "https://files.pythonhosted.org/packages/df/96/1307eb8cdc8077cdae6d62a628248c6cf6ab7dbfa35d3415ab1a7f2431b3/phageai-1.0.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img src=\"https://phage.ai/assets/images/phageai_logo2.svg\">\n</p>\n\n**PhageAI** is an application that simultaneously represents **a repository of knowledge of bacteriophages** and a bioinformatics pipeline to analyse genomes with **Artificial Intelligence support**. This package supports the most critical programmable features from our platform.\n\nMachine Learning algorithms can process enormous amounts of data in relatively short time in order to find connections and dependencies that are obvious for human beings. Correctly designed applications based on AI are able to vastly improve and speed up the work of the domain experts.\n\nModels based on DNA or proteins contextual vectorization and Deep Neural Networks are particularly effective when it comes to analysis of genomic data. The system that we propose aims to use the phages sequences uploaded to the database to build a model which is able to predict if a bacteriophage is **chronic**, **temperate** or **virulent** with a high probability. Furthermore, our system shares more prediction methods for phage taxonomy, phage similarity and annotation extended by proteins structural classes classification.\n\nOne of the key system modules is the bacteriophages repository with a clean web interface that allows to browse, upload and share data with other users. The gathered knowledge about the bacteriophages is not only valuable on its own but also because of the ability to train the ever-improving Machine Learning models.\n\nDetection of virulent or temperate features is only one of the first tasks that can be solved with Artificial Intelligence. The combination of Biology, Natural Language Processing and Machine Learning allows us to create algorithms for genomic data processing that could eventually turn out to be effective in a wide range of problems with focus on classification and information extraced from DNA.\n\n\n[](https://pypi.org/project/phageai/)\n[](https://pypi.python.org/pypi/)\n[](https://pypi.python.org/pypi/phageai/)\n[](https://github.com/psf/black)\n[](https://pepy.tech/project/phageai)\n[](https://twitter.com/phageai)\n[](https://pl.linkedin.com/company/phageai-s-a)\n\n\n## Table of Contents\n\n[Available methods](https://github.com/phageaisa/phageai#available-methods) | [Installation](https://github.com/phageaisa/phageai#installation-and-usage) | [Benchmark](https://github.com/phageaisa/phageai#benchmark) | [Community and Contributions](https://github.com/phageaisa/phageai#community-and-contributions) | [Have a question?](https://github.com/phageaisa/phageai#have-a-question) | [Found a bug?](https://github.com/phageaisa/phageai#found-a-bug) | [Team](https://github.com/phageaisa/phageai#team) | [Change log](https://github.com/phageaisa/phageai#change-log) | [License](https://github.com/phageaisa/phageai#license) | [Cite](https://github.com/phageaisa/phageai#cite)\n\n## Available methods\n\n* `upload(fasta_path, access)` - upload FASTA file with phage genome as \"public\", \"private\" or \"\" (temporary) sample in the PhageAI repository. Upload stage is starting the bioinformatics pipeline execution for phage characteristic.\n* `processing_status(job_id)` - get current processing status for your phage sample related with Job ID;\n* `get_lifecycle_classification(job_id)` - get phage lifecycle classification result;\n* `get_taxonomy_classification(job_id)` - get phage taxonomy classification results for order, family and genus;\n* `get_proteins_classification(job_id)` - get phage proteins structural classes classification results;\n* `get_top10_similarities(job_id)` - get TOP-10 similar phages to your sample from the repository;\n* `get_full_report(job_id)` - get full phage characteristics report (all meta-data and predictions);\n* `get_phage_characteristic(accession_number)` - get meta-data about publicly available phage with specific accession number (with version);\n\n\n## Installation and usage\n\n#### PhageAI user account (1/3)\nCreate a free user account in [the PhageAI web platform](https://app.phage.ai/) or use an existing one. If you had to create new one, activate your account by activation link which was sent on your mail inbox. After that, log into the platform successfully and click \"My profile\" in the top-right menu. In the \"API access\" section create a new access token (string) and copy it for the steps below.\n\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/phageaisa/phageai/main/media/phageai-access-token.png\">\n</p>\n\n#### PhageAI package (2/3)\n\n_PhageAI_ requires Python 3.8.0+ to run and can be installed by running:\n\n```\npip install phageai\n```\n\nIf you can't wait for the latest hotness from the develop branch, then install it directly from the repository:\n\n```\npip install git+https://github.com/phageaisa/phageai.git@develop\n```\n\n#### PhageAI execution (3/3)\n\n`PASTE_YOUR_ACCESS_TOKEN_HERE` - PhageAI web user's access token;\n`PASTE_YOUR_FASTA_PATH_HERE` - FASTA filename with *.fasta or *.fa extension;\n\n### Example - how to upload phage to repository with private access\n\n```python\nfrom phageai.platform import PhageAIAccounts\n\nphageai_api = PhageAIAccounts(access_token='PASTE_YOUR_ACCESS_TOKEN_HERE')\n\nphage_example_jobid = phageai_api.upload(fasta_path=\"PASTE_YOUR_FASTA_PATH_HERE\", access=\"private\")\n```\n\nExpected output should be the job ID value:\n```json\n{\n  'job_id': '0a71e61a-ec58-447b-859e-d9ba15e103a9'\n}\n```\n\nor, if you reach out daily API requests limit (100 by default), you can expect:\n\n```json\n{\n    \"author\": [\"Your daily API limit (100 requests) has been exceeded\"]\n}\n```\n\nIf you reach out your daily requests limit, and you still need more, feel free to contact us by contact@phageai.\n\n### Example - how to track the processing progress\n\nTracking progress of phage processing is super useful if you upload more samples in the same time or when you integrate your service or pipeline with PhageAI.\n\n```python\n(...)\n\nphage_example_jobid = phageai_api.upload(fasta_path=\"PASTE_YOUR_FASTA_PATH_HERE\", access=\"private\")\n\njob_id = phage_example_jobid[\"job_id\"]\n\nphageai_api.processing_status(job_id=job_id)\n```\n\nDepends on what is the current stage of your sample processing (\"Not Started\", \"In progress\", \"Done\", \"Failed\"), expected output should be like:\n\n```python\n{\n  'taxonomy_stage': 'Done',\n  'proteins_stage': 'Not Started',\n  'top10_stage': 'Not Started',\n  'lifecycle_stage': 'Done',\n  'final_report': 'In Progress'\n}\n```\n\nEach of the above stage is work separately in PhageAI so you can expect different statuses for each of them.\n\n### Example - get lifecycle classification\n\n```python\n(...)\n\nphageai_api.get_lifecycle_classification(job_id=job_id)\n```\n\nExpected output should be like:\n\n```python\n{'value': 'Chronic', 'probability': 99.85}\n```\n\nIn the same way you can execute other [available methods](https://github.com/phageaisa/phageai#available-methods) from the package. \n\n## Benchmark\n\nPhageAI lifecycle classifier was benchmarked with [DeePhage](https://github.com/shufangwu/DeePhage), [bacphlip](https://github.com/adamhockenberry/bacphlip), [VIBRANT](https://github.com/AnantharamanLab/VIBRANT) and [PHACTS](https://github.com/deprekate/PHACTS) tools using 104 Virulent and Temperate bacteriophages from our paper (testing set). Correct predictions results:\n\nTool | Version   | Chronic support | Phage sequences used in research | Test set accuracy (%) | DOI |\n--- |-----------| --- |----------------------------------|-----------------------| --- |\n**PhageAI** | **0.4.1** | **Yes** | **15,235**                       | **93.27**             | **This research** |\nDeePhage | 1.0       | No | 1,640                            | 84.62                 | 10.1093/gigascience/giab056 |\nbacphlip | 0.9.6     | No | 1,057                            | 100                   | 10.7717/peerj.11396 |\nVIBRANT | 1.2.1     | No | 350,626                          | 85.58                 | 10.1186/s40168-020-00867-0 |\nPHACTS | 1.8       | No | 227                              | 75.00                 | 10.1093/bioinformatics/bts014 |\n\n## Community and Contributions\n\nHappy to see you willing to make the PhageAI better. Development on the latest stable version of Python 3+ is preferred. As of this writing it's 3.8. You can use any operating system.\n\nIf you're fixing a bug or adding a new feature, add a test with *[pytest](https://github.com/pytest-dev/pytest)* and check the code with *[Black](https://github.com/psf/black/)* and *[mypy](https://github.com/python/mypy)*. Before adding any large feature, first open an issue for us to discuss the idea with the core devs and community.\n\n## Have a question?\n\nObviously if you have a private question or want to cooperate with us, you can always reach out to us directly by mail.\n\n## Found a bug?\n\nFeel free to add a new issue with a respective title and description on the [the PhageAI repository](https://github.com/phageaisa/phageai/issues). If you already found a solution to your problem, we would be happy to review your pull request.\n\n## Team\n\nCore Developers and Domain Experts who contributing to PhageAI:\n\n* Piotr Tynecki\n* \u0141ukasz Wa\u0142ejko\n* Krzysztof Owsieniuk\n* Joanna Kazimierczak\n* Arkadiusz Guzi\u0144ski\n* Bogumi\u0142 Zimo\u0144\n* \u017baneta Szulc\n* Maria Urbanowicz\n\n## Change log\n\nThe log's will become rather long. It moved to its own file.\n\nSee [CHANGELOG.md](https://github.com/phageaisa/phageai/blob/master/CHANGELOG.md).\n\n## License\n\nThe PhageAI package is released under the under terms of [the MIT License](https://github.com/phageaisa/phageai/blob/master/LICENSE).\n\n## Cite\n\n> **PhageAI - Bacteriophage Life Cycle Recognition with Machine Learning and Natural Language Processing**\n>\n> Tynecki, P.; Guzi\u0144ski, A.; Kazimierczak J.; Zimo\u0144 B.; Szulc \u017b.; Jadczuk, M.; Dastych, J.; Onisko, A.\n>\n> Viruses, Special Issue \"Bacteriophage Bioinformatics\"\n(ISSN 1999-4915), DOI: [10.1101/2020.07.11.198606](https://doi.org/10.1101/2020.07.11.198606)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PhageAI is an AI-driven software platform using advanced Machine Learning and Natural Language Processing techniques for deeper understanding of the bacteriophages genomics.",
    "version": "1.0.0",
    "project_urls": {
        "Download": "https://github.com/phageaisa/phageai/archive/v1.0.0.tar.gz",
        "Homepage": "https://github.com/phageaisa/phageai"
    },
    "split_keywords": [
        "bacteriophages",
        "phages",
        "phage therapy",
        "phage research",
        "phage lifecycle",
        "phage taxonomy",
        "phage similarity",
        "phage characteristics",
        "virulent phage",
        "temperate phage",
        "chronic phage"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc0a1ee597bc187052ea55c010dd3b773871ca874de15e796dbac77f6c187093",
                "md5": "6f1bcc31749150b1583852c408763ad8",
                "sha256": "6bbb91cb7c6171d26d180d8ddcf8e743cad68375a2626cbb5407f6ec48b840ce"
            },
            "downloads": -1,
            "filename": "phageai-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6f1bcc31749150b1583852c408763ad8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 9266,
            "upload_time": "2023-11-28T14:35:57",
            "upload_time_iso_8601": "2023-11-28T14:35:57.438487Z",
            "url": "https://files.pythonhosted.org/packages/dc/0a/1ee597bc187052ea55c010dd3b773871ca874de15e796dbac77f6c187093/phageai-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "df961307eb8cdc8077cdae6d62a628248c6cf6ab7dbfa35d3415ab1a7f2431b3",
                "md5": "43f1ae44b40d282291362f90a3a0833d",
                "sha256": "ab3fec348f6c2357ee96d64e8f739d39c8439f64bdf6d6ac393063dde575f1fc"
            },
            "downloads": -1,
            "filename": "phageai-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "43f1ae44b40d282291362f90a3a0833d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13353,
            "upload_time": "2023-11-28T14:35:59",
            "upload_time_iso_8601": "2023-11-28T14:35:59.450715Z",
            "url": "https://files.pythonhosted.org/packages/df/96/1307eb8cdc8077cdae6d62a628248c6cf6ab7dbfa35d3415ab1a7f2431b3/phageai-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-28 14:35:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "phageaisa",
    "github_project": "phageai",
    "travis_ci": true,
    "coveralls": true,
    "github_actions": false,
    "requirements": [],
    "lcname": "phageai"
}