[![Documentation Status](https://readthedocs.org/projects/af-analysis/badge/?version=latest)](https://af-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)
[![status](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5/status.svg)](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5)
[![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html)
[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)
# About Alphafold Analysis
<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg" alt="AF Analysis Logo" width="300" style="display: block; margin: auto;"/>
`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by:
* [AlphaFold 2][AF2]
* [AlphaFold 3][AF3]
* [ColabFold][ColabFold]
* [AlphaFold-Multimer][AF2-M]
* [AlphaPulldown][AlphaPulldown]
* [Boltz1][Boltz1]
* [Chai-1][Chai1]
Source code repository:
[https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)
## Statement of Need
AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.
## Main features
* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
* pDockQ
* pDockQ2
* LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.
## Installation
* `af-analysis` is available on PyPI and can be installed using ``pip``:
```bash
pip install af_analysis
```
* You can install last version from the github repo:
```bash
pip install git+https://github.com/samuelmurail/af_analysis.git@main
```
* AF-Analysis can also be installed easily through github:
```bash
git clone https://github.com/samuelmurail/af_analysis
cd af_analysis
pip install .
```
## Documentation
The complete documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).
* A notebook showing the basic usage of the `af_analysis` library can be found [here](https://af-analysis.readthedocs.io/en/latest/notebooks/basic_example.html).
* Alternatively you can test is directly on Google colab:
[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)
## Usage
### Importing data
Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run.
```python
import af_analysis
my_data = af_analysis.Data('MY_AF_RESULTS_DIR')
```
Extracted data are available in the `df` attribute of the `Data` object.
```python
my_data.df
```
### Analysis
* The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:
```python
from af_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```
### Docking Analysis
* The `docking` package contains several function to add metrics like [LIS Score][LIS]:
```python
from af_analysis import docking
docking.LIS_pep(my_data)
```
### Plots
* At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.
<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>
* plot msa, plddt and PAE:
```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```
* show 3D structure (`nglview` package required):
```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```
## Dependencies
`af_analysis` requires the following dependencies:
* `pdb_numpy`
* `pandas`
* `numpy`
* `tqdm`
* `seaborn`
* `cmcrameri`
* `nglview`
* `ipywidgets`
* `mdanalysis`
## Contributing
`af-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.
## Authors
* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>), [RPBS platform](https://bioserv.rpbs.univ-paris-diderot.fr/).
See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.
## License
This project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.
# References
* Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
* Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
* Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
* Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
* Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
* Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
* Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
* Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]
* Wohlwend et al. bioRxiv (2024) doi: [10.1101/2024.11.19.624167][Boltz1]
* Chai Discovery et al. bioRxiv (2024) doi:[10.1101/2024.10.10.615955v2][Chai1]
[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"
[Boltz1]: https://doi.org/10.1101/2024.11.19.624167 "Wohlwend et al. bioRxiv (2024) doi: 10.1101/2024.11.19.624167"
[Chai1]: https://doi.org/10.1101/2024.10.10.615955v2 "Chai Discovery et al. bioRxiv (2024) doi: 10.1101/2024.10.10.615955v2"
Raw data
{
"_id": null,
"home_page": null,
"name": "af-analysis",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "AlphaFold2, ColabFold, Python, af_analysis",
"author": null,
"author_email": "Samuel Murail <samuel.murail@u-paris.fr>",
"download_url": "https://files.pythonhosted.org/packages/3a/18/bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e/af_analysis-0.1.3.tar.gz",
"platform": null,
"description": "[![Documentation Status](https://readthedocs.org/projects/af-analysis/badge/?version=latest)](https://af-analysis.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)\n[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)\n[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)\n[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)\n[![status](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5/status.svg)](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5)\n[![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html)\n[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)\n\n# About Alphafold Analysis\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg\" alt=\"AF Analysis Logo\" width=\"300\" style=\"display: block; margin: auto;\"/>\n\n`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.\nThis package is designed to simplify and streamline the process of working with protein structures\ngenerated by:\n\n* [AlphaFold 2][AF2]\n* [AlphaFold 3][AF3]\n* [ColabFold][ColabFold]\n* [AlphaFold-Multimer][AF2-M]\n* [AlphaPulldown][AlphaPulldown]\n* [Boltz1][Boltz1]\n* [Chai-1][Chai1]\n\n\nSource code repository:\n [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)\n\n## Statement of Need\n\nAlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.\nAnalyzing the abundance of resulting structural models can be challenging and time-consuming.\nExisting tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.\n`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.\n\n## Main features\n\n* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.\n* Calculate and add additional structural quality metrics to the DataFrame, including:\n * pDockQ\n * pDockQ2\n * LIS score\n* Visualize predicted protein models.\n* Cluster generated models to identify diverse conformations.\n* Select the best models based on defined criteria.\n* Add your custom metrics to the DataFrame for further analysis.\n\n## Installation\n\n* `af-analysis` is available on PyPI and can be installed using ``pip``:\n\n```bash\npip install af_analysis\n```\n\n* You can install last version from the github repo:\n\n```bash\npip install git+https://github.com/samuelmurail/af_analysis.git@main\n```\n\n* AF-Analysis can also be installed easily through github:\n\n```bash\ngit clone https://github.com/samuelmurail/af_analysis\ncd af_analysis\npip install .\n```\n\n## Documentation\n\nThe complete documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).\n\n* A notebook showing the basic usage of the `af_analysis` library can be found [here](https://af-analysis.readthedocs.io/en/latest/notebooks/basic_example.html).\n\n* Alternatively you can test is directly on Google colab:\n\n [![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)\n\n## Usage\n\n### Importing data\n\nCreate the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. \n\n```python\nimport af_analysis\nmy_data = af_analysis.Data('MY_AF_RESULTS_DIR')\n```\n\nExtracted data are available in the `df` attribute of the `Data` object. \n\n```python\nmy_data.df\n```\n\n### Analysis\n\n* The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:\n\n```python\nfrom af_analysis import analysis\nanalysis.pdockq(my_data)\nanalysis.pdockq2(my_data)\n```\n\n### Docking Analysis\n\n* The `docking` package contains several function to add metrics like [LIS Score][LIS]:\n\n```python\nfrom af_analysis import docking\ndocking.LIS_pep(my_data)\n```\n\n### Plots\n\n* At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif\" alt=\"Interactive Visualization\" width=\"100%\" style=\"display: block; margin: auto;\"/>\n\n* plot msa, plddt and PAE:\n\n```python\nmy_data.plot_msa()\nmy_data.plot_plddt([0,1])\nbest_model_index = my_data.df['ranking_confidence'].idxmax()\nmy_data.plot_pae(best_model_index)\n```\n\n* show 3D structure (`nglview` package required):\n\n```python\nmy_data.show_3d(my_data.df['ranking_confidence'].idxmax())\n```\n\n## Dependencies\n\n`af_analysis` requires the following dependencies:\n\n* `pdb_numpy`\n* `pandas`\n* `numpy`\n* `tqdm`\n* `seaborn`\n* `cmcrameri`\n* `nglview`\n* `ipywidgets`\n* `mdanalysis`\n\n## Contributing\n\n`af-analysis` is an open-source project and contributions are welcome. If\nyou find a bug or have a feature request, please open an issue on the GitHub\nrepository at https://github.com/samuelmurail/af_analysis. If you would like\nto contribute code, please fork the repository and submit a pull request.\n\n## Authors\n\n* Alaa Regei, Graduate Student - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr).\n* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>), [RPBS platform](https://bioserv.rpbs.univ-paris-diderot.fr/).\n\nSee also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.\n\n# References\n\n* Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]\n* Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]\n* Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]\n* Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]\n* Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]\n* Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]\n* Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]\n* Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]\n* Wohlwend et al. bioRxiv (2024) doi: [10.1101/2024.11.19.624167][Boltz1]\n* Chai Discovery et al. bioRxiv (2024) doi:[10.1101/2024.10.10.615955v2][Chai1]\n\n[AF2]: https://www.nature.com/articles/s41586-021-03819-2 \"Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2\"\n[AF3]: https://www.nature.com/articles/s41586-024-07487-w \"Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w\"\n[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 \"Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1\"\n[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 \"Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034\"\n[pdockq]: https://www.nature.com/articles/s41467-022-28865-w \"Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w\"\n[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 \"Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424\"\n[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 \"Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 \"\n[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 \"Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749\"\n[Boltz1]: https://doi.org/10.1101/2024.11.19.624167 \"Wohlwend et al. bioRxiv (2024) doi: 10.1101/2024.11.19.624167\"\n[Chai1]: https://doi.org/10.1101/2024.10.10.615955v2 \"Chai Discovery et al. bioRxiv (2024) doi: 10.1101/2024.10.10.615955v2\"\n",
"bugtrack_url": null,
"license": "GPL-2.0",
"summary": "`AF analysis` is a python library allowing analysis of Alphafold results.",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/samuelmurail/af_analysis"
},
"split_keywords": [
"alphafold2",
" colabfold",
" python",
" af_analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "325306f6abb7b93474beced183300974260d65105d09465e68a84698845d47bf",
"md5": "617def54669fde7fbebd204c28f6b410",
"sha256": "f400fbbbfb4e801ddee84f510634878c1b2d82dd59f9d22b2801b906923aac50"
},
"downloads": -1,
"filename": "af_analysis-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "617def54669fde7fbebd204c28f6b410",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 48459,
"upload_time": "2025-01-13T10:53:32",
"upload_time_iso_8601": "2025-01-13T10:53:32.147736Z",
"url": "https://files.pythonhosted.org/packages/32/53/06f6abb7b93474beced183300974260d65105d09465e68a84698845d47bf/af_analysis-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3a18bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e",
"md5": "722e61baf0068098b6e72975831a77bd",
"sha256": "ae947110dcea693344a1072124d2383338c7503f228fae5db25c5c2b715a701f"
},
"downloads": -1,
"filename": "af_analysis-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "722e61baf0068098b6e72975831a77bd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 41482,
"upload_time": "2025-01-13T10:53:38",
"upload_time_iso_8601": "2025-01-13T10:53:38.217438Z",
"url": "https://files.pythonhosted.org/packages/3a/18/bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e/af_analysis-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-13 10:53:38",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "samuelmurail",
"github_project": "af_analysis",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pdb_numpy",
"specs": [
[
">=",
"0.0.12"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.3.4"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.11"
]
]
},
{
"name": "cmcrameri",
"specs": [
[
">=",
"1.7"
]
]
},
{
"name": "nglview",
"specs": [
[
">=",
"3.0"
]
]
},
{
"name": "ipywidgets",
"specs": [
[
">=",
"7.6"
]
]
},
{
"name": "mdanalysis",
"specs": [
[
">=",
"2.4"
]
]
},
{
"name": "scikit-learn",
"specs": []
}
],
"lcname": "af-analysis"
}