[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)
# About Alphafold Analysis
<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg" alt="AF Analysis Logo" width="200" style="display: block; margin: auto;"/>
`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]
and [AlphaPulldown][AlphaPulldown].
* Source code repository:
[https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)
## Statement of Need
AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.
## Main features:
* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
* pDockQ
* pDockQ2
* LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.
## Installation
- `af-analysis` is available on PyPI and can be installed using ``pip``:
```bash
pip install af_analysis
```
- You can install last version from the github repo:
```bash
pip install git+https://github.com/samuelmurail/af_analysis.git@main
```
- AF-Analysis can also be installed easily through github:
```bash
git clone https://github.com/samuelmurail/af_analysis
cd af_analysis
pip install .
```
## Documentation
The full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).
## Usage
### Importing data
Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run.
```python
import af_analysis
my_data = af_analysis.Data('MY_AF_RESULTS_DIR')
```
Extracted data are available in the `df` attribute of the `Data` object.
```python
my_data.df
```
### Analysis
- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:
```python
from af_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```
### Docking Analysis
- The `docking` package contains several function to add metrics like [LIS Score][LIS]:
```python
from af_analysis import docking
docking.LIS_pep(my_data)
```
### Plots
- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.
<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>
- plot msa, plddt and PAE:
```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```
- show 3D structure (`nglview` package required):
```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```
## Dependencies
`af_analysis` requires the following dependencies:
- `pdb_numpy`
- `pandas`
- `numpy`
- `tqdm`
- `seaborn`
- `cmcrameri`
- `nglview`
- `ipywidgets`
- `mdanalysis`
## Contributing
`af-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.
## Authors
* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).
See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.
## License
This project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.
# References
- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]
[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"
Raw data
{
"_id": null,
"home_page": null,
"name": "af-analysis",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "AlphaFold2, ColabFold, Python, af_analysis",
"author": null,
"author_email": "Samuel Murail <samuel.murail@u-paris.fr>",
"download_url": "https://files.pythonhosted.org/packages/a8/45/2c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7/af_analysis-0.1.1.tar.gz",
"platform": null,
"description": "[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)\n[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)\n[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)\n[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)\n\n# About Alphafold Analysis\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg\" alt=\"AF Analysis Logo\" width=\"200\" style=\"display: block; margin: auto;\"/>\n\n\n`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.\nThis package is designed to simplify and streamline the process of working with protein structures\ngenerated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]\nand [AlphaPulldown][AlphaPulldown].\n\n* Source code repository:\n [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)\n\n## Statement of Need\n\nAlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.\nAnalyzing the abundance of resulting structural models can be challenging and time-consuming.\nExisting tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.\n`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.\n\n## Main features:\n\n* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.\n* Calculate and add additional structural quality metrics to the DataFrame, including:\n * pDockQ\n * pDockQ2\n * LIS score\n* Visualize predicted protein models.\n* Cluster generated models to identify diverse conformations.\n* Select the best models based on defined criteria.\n* Add your custom metrics to the DataFrame for further analysis.\n\n## Installation\n\n- `af-analysis` is available on PyPI and can be installed using ``pip``:\n\n```bash\npip install af_analysis\n```\n\n- You can install last version from the github repo:\n\n```bash\npip install git+https://github.com/samuelmurail/af_analysis.git@main\n```\n\n- AF-Analysis can also be installed easily through github:\n\n```bash\ngit clone https://github.com/samuelmurail/af_analysis\ncd af_analysis\npip install .\n```\n\n## Documentation\n\nThe full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).\n\n\n## Usage\n\n\n### Importing data\n\nCreate the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. \n\n```python\nimport af_analysis\nmy_data = af_analysis.Data('MY_AF_RESULTS_DIR')\n```\n\nExtracted data are available in the `df` attribute of the `Data` object. \n\n```python\nmy_data.df\n```\n\n### Analysis\n\n- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:\n\n```python\nfrom af_analysis import analysis\nanalysis.pdockq(my_data)\nanalysis.pdockq2(my_data)\n```\n\n### Docking Analysis\n\n- The `docking` package contains several function to add metrics like [LIS Score][LIS]:\n\n```python\nfrom af_analysis import docking\ndocking.LIS_pep(my_data)\n```\n\n### Plots\n\n\n- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif\" alt=\"Interactive Visualization\" width=\"100%\" style=\"display: block; margin: auto;\"/>\n\n\n- plot msa, plddt and PAE:\n\n```python\nmy_data.plot_msa()\nmy_data.plot_plddt([0,1])\nbest_model_index = my_data.df['ranking_confidence'].idxmax()\nmy_data.plot_pae(best_model_index)\n```\n\n- show 3D structure (`nglview` package required):\n\n```python\nmy_data.show_3d(my_data.df['ranking_confidence'].idxmax())\n```\n\n## Dependencies\n\n`af_analysis` requires the following dependencies:\n\n- `pdb_numpy`\n- `pandas`\n- `numpy`\n- `tqdm`\n- `seaborn`\n- `cmcrameri`\n- `nglview`\n- `ipywidgets`\n- `mdanalysis`\n\n\n## Contributing\n\n`af-analysis` is an open-source project and contributions are welcome. If\nyou find a bug or have a feature request, please open an issue on the GitHub\nrepository at https://github.com/samuelmurail/af_analysis. If you would like\nto contribute code, please fork the repository and submit a pull request.\n\n\n## Authors\n\n* Alaa Regei, Graduate Student - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr).\n* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).\n\nSee also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.\n\n\n# References\n\n- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]\n- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]\n- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]\n- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]\n- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]\n- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]\n- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]\n- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]\n\n\n[AF2]: https://www.nature.com/articles/s41586-021-03819-2 \"Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2\"\n[AF3]: https://www.nature.com/articles/s41586-024-07487-w \"Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w\"\n[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 \"Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1\"\n[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 \"Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034\"\n[pdockq]: https://www.nature.com/articles/s41467-022-28865-w \"Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w\"\n[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 \"Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424\"\n[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 \"Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 \"\n[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 \"Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749\"\n\n",
"bugtrack_url": null,
"license": "GPL-2.0",
"summary": "`AF analysis` is a python library allowing analysis of Alphafold results.",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/samuelmurail/af_analysis"
},
"split_keywords": [
"alphafold2",
" colabfold",
" python",
" af_analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "27ef07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793",
"md5": "a3a677e02f1be7ac9698597f006ff7e2",
"sha256": "d220e18408ff2e1e23f010b698745da4074d4de701efb5b7992b89e612fbec1a"
},
"downloads": -1,
"filename": "af_analysis-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a3a677e02f1be7ac9698597f006ff7e2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 44323,
"upload_time": "2024-11-08T14:33:07",
"upload_time_iso_8601": "2024-11-08T14:33:07.452325Z",
"url": "https://files.pythonhosted.org/packages/27/ef/07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793/af_analysis-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a8452c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7",
"md5": "640ece0cd56697a0a9b877e20782774a",
"sha256": "920ae408fa7e839edb3e7be0d32cf35cce3f8ede459122804a2b9d2dbfe3449c"
},
"downloads": -1,
"filename": "af_analysis-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "640ece0cd56697a0a9b877e20782774a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 38379,
"upload_time": "2024-11-08T14:33:09",
"upload_time_iso_8601": "2024-11-08T14:33:09.645913Z",
"url": "https://files.pythonhosted.org/packages/a8/45/2c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7/af_analysis-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-08 14:33:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "samuelmurail",
"github_project": "af_analysis",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pdb_numpy",
"specs": [
[
">=",
"0.0.11"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.3.4"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.11"
]
]
},
{
"name": "cmcrameri",
"specs": [
[
">=",
"1.7"
]
]
},
{
"name": "nglview",
"specs": [
[
">=",
"3.0"
]
]
},
{
"name": "ipywidgets",
"specs": [
[
">=",
"7.6"
]
]
},
{
"name": "mdanalysis",
"specs": [
[
">=",
"2.4"
]
]
},
{
"name": "scikit-learn",
"specs": []
}
],
"lcname": "af-analysis"
}