af-analysis


Nameaf-analysis JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
Summary`AF analysis` is a python library allowing analysis of Alphafold results.
upload_time2025-01-13 10:53:38
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseGPL-2.0
keywords alphafold2 colabfold python af_analysis
VCS
bugtrack_url
requirements pdb_numpy pandas numpy tqdm seaborn cmcrameri nglview ipywidgets mdanalysis scikit-learn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Documentation Status](https://readthedocs.org/projects/af-analysis/badge/?version=latest)](https://af-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)
[![status](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5/status.svg)](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5)
[![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html)
[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)

# About Alphafold Analysis

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg" alt="AF Analysis Logo" width="300" style="display: block; margin: auto;"/>

`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by:

* [AlphaFold 2][AF2]
* [AlphaFold 3][AF3]
* [ColabFold][ColabFold]
* [AlphaFold-Multimer][AF2-M]
* [AlphaPulldown][AlphaPulldown]
* [Boltz1][Boltz1]
* [Chai-1][Chai1]


Source code repository:
   [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features

* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
  * pDockQ
  * pDockQ2
  * LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.

## Installation

* `af-analysis` is available on PyPI and can be installed using ``pip``:

```bash
pip install af_analysis
```

* You can install last version from the github repo:

```bash
pip install git+https://github.com/samuelmurail/af_analysis.git@main
```

* AF-Analysis can also be installed easily through github:

```bash
git clone https://github.com/samuelmurail/af_analysis
cd af_analysis
pip install .
```

## Documentation

The complete documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).

* A notebook showing the basic usage of the `af_analysis` library can be found [here](https://af-analysis.readthedocs.io/en/latest/notebooks/basic_example.html).

* Alternatively you can test is directly on Google colab:

    [![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)

## Usage

### Importing data

Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. 

```python
import af_analysis
my_data = af_analysis.Data('MY_AF_RESULTS_DIR')
```

Extracted data are available in the `df` attribute of the `Data` object. 

```python
my_data.df
```

### Analysis

* The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

```python
from af_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```

### Docking Analysis

* The `docking` package contains several function to add metrics like [LIS Score][LIS]:

```python
from af_analysis import docking
docking.LIS_pep(my_data)
```

### Plots

* At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>

* plot msa, plddt and PAE:

```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```

* show 3D structure (`nglview` package required):

```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```

## Dependencies

`af_analysis` requires the following dependencies:

* `pdb_numpy`
* `pandas`
* `numpy`
* `tqdm`
* `seaborn`
* `cmcrameri`
* `nglview`
* `ipywidgets`
* `mdanalysis`

## Contributing

`af-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.

## Authors

* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>), [RPBS platform](https://bioserv.rpbs.univ-paris-diderot.fr/).

See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.

# References

* Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
* Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
* Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
* Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
* Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
* Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
* Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
* Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]
* Wohlwend et al. bioRxiv (2024) doi: [10.1101/2024.11.19.624167][Boltz1]
* Chai Discovery et al. bioRxiv (2024) doi:[10.1101/2024.10.10.615955v2][Chai1]

[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"
[Boltz1]: https://doi.org/10.1101/2024.11.19.624167 "Wohlwend et al. bioRxiv (2024) doi: 10.1101/2024.11.19.624167"
[Chai1]: https://doi.org/10.1101/2024.10.10.615955v2 "Chai Discovery et al. bioRxiv (2024) doi: 10.1101/2024.10.10.615955v2"

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "af-analysis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "AlphaFold2, ColabFold, Python, af_analysis",
    "author": null,
    "author_email": "Samuel Murail <samuel.murail@u-paris.fr>",
    "download_url": "https://files.pythonhosted.org/packages/3a/18/bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e/af_analysis-0.1.3.tar.gz",
    "platform": null,
    "description": "[![Documentation Status](https://readthedocs.org/projects/af-analysis/badge/?version=latest)](https://af-analysis.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)\n[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)\n[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)\n[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)\n[![status](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5/status.svg)](https://joss.theoj.org/papers/0c359e32dc2f159688848361530239f5)\n[![License: GPL v2](https://img.shields.io/badge/License-GPL%20v2-blue.svg)](https://www.gnu.org/licenses/old-licenses/gpl-2.0.html)\n[![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)\n\n# About Alphafold Analysis\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg\" alt=\"AF Analysis Logo\" width=\"300\" style=\"display: block; margin: auto;\"/>\n\n`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.\nThis package is designed to simplify and streamline the process of working with protein structures\ngenerated by:\n\n* [AlphaFold 2][AF2]\n* [AlphaFold 3][AF3]\n* [ColabFold][ColabFold]\n* [AlphaFold-Multimer][AF2-M]\n* [AlphaPulldown][AlphaPulldown]\n* [Boltz1][Boltz1]\n* [Chai-1][Chai1]\n\n\nSource code repository:\n   [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)\n\n## Statement of Need\n\nAlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.\nAnalyzing the abundance of resulting structural models can be challenging and time-consuming.\nExisting tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.\n`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.\n\n## Main features\n\n* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.\n* Calculate and add additional structural quality metrics to the DataFrame, including:\n  * pDockQ\n  * pDockQ2\n  * LIS score\n* Visualize predicted protein models.\n* Cluster generated models to identify diverse conformations.\n* Select the best models based on defined criteria.\n* Add your custom metrics to the DataFrame for further analysis.\n\n## Installation\n\n* `af-analysis` is available on PyPI and can be installed using ``pip``:\n\n```bash\npip install af_analysis\n```\n\n* You can install last version from the github repo:\n\n```bash\npip install git+https://github.com/samuelmurail/af_analysis.git@main\n```\n\n* AF-Analysis can also be installed easily through github:\n\n```bash\ngit clone https://github.com/samuelmurail/af_analysis\ncd af_analysis\npip install .\n```\n\n## Documentation\n\nThe complete documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).\n\n* A notebook showing the basic usage of the `af_analysis` library can be found [here](https://af-analysis.readthedocs.io/en/latest/notebooks/basic_example.html).\n\n* Alternatively you can test is directly on Google colab:\n\n    [![Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/samuelmurail/af_analysis/blob/main/basic_example_colab.ipynb)\n\n## Usage\n\n### Importing data\n\nCreate the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. \n\n```python\nimport af_analysis\nmy_data = af_analysis.Data('MY_AF_RESULTS_DIR')\n```\n\nExtracted data are available in the `df` attribute of the `Data` object. \n\n```python\nmy_data.df\n```\n\n### Analysis\n\n* The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:\n\n```python\nfrom af_analysis import analysis\nanalysis.pdockq(my_data)\nanalysis.pdockq2(my_data)\n```\n\n### Docking Analysis\n\n* The `docking` package contains several function to add metrics like [LIS Score][LIS]:\n\n```python\nfrom af_analysis import docking\ndocking.LIS_pep(my_data)\n```\n\n### Plots\n\n* At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif\" alt=\"Interactive Visualization\" width=\"100%\" style=\"display: block; margin: auto;\"/>\n\n* plot msa, plddt and PAE:\n\n```python\nmy_data.plot_msa()\nmy_data.plot_plddt([0,1])\nbest_model_index = my_data.df['ranking_confidence'].idxmax()\nmy_data.plot_pae(best_model_index)\n```\n\n* show 3D structure (`nglview` package required):\n\n```python\nmy_data.show_3d(my_data.df['ranking_confidence'].idxmax())\n```\n\n## Dependencies\n\n`af_analysis` requires the following dependencies:\n\n* `pdb_numpy`\n* `pandas`\n* `numpy`\n* `tqdm`\n* `seaborn`\n* `cmcrameri`\n* `nglview`\n* `ipywidgets`\n* `mdanalysis`\n\n## Contributing\n\n`af-analysis` is an open-source project and contributions are welcome. If\nyou find a bug or have a feature request, please open an issue on the GitHub\nrepository at https://github.com/samuelmurail/af_analysis. If you would like\nto contribute code, please fork the repository and submit a pull request.\n\n## Authors\n\n* Alaa Regei, Graduate Student - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr).\n* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>), [RPBS platform](https://bioserv.rpbs.univ-paris-diderot.fr/).\n\nSee also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.\n\n# References\n\n* Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]\n* Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]\n* Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]\n* Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]\n* Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]\n* Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]\n* Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]\n* Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]\n* Wohlwend et al. bioRxiv (2024) doi: [10.1101/2024.11.19.624167][Boltz1]\n* Chai Discovery et al. bioRxiv (2024) doi:[10.1101/2024.10.10.615955v2][Chai1]\n\n[AF2]: https://www.nature.com/articles/s41586-021-03819-2 \"Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2\"\n[AF3]: https://www.nature.com/articles/s41586-024-07487-w \"Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w\"\n[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 \"Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1\"\n[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 \"Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034\"\n[pdockq]: https://www.nature.com/articles/s41467-022-28865-w \"Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w\"\n[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 \"Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424\"\n[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 \"Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 \"\n[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 \"Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749\"\n[Boltz1]: https://doi.org/10.1101/2024.11.19.624167 \"Wohlwend et al. bioRxiv (2024) doi: 10.1101/2024.11.19.624167\"\n[Chai1]: https://doi.org/10.1101/2024.10.10.615955v2 \"Chai Discovery et al. bioRxiv (2024) doi: 10.1101/2024.10.10.615955v2\"\n",
    "bugtrack_url": null,
    "license": "GPL-2.0",
    "summary": "`AF analysis` is a python library allowing analysis of Alphafold results.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/samuelmurail/af_analysis"
    },
    "split_keywords": [
        "alphafold2",
        " colabfold",
        " python",
        " af_analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "325306f6abb7b93474beced183300974260d65105d09465e68a84698845d47bf",
                "md5": "617def54669fde7fbebd204c28f6b410",
                "sha256": "f400fbbbfb4e801ddee84f510634878c1b2d82dd59f9d22b2801b906923aac50"
            },
            "downloads": -1,
            "filename": "af_analysis-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "617def54669fde7fbebd204c28f6b410",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 48459,
            "upload_time": "2025-01-13T10:53:32",
            "upload_time_iso_8601": "2025-01-13T10:53:32.147736Z",
            "url": "https://files.pythonhosted.org/packages/32/53/06f6abb7b93474beced183300974260d65105d09465e68a84698845d47bf/af_analysis-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3a18bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e",
                "md5": "722e61baf0068098b6e72975831a77bd",
                "sha256": "ae947110dcea693344a1072124d2383338c7503f228fae5db25c5c2b715a701f"
            },
            "downloads": -1,
            "filename": "af_analysis-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "722e61baf0068098b6e72975831a77bd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 41482,
            "upload_time": "2025-01-13T10:53:38",
            "upload_time_iso_8601": "2025-01-13T10:53:38.217438Z",
            "url": "https://files.pythonhosted.org/packages/3a/18/bf013f3571095721d8de3227e061b0ebcdd762e7a4f5c862fe26b17dda6e/af_analysis-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-13 10:53:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "samuelmurail",
    "github_project": "af_analysis",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pdb_numpy",
            "specs": [
                [
                    ">=",
                    "0.0.12"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.4"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.11"
                ]
            ]
        },
        {
            "name": "cmcrameri",
            "specs": [
                [
                    ">=",
                    "1.7"
                ]
            ]
        },
        {
            "name": "nglview",
            "specs": [
                [
                    ">=",
                    "3.0"
                ]
            ]
        },
        {
            "name": "ipywidgets",
            "specs": [
                [
                    ">=",
                    "7.6"
                ]
            ]
        },
        {
            "name": "mdanalysis",
            "specs": [
                [
                    ">=",
                    "2.4"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": []
        }
    ],
    "lcname": "af-analysis"
}
        
Elapsed time: 1.95969s