af2-analysis


Nameaf2-analysis JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
Summary`AF2 analysis` is a python library allowing analysis of Alphafold results.
upload_time2024-10-20 20:56:00
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseGNUv2.0
keywords alphafold2 colabfold python af2_analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af2_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af2_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_apis/build/status%2Fsamuelmurail.af2_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af2-analysis)](https://pypi.org/project/af2-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)

# About Alphafold2 Analysis

<img src="https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/logo.jpeg" alt="AF2 Analysis Logo" width="200" style="display: block; margin: auto;"/>


`af2-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]
and [AlphaPulldown][AlphaPulldown].

* Source code repository:
   [https://github.com/samuelmurail/af2_analysis](https://github.com/samuelmurail/af2_analysis)

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af2-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features:

* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
    * pDockQ
    * pDockQ2
    * LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.

## Installation

- `af2-analysis` is available on PyPI and can be installed using ``pip``:

```bash
pip install af2_analysis
```

- You can install last version from the github repo:

```bash
pip install git+https://github.com/samuelmurail/af2_analysis.git@main
```

- AF2-Analysis can also be installed easily through github:

```bash
git clone https://github.com/samuelmurail/af2_analysis
cd af2_analysis
pip install .
```

## Documentation

The full documentation is available at [ReadTheDocs](https://af2-analysis.readthedocs.io/en/latest/).


## Usage


### Importing data

Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. 

```python
import af2_analysis
my_data = af2_analysis.Data('MY_AF2_RESULTS_DIR')
```

Extracted data are available in the `df` attribute of the `Data` object. 

```python
my_data.df
```

### Analysis

- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

```python
from af2_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```

### Docking Analysis

- The `docking` package contains several function to add metrics like [LIS Score][LIS]:

```python
from af2_analysis import docking
docking.LIS_pep(my_data)
```

### Plots


- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src="https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>


- plot msa, plddt and PAE:

```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```

- show 3D structure (`nglview` package required):

```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```

## Dependencies

`af2_analysis` requires the following dependencies:

- `pdb_numpy`
- `pandas`
- `numpy`
- `tqdm`
- `seaborn`
- `cmcrameri`
- `nglview`
- `ipywidgets`
- `mdanalysis`


## Contributing

`af2-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af2_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.


## Authors

* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).

See also the list of [contributors](https://github.com/samuelmurail/af2_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License v2.0 - see the `LICENSE` file for details.


# References

- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]


[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "af2-analysis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "AlphaFold2, ColabFold, Python, af2_analysis",
    "author": null,
    "author_email": "Samuel Murail <samuel.murail@u-paris.fr>",
    "download_url": "https://files.pythonhosted.org/packages/21/2d/ce499516aa829d772f7326b56ca897755520b4c69d6f18132084f8f0b9f6/af2_analysis-0.1.0.tar.gz",
    "platform": null,
    "description": "[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/samuelmurail/af2_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af2_analysis)\n[![Build Status](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_apis/build/status%2Fsamuelmurail.af2_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af2_analysis/_build/latest?definitionId=2&branchName=main)\n[![PyPI - Version](https://img.shields.io/pypi/v/af2-analysis)](https://pypi.org/project/af2-analysis/)\n[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)\n\n# About Alphafold2 Analysis\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/logo.jpeg\" alt=\"AF2 Analysis Logo\" width=\"200\" style=\"display: block; margin: auto;\"/>\n\n\n`af2-analysis` is a python package for the analysis of AlphaFold protein structure predictions.\nThis package is designed to simplify and streamline the process of working with protein structures\ngenerated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]\nand [AlphaPulldown][AlphaPulldown].\n\n* Source code repository:\n   [https://github.com/samuelmurail/af2_analysis](https://github.com/samuelmurail/af2_analysis)\n\n## Statement of Need\n\nAlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.\nAnalyzing the abundance of resulting structural models can be challenging and time-consuming.\nExisting tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.\n`af2-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.\n\n## Main features:\n\n* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.\n* Calculate and add additional structural quality metrics to the DataFrame, including:\n    * pDockQ\n    * pDockQ2\n    * LIS score\n* Visualize predicted protein models.\n* Cluster generated models to identify diverse conformations.\n* Select the best models based on defined criteria.\n* Add your custom metrics to the DataFrame for further analysis.\n\n## Installation\n\n- `af2-analysis` is available on PyPI and can be installed using ``pip``:\n\n```bash\npip install af2_analysis\n```\n\n- You can install last version from the github repo:\n\n```bash\npip install git+https://github.com/samuelmurail/af2_analysis.git@main\n```\n\n- AF2-Analysis can also be installed easily through github:\n\n```bash\ngit clone https://github.com/samuelmurail/af2_analysis\ncd af2_analysis\npip install .\n```\n\n## Documentation\n\nThe full documentation is available at [ReadTheDocs](https://af2-analysis.readthedocs.io/en/latest/).\n\n\n## Usage\n\n\n### Importing data\n\nCreate the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. \n\n```python\nimport af2_analysis\nmy_data = af2_analysis.Data('MY_AF2_RESULTS_DIR')\n```\n\nExtracted data are available in the `df` attribute of the `Data` object. \n\n```python\nmy_data.df\n```\n\n### Analysis\n\n- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:\n\n```python\nfrom af2_analysis import analysis\nanalysis.pdockq(my_data)\nanalysis.pdockq2(my_data)\n```\n\n### Docking Analysis\n\n- The `docking` package contains several function to add metrics like [LIS Score][LIS]:\n\n```python\nfrom af2_analysis import docking\ndocking.LIS_pep(my_data)\n```\n\n### Plots\n\n\n- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af2_analysis/master/docs/source/_static/show_info.gif\" alt=\"Interactive Visualization\" width=\"100%\" style=\"display: block; margin: auto;\"/>\n\n\n- plot msa, plddt and PAE:\n\n```python\nmy_data.plot_msa()\nmy_data.plot_plddt([0,1])\nbest_model_index = my_data.df['ranking_confidence'].idxmax()\nmy_data.plot_pae(best_model_index)\n```\n\n- show 3D structure (`nglview` package required):\n\n```python\nmy_data.show_3d(my_data.df['ranking_confidence'].idxmax())\n```\n\n## Dependencies\n\n`af2_analysis` requires the following dependencies:\n\n- `pdb_numpy`\n- `pandas`\n- `numpy`\n- `tqdm`\n- `seaborn`\n- `cmcrameri`\n- `nglview`\n- `ipywidgets`\n- `mdanalysis`\n\n\n## Contributing\n\n`af2-analysis` is an open-source project and contributions are welcome. If\nyou find a bug or have a feature request, please open an issue on the GitHub\nrepository at https://github.com/samuelmurail/af2_analysis. If you would like\nto contribute code, please fork the repository and submit a pull request.\n\n\n## Authors\n\n* Alaa Regei, Graduate Student - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr).\n* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).\n\nSee also the list of [contributors](https://github.com/samuelmurail/af2_analysis/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the GNU General Public License v2.0 - see the `LICENSE` file for details.\n\n\n# References\n\n- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]\n- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]\n- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]\n- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]\n- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]\n- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]\n- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]\n- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]\n\n\n[AF2]: https://www.nature.com/articles/s41586-021-03819-2 \"Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2\"\n[AF3]: https://www.nature.com/articles/s41586-024-07487-w \"Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w\"\n[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 \"Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1\"\n[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 \"Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034\"\n[pdockq]: https://www.nature.com/articles/s41467-022-28865-w \"Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w\"\n[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 \"Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424\"\n[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 \"Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 \"\n[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 \"Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749\"\n\n",
    "bugtrack_url": null,
    "license": "GNUv2.0",
    "summary": "`AF2 analysis` is a python library allowing analysis of Alphafold results.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/samuelmurail/af2_analysis"
    },
    "split_keywords": [
        "alphafold2",
        " colabfold",
        " python",
        " af2_analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a0277e43240a3b1ef9a23202222b3270221e702ce76d6fd591d1233a19ccd490",
                "md5": "f0873b06853cae073977c8bf4261d71b",
                "sha256": "42f26ca5809d34d04723d8b3c9960a0f568e0f525f6cdfed2080bf0f7ecf3fed"
            },
            "downloads": -1,
            "filename": "af2_analysis-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f0873b06853cae073977c8bf4261d71b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 44369,
            "upload_time": "2024-10-20T20:55:58",
            "upload_time_iso_8601": "2024-10-20T20:55:58.572273Z",
            "url": "https://files.pythonhosted.org/packages/a0/27/7e43240a3b1ef9a23202222b3270221e702ce76d6fd591d1233a19ccd490/af2_analysis-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "212dce499516aa829d772f7326b56ca897755520b4c69d6f18132084f8f0b9f6",
                "md5": "e9664fbf5333b70894e5d3829178723f",
                "sha256": "24e64b15717ae6a472a029b127e0bdf4c8b0128f981bde9b3e4a4f8aa08530a0"
            },
            "downloads": -1,
            "filename": "af2_analysis-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e9664fbf5333b70894e5d3829178723f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 38392,
            "upload_time": "2024-10-20T20:56:00",
            "upload_time_iso_8601": "2024-10-20T20:56:00.467281Z",
            "url": "https://files.pythonhosted.org/packages/21/2d/ce499516aa829d772f7326b56ca897755520b4c69d6f18132084f8f0b9f6/af2_analysis-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-20 20:56:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "samuelmurail",
    "github_project": "af2_analysis",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "af2-analysis"
}
        
Elapsed time: 0.36945s