af-analysis


Nameaf-analysis JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
Summary`AF analysis` is a python library allowing analysis of Alphafold results.
upload_time2024-11-08 14:33:09
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseGPL-2.0
keywords alphafold2 colabfold python af_analysis
VCS
bugtrack_url
requirements pdb_numpy pandas numpy tqdm seaborn cmcrameri nglview ipywidgets mdanalysis scikit-learn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)
[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)
[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)
[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)
[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)

# About Alphafold Analysis

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg" alt="AF Analysis Logo" width="200" style="display: block; margin: auto;"/>


`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.
This package is designed to simplify and streamline the process of working with protein structures
generated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]
and [AlphaPulldown][AlphaPulldown].

* Source code repository:
   [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)

## Statement of Need

AlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.
Analyzing the abundance of resulting structural models can be challenging and time-consuming.
Existing tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.
`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.

## Main features:

* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.
* Calculate and add additional structural quality metrics to the DataFrame, including:
    * pDockQ
    * pDockQ2
    * LIS score
* Visualize predicted protein models.
* Cluster generated models to identify diverse conformations.
* Select the best models based on defined criteria.
* Add your custom metrics to the DataFrame for further analysis.

## Installation

- `af-analysis` is available on PyPI and can be installed using ``pip``:

```bash
pip install af_analysis
```

- You can install last version from the github repo:

```bash
pip install git+https://github.com/samuelmurail/af_analysis.git@main
```

- AF-Analysis can also be installed easily through github:

```bash
git clone https://github.com/samuelmurail/af_analysis
cd af_analysis
pip install .
```

## Documentation

The full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).


## Usage


### Importing data

Create the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. 

```python
import af_analysis
my_data = af_analysis.Data('MY_AF_RESULTS_DIR')
```

Extracted data are available in the `df` attribute of the `Data` object. 

```python
my_data.df
```

### Analysis

- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:

```python
from af_analysis import analysis
analysis.pdockq(my_data)
analysis.pdockq2(my_data)
```

### Docking Analysis

- The `docking` package contains several function to add metrics like [LIS Score][LIS]:

```python
from af_analysis import docking
docking.LIS_pep(my_data)
```

### Plots


- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.

<img src="https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif" alt="Interactive Visualization" width="100%" style="display: block; margin: auto;"/>


- plot msa, plddt and PAE:

```python
my_data.plot_msa()
my_data.plot_plddt([0,1])
best_model_index = my_data.df['ranking_confidence'].idxmax()
my_data.plot_pae(best_model_index)
```

- show 3D structure (`nglview` package required):

```python
my_data.show_3d(my_data.df['ranking_confidence'].idxmax())
```

## Dependencies

`af_analysis` requires the following dependencies:

- `pdb_numpy`
- `pandas`
- `numpy`
- `tqdm`
- `seaborn`
- `cmcrameri`
- `nglview`
- `ipywidgets`
- `mdanalysis`


## Contributing

`af-analysis` is an open-source project and contributions are welcome. If
you find a bug or have a feature request, please open an issue on the GitHub
repository at https://github.com/samuelmurail/af_analysis. If you would like
to contribute code, please fork the repository and submit a pull request.


## Authors

* Alaa Regei, Graduate Student - [Université Paris Cité](https://u-paris.fr).
* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Université Paris Cité](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).

See also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.

## License

This project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.


# References

- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]
- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]
- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]
- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]
- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]
- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]
- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]
- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]


[AF2]: https://www.nature.com/articles/s41586-021-03819-2 "Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2"
[AF3]: https://www.nature.com/articles/s41586-024-07487-w "Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w"
[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 "Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1"
[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 "Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034"
[pdockq]: https://www.nature.com/articles/s41467-022-28865-w "Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w"
[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 "Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424"
[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 "Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 "
[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 "Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749"


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "af-analysis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "AlphaFold2, ColabFold, Python, af_analysis",
    "author": null,
    "author_email": "Samuel Murail <samuel.murail@u-paris.fr>",
    "download_url": "https://files.pythonhosted.org/packages/a8/45/2c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7/af_analysis-0.1.1.tar.gz",
    "platform": null,
    "description": "[![Documentation Status](https://readthedocs.org/projects/af2-analysis/badge/?version=latest)](https://af2-analysis.readthedocs.io/en/latest/?badge=latest)\n[![codecov](https://codecov.io/gh/samuelmurail/af_analysis/graph/badge.svg?token=WOJYQKKOP7)](https://codecov.io/gh/samuelmurail/af_analysis)\n[![Build Status](https://dev.azure.com/samuelmurailRPBS/af_analysis/_apis/build/status%2Fsamuelmurail.af_analysis?branchName=main)](https://dev.azure.com/samuelmurailRPBS/af_analysis/_build/latest?definitionId=2&branchName=main)\n[![PyPI - Version](https://img.shields.io/pypi/v/af-analysis)](https://pypi.org/project/af-analysis/)\n[![Downloads](https://static.pepy.tech/badge/af2-analysis)](https://pepy.tech/project/af2-analysis)\n\n# About Alphafold Analysis\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/logo.jpeg\" alt=\"AF Analysis Logo\" width=\"200\" style=\"display: block; margin: auto;\"/>\n\n\n`af-analysis` is a python package for the analysis of AlphaFold protein structure predictions.\nThis package is designed to simplify and streamline the process of working with protein structures\ngenerated by [AlphaFold 2][AF2], [AlphaFold 3][AF3] and its derivatives like [ColabFold][ColabFold], [AlphaFold-Multimer][AF2-M]\nand [AlphaPulldown][AlphaPulldown].\n\n* Source code repository:\n   [https://github.com/samuelmurail/af_analysis](https://github.com/samuelmurail/af_analysis)\n\n## Statement of Need\n\nAlphaFold 2 and its derivatives have revolutionized protein structure prediction, achieving remarkable accuracy.\nAnalyzing the abundance of resulting structural models can be challenging and time-consuming.\nExisting tools often require separate scripts for calculating various quality metrics (pDockQ, pDockQ2, LIS score) and assessing model diversity.\n`af-analysis` addresses these challenges by providing a unified and user-friendly framework for in-depth analysis of AlphaFold 2 results.\n\n## Main features:\n\n* Import AlphaFold or ColabFold prediction directories as pandas DataFrames for efficient data handling.\n* Calculate and add additional structural quality metrics to the DataFrame, including:\n    * pDockQ\n    * pDockQ2\n    * LIS score\n* Visualize predicted protein models.\n* Cluster generated models to identify diverse conformations.\n* Select the best models based on defined criteria.\n* Add your custom metrics to the DataFrame for further analysis.\n\n## Installation\n\n- `af-analysis` is available on PyPI and can be installed using ``pip``:\n\n```bash\npip install af_analysis\n```\n\n- You can install last version from the github repo:\n\n```bash\npip install git+https://github.com/samuelmurail/af_analysis.git@main\n```\n\n- AF-Analysis can also be installed easily through github:\n\n```bash\ngit clone https://github.com/samuelmurail/af_analysis\ncd af_analysis\npip install .\n```\n\n## Documentation\n\nThe full documentation is available at [ReadTheDocs](https://af-analysis.readthedocs.io/en/latest/).\n\n\n## Usage\n\n\n### Importing data\n\nCreate the `Data` object, giving the path of the directory containing the results of the alphafold2/colabfold run. \n\n```python\nimport af_analysis\nmy_data = af_analysis.Data('MY_AF_RESULTS_DIR')\n```\n\nExtracted data are available in the `df` attribute of the `Data` object. \n\n```python\nmy_data.df\n```\n\n### Analysis\n\n- The `analysis` package contains several function to add metrics like [pdockQ][pdockq] and [pdockQ2][pdockq2]:\n\n```python\nfrom af_analysis import analysis\nanalysis.pdockq(my_data)\nanalysis.pdockq2(my_data)\n```\n\n### Docking Analysis\n\n- The `docking` package contains several function to add metrics like [LIS Score][LIS]:\n\n```python\nfrom af_analysis import docking\ndocking.LIS_pep(my_data)\n```\n\n### Plots\n\n\n- At first approach the user can visualize the pLDDT, PAE matrix and the model scores. The ``show_info()`` function displays the scores of the models, as well as the pLDDT plot and PAE matrix in a interactive way.\n\n<img src=\"https://raw.githubusercontent.com/samuelmurail/af_analysis/master/docs/source/_static/show_info.gif\" alt=\"Interactive Visualization\" width=\"100%\" style=\"display: block; margin: auto;\"/>\n\n\n- plot msa, plddt and PAE:\n\n```python\nmy_data.plot_msa()\nmy_data.plot_plddt([0,1])\nbest_model_index = my_data.df['ranking_confidence'].idxmax()\nmy_data.plot_pae(best_model_index)\n```\n\n- show 3D structure (`nglview` package required):\n\n```python\nmy_data.show_3d(my_data.df['ranking_confidence'].idxmax())\n```\n\n## Dependencies\n\n`af_analysis` requires the following dependencies:\n\n- `pdb_numpy`\n- `pandas`\n- `numpy`\n- `tqdm`\n- `seaborn`\n- `cmcrameri`\n- `nglview`\n- `ipywidgets`\n- `mdanalysis`\n\n\n## Contributing\n\n`af-analysis` is an open-source project and contributions are welcome. If\nyou find a bug or have a feature request, please open an issue on the GitHub\nrepository at https://github.com/samuelmurail/af_analysis. If you would like\nto contribute code, please fork the repository and submit a pull request.\n\n\n## Authors\n\n* Alaa Regei, Graduate Student - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr).\n* [Samuel Murail](https://samuelmurail.github.io/PersonalPage/>), Associate Professor - [Universit\u00e9 Paris Cit\u00e9](https://u-paris.fr), [CMPLI](http://bfa.univ-paris-diderot.fr/equipe-8/>).\n\nSee also the list of [contributors](https://github.com/samuelmurail/af_analysis/contributors) who participated in this project.\n\n## License\n\nThis project is licensed under the GNU General Public License version 2 - see the `LICENSE` file for details.\n\n\n# References\n\n- Jumper et al. Nature (2021) doi: [10.1038/s41586-021-03819-2][AF2]\n- Abramson et al. Nature (2024) doi: [10.1038/s41586-024-07487-w][AF3]\n- Mirdita et al. Nature Methods (2022) doi: [10.1038/s41592-022-01488-1][ColabFold]\n- Evans et al. bioRxiv (2021) doi: [10.1101/2021.10.04.463034][AF2-M]\n- Bryant et al. Nat. Commun. (2022) doi: [10.1038/s41467-022-28865-w][pdockq]\n- Zhu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btad424][pdockq2]\n- Kim et al. bioRxiv (2024) doi: [10.1101/2024.02.19.580970][LIS]\n- Yu et al. Bioinformatics (2023) doi: [10.1093/bioinformatics/btac749][AlphaPulldown]\n\n\n[AF2]: https://www.nature.com/articles/s41586-021-03819-2 \"Jumper et al. Nature (2021) doi: 10.1038/s41586-021-03819-2\"\n[AF3]: https://www.nature.com/articles/s41586-024-07487-w \"Abramson et al. Nature (2024) doi: 10.1038/s41586-024-07487-w\"\n[ColabFold]: https://www.nature.com/articles/s41592-022-01488-1 \"Mirdita et al. Nat Methods (2022) doi: 10.1038/s41592-022-01488-1\"\n[AF2-M]: https://www.biorxiv.org/content/10.1101/2021.10.04.463034v2 \"Evans et al. bioRxiv (2021) doi: 10.1101/2021.10.04.463034\"\n[pdockq]: https://www.nature.com/articles/s41467-022-28865-w \"Bryant et al. Nat Commun (2022) doi: 10.1038/s41467-022-28865-w\"\n[pdockq2]: https://academic.oup.com/bioinformatics/article/39/7/btad424/7219714 \"Zhu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btad424\"\n[LIS]: https://www.biorxiv.org/content/10.1101/2024.02.19.580970v1 \"Kim et al. bioRxiv (2024) doi: 10.1101/2024.02.19.580970 \"\n[AlphaPulldown]: https://doi.org/10.1093/bioinformatics/btac749 \"Yu et al. Bioinformatics (2023) doi: 10.1093/bioinformatics/btac749\"\n\n",
    "bugtrack_url": null,
    "license": "GPL-2.0",
    "summary": "`AF analysis` is a python library allowing analysis of Alphafold results.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/samuelmurail/af_analysis"
    },
    "split_keywords": [
        "alphafold2",
        " colabfold",
        " python",
        " af_analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "27ef07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793",
                "md5": "a3a677e02f1be7ac9698597f006ff7e2",
                "sha256": "d220e18408ff2e1e23f010b698745da4074d4de701efb5b7992b89e612fbec1a"
            },
            "downloads": -1,
            "filename": "af_analysis-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a3a677e02f1be7ac9698597f006ff7e2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 44323,
            "upload_time": "2024-11-08T14:33:07",
            "upload_time_iso_8601": "2024-11-08T14:33:07.452325Z",
            "url": "https://files.pythonhosted.org/packages/27/ef/07df890fcba35fee4ba7133349aeb0f55f411057944c4d271b2422cba793/af_analysis-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a8452c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7",
                "md5": "640ece0cd56697a0a9b877e20782774a",
                "sha256": "920ae408fa7e839edb3e7be0d32cf35cce3f8ede459122804a2b9d2dbfe3449c"
            },
            "downloads": -1,
            "filename": "af_analysis-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "640ece0cd56697a0a9b877e20782774a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 38379,
            "upload_time": "2024-11-08T14:33:09",
            "upload_time_iso_8601": "2024-11-08T14:33:09.645913Z",
            "url": "https://files.pythonhosted.org/packages/a8/45/2c5a70803ce276330b1c4fdb4e99b58429cb95c95f4d29b109dea8b8f4a7/af_analysis-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-08 14:33:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "samuelmurail",
    "github_project": "af_analysis",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pdb_numpy",
            "specs": [
                [
                    ">=",
                    "0.0.11"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.4"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.11"
                ]
            ]
        },
        {
            "name": "cmcrameri",
            "specs": [
                [
                    ">=",
                    "1.7"
                ]
            ]
        },
        {
            "name": "nglview",
            "specs": [
                [
                    ">=",
                    "3.0"
                ]
            ]
        },
        {
            "name": "ipywidgets",
            "specs": [
                [
                    ">=",
                    "7.6"
                ]
            ]
        },
        {
            "name": "mdanalysis",
            "specs": [
                [
                    ">=",
                    "2.4"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": []
        }
    ],
    "lcname": "af-analysis"
}
        
Elapsed time: 0.42120s