Name | Angular-Deviation-Diffuser JSON |
Version |
1.0.9
JSON |
| download |
home_page | None |
Summary | This is a Transformer-Based diffusion model for Efficient Protein Conformational Ensemble Generation |
upload_time | 2024-11-25 22:51:55 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | MIT License Copyright (c) [2024] [YI YANG] Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
diffusion model
protein conformation
angular deviation
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Angular Deviation Diffuser
## Overview
**Angular Deviation Diffuser** is a transformer-based diffusion model designed for efficiently generating conformational ensembles of protein backbones by using angular deviations as data flow. It aims to overcome the limitations of traditional molecular dynamics (MD) simulations by providing a fast and computationally efficient approach for sampling protein conformational landscapes. This model leverages the concepts of SE(3) symmetry, angular deviations, and diffusion processes to produce dynamic ensembles that closely match those generated through MD simulations, thereby offering a new way to study protein structure and function.
### Overview of this work
![Angular Deviation Diffuser Workflow](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/main/Pictures/overview.png?raw=true)
## Computational results
### Generated Conformations Example (Dark State and Light State)
![Generated Conformations Example](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/main/Pictures/Dark_and_light_generated_by_our_model.gif?raw=true)
### Using absolute angles vs. Using angle deviations for the denoising process(Sampling Process)
![Using angles V.S. Using angle deviation](https://raw.githubusercontent.com/AlanYangYi/angular_deviation_diffuser/refs/heads/main/Pictures/angleVSanglechange%20(2).gif)
## Background
Protein dynamics are essential for understanding biological functionality, as proteins exist not only in a single static structure but also in multiple dynamic conformational states. MD simulations are the gold standard for studying these dynamics, but they are resource-intensive and limited in their ability to fully explore all possible conformational states. The Angular Deviation Diffuser addresses these limitations by utilizing advanced deep learning techniques, specifically a diffusion model integrated with SE(3) invariance, to efficiently generate accurate protein conformations.
## Features
- **Angular Deviation-Based Diffusion**: Uses angular deviations rather than absolute angles for data representation, improving stability and efficiency.
- **Transformer Backbone**: Utilizes a transformer architecture for learning protein dynamics from training data, capturing the conformational space effectively.
- **SE(3) Symmetry Integration**: Ensures the generated conformations respect the inherent rotational and translational symmetry of molecular systems.
- **Efficient Ensemble Generation**: Capable of generating diverse conformational ensembles in significantly less time compared to traditional MD simulations.
## Installation
To install and use Angular Deviation Diffuser, follow these steps:
### Prerequisites
- **Conda**: Ensure that [Conda](https://docs.conda.io/en/latest/miniconda.html) is installed to manage the environment and dependencies.
### Steps
1. **Create and Activate Conda Environment**:
```bash
conda create -n angular_deviation_diffusion python==3.8
conda activate angular_deviation_diffusion
```
2. **Install Angular Deviation Diffuser**:
```bash
pip install Angular_Deviation_Diffuser
python -c 'import pyrosetta_installer; pyrosetta_installer.install_pyrosetta()'
```
## Usage
The following sections provide detailed guidance on how to use the package for generating protein conformations:
### 1. Extract Six Types of Angles
Extract the backbone angles (ϕ, ψ, ω, θ₁, θ₂, θ₃) from a given PDB file.
```python
from Angular_Deviation_Diffuser import extract_six_angles
angles = extract_six_angles.get_angle_from_pdb("your_pdb_file.pdb")
```
Replace `"your_pdb_file.pdb"` with the appropriate PDB file name. This function returns an angle matrix containing all six types of backbone angles.
### 2. Reconstruct 3D Coordinates and Generate a PDB File
Reconstruct the 3D atomic coordinates using the six angle types and generate the corresponding PDB file.
```python
from Angular_Deviation_Diffuser import reconstruct_coordinate
# Given an L x 6 angle matrix, reconstruct the Cartesian coordinates of the atoms.
# Replace 'angles' with the actual angle data in a numpy array type.
coor = reconstruct_coordinate.angles2coord(angles)
# Save the reconstructed coordinates to a PDB file
reconstruct_coordinate.coor_to_pdb(coor, "reconstructed_structure.pdb")
```
Replace `"reconstructed_structure.pdb"` with the desired output PDB file name.
### 3. Training the Model
Train the transformer-based diffusion model using the angular deviation data obtained from the previous step.
```bash
python training.py --data_dir 'path/to/training_data'
```
Replace `'path/to/training_data'` with the path to your training dataset. The training set can be downloaded from [here](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/2eaf4d98dacb188eeeb56005ff526e1130f02dc3/Training_Set.zip).
### 4. Generating Conformations
Generate a diverse ensemble of protein backbone conformations using the trained model.
```python
from Angular_Deviation_Diffuser import sampling
# Generate conformations with refinement
sampling.generate_conformations_with_refinement(batch_size=10, total_samples=10)
```
The pre-trained model weights can be downloaded from [here](https://drive.usercontent.google.com/download?id=1ld2lZgJFoZJZrwbKdzHcDAZU7t9jmBkH&export=download&authuser=0&confirm=t&uuid=17381f09-3b32-4d7f-976d-1d49de944a7f&at=AENtkXZcdxy-fTKXegNBrIdxcF-T:1731452378968).
### 5. Adding Side Chains with Refinement
Utilize PyRosetta to add side chains to the generated backbone structures and refine them.
```python
from Angular_Deviation_Diffuser import refine
# Refine the generated backbone structure
refine.refine_conformations('reconstructed_structure.pdb', "refined_conformation.pdb")
```
Replace `'reconstructed_structure.pdb'` and `"refined_conformation.pdb"` with the appropriate file names for input and output.
## Online Training and Sampling
Google Colab:https://colab.research.google.com/drive/1paTyFVRMzD4b75DeFjYyXlMV38d1eE1I#scrollTo=Dg41j6Feaj6f.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more information.
## Acknowledgements
We are grateful to our research team for their invaluable contributions and support throughout the development of this model.
---
If you have any issues or questions, please feel free to open an issue in the repository or contact us directly.
Raw data
{
"_id": null,
"home_page": null,
"name": "Angular-Deviation-Diffuser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "diffusion model, protein conformation, angular deviation",
"author": null,
"author_email": "yi yang <yiyangalan@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/71/ea/fe73eda4dc32291f9b46b347cb3dce8c881cc92e339b6c516dcf9d4342fc/angular_deviation_diffuser-1.0.9.tar.gz",
"platform": null,
"description": "# Angular Deviation Diffuser\n\n\n\n## Overview\n\n**Angular Deviation Diffuser** is a transformer-based diffusion model designed for efficiently generating conformational ensembles of protein backbones by using angular deviations as data flow. It aims to overcome the limitations of traditional molecular dynamics (MD) simulations by providing a fast and computationally efficient approach for sampling protein conformational landscapes. This model leverages the concepts of SE(3) symmetry, angular deviations, and diffusion processes to produce dynamic ensembles that closely match those generated through MD simulations, thereby offering a new way to study protein structure and function.\n\n\n### Overview of this work\n![Angular Deviation Diffuser Workflow](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/main/Pictures/overview.png?raw=true)\n\n## Computational results\n\n### Generated Conformations Example (Dark State and Light State)\n![Generated Conformations Example](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/main/Pictures/Dark_and_light_generated_by_our_model.gif?raw=true)\n\n### Using absolute angles vs. Using angle deviations for the denoising process(Sampling Process)\n![Using angles V.S. Using angle deviation](https://raw.githubusercontent.com/AlanYangYi/angular_deviation_diffuser/refs/heads/main/Pictures/angleVSanglechange%20(2).gif)\n\n\n## Background\n\nProtein dynamics are essential for understanding biological functionality, as proteins exist not only in a single static structure but also in multiple dynamic conformational states. MD simulations are the gold standard for studying these dynamics, but they are resource-intensive and limited in their ability to fully explore all possible conformational states. The Angular Deviation Diffuser addresses these limitations by utilizing advanced deep learning techniques, specifically a diffusion model integrated with SE(3) invariance, to efficiently generate accurate protein conformations.\n\n## Features\n\n- **Angular Deviation-Based Diffusion**: Uses angular deviations rather than absolute angles for data representation, improving stability and efficiency.\n- **Transformer Backbone**: Utilizes a transformer architecture for learning protein dynamics from training data, capturing the conformational space effectively.\n- **SE(3) Symmetry Integration**: Ensures the generated conformations respect the inherent rotational and translational symmetry of molecular systems.\n- **Efficient Ensemble Generation**: Capable of generating diverse conformational ensembles in significantly less time compared to traditional MD simulations.\n\n## Installation\n\nTo install and use Angular Deviation Diffuser, follow these steps:\n\n### Prerequisites\n\n- **Conda**: Ensure that [Conda](https://docs.conda.io/en/latest/miniconda.html) is installed to manage the environment and dependencies.\n\n### Steps\n\n1. **Create and Activate Conda Environment**:\n \n ```bash\n conda create -n angular_deviation_diffusion python==3.8\n conda activate angular_deviation_diffusion\n ```\n\n2. **Install Angular Deviation Diffuser**:\n \n ```bash\n pip install Angular_Deviation_Diffuser\n python -c 'import pyrosetta_installer; pyrosetta_installer.install_pyrosetta()'\n ```\n\n## Usage\n\nThe following sections provide detailed guidance on how to use the package for generating protein conformations:\n\n### 1. Extract Six Types of Angles\n\nExtract the backbone angles (\u03d5, \u03c8, \u03c9, \u03b8\u2081, \u03b8\u2082, \u03b8\u2083) from a given PDB file.\n\n```python\nfrom Angular_Deviation_Diffuser import extract_six_angles\n\nangles = extract_six_angles.get_angle_from_pdb(\"your_pdb_file.pdb\")\n```\nReplace `\"your_pdb_file.pdb\"` with the appropriate PDB file name. This function returns an angle matrix containing all six types of backbone angles.\n\n### 2. Reconstruct 3D Coordinates and Generate a PDB File\n\nReconstruct the 3D atomic coordinates using the six angle types and generate the corresponding PDB file.\n\n```python\nfrom Angular_Deviation_Diffuser import reconstruct_coordinate\n\n# Given an L x 6 angle matrix, reconstruct the Cartesian coordinates of the atoms.\n# Replace 'angles' with the actual angle data in a numpy array type.\ncoor = reconstruct_coordinate.angles2coord(angles)\n\n# Save the reconstructed coordinates to a PDB file\nreconstruct_coordinate.coor_to_pdb(coor, \"reconstructed_structure.pdb\")\n```\nReplace `\"reconstructed_structure.pdb\"` with the desired output PDB file name.\n\n### 3. Training the Model\n\nTrain the transformer-based diffusion model using the angular deviation data obtained from the previous step.\n\n```bash\npython training.py --data_dir 'path/to/training_data'\n```\nReplace `'path/to/training_data'` with the path to your training dataset. The training set can be downloaded from [here](https://github.com/AlanYangYi/angular_deviation_diffuser/blob/2eaf4d98dacb188eeeb56005ff526e1130f02dc3/Training_Set.zip).\n\n### 4. Generating Conformations\n\nGenerate a diverse ensemble of protein backbone conformations using the trained model.\n\n```python\nfrom Angular_Deviation_Diffuser import sampling\n\n# Generate conformations with refinement\nsampling.generate_conformations_with_refinement(batch_size=10, total_samples=10)\n```\nThe pre-trained model weights can be downloaded from [here](https://drive.usercontent.google.com/download?id=1ld2lZgJFoZJZrwbKdzHcDAZU7t9jmBkH&export=download&authuser=0&confirm=t&uuid=17381f09-3b32-4d7f-976d-1d49de944a7f&at=AENtkXZcdxy-fTKXegNBrIdxcF-T:1731452378968).\n\n### 5. Adding Side Chains with Refinement\n\nUtilize PyRosetta to add side chains to the generated backbone structures and refine them.\n\n```python\nfrom Angular_Deviation_Diffuser import refine\n\n# Refine the generated backbone structure\nrefine.refine_conformations('reconstructed_structure.pdb', \"refined_conformation.pdb\")\n```\nReplace `'reconstructed_structure.pdb'` and `\"refined_conformation.pdb\"` with the appropriate file names for input and output.\n\n\n## Online Training and Sampling\nGoogle Colab:https://colab.research.google.com/drive/1paTyFVRMzD4b75DeFjYyXlMV38d1eE1I#scrollTo=Dg41j6Feaj6f. \n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more information.\n\n## Acknowledgements\n\nWe are grateful to our research team for their invaluable contributions and support throughout the development of this model.\n\n\n---\n\nIf you have any issues or questions, please feel free to open an issue in the repository or contact us directly.\n\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) [2024] [YI YANG] Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "This is a Transformer-Based diffusion model for Efficient Protein Conformational Ensemble Generation",
"version": "1.0.9",
"project_urls": {
"Homepage": "https://github.com/AlanYangYi/angular_deviation_diffuser"
},
"split_keywords": [
"diffusion model",
" protein conformation",
" angular deviation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4ac09c19f7df460de499a9682c3f726769dacfcf1c4df516b9b453b6898a2d39",
"md5": "656d31a3762bb8407fed169740b86319",
"sha256": "421ce8a75d2c3bbc02a450476ed721546d3fddf1fb5aa0b6a40fd76f880f8e47"
},
"downloads": -1,
"filename": "Angular_Deviation_Diffuser-1.0.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "656d31a3762bb8407fed169740b86319",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15145,
"upload_time": "2024-11-25T22:51:53",
"upload_time_iso_8601": "2024-11-25T22:51:53.992229Z",
"url": "https://files.pythonhosted.org/packages/4a/c0/9c19f7df460de499a9682c3f726769dacfcf1c4df516b9b453b6898a2d39/Angular_Deviation_Diffuser-1.0.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "71eafe73eda4dc32291f9b46b347cb3dce8c881cc92e339b6c516dcf9d4342fc",
"md5": "cc3fc976e321aa19222358d1e5a731fd",
"sha256": "4761c298bc2b1f3ee7f9c85f096d6b3c288ed84ea49fa3c1e5424987b6aa1eb5"
},
"downloads": -1,
"filename": "angular_deviation_diffuser-1.0.9.tar.gz",
"has_sig": false,
"md5_digest": "cc3fc976e321aa19222358d1e5a731fd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 16083,
"upload_time": "2024-11-25T22:51:55",
"upload_time_iso_8601": "2024-11-25T22:51:55.700668Z",
"url": "https://files.pythonhosted.org/packages/71/ea/fe73eda4dc32291f9b46b347cb3dce8c881cc92e339b6c516dcf9d4342fc/angular_deviation_diffuser-1.0.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-25 22:51:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AlanYangYi",
"github_project": "angular_deviation_diffuser",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "angular-deviation-diffuser"
}