# AlphaFetcher
`AlphaFetcher` facilitates fetching and downloading protein metadata and related files from the
[AlphaFold Protein Structure Database](https://alphafold.ebi.ac.uk/)
using Uniprot access codes.
---
## 🌟 Features
- **Batch Import**: Input single or multiple Uniprot access codes seamlessly.
- **Parallel Processing**: Efficiently fetch metadata using multithreading.
- **Flexible Downloads**: Choose among various file types - PDB, CIF, BCIF, PAE image, and PAE data files.
- **Optimal Performance**: Easily adjust the number of workers for threaded tasks.
---
## 🔧 Installation
We recommend PyPI installation:
```bash
pip install alphafetcher
```
---
## 💡 Usage
```python
from alphafetcher import AlphaFetcher
# Instantiate the fetcher
# The base_savedir parameter allows you to set a base directory where files will be saved.
# Inside this directory, two separate directories for pdb and cif files will be created.
fetcher = AlphaFetcher(base_savedir="my_savedir")
# Add desired Uniprot access codes
fetcher.add_proteins(["A1KXE4", "H0YL14", "B2RXH2", "A8MVW5"])
# Retrieve metadata
fetcher.fetch_metadata(multithread=True, workers=4)
# Commence download of specified files
fetcher.download_all_files(pdb=True, cif=True, multithread=True, workers=4)
```
---
## 📜 Documentation
### Initialization
- **`AlphaFetcher(base_savedir: str)`**
- *Description*: Initialize the fetcher with a base save directory. The `base_savedir` is where the downloaded pdb and cif files will be stored. Inside this directory, two subdirectories will be automatically created: one for pdb files and another for cif files.
- *Parameters*:
- `base_savedir`: The base directory where the pdb and cif files will be saved.
### Methods
- **`add_proteins(proteins: Union[str, List[str]])`**
- *Description*: Add the provided Uniprot access codes for fetching. A single string or a list of strings are
accepted.
- **`fetch_metadata(multithread: bool = False, workers: int = 10)`**
- *Description*: Extracts metadata corresponding to the supplied Uniprot access codes. This metadata is used to
download the relevant files and is stored in ```fetcher.metadata_dict```, assuming the notation of the example
above is followed.
- **`download_all_files(uniprot_access: str, pdb: bool = False, cif: bool = False, bcif: bool = False,
pae_image:bool = False, pae_data: bool = False)`**
- *Description*: Initiates download for the specified file types linked to the given Uniprot codes.
- Specify the types of files to be downloaded by changing the values of their parameters to True.
*For a comprehensive guide, users are encouraged to view the docstrings incorporated within the source code.*
---
## ⚠️ Limitations
Always respect the AlphaFold Protein Structure Database terms of service, ensuring not to flood it with excessive
concurrent requests. Consider adjusting the number of workers to reduce the requests density.
---
## 🙌 Contributing
We welcome your contributions! To collaborate:
1. Fork this repository.
2. Commit your changes.
3. Open a pull request with your updates.
---
## 📖 Authors and Acknowledgment
- **Jose Gavalda-Garcia** - *Author* - [jose.gavalda.garcia@vub.be](mailto:jose.gavalda.garcia@vub.be)
- **Wim Vranken** - *Supervisor* - [wim.vranken@vub.be](mailto:wim.vranken@vub.be)
---
## 📄 License
This project is licensed under the [GNU General Public License v3 (GPLv3)](https://www.gnu.org/licenses/gpl-3.0.en.html).
Raw data
{
"_id": null,
"home_page": "https://bitbucket.org/bio2byte/alphafetcher/",
"name": "AlphaFetcher",
"maintainer": "Jose Gavalda-Garcia",
"docs_url": null,
"requires_python": ">=3.6, <3.12",
"maintainer_email": "jose.gavalda.garcia@vub.be",
"keywords": "",
"author": "Jose Gavalda-Garcia",
"author_email": "jose.gavalda.garcia@vub.be",
"download_url": "https://files.pythonhosted.org/packages/13/21/555eceb2b07b0d2b099c40d1fc664fb9dbef21a6dd3fb11766c533a4884f/AlphaFetcher-0.1.1.tar.gz",
"platform": null,
"description": "# AlphaFetcher\n\n`AlphaFetcher` facilitates fetching and downloading protein metadata and related files from the \n[AlphaFold Protein Structure Database](https://alphafold.ebi.ac.uk/) \nusing Uniprot access codes.\n\n---\n\n## \ud83c\udf1f Features\n\n- **Batch Import**: Input single or multiple Uniprot access codes seamlessly.\n \n- **Parallel Processing**: Efficiently fetch metadata using multithreading.\n \n- **Flexible Downloads**: Choose among various file types - PDB, CIF, BCIF, PAE image, and PAE data files.\n \n- **Optimal Performance**: Easily adjust the number of workers for threaded tasks.\n\n---\n\n## \ud83d\udd27 Installation\n\nWe recommend PyPI installation:\n\n```bash\npip install alphafetcher\n```\n\n---\n\n## \ud83d\udca1 Usage\n\n```python\nfrom alphafetcher import AlphaFetcher\n\n# Instantiate the fetcher\n# The base_savedir parameter allows you to set a base directory where files will be saved.\n# Inside this directory, two separate directories for pdb and cif files will be created.\nfetcher = AlphaFetcher(base_savedir=\"my_savedir\")\n\n# Add desired Uniprot access codes\nfetcher.add_proteins([\"A1KXE4\", \"H0YL14\", \"B2RXH2\", \"A8MVW5\"])\n\n# Retrieve metadata\nfetcher.fetch_metadata(multithread=True, workers=4)\n\n# Commence download of specified files\nfetcher.download_all_files(pdb=True, cif=True, multithread=True, workers=4)\n```\n\n---\n\n## \ud83d\udcdc Documentation\n\n### Initialization\n\n- **`AlphaFetcher(base_savedir: str)`**\n - *Description*: Initialize the fetcher with a base save directory. The `base_savedir` is where the downloaded pdb and cif files will be stored. Inside this directory, two subdirectories will be automatically created: one for pdb files and another for cif files.\n - *Parameters*:\n - `base_savedir`: The base directory where the pdb and cif files will be saved.\n\n\n### Methods\n\n- **`add_proteins(proteins: Union[str, List[str]])`**\n - *Description*: Add the provided Uniprot access codes for fetching. A single string or a list of strings are \n accepted. \n\n- **`fetch_metadata(multithread: bool = False, workers: int = 10)`**\n - *Description*: Extracts metadata corresponding to the supplied Uniprot access codes. This metadata is used to \n download the relevant files and is stored in ```fetcher.metadata_dict```, assuming the notation of the example\n above is followed.\n \n- **`download_all_files(uniprot_access: str, pdb: bool = False, cif: bool = False, bcif: bool = False, \n pae_image:bool = False, pae_data: bool = False)`**\n - *Description*: Initiates download for the specified file types linked to the given Uniprot codes.\n - Specify the types of files to be downloaded by changing the values of their parameters to True.\n\n*For a comprehensive guide, users are encouraged to view the docstrings incorporated within the source code.*\n\n---\n\n## \u26a0\ufe0f Limitations\n\nAlways respect the AlphaFold Protein Structure Database terms of service, ensuring not to flood it with excessive \nconcurrent requests. Consider adjusting the number of workers to reduce the requests density. \n\n---\n\n## \ud83d\ude4c Contributing\n\nWe welcome your contributions! To collaborate:\n1. Fork this repository.\n2. Commit your changes.\n3. Open a pull request with your updates.\n\n---\n\n## \ud83d\udcd6 Authors and Acknowledgment\n\n- **Jose Gavalda-Garcia** - *Author* - [jose.gavalda.garcia@vub.be](mailto:jose.gavalda.garcia@vub.be)\n- **Wim Vranken** - *Supervisor* - [wim.vranken@vub.be](mailto:wim.vranken@vub.be)\n\n---\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the [GNU General Public License v3 (GPLv3)](https://www.gnu.org/licenses/gpl-3.0.en.html).\n\n\n",
"bugtrack_url": null,
"license": "OSI Approved :: GNU General Public License v3 (GPLv3)",
"summary": "This package allows interface with the AlphaFold Protein Structure Database. This package allows the download of entries' metadata an AlphaFold files (e.g. mmCIF, PAE, PDB...)",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://bitbucket.org/bio2byte/alphafetcher/"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1b61ad454e47cba3fda9310178bb509a1615cbc8409f5e1cff0e33b838018ce4",
"md5": "5521676f51ec6ccbd6a6706e92826654",
"sha256": "e6aeecc194eb6f2516f6a6703db2e6a2d6f4daf6f58ecde37385e03a1d980392"
},
"downloads": -1,
"filename": "AlphaFetcher-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5521676f51ec6ccbd6a6706e92826654",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6, <3.12",
"size": 6593,
"upload_time": "2023-10-02T11:12:24",
"upload_time_iso_8601": "2023-10-02T11:12:24.566016Z",
"url": "https://files.pythonhosted.org/packages/1b/61/ad454e47cba3fda9310178bb509a1615cbc8409f5e1cff0e33b838018ce4/AlphaFetcher-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1321555eceb2b07b0d2b099c40d1fc664fb9dbef21a6dd3fb11766c533a4884f",
"md5": "c1748f90530f6f3b96882536b1e72e0a",
"sha256": "6cc81734c803609e036a62d8e41cdfc5eb5a1fcffa910f17fc46bf87bec6169f"
},
"downloads": -1,
"filename": "AlphaFetcher-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "c1748f90530f6f3b96882536b1e72e0a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6, <3.12",
"size": 5952,
"upload_time": "2023-10-02T11:12:26",
"upload_time_iso_8601": "2023-10-02T11:12:26.166604Z",
"url": "https://files.pythonhosted.org/packages/13/21/555eceb2b07b0d2b099c40d1fc664fb9dbef21a6dd3fb11766c533a4884f/AlphaFetcher-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-02 11:12:26",
"github": false,
"gitlab": false,
"bitbucket": true,
"codeberg": false,
"bitbucket_user": "bio2byte",
"bitbucket_project": "alphafetcher",
"lcname": "alphafetcher"
}