# AsperaSRAgetter
AsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.
## Installation
AsperaSRAgetter has been distributed on [pypi](https://pypi.org/project/AsperaSRAgetter/). You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI [with Conda](https://anaconda.org/hcc/aspera-cli).
```shell
# You may create a new invironment for AsperaSRAgetter, but this is optional
conda create -n AsperaSRAgetter python=3.10
conda activate AsperaSRAgetter
# Install AsperaSRAgetter using pip
pip install AsperaSRAgetter
# Install Aspera-CLI using conda
conda install -c hcc aspera-cli
```
## Workflow
AsperaSRAgetter first inquiry for corresponding fastq.gz file report through [ENA filereport API](https://www.ebi.ac.uk/ena/portal/api/). Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command `ascp`
to download the fastq.gz file.
The file reports will be stored as a `.tsv` table as records of the downloading process.
All files' MD5 hash values are saved in `.md5` file which users can further verify the integrity of files.

## Usage
The command name of AsperaSRAgetter is **sragetter**. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below).
Note that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENT_PATH/etc/asperaweb_id_dsa.openssh)
```bash
usage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR
options:
-h, --help show this help message and exit
-v, --version Show SRAdownloader version number and exit
-acc ACCESSION, --accession ACCESSION
SRA data accession
-f FILE, --file FILE TXT file with multiple SRA accessions
-ssh SSH_KEY, --ssh-key SSH_KEY
Public key authentication file provided by Aspera command line client download package as the 'asperaweb_id_dsa.openssh' file
-o OUTDIR, --outdir OUTDIR
Path to store the downloaded SRA data
Usage
-----------------
Download with one accession:
$ sragetter --accession sra_accession --ssh-key sshkey_path.openssh --outdir outdir_path
Download with TXT file containing multiple accessions:
$ sragetter --file sra_accessions.txt --ssh-key sshkey_path.openssh --outdir outdir_path
```
## Contact
If you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.
Raw data
{
"_id": null,
"home_page": "https://github.com/RunJiaJi/AsperaSRAgetter",
"name": "AsperaSRAgetter",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Runjia Ji",
"author_email": "jirunjia@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/28/5b/d232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad/asperasragetter-2.2.tar.gz",
"platform": null,
"description": "# AsperaSRAgetter\nAsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.\n\n## Installation\nAsperaSRAgetter has been distributed on [pypi](https://pypi.org/project/AsperaSRAgetter/). You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI [with Conda](https://anaconda.org/hcc/aspera-cli).\n\n```shell\n# You may create a new invironment for AsperaSRAgetter, but this is optional\nconda create -n AsperaSRAgetter python=3.10\nconda activate AsperaSRAgetter\n\n# Install AsperaSRAgetter using pip\npip install AsperaSRAgetter\n\n# Install Aspera-CLI using conda\nconda install -c hcc aspera-cli\n```\n\n## Workflow\n\nAsperaSRAgetter first inquiry for corresponding fastq.gz file report through [ENA filereport API](https://www.ebi.ac.uk/ena/portal/api/). Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command `ascp`\nto download the fastq.gz file.\n\nThe file reports will be stored as a `.tsv` table as records of the downloading process. \n\nAll files' MD5 hash values are saved in `.md5` file which users can further verify the integrity of files.\n\n \n\n## Usage\n\nThe command name of AsperaSRAgetter is **sragetter**. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below). \nNote that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENT_PATH/etc/asperaweb_id_dsa.openssh)\n\n```bash\nusage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR\n\noptions:\n -h, --help show this help message and exit\n -v, --version Show SRAdownloader version number and exit\n -acc ACCESSION, --accession ACCESSION\n SRA data accession\n -f FILE, --file FILE TXT file with multiple SRA accessions\n -ssh SSH_KEY, --ssh-key SSH_KEY\n Public key authentication file provided by Aspera command line client download package as the 'asperaweb_id_dsa.openssh' file\n -o OUTDIR, --outdir OUTDIR\n Path to store the downloaded SRA data\n\nUsage\n-----------------\nDownload with one accession:\n $ sragetter --accession sra_accession --ssh-key sshkey_path.openssh --outdir outdir_path\n\nDownload with TXT file containing multiple accessions:\n $ sragetter --file sra_accessions.txt --ssh-key sshkey_path.openssh --outdir outdir_path\n```\n\n## Contact\nIf you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.\n",
"bugtrack_url": null,
"license": null,
"summary": "The AsperaSRAgetter provides a easy way to download sequencing data from ENA by using Aspera.",
"version": "2.2",
"project_urls": {
"Homepage": "https://github.com/RunJiaJi/AsperaSRAgetter"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2bf0008583e36f8c599860d7aa4897ef272e91a1e90d8a8f650bcff523ed5020",
"md5": "861cce82e556e71c7260d416d655b524",
"sha256": "e9d7aa1c3a64d3a2fda6a354a474778a89e4168e46c371cfc789007fdfeda25d"
},
"downloads": -1,
"filename": "AsperaSRAgetter-2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "861cce82e556e71c7260d416d655b524",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 5642,
"upload_time": "2024-08-22T02:02:44",
"upload_time_iso_8601": "2024-08-22T02:02:44.245250Z",
"url": "https://files.pythonhosted.org/packages/2b/f0/008583e36f8c599860d7aa4897ef272e91a1e90d8a8f650bcff523ed5020/AsperaSRAgetter-2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "285bd232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad",
"md5": "94d2b0d81e2a26084c629fbf516df20f",
"sha256": "ce1ce2a0a9d43ed7166b7e8bb9ba5fc67e8a35c01a8d99973c292340d5cd82a6"
},
"downloads": -1,
"filename": "asperasragetter-2.2.tar.gz",
"has_sig": false,
"md5_digest": "94d2b0d81e2a26084c629fbf516df20f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4949,
"upload_time": "2024-08-22T02:02:45",
"upload_time_iso_8601": "2024-08-22T02:02:45.562229Z",
"url": "https://files.pythonhosted.org/packages/28/5b/d232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad/asperasragetter-2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-22 02:02:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RunJiaJi",
"github_project": "AsperaSRAgetter",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "asperasragetter"
}