AsperaSRAgetter


NameAsperaSRAgetter JSON
Version 2.2 PyPI version JSON
download
home_pagehttps://github.com/RunJiaJi/AsperaSRAgetter
SummaryThe AsperaSRAgetter provides a easy way to download sequencing data from ENA by using Aspera.
upload_time2024-08-22 02:02:45
maintainerNone
docs_urlNone
authorRunjia Ji
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AsperaSRAgetter
AsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.

## Installation
AsperaSRAgetter has been distributed on [pypi](https://pypi.org/project/AsperaSRAgetter/). You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI [with Conda](https://anaconda.org/hcc/aspera-cli).

```shell
# You may create a new invironment for AsperaSRAgetter, but this is optional
conda create -n AsperaSRAgetter python=3.10
conda activate AsperaSRAgetter

# Install AsperaSRAgetter using pip
pip install AsperaSRAgetter

# Install Aspera-CLI using conda
conda install -c hcc aspera-cli
```

## Workflow

AsperaSRAgetter first inquiry for corresponding fastq.gz file report through [ENA filereport API](https://www.ebi.ac.uk/ena/portal/api/). Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command `ascp`
to download the fastq.gz file.

The file reports will be stored as a `.tsv` table as records of the downloading process. 

All files' MD5 hash values are saved in `.md5` file which users can further verify the integrity of files.

![workflow](AsperaSRAgetter/static/workflow.png) 

## Usage

The command name of AsperaSRAgetter is **sragetter**. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below). 
Note that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENT_PATH/etc/asperaweb_id_dsa.openssh)

```bash
usage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR

options:
  -h, --help            show this help message and exit
  -v, --version         Show SRAdownloader version number and exit
  -acc ACCESSION, --accession ACCESSION
                        SRA data accession
  -f FILE, --file FILE  TXT file with multiple SRA accessions
  -ssh SSH_KEY, --ssh-key SSH_KEY
                        Public key authentication file provided by Aspera command line client download package as the 'asperaweb_id_dsa.openssh' file
  -o OUTDIR, --outdir OUTDIR
                        Path to store the downloaded SRA data

Usage
-----------------
Download with one accession:
    $ sragetter --accession sra_accession --ssh-key sshkey_path.openssh --outdir outdir_path

Download with TXT file containing multiple accessions:
    $ sragetter --file sra_accessions.txt --ssh-key sshkey_path.openssh --outdir outdir_path
```

## Contact
If you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/RunJiaJi/AsperaSRAgetter",
    "name": "AsperaSRAgetter",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Runjia Ji",
    "author_email": "jirunjia@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/28/5b/d232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad/asperasragetter-2.2.tar.gz",
    "platform": null,
    "description": "# AsperaSRAgetter\nAsperaSRAgetter provides an easy way to download sequencing data (fastq.gz format) from European Nucleotide Archive (ENA) by using Aspera.\n\n## Installation\nAsperaSRAgetter has been distributed on [pypi](https://pypi.org/project/AsperaSRAgetter/). You can easily install AsperaSRAgetter through pip. AsperaSRAgetter depends on Aspera-CLI to retrive sequencing data from ENA. It is recommended to install Aspera-CLI [with Conda](https://anaconda.org/hcc/aspera-cli).\n\n```shell\n# You may create a new invironment for AsperaSRAgetter, but this is optional\nconda create -n AsperaSRAgetter python=3.10\nconda activate AsperaSRAgetter\n\n# Install AsperaSRAgetter using pip\npip install AsperaSRAgetter\n\n# Install Aspera-CLI using conda\nconda install -c hcc aspera-cli\n```\n\n## Workflow\n\nAsperaSRAgetter first inquiry for corresponding fastq.gz file report through [ENA filereport API](https://www.ebi.ac.uk/ena/portal/api/). Sencondly, the MD5 hash value and ftp url of each fastq.gz files are then resolved from the report. Lastly, ftp url is then passed to Aspera transfer command `ascp`\nto download the fastq.gz file.\n\nThe file reports will be stored as a `.tsv` table as records of the downloading process. \n\nAll files' MD5 hash values are saved in `.md5` file which users can further verify the integrity of files.\n\n![workflow](AsperaSRAgetter/static/workflow.png) \n\n## Usage\n\nThe command name of AsperaSRAgetter is **sragetter**. It accepts either one SRA accession or one TXT file containing multiple accessions (see the usage example below). \nNote that users need to provide the path of public key authentication file of Aspera-CLI (normally should be ENVIRONMENT_PATH/etc/asperaweb_id_dsa.openssh)\n\n```bash\nusage: sragetter [-h] [-v] [-acc ACCESSION | -f FILE] -ssh SSH_KEY -o OUTDIR\n\noptions:\n  -h, --help            show this help message and exit\n  -v, --version         Show SRAdownloader version number and exit\n  -acc ACCESSION, --accession ACCESSION\n                        SRA data accession\n  -f FILE, --file FILE  TXT file with multiple SRA accessions\n  -ssh SSH_KEY, --ssh-key SSH_KEY\n                        Public key authentication file provided by Aspera command line client download package as the 'asperaweb_id_dsa.openssh' file\n  -o OUTDIR, --outdir OUTDIR\n                        Path to store the downloaded SRA data\n\nUsage\n-----------------\nDownload with one accession:\n    $ sragetter --accession sra_accession --ssh-key sshkey_path.openssh --outdir outdir_path\n\nDownload with TXT file containing multiple accessions:\n    $ sragetter --file sra_accessions.txt --ssh-key sshkey_path.openssh --outdir outdir_path\n```\n\n## Contact\nIf you have any questions using AsperaSRAgetter, feel free to open an issue or contact me jirunjia@gmail.com.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "The AsperaSRAgetter provides a easy way to download sequencing data from ENA by using Aspera.",
    "version": "2.2",
    "project_urls": {
        "Homepage": "https://github.com/RunJiaJi/AsperaSRAgetter"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2bf0008583e36f8c599860d7aa4897ef272e91a1e90d8a8f650bcff523ed5020",
                "md5": "861cce82e556e71c7260d416d655b524",
                "sha256": "e9d7aa1c3a64d3a2fda6a354a474778a89e4168e46c371cfc789007fdfeda25d"
            },
            "downloads": -1,
            "filename": "AsperaSRAgetter-2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "861cce82e556e71c7260d416d655b524",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 5642,
            "upload_time": "2024-08-22T02:02:44",
            "upload_time_iso_8601": "2024-08-22T02:02:44.245250Z",
            "url": "https://files.pythonhosted.org/packages/2b/f0/008583e36f8c599860d7aa4897ef272e91a1e90d8a8f650bcff523ed5020/AsperaSRAgetter-2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "285bd232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad",
                "md5": "94d2b0d81e2a26084c629fbf516df20f",
                "sha256": "ce1ce2a0a9d43ed7166b7e8bb9ba5fc67e8a35c01a8d99973c292340d5cd82a6"
            },
            "downloads": -1,
            "filename": "asperasragetter-2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "94d2b0d81e2a26084c629fbf516df20f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 4949,
            "upload_time": "2024-08-22T02:02:45",
            "upload_time_iso_8601": "2024-08-22T02:02:45.562229Z",
            "url": "https://files.pythonhosted.org/packages/28/5b/d232c9173469e249039c3f0f59e1e58ee4bc9d374fdfb12abae9879e2dad/asperasragetter-2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-22 02:02:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "RunJiaJi",
    "github_project": "AsperaSRAgetter",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "asperasragetter"
}
        
Elapsed time: 0.35479s