ncbi-acc-download


Namencbi-acc-download JSON
Version 0.2.9 PyPI version JSON
download
home_pagehttps://github.com/kblin/ncbi-acc-download/
SummaryDownload genome files from NCBI by accession.
upload_time2024-08-01 13:51:27
maintainerNone
docs_urlNone
authorKai Blin
requires_pythonNone
licenseApache Software License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            # NCBI accession download script

A partner script to the popular [ncbi-genome-download](https://github.com/kblin/ncbi-genome-download)
script, `ncbi-acc-download` allows you to download sequences from GenBank/RefSeq by accession through
the NCBI [ENTREZ API](https://www.ncbi.nlm.nih.gov/books/NBK184582/).

## Installation

```
pip install ncbi-acc-download
```

Alternatively, clone this repository from GitHub, then run (in a python virtual environment)
```
pip install .
```
If this fails on older versions of Python, try updating your `pip` tool first:
```
pip install --upgrade pip
```
and then rerun the `ncbi-acc-download` install.

`ncbi-acc-download` is only developed and tested on Python releases still under active
support by the Python project. At the moment, this means versions 3.8, 3.9, 3.10, 3.11 and 3.12.
Specifically, no attempt at testing under Python versions older than 3.8 is being made.

`ncbi-acc-download` 0.2.6 was the last version to support Python 2.7.

If your system is stuck on an older version of Python, consider using a tool like
[Homebrew](http://brew.sh) or [Linuxbrew](http://linuxbrew.sh) to obtain a more up-to-date
version.


## Usage

To download a nucleotide record AB_12345 in GenBank format, run
```
ncbi-acc-download AB_12345
```

To download a nucleotide record AB_12345 in FASTA format, run
```
ncbi-acc-download --format fasta AB_12345
```

To download a protein record WP_12345 in FASTA format, run
```
ncbi-acc-download --molecule protein WP_12345
```

To just generate a list of download URLs to run the actual download elsewhere, run
```
ncbi-acc-download --url AB_12345
```

If you want to concatenate multiple sequences into a single file, run
```
ncbi-acc-download --out two_genomes.gbk AB_12345 AB_23456
```

You can use this with `/dev/stdout` as the filename to print the downloaded data to
standard output instead of writing to a file if you want to chain `ncbi-acc-download` with other
command line tools, like so:
```
ncbi-genome-download --out /dev/stdout --format fasta AB_12345 AB_23456 | gzip > two_genomes.fa.gz
```

If you want to download all records covered by a WGS master record instead of the master record itself,
run
```
ncbi-acc-download --recursive NZ_EXMP01000000
```

You can supply a genomic range to the accession download using `--range`
```
ncbi-acc-download NC_007194 --range 1001:9000
```
As cutting a record up with a range operator like that can leave partial features at both ends of the
record, you can combine the range download with the new `correct` extended validator to remove the
partial features.
```
ncbi-acc-download NC_007194 --range 1001:9000 --extended-validation correct
```

You can get more detailed information on the download progress by using the `--verbose` or `-v` flag.

To get an overview of all options, run
```
ncbi-acc-download --help
```

## License
All code is available under the Apache License version 2, see the
[`LICENSE`](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kblin/ncbi-acc-download/",
    "name": "ncbi-acc-download",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Kai Blin",
    "author_email": "kblin@biosustain.dtu.dk",
    "download_url": "https://files.pythonhosted.org/packages/fb/5e/8e6f8f3eeb63a0f70ea89e9866609739852308b2d8276f41bdb9d3484cd2/ncbi_acc_download-0.2.9.tar.gz",
    "platform": null,
    "description": "# NCBI accession download script\n\nA partner script to the popular [ncbi-genome-download](https://github.com/kblin/ncbi-genome-download)\nscript, `ncbi-acc-download` allows you to download sequences from GenBank/RefSeq by accession through\nthe NCBI [ENTREZ API](https://www.ncbi.nlm.nih.gov/books/NBK184582/).\n\n## Installation\n\n```\npip install ncbi-acc-download\n```\n\nAlternatively, clone this repository from GitHub, then run (in a python virtual environment)\n```\npip install .\n```\nIf this fails on older versions of Python, try updating your `pip` tool first:\n```\npip install --upgrade pip\n```\nand then rerun the `ncbi-acc-download` install.\n\n`ncbi-acc-download` is only developed and tested on Python releases still under active\nsupport by the Python project. At the moment, this means versions 3.8, 3.9, 3.10, 3.11 and 3.12.\nSpecifically, no attempt at testing under Python versions older than 3.8 is being made.\n\n`ncbi-acc-download` 0.2.6 was the last version to support Python 2.7.\n\nIf your system is stuck on an older version of Python, consider using a tool like\n[Homebrew](http://brew.sh) or [Linuxbrew](http://linuxbrew.sh) to obtain a more up-to-date\nversion.\n\n\n## Usage\n\nTo download a nucleotide record AB_12345 in GenBank format, run\n```\nncbi-acc-download AB_12345\n```\n\nTo download a nucleotide record AB_12345 in FASTA format, run\n```\nncbi-acc-download --format fasta AB_12345\n```\n\nTo download a protein record WP_12345 in FASTA format, run\n```\nncbi-acc-download --molecule protein WP_12345\n```\n\nTo just generate a list of download URLs to run the actual download elsewhere, run\n```\nncbi-acc-download --url AB_12345\n```\n\nIf you want to concatenate multiple sequences into a single file, run\n```\nncbi-acc-download --out two_genomes.gbk AB_12345 AB_23456\n```\n\nYou can use this with `/dev/stdout` as the filename to print the downloaded data to\nstandard output instead of writing to a file if you want to chain `ncbi-acc-download` with other\ncommand line tools, like so:\n```\nncbi-genome-download --out /dev/stdout --format fasta AB_12345 AB_23456 | gzip > two_genomes.fa.gz\n```\n\nIf you want to download all records covered by a WGS master record instead of the master record itself,\nrun\n```\nncbi-acc-download --recursive NZ_EXMP01000000\n```\n\nYou can supply a genomic range to the accession download using `--range`\n```\nncbi-acc-download NC_007194 --range 1001:9000\n```\nAs cutting a record up with a range operator like that can leave partial features at both ends of the\nrecord, you can combine the range download with the new `correct` extended validator to remove the\npartial features.\n```\nncbi-acc-download NC_007194 --range 1001:9000 --extended-validation correct\n```\n\nYou can get more detailed information on the download progress by using the `--verbose` or `-v` flag.\n\nTo get an overview of all options, run\n```\nncbi-acc-download --help\n```\n\n## License\nAll code is available under the Apache License version 2, see the\n[`LICENSE`](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "Apache Software License",
    "summary": "Download genome files from NCBI by accession.",
    "version": "0.2.9",
    "project_urls": {
        "Homepage": "https://github.com/kblin/ncbi-acc-download/"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6d87a30eaf86bd06173d3ef5543fba15a3f85e629b59ac149296426873adcfba",
                "md5": "a3115a9d3190b2ed350367f145148fb9",
                "sha256": "cd76dc543118ea2a6d9516c781dca5c6742a6fbc846f6d71ce508df67629cf0e"
            },
            "downloads": -1,
            "filename": "ncbi_acc_download-0.2.9-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a3115a9d3190b2ed350367f145148fb9",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 16978,
            "upload_time": "2024-08-01T13:51:25",
            "upload_time_iso_8601": "2024-08-01T13:51:25.720932Z",
            "url": "https://files.pythonhosted.org/packages/6d/87/a30eaf86bd06173d3ef5543fba15a3f85e629b59ac149296426873adcfba/ncbi_acc_download-0.2.9-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fb5e8e6f8f3eeb63a0f70ea89e9866609739852308b2d8276f41bdb9d3484cd2",
                "md5": "ec6d782d3bc07d2d07ba851dc1b87c1c",
                "sha256": "1d53d3875a26cd1d7c89d1ebae3db96b9da3fc524dbffcce9ba3f1ba3f46f9da"
            },
            "downloads": -1,
            "filename": "ncbi_acc_download-0.2.9.tar.gz",
            "has_sig": false,
            "md5_digest": "ec6d782d3bc07d2d07ba851dc1b87c1c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18365,
            "upload_time": "2024-08-01T13:51:27",
            "upload_time_iso_8601": "2024-08-01T13:51:27.214253Z",
            "url": "https://files.pythonhosted.org/packages/fb/5e/8e6f8f3eeb63a0f70ea89e9866609739852308b2d8276f41bdb9d3484cd2/ncbi_acc_download-0.2.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-01 13:51:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kblin",
    "github_project": "ncbi-acc-download",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "ncbi-acc-download"
}
        
Elapsed time: 0.37027s