Name | taxadb2 JSON |
Version |
0.12.3
JSON |
| download |
home_page | None |
Summary | Locally query the NCBI taxonomy |
upload_time | 2025-02-19 12:10:11 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT |
keywords |
ncbi
taxonomy
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Taxadb2
[](http://taxadb.readthedocs.io/en/latest/?badge=latest)
[](https://www.python.org/)
[](https://pypi.org/project/taxadb2/)
[](https://github.com/kullrich/taxadb2)
Taxadb2 is an application to locally query the ncbi taxonomy. Taxadb2 is written in python, and access its database using the [peewee](http://peewee.readthedocs.io) library.
Taxadb2 is a fork from [https://github.com/HadrienG/taxadb](https://github.com/HadrienG/taxadb) and handles the `merged.dmp` ncbi taxonomy file to deal with updated taxIDs.
* the built-in support for [MySQL](https://www.mysql.com) and [PostgreSQL](https://www.postgresql.org) was not touched and kept as it is
* `merged.dmp` support was added
In brief Taxadb2:
* is a small tool to query the [ncbi](https://ncbi.nlm.nih.gov/taxonomy) taxonomy.
* is written in python >= 3.10.
* has built-in support for [SQLite](https://www.sqlite.org), [MySQL](https://www.mysql.com) and [PostgreSQL](https://www.postgresql.org).
* has available pre-built SQLite databases.
* has a comprehensive API documentation.
## Installation
Taxadb2 requires python >= 3.10 to work. To install taxadb2 with sqlite support, simply type the following in your terminal:
pip3 install taxadb2
If you wish to use MySQL or PostgreSQL, please refer to the full [documentation](http://taxadb2.readthedocs.io/en/latest/)
## Usage
### Querying the Database
Firstly, make sure you have [built](#creating-the-database) the database
Below you can find basic examples. For more complete examples, please refer to the complete [API documentation](http://taxadb2.readthedocs.io/en/latest/)
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>> 'names': SciName(dbtype='sqlite', dbname=dbname),
>>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }
>>> taxid2name = ncbi['taxid'].sci_name(2)
>>> print(taxid2name)
Bacteria
>>> lineage = ncbi['taxid'].lineage_name(17)
>>> print(lineage[:5])
['Methylophilus methylotrophus', 'Methylophilus', 'Methylophilaceae', 'Nitrosomonadales', 'Betaproteobacteria']
>>> lineage = ncbi['taxid'].lineage_name(17, reverse=True)
>>> print(lineage[:5])
['cellular organisms', 'Bacteria', 'Pseudomonadati', 'Pseudomonadota', 'Betaproteobacteria']
>>> ncbi['taxid'].has_parent(17, 'Bacteria')
True
```
Get the taxid from a scientific name.
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>> 'names': SciName(dbtype='sqlite', dbname=dbname),
>>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }
>>> name2taxid = ncbi['names'].taxid('Pseudomonadota')
>>> print(name2taxid)
1224
```
Automatic detection of `old` taxIDs imported from `merged.dmp`.
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>> 'names': SciName(dbtype='sqlite', dbname=dbname),
>>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }
>>> taxid2name = ncbi['taxid'].sci_name(30)
TaxID 30 is deprecated, using 29 instead.
>>> print(taxid2name)
Myxococcales
```
Get the taxonomic information for accession number(s).
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> dbname = "taxadb2/test/test_db.sqlite"
>>> ncbi = {
>>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),
>>> 'names': SciName(dbtype='sqlite', dbname=dbname),
>>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)
>>> }
>>> my_accessions = ['A01460']
>>> taxids = ncbi['accessionid'].taxid(my_accessions)
>>> taxids
<generator object AccessionID.taxid at 0x103e21bd0>
>>> for ti in taxids:
print(ti)
('A01460', 17)
```
You can also use a configuration file in order to automatically set database connection parameters at object build. Either set config parameter to __init__ object method:
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> config_path = "taxadb2/test/taxadb2.cfg"
>>> ncbi = {
>>> 'taxid': TaxID(config=config_path),
>>> 'names': SciName(config=config_path),
>>> 'accessionid': AccessionID(config=config_path)
>>> }
>>> ncbi['taxid'].sci_name(2)
Bacteria
>>> ...
```
or set environment variable TAXADB_CONFIG which point to configuration file:
```bash
$ export TAXADB2_CONFIG='taxadb2/test/taxadb2.cfg'
```
```python
>>> from taxadb2.taxid import TaxID
>>> from taxadb2.names import SciName
>>> from taxadb2.accessionid import AccessionID
>>> ncbi = {
>>> 'taxid': TaxID(),
>>> 'names': SciName(),
>>> 'accessionid': AccessionID()
>>> }
>>> ncbi['taxid'].sci_name(2)
Bacteria
>>> ...
```
Check documentation for more information.
### Creating the Database
#### Download data
The following commands will download the necessary files from the [ncbi ftp](https://ftp.ncbi.nlm.nih.gov/) into the directory `taxadb`.
```
$ taxadb2 download --outdir taxadb --type taxa
```
#### Insert data
##### SQLite
```
$ taxadb2 create --division taxa --input taxadb --dbname taxadb.sqlite
```
You can then safely remove the downloaded files
```
$ rm -r taxadb
```
You can easily rerun the same command, `taxadb2` is able to skip already inserted `taxid` as well as `accession`.
## Tests
**Note:** Relies on the `pytest` module. `pip install pytest`
You can easily run some tests. Go to the root directory of this projects `cd /path/to/taxadb2` and run
`pytest -v`.
This simple command will run tests against an `SQLite` test database called `test_db.sqlite` located in `taxadb2/test`
directory.
It is also possible to only run tests related to accessionid or taxid as follow
```
$ pytest -m 'taxid'
$ pytest -m 'accessionid'
```
You can also use the configuration file located in root distribution `taxadb2.ini` as follow. This file should contain
database connection settings:
```
$ pytest taxadb2/test --config='taxadb2.ini'
```
## License
Code is under the [MIT](LICENSE) license.
## Issues
Found a bug or have a question? Please open an [issue](https://github.com/kullrich/taxadb2/issues)
## Contributing
Thought about a new feature that you'd like us to implement? Open an [issue](https://github.com/kullrich/taxadb2/issues) or fork the repository and submit a [pull request](https://github.com/kullrich/taxadb2/pulls)
## Code of Conduct - Participation guidelines
This repository adhere to [Contributor Covenant](http://contributor-covenant.org) code of conduct for in any interactions you have within this project. (see [Code of Conduct](https://github.com/kullrich/taxadb2/blob/devel/CODE_OF_CONDUCT.md))
See also the policy against sexualized discrimination, harassment and violence for the Max Planck Society [Code-of-Conduct](https://www.mpg.de/11961177/code-of-conduct-en.pdf).
By contributing to this project, you agree to abide by its terms.
## References
https://github.com/HadrienG/taxadb
Raw data
{
"_id": null,
"home_page": null,
"name": "taxadb2",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Kristian K Ullrich <ullrich@evolbio.mpg.de>",
"keywords": "ncbi, taxonomy",
"author": null,
"author_email": "Kristian K Ullrich <ullrich@evolbio.mpg.de>, Hadrien Gourl\u00e9 <hadrien.gourle@slu.se>, Juliette Hayer <juliette.hayer@slu.se>, Emmanuel Quevillon <tuco@pasteur.fr>",
"download_url": "https://files.pythonhosted.org/packages/90/d7/d0bbe21dc4f559c9eb5c381350c98864d69c720e18346718c8d765718422/taxadb2-0.12.3.tar.gz",
"platform": null,
"description": "# Taxadb2\n\n[](http://taxadb.readthedocs.io/en/latest/?badge=latest)\n[](https://www.python.org/)\n[](https://pypi.org/project/taxadb2/)\n[](https://github.com/kullrich/taxadb2)\n\nTaxadb2 is an application to locally query the ncbi taxonomy. Taxadb2 is written in python, and access its database using the [peewee](http://peewee.readthedocs.io) library.\n\nTaxadb2 is a fork from [https://github.com/HadrienG/taxadb](https://github.com/HadrienG/taxadb) and handles the `merged.dmp` ncbi taxonomy file to deal with updated taxIDs.\n\n* the built-in support for [MySQL](https://www.mysql.com) and [PostgreSQL](https://www.postgresql.org) was not touched and kept as it is\n* `merged.dmp` support was added\n\nIn brief Taxadb2:\n\n* is a small tool to query the [ncbi](https://ncbi.nlm.nih.gov/taxonomy) taxonomy.\n* is written in python >= 3.10.\n* has built-in support for [SQLite](https://www.sqlite.org), [MySQL](https://www.mysql.com) and [PostgreSQL](https://www.postgresql.org).\n* has available pre-built SQLite databases.\n* has a comprehensive API documentation.\n\n\n## Installation\n\nTaxadb2 requires python >= 3.10 to work. To install taxadb2 with sqlite support, simply type the following in your terminal:\n\n pip3 install taxadb2\n\nIf you wish to use MySQL or PostgreSQL, please refer to the full [documentation](http://taxadb2.readthedocs.io/en/latest/)\n\n## Usage\n\n### Querying the Database\n\nFirstly, make sure you have [built](#creating-the-database) the database\n\nBelow you can find basic examples. For more complete examples, please refer to the complete [API documentation](http://taxadb2.readthedocs.io/en/latest/)\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> dbname = \"taxadb2/test/test_db.sqlite\"\n >>> ncbi = {\n >>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),\n >>> 'names': SciName(dbtype='sqlite', dbname=dbname),\n >>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)\n >>> }\n\n >>> taxid2name = ncbi['taxid'].sci_name(2)\n >>> print(taxid2name)\n Bacteria\n >>> lineage = ncbi['taxid'].lineage_name(17)\n >>> print(lineage[:5])\n ['Methylophilus methylotrophus', 'Methylophilus', 'Methylophilaceae', 'Nitrosomonadales', 'Betaproteobacteria']\n >>> lineage = ncbi['taxid'].lineage_name(17, reverse=True)\n >>> print(lineage[:5])\n ['cellular organisms', 'Bacteria', 'Pseudomonadati', 'Pseudomonadota', 'Betaproteobacteria']\n\n >>> ncbi['taxid'].has_parent(17, 'Bacteria')\n True\n```\n\nGet the taxid from a scientific name.\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> dbname = \"taxadb2/test/test_db.sqlite\"\n >>> ncbi = {\n >>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),\n >>> 'names': SciName(dbtype='sqlite', dbname=dbname),\n >>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)\n >>> }\n \n >>> name2taxid = ncbi['names'].taxid('Pseudomonadota')\n >>> print(name2taxid)\n 1224\n```\n\nAutomatic detection of `old` taxIDs imported from `merged.dmp`.\n\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> dbname = \"taxadb2/test/test_db.sqlite\"\n >>> ncbi = {\n >>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),\n >>> 'names': SciName(dbtype='sqlite', dbname=dbname),\n >>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)\n >>> }\n\n >>> taxid2name = ncbi['taxid'].sci_name(30)\n TaxID 30 is deprecated, using 29 instead.\n >>> print(taxid2name)\n Myxococcales\n```\n\nGet the taxonomic information for accession number(s).\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> dbname = \"taxadb2/test/test_db.sqlite\"\n >>> ncbi = {\n >>> 'taxid': TaxID(dbtype='sqlite', dbname=dbname),\n >>> 'names': SciName(dbtype='sqlite', dbname=dbname),\n >>> 'accessionid': AccessionID(dbtype='sqlite', dbname=dbname)\n >>> }\n\n >>> my_accessions = ['A01460']\n >>> taxids = ncbi['accessionid'].taxid(my_accessions)\n >>> taxids\n <generator object AccessionID.taxid at 0x103e21bd0>\n >>> for ti in taxids:\n print(ti)\n ('A01460', 17)\n```\n\nYou can also use a configuration file in order to automatically set database connection parameters at object build. Either set config parameter to __init__ object method:\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> config_path = \"taxadb2/test/taxadb2.cfg\"\n >>> ncbi = {\n >>> 'taxid': TaxID(config=config_path),\n >>> 'names': SciName(config=config_path),\n >>> 'accessionid': AccessionID(config=config_path)\n >>> }\n\n >>> ncbi['taxid'].sci_name(2)\n Bacteria\n >>> ...\n```\n\nor set environment variable TAXADB_CONFIG which point to configuration file:\n\n```bash\n $ export TAXADB2_CONFIG='taxadb2/test/taxadb2.cfg'\n```\n\n```python\n >>> from taxadb2.taxid import TaxID\n >>> from taxadb2.names import SciName\n >>> from taxadb2.accessionid import AccessionID\n >>> ncbi = {\n >>> 'taxid': TaxID(),\n >>> 'names': SciName(),\n >>> 'accessionid': AccessionID()\n >>> }\n\n >>> ncbi['taxid'].sci_name(2)\n Bacteria\n >>> ...\n```\n\nCheck documentation for more information.\n\n### Creating the Database\n\n#### Download data\n\nThe following commands will download the necessary files from the [ncbi ftp](https://ftp.ncbi.nlm.nih.gov/) into the directory `taxadb`.\n```\n$ taxadb2 download --outdir taxadb --type taxa\n```\n\n#### Insert data\n\n##### SQLite\n\n```\n$ taxadb2 create --division taxa --input taxadb --dbname taxadb.sqlite\n```\nYou can then safely remove the downloaded files\n```\n$ rm -r taxadb\n```\n\nYou can easily rerun the same command, `taxadb2` is able to skip already inserted `taxid` as well as `accession`.\n\n## Tests\n\n**Note:** Relies on the `pytest` module. `pip install pytest`\n\nYou can easily run some tests. Go to the root directory of this projects `cd /path/to/taxadb2` and run\n`pytest -v`.\n\nThis simple command will run tests against an `SQLite` test database called `test_db.sqlite` located in `taxadb2/test`\ndirectory.\n\nIt is also possible to only run tests related to accessionid or taxid as follow\n```\n$ pytest -m 'taxid'\n$ pytest -m 'accessionid'\n```\n\nYou can also use the configuration file located in root distribution `taxadb2.ini` as follow. This file should contain\ndatabase connection settings:\n```\n$ pytest taxadb2/test --config='taxadb2.ini'\n```\n\n## License\n\nCode is under the [MIT](LICENSE) license.\n\n## Issues\n\nFound a bug or have a question? Please open an [issue](https://github.com/kullrich/taxadb2/issues)\n\n## Contributing\n\nThought about a new feature that you'd like us to implement? Open an [issue](https://github.com/kullrich/taxadb2/issues) or fork the repository and submit a [pull request](https://github.com/kullrich/taxadb2/pulls)\n\n## Code of Conduct - Participation guidelines\n\nThis repository adhere to [Contributor Covenant](http://contributor-covenant.org) code of conduct for in any interactions you have within this project. (see [Code of Conduct](https://github.com/kullrich/taxadb2/blob/devel/CODE_OF_CONDUCT.md))\n\nSee also the policy against sexualized discrimination, harassment and violence for the Max Planck Society [Code-of-Conduct](https://www.mpg.de/11961177/code-of-conduct-en.pdf).\n\nBy contributing to this project, you agree to abide by its terms.\n\n## References\n\nhttps://github.com/HadrienG/taxadb\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Locally query the NCBI taxonomy",
"version": "0.12.3",
"project_urls": {
"Bug Tracker": "https://github.com/kullrich/taxadb2/issues",
"Homepage": "https://github.com/kullrich/taxadb2",
"documentation": "https://taxadb2.readthedocs.io/en/latest/",
"repository": "https://github.com/kullrich/taxadb2"
},
"split_keywords": [
"ncbi",
" taxonomy"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "37f97764175d85953c622467e8a7fd6907bc80a294b4bd8ef2b27cc7cbaeb676",
"md5": "bac982fa9e88eb5ee0a160e23ca2cfe4",
"sha256": "a259ae7afac435e9ea4b1bb3d6bc0ba71bc573250dda7a50a8c4c2b7e7c3eb38"
},
"downloads": -1,
"filename": "taxadb2-0.12.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bac982fa9e88eb5ee0a160e23ca2cfe4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 26062,
"upload_time": "2025-02-19T12:10:08",
"upload_time_iso_8601": "2025-02-19T12:10:08.948888Z",
"url": "https://files.pythonhosted.org/packages/37/f9/7764175d85953c622467e8a7fd6907bc80a294b4bd8ef2b27cc7cbaeb676/taxadb2-0.12.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "90d7d0bbe21dc4f559c9eb5c381350c98864d69c720e18346718c8d765718422",
"md5": "02c6ea2d5fdcbb1c10649dc70060bcc4",
"sha256": "c3f8b4add73de45f599e5c3e3aeecc7b0982159f9df471311f0df9d2f7bdb322"
},
"downloads": -1,
"filename": "taxadb2-0.12.3.tar.gz",
"has_sig": false,
"md5_digest": "02c6ea2d5fdcbb1c10649dc70060bcc4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 24358,
"upload_time": "2025-02-19T12:10:11",
"upload_time_iso_8601": "2025-02-19T12:10:11.253641Z",
"url": "https://files.pythonhosted.org/packages/90/d7/d0bbe21dc4f559c9eb5c381350c98864d69c720e18346718c8d765718422/taxadb2-0.12.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-19 12:10:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kullrich",
"github_project": "taxadb2",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "taxadb2"
}