Glottolog ``treedb``
====================
|PyPI version| |License| |Supported Python| |Wheel|
|Build|
This tool loads the content of the `languoids/tree`_ directory from the
Glottolog_ `master repo`_ into a normalized SQLite_ database.
Each file under in that directory contains the definition of one Glottolog
languoid_. Loading their content into a relational database allows to perform
some advanced consistency checks (example_) and in general to execute queries
that inspect the languoid tree relations in a compact and performant way (e.g.
without repeatedly traversing the directory tree).
See pyglottolog_ for the more general official Python API to work with the repo
without a mandatory initial loading step (also provides programmatic access to
the references_ and a convenient command-line interface).
The database can be exported into a ZIP file containing one CSV file for
each database table, or written into a single denormalized CSV file with one
row per languoid (via a provided `SQL query`_).
As sqlite_ is the `most widely used`_ database, the database file itself
(e.g. ``treedb.sqlite3``) can be queried directly from most programming
environments. It can also be examined using graphical interfaces such as
DBeaver_, or via the `sqlite3 cli`_.
Python users can also use the provided SQLAlchemy_ models_ to build queries or
additional abstractions programmatically using `SQLAlchemy core`_ or the ORM_
(as more maintainable alternative to hand-written SQL queries).
Links
-----
- GitHub: https://github.com/glottolog/treedb
- PyPI: https://pypi.org/project/treedb/
- Example: https://nbviewer.jupyter.org/github/glottolog/treedb/blob/master/Stats.ipynb
- Changelog: https://github.com/glottolog/treedb/blob/master/CHANGES.rst
- Issue Tracker: https://github.com/glottolog/treedb/issues
- Download: https://pypi.org/project/treedb/#files
Quickstart
----------
Install ``treedb`` (and dependencies):
.. code:: bash
$ pip install treedb
Clone the Glottolog `master repo`_ :
.. code:: bash
$ git clone https://github.com/glottolog/glottolog.git
Note: ``treedb`` expects to find it under ``./glottolog/`` by default (i.e. under
the current directory), use ``treedb.set_root()`` to point it to a different
path.
Load ``./glottolog/languoids/tree/**/md.ini`` into an in-memory ``sqlite3`` database.
Write the denormalized example query into ``treedb.query.csv``:
.. code:: bash
$ python -c "import treedb; treedb.load(); treedb.write_csv()"
Usage from Python
------------------
Start a Python shell:
.. code:: bash
$ python
Import the package:
.. code:: python
>>> import treedb
Use ``treedb.iterlanguoids()`` to iterate over languoids as (<path>, ``dict``) pairs:
.. code:: python
>>> next(treedb.iterlanguoids())
(('abin1243',), {'id': 'abin1243', 'parent_id': None, 'level': 'language', ...
Note: This is a low-level interface, which does not require loading.
Load the database into ``treedb.sqlite3`` (and set the default ``engine``):
.. code:: python
>>> treedb.load('treedb.sqlite3')
...
<treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>
Run consistency checks:
.. code:: python
>>> treedb.check()
...
True
Export into a ZIP file containing one CSV file per database table:
.. code:: python
>>> treedb.csv_zipfile()
...Path('treedb.zip')
Execute the example query and write it into a CSV file with one row per languoid:
.. code:: python
>>> treedb.write_csv()
...Path('treedb.query.csv')
Rebuild the database (e.g. after an update):
.. code:: python
>>> treedb.load(rebuild=True)
...
<treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>
Execute a simple query with ``sqlalchemy`` core and write it to a CSV file:
.. code:: python
>>> import sqlalchemy as sa
>>> treedb.write_csv(sa.select(treedb.Languoid), filename='languoids.csv')
...Path('languoids.csv')
Get one row from the ``languoid`` table via `sqlalchemy` core (in Glottocode order):
.. code:: python
>>> next(treedb.iterrows(sa.select(treedb.Languoid)))
('3adt1234', '3Ad-Tekles', 'dialect', 'nort3292', None, None, None, None)
Get one ``Languoid`` model instance via ``sqlalchemy`` orm (in Glottocode order):
.. code:: python
>>> session = treedb.Session()
>>> session.query(treedb.Languoid).first()
<Languoid id='3adt1234' level='dialect' name='3Ad-Tekles'>
>>> session.close()
See also
--------
- pyglottolog_ |--| official Python API to access https://github.com/glottolog/glottolog
License
-------
This tool is distributed under the `MIT license`_.
.. _Glottolog: https://glottolog.org/
.. _master repo: https://github.com/glottolog/glottolog
.. _languoids/tree: https://github.com/glottolog/glottolog/tree/master/languoids/tree
.. _SQLite: https://sqlite.org
.. _languoid: https://glottolog.org/meta/glossary#Languoid
.. _example: https://github.com/glottolog/treedb/blob/36c7cdcdd017e7aa4386ef085ee84fb3036c01ca/treedb/checks.py#L154-L169
.. _pyglottolog: https://github.com/glottolog/pyglottolog
.. _references: https://github.com/glottolog/glottolog/tree/master/references
.. _SQL query: https://github.com/glottolog/treedb/blob/master/treedb/queries.py
.. _most widely used: https://www.sqlite.org/mostdeployed.html
.. _DBeaver: https://dbeaver.io/
.. _sqlite3 cli: https://sqlite.org/cli.html
.. _SQLAlchemy: https://www.sqlalchemy.org
.. _models: https://github.com/glottolog/treedb/blob/master/treedb/models.py
.. _SQLAlchemy Core: https://docs.sqlalchemy.org/en/latest/core/
.. _ORM: https://docs.sqlalchemy.org/en/latest/orm/
.. _venv: https://docs.python.org/3/library/venv.html
.. _MIT license: https://opensource.org/licenses/MIT
.. |--| unicode:: U+2013
.. |PyPI version| image:: https://img.shields.io/pypi/v/treedb.svg
:target: https://pypi.org/project/treedb/
:alt: Latest PyPI Version
.. |License| image:: https://img.shields.io/pypi/l/treedb.svg
:target: https://github.com/glottolog/treedb/blob/master/LICENSE.txt
:alt: License
.. |Supported Python| image:: https://img.shields.io/pypi/pyversions/treedb.svg
:target: https://pypi.org/project/treedb/
:alt: Supported Python Versions
.. |Wheel| image:: https://img.shields.io/pypi/wheel/treedb.svg
:target: https://pypi.org/project/treedb/#files
:alt: Wheel format
.. |Build| image:: https://github.com/glottolog/treedb/actions/workflows/build.yaml/badge.svg?branch=master
:target: https://github.com/glottolog/treedb/actions/workflows/build.yaml?query=branch%3Amaster
:alt: Build
.. |Codecov| image:: https://codecov.io/gh/glottolog/treedb/branch/master/graph/badge.svg
:target: https://codecov.io/gh/glottolog/treedb
:alt: Codecov
Raw data
{
"_id": null,
"home_page": "https://github.com/glottolog/treedb",
"name": "treedb",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "glottolog languoids sqlite3 database",
"author": "Sebastian Bank",
"author_email": "sebastian.bank@uni-leipzig.de",
"download_url": "https://files.pythonhosted.org/packages/76/8b/75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947/treedb-2.6.3.zip",
"platform": "any",
"description": "Glottolog ``treedb``\r\n====================\r\n\r\n|PyPI version| |License| |Supported Python| |Wheel|\r\n\r\n|Build|\r\n\r\nThis tool loads the content of the `languoids/tree`_ directory from the\r\nGlottolog_ `master repo`_ into a normalized SQLite_ database.\r\n\r\nEach file under in that directory contains the definition of one Glottolog\r\nlanguoid_. Loading their content into a relational database allows to perform\r\nsome advanced consistency checks (example_) and in general to execute queries\r\nthat inspect the languoid tree relations in a compact and performant way (e.g.\r\nwithout repeatedly traversing the directory tree).\r\n\r\nSee pyglottolog_ for the more general official Python API to work with the repo\r\nwithout a mandatory initial loading step (also provides programmatic access to\r\nthe references_ and a convenient command-line interface).\r\n\r\nThe database can be exported into a ZIP file containing one CSV file for\r\neach database table, or written into a single denormalized CSV file with one\r\nrow per languoid (via a provided `SQL query`_).\r\n\r\nAs sqlite_ is the `most widely used`_ database, the database file itself\r\n(e.g. ``treedb.sqlite3``) can be queried directly from most programming\r\nenvironments. It can also be examined using graphical interfaces such as\r\nDBeaver_, or via the `sqlite3 cli`_.\r\n\r\nPython users can also use the provided SQLAlchemy_ models_ to build queries or\r\nadditional abstractions programmatically using `SQLAlchemy core`_ or the ORM_\r\n(as more maintainable alternative to hand-written SQL queries).\r\n\r\n\r\nLinks\r\n-----\r\n\r\n- GitHub: https://github.com/glottolog/treedb\r\n- PyPI: https://pypi.org/project/treedb/\r\n- Example: https://nbviewer.jupyter.org/github/glottolog/treedb/blob/master/Stats.ipynb\r\n- Changelog: https://github.com/glottolog/treedb/blob/master/CHANGES.rst\r\n- Issue Tracker: https://github.com/glottolog/treedb/issues\r\n- Download: https://pypi.org/project/treedb/#files\r\n\r\n\r\nQuickstart\r\n----------\r\n\r\nInstall ``treedb`` (and dependencies):\r\n\r\n.. code:: bash\r\n\r\n $ pip install treedb\r\n\r\nClone the Glottolog `master repo`_ :\r\n\r\n.. code:: bash\r\n\r\n $ git clone https://github.com/glottolog/glottolog.git\r\n\r\nNote: ``treedb`` expects to find it under ``./glottolog/`` by default (i.e. under\r\nthe current directory), use ``treedb.set_root()`` to point it to a different\r\npath.\r\n\r\nLoad ``./glottolog/languoids/tree/**/md.ini`` into an in-memory ``sqlite3`` database.\r\nWrite the denormalized example query into ``treedb.query.csv``:\r\n\r\n.. code:: bash\r\n\r\n $ python -c \"import treedb; treedb.load(); treedb.write_csv()\"\r\n\r\n\r\nUsage from Python\r\n------------------\r\n\r\nStart a Python shell:\r\n\r\n.. code:: bash\r\n\r\n $ python\r\n\r\nImport the package:\r\n\r\n.. code:: python\r\n\r\n >>> import treedb\r\n\r\nUse ``treedb.iterlanguoids()`` to iterate over languoids as (<path>, ``dict``) pairs:\r\n\r\n.. code:: python\r\n\r\n >>> next(treedb.iterlanguoids())\r\n (('abin1243',), {'id': 'abin1243', 'parent_id': None, 'level': 'language', ...\r\n\r\nNote: This is a low-level interface, which does not require loading.\r\n\r\nLoad the database into ``treedb.sqlite3`` (and set the default ``engine``):\r\n\r\n.. code:: python\r\n\r\n >>> treedb.load('treedb.sqlite3')\r\n ...\r\n <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>\r\n\r\nRun consistency checks:\r\n\r\n.. code:: python\r\n\r\n >>> treedb.check()\r\n ...\r\n True\r\n\r\nExport into a ZIP file containing one CSV file per database table:\r\n\r\n.. code:: python\r\n\r\n >>> treedb.csv_zipfile()\r\n ...Path('treedb.zip')\r\n\r\nExecute the example query and write it into a CSV file with one row per languoid:\r\n\r\n.. code:: python\r\n\r\n >>> treedb.write_csv()\r\n ...Path('treedb.query.csv')\r\n\r\nRebuild the database (e.g. after an update):\r\n\r\n.. code:: python\r\n\r\n >>> treedb.load(rebuild=True)\r\n ...\r\n <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>\r\n\r\nExecute a simple query with ``sqlalchemy`` core and write it to a CSV file:\r\n\r\n.. code:: python\r\n\r\n >>> import sqlalchemy as sa\r\n >>> treedb.write_csv(sa.select(treedb.Languoid), filename='languoids.csv')\r\n ...Path('languoids.csv')\r\n\r\nGet one row from the ``languoid`` table via `sqlalchemy` core (in Glottocode order):\r\n\r\n.. code:: python\r\n\r\n >>> next(treedb.iterrows(sa.select(treedb.Languoid)))\r\n ('3adt1234', '3Ad-Tekles', 'dialect', 'nort3292', None, None, None, None)\r\n\r\nGet one ``Languoid`` model instance via ``sqlalchemy`` orm (in Glottocode order):\r\n\r\n.. code:: python\r\n\r\n >>> session = treedb.Session()\r\n >>> session.query(treedb.Languoid).first()\r\n <Languoid id='3adt1234' level='dialect' name='3Ad-Tekles'>\r\n >>> session.close()\r\n\r\n\r\nSee also\r\n--------\r\n\r\n- pyglottolog_ |--| official Python API to access https://github.com/glottolog/glottolog\r\n\r\n\r\nLicense\r\n-------\r\n\r\nThis tool is distributed under the `MIT license`_.\r\n\r\n\r\n.. _Glottolog: https://glottolog.org/\r\n.. _master repo: https://github.com/glottolog/glottolog\r\n.. _languoids/tree: https://github.com/glottolog/glottolog/tree/master/languoids/tree\r\n.. _SQLite: https://sqlite.org\r\n.. _languoid: https://glottolog.org/meta/glossary#Languoid\r\n.. _example: https://github.com/glottolog/treedb/blob/36c7cdcdd017e7aa4386ef085ee84fb3036c01ca/treedb/checks.py#L154-L169\r\n.. _pyglottolog: https://github.com/glottolog/pyglottolog\r\n.. _references: https://github.com/glottolog/glottolog/tree/master/references\r\n.. _SQL query: https://github.com/glottolog/treedb/blob/master/treedb/queries.py\r\n.. _most widely used: https://www.sqlite.org/mostdeployed.html\r\n.. _DBeaver: https://dbeaver.io/\r\n.. _sqlite3 cli: https://sqlite.org/cli.html\r\n.. _SQLAlchemy: https://www.sqlalchemy.org\r\n.. _models: https://github.com/glottolog/treedb/blob/master/treedb/models.py\r\n.. _SQLAlchemy Core: https://docs.sqlalchemy.org/en/latest/core/\r\n.. _ORM: https://docs.sqlalchemy.org/en/latest/orm/\r\n.. _venv: https://docs.python.org/3/library/venv.html\r\n\r\n.. _MIT license: https://opensource.org/licenses/MIT\r\n\r\n\r\n.. |--| unicode:: U+2013\r\n\r\n\r\n.. |PyPI version| image:: https://img.shields.io/pypi/v/treedb.svg\r\n :target: https://pypi.org/project/treedb/\r\n :alt: Latest PyPI Version\r\n.. |License| image:: https://img.shields.io/pypi/l/treedb.svg\r\n :target: https://github.com/glottolog/treedb/blob/master/LICENSE.txt\r\n :alt: License\r\n.. |Supported Python| image:: https://img.shields.io/pypi/pyversions/treedb.svg\r\n :target: https://pypi.org/project/treedb/\r\n :alt: Supported Python Versions\r\n.. |Wheel| image:: https://img.shields.io/pypi/wheel/treedb.svg\r\n :target: https://pypi.org/project/treedb/#files\r\n :alt: Wheel format\r\n\r\n.. |Build| image:: https://github.com/glottolog/treedb/actions/workflows/build.yaml/badge.svg?branch=master\r\n :target: https://github.com/glottolog/treedb/actions/workflows/build.yaml?query=branch%3Amaster\r\n :alt: Build\r\n.. |Codecov| image:: https://codecov.io/gh/glottolog/treedb/branch/master/graph/badge.svg\r\n :target: https://codecov.io/gh/glottolog/treedb\r\n :alt: Codecov\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Glottolog languoid tree as SQLite database",
"version": "2.6.3",
"project_urls": {
"CI": "https://github.com/glottolog/treedb/actions",
"Changelog": "https://github.com/glottolog/treedb/blob/master/CHANGES.rst",
"Coverage": "https://codecov.io/gh/glottolog/treedb",
"Homepage": "https://github.com/glottolog/treedb",
"Issue Tracker": "https://github.com/glottolog/treedb/issues"
},
"split_keywords": [
"glottolog",
"languoids",
"sqlite3",
"database"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d1a7571a75224873b744484f2e163258d12623fab76b196c7345a62d31f12054",
"md5": "6d7bd6e3a8556a4fd73ca774cd337509",
"sha256": "ba4372073c293f7e9142061402dad06ebae7c3d07341afd790c13b9a6e4c811f"
},
"downloads": -1,
"filename": "treedb-2.6.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6d7bd6e3a8556a4fd73ca774cd337509",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 70325,
"upload_time": "2024-03-17T17:03:11",
"upload_time_iso_8601": "2024-03-17T17:03:11.395195Z",
"url": "https://files.pythonhosted.org/packages/d1/a7/571a75224873b744484f2e163258d12623fab76b196c7345a62d31f12054/treedb-2.6.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "768b75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947",
"md5": "9b3ad773cabc1b4a9429f4142eed5b29",
"sha256": "bc031feb9fe46e6e9e91153a4de079648d97ecf8da3b5193764093df60ff9af5"
},
"downloads": -1,
"filename": "treedb-2.6.3.zip",
"has_sig": false,
"md5_digest": "9b3ad773cabc1b4a9429f4142eed5b29",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 529978,
"upload_time": "2024-03-17T17:03:14",
"upload_time_iso_8601": "2024-03-17T17:03:14.036481Z",
"url": "https://files.pythonhosted.org/packages/76/8b/75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947/treedb-2.6.3.zip",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-17 17:03:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "glottolog",
"github_project": "treedb",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "treedb"
}