treedb


Nametreedb JSON
Version 2.6.3 PyPI version JSON
download
home_pagehttps://github.com/glottolog/treedb
SummaryGlottolog languoid tree as SQLite database
upload_time2024-03-17 17:03:14
maintainer
docs_urlNone
authorSebastian Bank
requires_python>=3.8
licenseMIT
keywords glottolog languoids sqlite3 database
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Glottolog ``treedb``
====================

|PyPI version| |License| |Supported Python| |Wheel|

|Build|

This tool loads the content of the `languoids/tree`_ directory from the
Glottolog_ `master repo`_ into a normalized SQLite_ database.

Each file under in that directory contains the definition of one Glottolog
languoid_. Loading their content into a relational database allows to perform
some advanced consistency checks (example_) and in general to execute queries
that inspect the languoid tree relations in a compact and performant way (e.g.
without repeatedly traversing the directory tree).

See pyglottolog_ for the more general official Python API to work with the repo
without a mandatory initial loading step (also provides programmatic access to
the references_ and a convenient command-line interface).

The database can be exported into a ZIP file containing one CSV file for
each database table, or written into a single denormalized CSV file with one
row per languoid (via a provided `SQL query`_).

As sqlite_ is the `most widely used`_ database, the database file itself
(e.g. ``treedb.sqlite3``) can be queried directly from most programming
environments. It can also be examined using graphical interfaces such as
DBeaver_, or via the `sqlite3 cli`_.

Python users can also use the provided SQLAlchemy_ models_ to build queries or
additional abstractions programmatically using `SQLAlchemy core`_ or the ORM_
(as more maintainable alternative to hand-written SQL queries).


Links
-----

- GitHub: https://github.com/glottolog/treedb
- PyPI: https://pypi.org/project/treedb/
- Example: https://nbviewer.jupyter.org/github/glottolog/treedb/blob/master/Stats.ipynb
- Changelog: https://github.com/glottolog/treedb/blob/master/CHANGES.rst
- Issue Tracker: https://github.com/glottolog/treedb/issues
- Download: https://pypi.org/project/treedb/#files


Quickstart
----------

Install ``treedb`` (and dependencies):

.. code:: bash

    $ pip install treedb

Clone the Glottolog `master repo`_ :

.. code:: bash

    $ git clone https://github.com/glottolog/glottolog.git

Note: ``treedb`` expects to find it under ``./glottolog/`` by default (i.e. under
the current directory), use ``treedb.set_root()`` to point it to a different
path.

Load ``./glottolog/languoids/tree/**/md.ini`` into an in-memory ``sqlite3`` database.
Write the denormalized example query into ``treedb.query.csv``:

.. code:: bash

    $ python -c "import treedb; treedb.load(); treedb.write_csv()"


Usage from Python
------------------

Start a Python shell:

.. code:: bash

    $ python

Import the package:

.. code:: python

    >>> import treedb

Use ``treedb.iterlanguoids()`` to iterate over languoids as (<path>, ``dict``) pairs:

.. code:: python

    >>> next(treedb.iterlanguoids())
    (('abin1243',), {'id': 'abin1243', 'parent_id': None, 'level': 'language', ...

Note: This is a low-level interface, which does not require loading.

Load the database into ``treedb.sqlite3`` (and set the default ``engine``):

.. code:: python

    >>> treedb.load('treedb.sqlite3')
    ...
    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>

Run consistency checks:

.. code:: python

    >>> treedb.check()
    ...
    True

Export into a ZIP file containing one CSV file per database table:

.. code:: python

    >>> treedb.csv_zipfile()
    ...Path('treedb.zip')

Execute the example query and write it into a CSV file with one row per languoid:

.. code:: python

    >>> treedb.write_csv()
    ...Path('treedb.query.csv')

Rebuild the database (e.g. after an update):

.. code:: python

    >>> treedb.load(rebuild=True)
    ...
    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>

Execute a simple query with ``sqlalchemy`` core and write it to a CSV file:

.. code:: python

    >>> import sqlalchemy as sa
    >>> treedb.write_csv(sa.select(treedb.Languoid), filename='languoids.csv')
    ...Path('languoids.csv')

Get one row from the ``languoid`` table via `sqlalchemy` core (in Glottocode order):

.. code:: python

    >>> next(treedb.iterrows(sa.select(treedb.Languoid)))
    ('3adt1234', '3Ad-Tekles', 'dialect', 'nort3292', None, None, None, None)

Get one ``Languoid`` model instance via ``sqlalchemy`` orm (in Glottocode order):

.. code:: python

    >>> session = treedb.Session()
    >>> session.query(treedb.Languoid).first()
    <Languoid id='3adt1234' level='dialect' name='3Ad-Tekles'>
    >>> session.close()


See also
--------

- pyglottolog_ |--| official Python API to access https://github.com/glottolog/glottolog


License
-------

This tool is distributed under the `MIT license`_.


.. _Glottolog: https://glottolog.org/
.. _master repo: https://github.com/glottolog/glottolog
.. _languoids/tree: https://github.com/glottolog/glottolog/tree/master/languoids/tree
.. _SQLite: https://sqlite.org
.. _languoid: https://glottolog.org/meta/glossary#Languoid
.. _example: https://github.com/glottolog/treedb/blob/36c7cdcdd017e7aa4386ef085ee84fb3036c01ca/treedb/checks.py#L154-L169
.. _pyglottolog: https://github.com/glottolog/pyglottolog
.. _references: https://github.com/glottolog/glottolog/tree/master/references
.. _SQL query: https://github.com/glottolog/treedb/blob/master/treedb/queries.py
.. _most widely used: https://www.sqlite.org/mostdeployed.html
.. _DBeaver: https://dbeaver.io/
.. _sqlite3 cli: https://sqlite.org/cli.html
.. _SQLAlchemy: https://www.sqlalchemy.org
.. _models: https://github.com/glottolog/treedb/blob/master/treedb/models.py
.. _SQLAlchemy Core: https://docs.sqlalchemy.org/en/latest/core/
.. _ORM: https://docs.sqlalchemy.org/en/latest/orm/
.. _venv: https://docs.python.org/3/library/venv.html

.. _MIT license: https://opensource.org/licenses/MIT


.. |--| unicode:: U+2013


.. |PyPI version| image:: https://img.shields.io/pypi/v/treedb.svg
    :target: https://pypi.org/project/treedb/
    :alt: Latest PyPI Version
.. |License| image:: https://img.shields.io/pypi/l/treedb.svg
    :target: https://github.com/glottolog/treedb/blob/master/LICENSE.txt
    :alt: License
.. |Supported Python| image:: https://img.shields.io/pypi/pyversions/treedb.svg
    :target: https://pypi.org/project/treedb/
    :alt: Supported Python Versions
.. |Wheel| image:: https://img.shields.io/pypi/wheel/treedb.svg
    :target: https://pypi.org/project/treedb/#files
    :alt: Wheel format

.. |Build| image:: https://github.com/glottolog/treedb/actions/workflows/build.yaml/badge.svg?branch=master
    :target: https://github.com/glottolog/treedb/actions/workflows/build.yaml?query=branch%3Amaster
    :alt: Build
.. |Codecov| image:: https://codecov.io/gh/glottolog/treedb/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/glottolog/treedb
    :alt: Codecov

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/glottolog/treedb",
    "name": "treedb",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "glottolog languoids sqlite3 database",
    "author": "Sebastian Bank",
    "author_email": "sebastian.bank@uni-leipzig.de",
    "download_url": "https://files.pythonhosted.org/packages/76/8b/75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947/treedb-2.6.3.zip",
    "platform": "any",
    "description": "Glottolog ``treedb``\r\n====================\r\n\r\n|PyPI version| |License| |Supported Python| |Wheel|\r\n\r\n|Build|\r\n\r\nThis tool loads the content of the `languoids/tree`_ directory from the\r\nGlottolog_ `master repo`_ into a normalized SQLite_ database.\r\n\r\nEach file under in that directory contains the definition of one Glottolog\r\nlanguoid_. Loading their content into a relational database allows to perform\r\nsome advanced consistency checks (example_) and in general to execute queries\r\nthat inspect the languoid tree relations in a compact and performant way (e.g.\r\nwithout repeatedly traversing the directory tree).\r\n\r\nSee pyglottolog_ for the more general official Python API to work with the repo\r\nwithout a mandatory initial loading step (also provides programmatic access to\r\nthe references_ and a convenient command-line interface).\r\n\r\nThe database can be exported into a ZIP file containing one CSV file for\r\neach database table, or written into a single denormalized CSV file with one\r\nrow per languoid (via a provided `SQL query`_).\r\n\r\nAs sqlite_ is the `most widely used`_ database, the database file itself\r\n(e.g. ``treedb.sqlite3``) can be queried directly from most programming\r\nenvironments. It can also be examined using graphical interfaces such as\r\nDBeaver_, or via the `sqlite3 cli`_.\r\n\r\nPython users can also use the provided SQLAlchemy_ models_ to build queries or\r\nadditional abstractions programmatically using `SQLAlchemy core`_ or the ORM_\r\n(as more maintainable alternative to hand-written SQL queries).\r\n\r\n\r\nLinks\r\n-----\r\n\r\n- GitHub: https://github.com/glottolog/treedb\r\n- PyPI: https://pypi.org/project/treedb/\r\n- Example: https://nbviewer.jupyter.org/github/glottolog/treedb/blob/master/Stats.ipynb\r\n- Changelog: https://github.com/glottolog/treedb/blob/master/CHANGES.rst\r\n- Issue Tracker: https://github.com/glottolog/treedb/issues\r\n- Download: https://pypi.org/project/treedb/#files\r\n\r\n\r\nQuickstart\r\n----------\r\n\r\nInstall ``treedb`` (and dependencies):\r\n\r\n.. code:: bash\r\n\r\n    $ pip install treedb\r\n\r\nClone the Glottolog `master repo`_ :\r\n\r\n.. code:: bash\r\n\r\n    $ git clone https://github.com/glottolog/glottolog.git\r\n\r\nNote: ``treedb`` expects to find it under ``./glottolog/`` by default (i.e. under\r\nthe current directory), use ``treedb.set_root()`` to point it to a different\r\npath.\r\n\r\nLoad ``./glottolog/languoids/tree/**/md.ini`` into an in-memory ``sqlite3`` database.\r\nWrite the denormalized example query into ``treedb.query.csv``:\r\n\r\n.. code:: bash\r\n\r\n    $ python -c \"import treedb; treedb.load(); treedb.write_csv()\"\r\n\r\n\r\nUsage from Python\r\n------------------\r\n\r\nStart a Python shell:\r\n\r\n.. code:: bash\r\n\r\n    $ python\r\n\r\nImport the package:\r\n\r\n.. code:: python\r\n\r\n    >>> import treedb\r\n\r\nUse ``treedb.iterlanguoids()`` to iterate over languoids as (<path>, ``dict``) pairs:\r\n\r\n.. code:: python\r\n\r\n    >>> next(treedb.iterlanguoids())\r\n    (('abin1243',), {'id': 'abin1243', 'parent_id': None, 'level': 'language', ...\r\n\r\nNote: This is a low-level interface, which does not require loading.\r\n\r\nLoad the database into ``treedb.sqlite3`` (and set the default ``engine``):\r\n\r\n.. code:: python\r\n\r\n    >>> treedb.load('treedb.sqlite3')\r\n    ...\r\n    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>\r\n\r\nRun consistency checks:\r\n\r\n.. code:: python\r\n\r\n    >>> treedb.check()\r\n    ...\r\n    True\r\n\r\nExport into a ZIP file containing one CSV file per database table:\r\n\r\n.. code:: python\r\n\r\n    >>> treedb.csv_zipfile()\r\n    ...Path('treedb.zip')\r\n\r\nExecute the example query and write it into a CSV file with one row per languoid:\r\n\r\n.. code:: python\r\n\r\n    >>> treedb.write_csv()\r\n    ...Path('treedb.query.csv')\r\n\r\nRebuild the database (e.g. after an update):\r\n\r\n.. code:: python\r\n\r\n    >>> treedb.load(rebuild=True)\r\n    ...\r\n    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>\r\n\r\nExecute a simple query with ``sqlalchemy`` core and write it to a CSV file:\r\n\r\n.. code:: python\r\n\r\n    >>> import sqlalchemy as sa\r\n    >>> treedb.write_csv(sa.select(treedb.Languoid), filename='languoids.csv')\r\n    ...Path('languoids.csv')\r\n\r\nGet one row from the ``languoid`` table via `sqlalchemy` core (in Glottocode order):\r\n\r\n.. code:: python\r\n\r\n    >>> next(treedb.iterrows(sa.select(treedb.Languoid)))\r\n    ('3adt1234', '3Ad-Tekles', 'dialect', 'nort3292', None, None, None, None)\r\n\r\nGet one ``Languoid`` model instance via ``sqlalchemy`` orm (in Glottocode order):\r\n\r\n.. code:: python\r\n\r\n    >>> session = treedb.Session()\r\n    >>> session.query(treedb.Languoid).first()\r\n    <Languoid id='3adt1234' level='dialect' name='3Ad-Tekles'>\r\n    >>> session.close()\r\n\r\n\r\nSee also\r\n--------\r\n\r\n- pyglottolog_ |--| official Python API to access https://github.com/glottolog/glottolog\r\n\r\n\r\nLicense\r\n-------\r\n\r\nThis tool is distributed under the `MIT license`_.\r\n\r\n\r\n.. _Glottolog: https://glottolog.org/\r\n.. _master repo: https://github.com/glottolog/glottolog\r\n.. _languoids/tree: https://github.com/glottolog/glottolog/tree/master/languoids/tree\r\n.. _SQLite: https://sqlite.org\r\n.. _languoid: https://glottolog.org/meta/glossary#Languoid\r\n.. _example: https://github.com/glottolog/treedb/blob/36c7cdcdd017e7aa4386ef085ee84fb3036c01ca/treedb/checks.py#L154-L169\r\n.. _pyglottolog: https://github.com/glottolog/pyglottolog\r\n.. _references: https://github.com/glottolog/glottolog/tree/master/references\r\n.. _SQL query: https://github.com/glottolog/treedb/blob/master/treedb/queries.py\r\n.. _most widely used: https://www.sqlite.org/mostdeployed.html\r\n.. _DBeaver: https://dbeaver.io/\r\n.. _sqlite3 cli: https://sqlite.org/cli.html\r\n.. _SQLAlchemy: https://www.sqlalchemy.org\r\n.. _models: https://github.com/glottolog/treedb/blob/master/treedb/models.py\r\n.. _SQLAlchemy Core: https://docs.sqlalchemy.org/en/latest/core/\r\n.. _ORM: https://docs.sqlalchemy.org/en/latest/orm/\r\n.. _venv: https://docs.python.org/3/library/venv.html\r\n\r\n.. _MIT license: https://opensource.org/licenses/MIT\r\n\r\n\r\n.. |--| unicode:: U+2013\r\n\r\n\r\n.. |PyPI version| image:: https://img.shields.io/pypi/v/treedb.svg\r\n    :target: https://pypi.org/project/treedb/\r\n    :alt: Latest PyPI Version\r\n.. |License| image:: https://img.shields.io/pypi/l/treedb.svg\r\n    :target: https://github.com/glottolog/treedb/blob/master/LICENSE.txt\r\n    :alt: License\r\n.. |Supported Python| image:: https://img.shields.io/pypi/pyversions/treedb.svg\r\n    :target: https://pypi.org/project/treedb/\r\n    :alt: Supported Python Versions\r\n.. |Wheel| image:: https://img.shields.io/pypi/wheel/treedb.svg\r\n    :target: https://pypi.org/project/treedb/#files\r\n    :alt: Wheel format\r\n\r\n.. |Build| image:: https://github.com/glottolog/treedb/actions/workflows/build.yaml/badge.svg?branch=master\r\n    :target: https://github.com/glottolog/treedb/actions/workflows/build.yaml?query=branch%3Amaster\r\n    :alt: Build\r\n.. |Codecov| image:: https://codecov.io/gh/glottolog/treedb/branch/master/graph/badge.svg\r\n    :target: https://codecov.io/gh/glottolog/treedb\r\n    :alt: Codecov\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Glottolog languoid tree as SQLite database",
    "version": "2.6.3",
    "project_urls": {
        "CI": "https://github.com/glottolog/treedb/actions",
        "Changelog": "https://github.com/glottolog/treedb/blob/master/CHANGES.rst",
        "Coverage": "https://codecov.io/gh/glottolog/treedb",
        "Homepage": "https://github.com/glottolog/treedb",
        "Issue Tracker": "https://github.com/glottolog/treedb/issues"
    },
    "split_keywords": [
        "glottolog",
        "languoids",
        "sqlite3",
        "database"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d1a7571a75224873b744484f2e163258d12623fab76b196c7345a62d31f12054",
                "md5": "6d7bd6e3a8556a4fd73ca774cd337509",
                "sha256": "ba4372073c293f7e9142061402dad06ebae7c3d07341afd790c13b9a6e4c811f"
            },
            "downloads": -1,
            "filename": "treedb-2.6.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6d7bd6e3a8556a4fd73ca774cd337509",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 70325,
            "upload_time": "2024-03-17T17:03:11",
            "upload_time_iso_8601": "2024-03-17T17:03:11.395195Z",
            "url": "https://files.pythonhosted.org/packages/d1/a7/571a75224873b744484f2e163258d12623fab76b196c7345a62d31f12054/treedb-2.6.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "768b75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947",
                "md5": "9b3ad773cabc1b4a9429f4142eed5b29",
                "sha256": "bc031feb9fe46e6e9e91153a4de079648d97ecf8da3b5193764093df60ff9af5"
            },
            "downloads": -1,
            "filename": "treedb-2.6.3.zip",
            "has_sig": false,
            "md5_digest": "9b3ad773cabc1b4a9429f4142eed5b29",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 529978,
            "upload_time": "2024-03-17T17:03:14",
            "upload_time_iso_8601": "2024-03-17T17:03:14.036481Z",
            "url": "https://files.pythonhosted.org/packages/76/8b/75bf1d7aa451478a6960e03d367e5da23a2e7e3f961931a34bbb01a5e947/treedb-2.6.3.zip",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-17 17:03:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "glottolog",
    "github_project": "treedb",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "treedb"
}
        
Elapsed time: 0.22215s