Metadata-Version: 2.1
Name: treedb
Version: 2.6.3
Summary: Glottolog languoid tree as SQLite database
Home-page: https://github.com/glottolog/treedb
Author: Sebastian Bank
Author-email: sebastian.bank@uni-leipzig.de
License: MIT
Project-URL: Changelog, https://github.com/glottolog/treedb/blob/master/CHANGES.rst
Project-URL: Issue Tracker, https://github.com/glottolog/treedb/issues
Project-URL: CI, https://github.com/glottolog/treedb/actions
Project-URL: Coverage, https://codecov.io/gh/glottolog/treedb
Keywords: glottolog languoids sqlite3 database
Platform: any
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE.txt
Requires-Dist: csv23~=0.3
Requires-Dist: pycountry==23.12.11
Requires-Dist: sqlalchemy>=1.4.24
Provides-Extra: dev
Requires-Dist: tox>=3; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: pep8-naming; extra == "dev"
Requires-Dist: wheel; extra == "dev"
Requires-Dist: twine; extra == "dev"
Provides-Extra: test
Requires-Dist: pytest>=6; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: coverage; extra == "test"
Provides-Extra: pretty
Requires-Dist: sqlparse>=0.3; extra == "pretty"
Provides-Extra: pandas
Requires-Dist: pandas>=1; extra == "pandas"

Glottolog ``treedb``
====================

|PyPI version| |License| |Supported Python| |Wheel|

|Build|

This tool loads the content of the `languoids/tree`_ directory from the
Glottolog_ `master repo`_ into a normalized SQLite_ database.

Each file under in that directory contains the definition of one Glottolog
languoid_. Loading their content into a relational database allows to perform
some advanced consistency checks (example_) and in general to execute queries
that inspect the languoid tree relations in a compact and performant way (e.g.
without repeatedly traversing the directory tree).

See pyglottolog_ for the more general official Python API to work with the repo
without a mandatory initial loading step (also provides programmatic access to
the references_ and a convenient command-line interface).

The database can be exported into a ZIP file containing one CSV file for
each database table, or written into a single denormalized CSV file with one
row per languoid (via a provided `SQL query`_).

As sqlite_ is the `most widely used`_ database, the database file itself
(e.g. ``treedb.sqlite3``) can be queried directly from most programming
environments. It can also be examined using graphical interfaces such as
DBeaver_, or via the `sqlite3 cli`_.

Python users can also use the provided SQLAlchemy_ models_ to build queries or
additional abstractions programmatically using `SQLAlchemy core`_ or the ORM_
(as more maintainable alternative to hand-written SQL queries).


Links
-----

- GitHub: https://github.com/glottolog/treedb
- PyPI: https://pypi.org/project/treedb/
- Example: https://nbviewer.jupyter.org/github/glottolog/treedb/blob/master/Stats.ipynb
- Changelog: https://github.com/glottolog/treedb/blob/master/CHANGES.rst
- Issue Tracker: https://github.com/glottolog/treedb/issues
- Download: https://pypi.org/project/treedb/#files


Quickstart
----------

Install ``treedb`` (and dependencies):

.. code:: bash

    $ pip install treedb

Clone the Glottolog `master repo`_ :

.. code:: bash

    $ git clone https://github.com/glottolog/glottolog.git

Note: ``treedb`` expects to find it under ``./glottolog/`` by default (i.e. under
the current directory), use ``treedb.set_root()`` to point it to a different
path.

Load ``./glottolog/languoids/tree/**/md.ini`` into an in-memory ``sqlite3`` database.
Write the denormalized example query into ``treedb.query.csv``:

.. code:: bash

    $ python -c "import treedb; treedb.load(); treedb.write_csv()"


Usage from Python
------------------

Start a Python shell:

.. code:: bash

    $ python

Import the package:

.. code:: python

    >>> import treedb

Use ``treedb.iterlanguoids()`` to iterate over languoids as (<path>, ``dict``) pairs:

.. code:: python

    >>> next(treedb.iterlanguoids())
    (('abin1243',), {'id': 'abin1243', 'parent_id': None, 'level': 'language', ...

Note: This is a low-level interface, which does not require loading.

Load the database into ``treedb.sqlite3`` (and set the default ``engine``):

.. code:: python

    >>> treedb.load('treedb.sqlite3')
    ...
    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>

Run consistency checks:

.. code:: python

    >>> treedb.check()
    ...
    True

Export into a ZIP file containing one CSV file per database table:

.. code:: python

    >>> treedb.csv_zipfile()
    ...Path('treedb.zip')

Execute the example query and write it into a CSV file with one row per languoid:

.. code:: python

    >>> treedb.write_csv()
    ...Path('treedb.query.csv')

Rebuild the database (e.g. after an update):

.. code:: python

    >>> treedb.load(rebuild=True)
    ...
    <treedb._proxies.SqliteEngineProxy filename='treedb.sqlite3' ...>

Execute a simple query with ``sqlalchemy`` core and write it to a CSV file:

.. code:: python

    >>> import sqlalchemy as sa
    >>> treedb.write_csv(sa.select(treedb.Languoid), filename='languoids.csv')
    ...Path('languoids.csv')

Get one row from the ``languoid`` table via `sqlalchemy` core (in Glottocode order):

.. code:: python

    >>> next(treedb.iterrows(sa.select(treedb.Languoid)))
    ('3adt1234', '3Ad-Tekles', 'dialect', 'nort3292', None, None, None, None)

Get one ``Languoid`` model instance via ``sqlalchemy`` orm (in Glottocode order):

.. code:: python

    >>> session = treedb.Session()
    >>> session.query(treedb.Languoid).first()
    <Languoid id='3adt1234' level='dialect' name='3Ad-Tekles'>
    >>> session.close()


See also
--------

- pyglottolog_ |--| official Python API to access https://github.com/glottolog/glottolog


License
-------

This tool is distributed under the `MIT license`_.


.. _Glottolog: https://glottolog.org/
.. _master repo: https://github.com/glottolog/glottolog
.. _languoids/tree: https://github.com/glottolog/glottolog/tree/master/languoids/tree
.. _SQLite: https://sqlite.org
.. _languoid: https://glottolog.org/meta/glossary#Languoid
.. _example: https://github.com/glottolog/treedb/blob/36c7cdcdd017e7aa4386ef085ee84fb3036c01ca/treedb/checks.py#L154-L169
.. _pyglottolog: https://github.com/glottolog/pyglottolog
.. _references: https://github.com/glottolog/glottolog/tree/master/references
.. _SQL query: https://github.com/glottolog/treedb/blob/master/treedb/queries.py
.. _most widely used: https://www.sqlite.org/mostdeployed.html
.. _DBeaver: https://dbeaver.io/
.. _sqlite3 cli: https://sqlite.org/cli.html
.. _SQLAlchemy: https://www.sqlalchemy.org
.. _models: https://github.com/glottolog/treedb/blob/master/treedb/models.py
.. _SQLAlchemy Core: https://docs.sqlalchemy.org/en/latest/core/
.. _ORM: https://docs.sqlalchemy.org/en/latest/orm/
.. _venv: https://docs.python.org/3/library/venv.html

.. _MIT license: https://opensource.org/licenses/MIT


.. |--| unicode:: U+2013


.. |PyPI version| image:: https://img.shields.io/pypi/v/treedb.svg
    :target: https://pypi.org/project/treedb/
    :alt: Latest PyPI Version
.. |License| image:: https://img.shields.io/pypi/l/treedb.svg
    :target: https://github.com/glottolog/treedb/blob/master/LICENSE.txt
    :alt: License
.. |Supported Python| image:: https://img.shields.io/pypi/pyversions/treedb.svg
    :target: https://pypi.org/project/treedb/
    :alt: Supported Python Versions
.. |Wheel| image:: https://img.shields.io/pypi/wheel/treedb.svg
    :target: https://pypi.org/project/treedb/#files
    :alt: Wheel format

.. |Build| image:: https://github.com/glottolog/treedb/actions/workflows/build.yaml/badge.svg?branch=master
    :target: https://github.com/glottolog/treedb/actions/workflows/build.yaml?query=branch%3Amaster
    :alt: Build
.. |Codecov| image:: https://codecov.io/gh/glottolog/treedb/branch/master/graph/badge.svg
    :target: https://codecov.io/gh/glottolog/treedb
    :alt: Codecov
