qante - Query ANnotated TExt
============================
Motivation
----------
Extracting the highly-valuable data from unstructured text often
results in hard-to-read, brittle, difficult-to-maintain code.
The problem is that using regular expressions directly embedded
in the program control flow does not provide the best level of
abstraction. We propose a query language (based on the tuple
relational calculus) that facilitates data extraction.
Developers can explicitly express their intent declaratively,
making their code much easier to write, read, and maintain.
Solution
--------
This package allows programmers to express what they are searching
for by using higher-level concepts to express their query as tags,
locations, and expressions on location relations.
The *location* of a string of characters within the document is
the interval defining its starting and ending position.
Locations are grouped into sets named by *tags*. Tags can be
used in conjunctions and disjunctions of interval relations to
query for tuples of locations.
Documentation
-------------
We invite you to view our YouTube `video`_ of our `presentation`_ from the `Playlist`_
for `PyData Global 2022`_.
We presented this material from our `GitHub`_ repo:
* `pydataG22.pdf`_ slides of our talk.
* `ipynb/pydata.ipynb`_ a ``jupyter notebook`` with examples.
* `RELEASE_NOTES.rst` describes updates for each release.
Use one of these pip or python commands (rev 3 or above) to install from `PyPI`_::
pip install qante
python -m pip install qante
Use python docstrings for API Documentation::
python # rev 3 or above
from quante.tagger import Tagger
from quante.query import Query
from quante import loctuple as lt
from quant.table import get_table
help(Tagger) # annotate text with tags using tagRE('tagname', regexp)
help(Query) # Syntax for querying annotated text
help(lt) # Predicates used by queries
help(get_table) # extract table (as dictionaries) from text
See also: "API Documentation" at the end of our jupyter notebook.
We welcome your questions by electronic mail at: qante{at}asgard.com
.. _`GitHub`: https://github.com/AsgardSystems/qante
.. _`PyPI`: https://pypi.org
.. _`video`: https://www.youtube.com/watch?v=w9UfQ1TKIuE&t=0s
.. _`presentation`: https://global2022.pydata.org/cfp/talk/LUYPAE/
.. _`PyData Global 2022`: https://pydata.org/global2022/
.. _`Playlist`: https://www.youtube.com/playlist?list=PLGVZCDnMOq0qgYUt0yn7F80wmzCnj2dEq
.. _`pydataG22.pdf`: https://github.com/AsgardSystems/qante/blob/main/pydataG22.pdf
.. _`RELEASE_NOTES.rst`: https://github.com/AsgardSystems/qante/blob/main/RELEASE_NOTES.rst
.. _`ipynb/pydata.ipynb`: https://github.com/AsgardSystems/qante/blob/main/ipynb/pydata.ipynb
Raw data
{
"_id": null,
"home_page": "https://github.com/AsgardSystems/qante.git",
"name": "qante",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "text,text processing,search,query,sequences,annotation",
"author": "Martha L. Escobar-Molano, David A. Barrett",
"author_email": "qante@asgard.com",
"download_url": "https://files.pythonhosted.org/packages/2c/19/ce6a7965ab4fe65079e2136d7d17773cba5a9daa1e0165d086469948f2d6/qante-0.0.5.tar.gz",
"platform": null,
"description": "qante - Query ANnotated TExt\n============================\n\nMotivation\n----------\n\nExtracting the highly-valuable data from unstructured text often\nresults in hard-to-read, brittle, difficult-to-maintain code.\nThe problem is that using regular expressions directly embedded\nin the program control flow does not provide the best level of\nabstraction. We propose a query language (based on the tuple\nrelational calculus) that facilitates data extraction.\nDevelopers can explicitly express their intent declaratively,\nmaking their code much easier to write, read, and maintain.\n\nSolution\n--------\n\nThis package allows programmers to express what they are searching\nfor by using higher-level concepts to express their query as tags,\nlocations, and expressions on location relations.\n\nThe *location* of a string of characters within the document is\nthe interval defining its starting and ending position.\n\nLocations are grouped into sets named by *tags*. Tags can be\nused in conjunctions and disjunctions of interval relations to\nquery for tuples of locations.\n\nDocumentation\n-------------\n\nWe invite you to view our YouTube `video`_ of our `presentation`_ from the `Playlist`_ \nfor `PyData Global 2022`_. \n\nWe presented this material from our `GitHub`_ repo:\n\n* `pydataG22.pdf`_ slides of our talk.\n* `ipynb/pydata.ipynb`_ a ``jupyter notebook`` with examples.\n* `RELEASE_NOTES.rst` describes updates for each release.\n\n\nUse one of these pip or python commands (rev 3 or above) to install from `PyPI`_::\n\n pip install qante\n python -m pip install qante\n\n\nUse python docstrings for API Documentation::\n\n python # rev 3 or above\n from quante.tagger import Tagger\n from quante.query import Query\n from quante import loctuple as lt\n from quant.table import get_table\n\n help(Tagger) # annotate text with tags using tagRE('tagname', regexp)\n help(Query) # Syntax for querying annotated text\n help(lt) # Predicates used by queries\n help(get_table) # extract table (as dictionaries) from text\n\n\nSee also: \"API Documentation\" at the end of our jupyter notebook.\n\n\nWe welcome your questions by electronic mail at: qante{at}asgard.com\n\n.. _`GitHub`: https://github.com/AsgardSystems/qante\n.. _`PyPI`: https://pypi.org\n.. _`video`: https://www.youtube.com/watch?v=w9UfQ1TKIuE&t=0s\n.. _`presentation`: https://global2022.pydata.org/cfp/talk/LUYPAE/\n.. _`PyData Global 2022`: https://pydata.org/global2022/\n.. _`Playlist`: https://www.youtube.com/playlist?list=PLGVZCDnMOq0qgYUt0yn7F80wmzCnj2dEq\n.. _`pydataG22.pdf`: https://github.com/AsgardSystems/qante/blob/main/pydataG22.pdf\n.. _`RELEASE_NOTES.rst`: https://github.com/AsgardSystems/qante/blob/main/RELEASE_NOTES.rst\n.. _`ipynb/pydata.ipynb`: https://github.com/AsgardSystems/qante/blob/main/ipynb/pydata.ipynb",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "qante - Query ANnotated TExt",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/AsgardSystems/qante.git"
},
"split_keywords": [
"text",
"text processing",
"search",
"query",
"sequences",
"annotation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2c19ce6a7965ab4fe65079e2136d7d17773cba5a9daa1e0165d086469948f2d6",
"md5": "76e0dbd95008d7ab88463c44240a6d79",
"sha256": "55cbf90454f36c0c9a67c1bfafbaa82ffad30e90b37c53919576db5ce7db6e4f"
},
"downloads": -1,
"filename": "qante-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "76e0dbd95008d7ab88463c44240a6d79",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 30280,
"upload_time": "2023-09-12T18:53:03",
"upload_time_iso_8601": "2023-09-12T18:53:03.564829Z",
"url": "https://files.pythonhosted.org/packages/2c/19/ce6a7965ab4fe65079e2136d7d17773cba5a9daa1e0165d086469948f2d6/qante-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-12 18:53:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AsgardSystems",
"github_project": "qante",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "qante"
}