DHParser

Name	DHParser JSON
Version	1.8.1 JSON
	download
home_page	https://gitlab.lrz.de/badw-it/DHParser
Summary	Parser Generator and DSL-construction-kit
upload_time	2024-10-19 10:22:21
maintainer	None
docs_url	None
author	Eckhart Arnold
requires_python	<4.0,>=3.7
license	Apache-2
keywords	parser generator domain specific languages digital humanities parsing expression grammar ebnf
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            DHParser
========

![](https://img.shields.io/pypi/v/DHParser) 
![](https://img.shields.io/pypi/status/DHParser)
![](https://img.shields.io/pypi/l/DHParser)
![](https://img.shields.io/pypi/pyversions/DHParser)

DHParser - Rapid prototyping of formal grammars and 
domain specific languages (DSL) in the Digital Humanities

This software is open source software under the Apache 2.0-License (see section License, below).

Copyright 2016-2024  Eckhart Arnold, Bavarian Academy of Sciences and Humanities


Purpose
-------

DHParser has been developed with three main purposes in mind:

1. Developing parsers for domain specific languages and notations, either existing
   notations, like, LaTeX, or newly created DSLs, like the
   [Medieval-Latin-Dictionary-DSL](https://gitlab.lrz.de/badw-it/mlw-dsl-oeffentlich).

   Typically, these languages are strict formal languages the grammar of
   which can be described with context-free grammars. (In cases where
   this does not hold like TeX, it is often still possible to describe a 
   reasonably large subset of the formal language with a context free grammar.) 

2. Developing parsers for semi-structured or informally structured
   text-data. 
   
   This kind of data is typically what you get when retro-digitizing
   textual data like printed bibliographies, or reference works or
   dictionaries. Often such works can be captured with a formal 
   grammar, but these grammars require a lot of iterations and tests
   to develop and usually become much more ramified than the grammars
   of well-designed formal languages. Thus, DHParser's elaborated
   testing and debugging-framework for grammars.

   (See Florian Zacherl's [Dissertation on the retro-digitalization of dictionary data](https://www.kit.gwi.uni-muenchen.de/?band=82908&v=1)
   for an interesting case study. I am confident that the development of
   a suitable formal grammar is much easier with an elaborated framework
   like DHParser than with the PHP-parsing-expression-grammar-kit that
   Florian Zacherl has used.)

3. Developing processing-pipelines for tree-structured data. 

   In typical digital humanities applications one wants to produce
   different forms of output (say, printed, online-human-readable,
   online-machine-readable) from one and the same source of data.
   Therefore, the parsing stage (if the data source is structured
   text-data) will be followed by more or less intricate bifurcated
   processing pipelines.
   

Features
--------

* Memoizing packrat-parser based on Parsing Expression Grammars. This
  means: 
  
    - Linear parsing time

    - Any EBNF-grammar supported, including left-recursive grammars 
      (via "seed and grow"-algorithm)

    - Unlimited look ahead and look behind

* [Macros](
  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#macros)
  to avoid code-repetition within grammars

* [Declarative tree-transformations](
  https://dhparser.readthedocs.io/en/latest/manuals/03_AST-transformation.html#declarative-tree-transformation)
  for post-processing syntax-trees

* Unit testing framework and post-mortem-debuger for [test-driven grammar
  development](
  https://dhparser.readthedocs.io/en/latest/Overview.html#test-driven-grammar-development)
  and rapid-prototyping of grammars

* [Customizable error reporting](
  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#error-catching),
  [recovery after syntax errors](
  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#skip-and-resume) 
  and support for [fail-tolerant parsers](
  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#fail-tolerant-parsing)

* Support for [Language-servers](https://microsoft.github.io/language-server-protocol/)

* Workflow-support and [data-processing-pipelines](
  https://dhparser.readthedocs.io/en/latest/manuals/04_postprocessing.html#processing-pipelines)

* XML-support like [mapping flat-text to the DOM-tree](
  https://dhparser.readthedocs.io/en/latest/manuals/02_document-trees.html#content-mappings)
  ("node-tree" in DHParser's terminology) and 
  [adding markup in arbitrary places](
  https://dhparser.readthedocs.io/en/latest/manuals/02_document-trees.html#markup-insertion),
  even if this requires splitting tags.

* Full Unicode support

* No dependencies except the Python Standard Library

* [Extensive documentation](https://dhparser.readthedocs.io) and examples


Ease of use
-----------

**Directly compile existing EBNF-grammars:**

DHParser recognizes various dialects of EBNF or PEG-syntax for specifying
grammars. For any already given grammar-specification in EBNF or PEG, 
it is not unlikely that DHParser can generate a parser either right away or 
with only minor changes or additions.

You can try this by compiling the file `XML_W3C_SPEC.ebnf` in the `examples/XML`
of the source-tree which contains the official XML-grammar directly extracted
from [www.w3.org/TR/xml/](https://www.w3.org/TR/xml/):

    $ dhparser examples/XML/XML_W3C_SPEC.ebnf

This command produces a Python-Script `XML_W3C_SPECParser.py` in the same
directory as the EBNF-file. This file can be run on any XML-file and will
yield its concrete syntax tree, e.g.:

    $ python examples/XML/XML_W3C_SPECParser.py examples/XML/example.xml

Note, that the concrete syntax tree of an XML file as returned by the generated
parser is not the same as the data-tree encoded by that very XML-file. In 
order to receive the data tree, further transformations are necessary. See
`examples/XML/XMLParser.py` for an example of how this can be done.

**Use (small) grammars on the fly in Python code:**

Small grammars can also directly be compiled from Python-code. (Here, we
use DHParser's preferred syntax which does not require trailing semicolons
and uses the tilde `~` as a special sign to denote "insignificant" whitespace.)

key_value_store.py:

    #!/usr/bin/env python 
    # A mini-DSL for a key value store
    from DHParser.dsl import create_parser

    # specify the grammar of your DSL in EBNF-notation
    grammar = '''@ drop = whitespace, strings
    key_store   = ~ { entry }
    entry       = key "="~ value          # ~ means: insignificant whitespace 
    key         = /\w+/~                  # Scanner-less parsing: Use regular
    value       = /\"[^"\n]*\"/~          # expressions wherever you like'''

    # generating a parser is almost as simple as compiling a regular expression
    parser = create_parser(grammar)       # parser factory for thread-safety

Now, parse some text and extract the data from the Python-shell:

    >>> from key_value_store import parser
    >>> text = '''
            title    = "Odysee 2001"
            director = "Stanley Kubrick"
        '''
    >>> data = parser(text)
    >>> for entry in data.select('entry'):
            print(entry['key'], entry['value'])

    title "Odysee 2001"
    director "Stanley Kubrick"

Or, serialize as XML:

    >>> print(data.as_xml())

    <key_store>
      <entry>
        <key>title</key>
        <value>"Odysee 2001"</value>
      </entry>
      <entry>
        <key>director</key>
        <value>"Stanley Kubrick"</value>
      </entry>
    </key_store>

**Set up DSL-projects with unit-tests for long-term-development:** 

For larger projects that require testing and incremental grammar development,
use:
  
  $ dhparser NEW_PROJECT_NAME 

to set up a project-directory with all the scaffolding for a new DSL-project,
including the full unit-testing-framework.

Installation
------------

You can install DHParser from the Python package index [pypi.org](https://pypi.org):

    python -m pip install --user DHParser

Alternatively, you can clone the latest version from 
[gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser)


Getting Started
---------------

See [Introduction.md](https://gitlab.lrz.de/badw-it/DHParser/blob/master/Introduction.md) for the
motivation and an overview how DHParser works or jump right into the
[Step by Step Guide](https://gitlab.lrz.de/badw-it/DHParser/blob/master/documentation_src/StepByStepGuide.rst) to
learn how to set up and use DHParser.
Or have a look at the 
[comprehensive overview of DHParser's features](https://gitlab.lrz.de/badw-it/DHParser/-/blob/master/documentation_src/Overview.rst) 
to see how DHParser supports the construction of domain specific languages.

Documentation
-------------

For the full documentation see: [dhparser.readthedocs.io](https://dhparser.readthedocs.io/en/latest/)

License
-------

DHParser is open source software under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

Copyright 2016-2022  Eckhart Arnold, Bavarian Academy of Sciences and Humanities

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


Optional Post-Installation
--------------------------

It is recommended that you install the `regex`-module
(https://bitbucket.org/mrabarnett/mrab-regex). If present, DHParser
will use `regex` instead of the built-in `re`-module for regular
expressions. `regex` is faster and more powerful than `re`.

In order to speed up DHParser even more, it can be compiled with
the Python to C compiler [Cython](https://cython.org). Cython
version 3.0 or higher is required to compile DHParser. Type:

    pip install cython
 
on the command-line to install cython. Once cython has been 
built and installed, you can run the "dhparser_cythonize"-script 
from the command line:

    dhparser_cythonize

On Linux-systems, in case you want to use clang instead of the
gcc-compiler, type:

    export CC=/usr/bin/clang; dhparser_cythonize 

Using clang may also help to circumvent C-errors like 
"incompatible pointer types".

The Cython-compiled version is about 2-3 times faster than the 
CPython-interpreted version. Compiling can take quite a while. 
If you are in a hurry, you can just can also just call
`dhparser_cythonize_stringview` which just compiles the 
stringview-module, which profits the most from being "cythonized".

Depending on the use case, e.g. when parsing large files, 
[PyPy3](https://www.pypy.org/) yields even greater speed-ups. 
However, in other cases pypy can also be noticeably slower than cpython!
To circumvent the longer startup times of pypy3 in comparison to CPython, 
it is recommended to use the xxxServer.py-scripts rather than calling 
the xxxParser.py-script each time when parsing many documents subsequently.

Another way to speed up your parser is by adding "@ optimizations = all"
at the beginning of your EBNF-grammar-file. DHParser then tries to 
compile (some) non-recursive parts of your grammar to entirely to regular 
rexpressions which yields a 10-20% speedup. Beware that this option
is still experimental!


Sources
-------

Find the sources on [gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser) .
Get them with:

    git clone https://gitlab.lrz.de/badw-it/DHParser

There exists a mirror of this repository on Github:
https://github.com/jecki/DHParser Be aware, though, that the github-mirror
may occasionally lag behind a few commits.


Packaging
---------

DHParser uses [Poetry](https://python-poetry.org/) for packaging and
dependency-management. In order to build a package from the sources,
type:

    poetry build

on the command line. The packages will then appear in the "dist" subdirectory.


Author
------

Author: Eckhart Arnold, Bavarian Academy of Sciences
Email:  arnold@badw.de

How to cite
-----------

If you use DHParser for Scientific Work, then please cite it as:

DHParser. A Parser-Generator for Digital-Humanities-Applications,  
Division for Digital Humanities Research & Development, Bavarian Academy of Science and Technology, 
Munich Germany 2017, https://gitlab.lrz.de/badw-it/dhparser

References and Acknowledgement
------------------------------

Eckhart Arnold: Domänenspezifische Notationen. Eine (noch)
unterschätzte Technologie in den Digitalen Geisteswissenschaften,
Präsentation auf dem
[dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen),
München 2016. Short-URL: [tiny.badw.de/2JVT][Arnold_2016]

[Arnold_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/EA_Pr%C3%A4sentation_Auszeichnungssprachen.pdf

Brian Ford: Parsing Expression Grammars: A Recognition-Based Syntactic
Foundation, Cambridge
Massachusetts, 2004. Short-URL:[t1p.de/jihs][Ford_2004]

[Ford_2004]: https://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf

[Ford_20XX]: https://bford.info/packrat/

Richard A. Frost, Rahmatullah Hafiz and Paul Callaghan: Parser
Combinators for Ambiguous Left-Recursive Grammars, in: P. Hudak and
D.S. Warren (Eds.): PADL 2008, LNCS 4902, pp. 167–181, Springer-Verlag
Berlin Heidelberg 2008.

Elizabeth Scott and Adrian Johnstone, GLL Parsing,
in: Electronic Notes in Theoretical Computer Science 253 (2010) 177–189,
[dotat.at/tmp/gll.pdf][scott_johnstone_2010]

[scott_johnstone_2010]: https://dotat.at/tmp/gll.pdf

Dominikus Herzberg: Objekt-orientierte Parser-Kombinatoren in Python,
Blog-Post, September, 18th 2008 on denkspuren. gedanken, ideen,
anregungen und links rund um informatik-themen, short-URL:
[t1p.de/bm3k][Herzberg_2008a]

[Herzberg_2008a]: https://denkspuren.blogspot.de/2008/09/objekt-orientierte-parser-kombinatoren.html

Dominikus Herzberg: Eine einfache Grammatik für LaTeX, Blog-Post,
September, 18th 2008 on denkspuren. gedanken, ideen, anregungen und
links rund um informatik-themen, short-URL:
[t1p.de/7jzh][Herzberg_2008b]

[Herzberg_2008b]: https://denkspuren.blogspot.de/2008/09/eine-einfache-grammatik-fr-latex.html

Dominikus Herzberg: Uniform Syntax, Blog-Post, February, 27th 2007 on
denkspuren. gedanken, ideen, anregungen und links rund um
informatik-themen, short-URL: [t1p.de/s0zk][Herzberg_2007]

[Herzberg_2007]: https://denkspuren.blogspot.de/2007/02/uniform-syntax.html

[ISO_IEC_14977]: https://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf

John MacFarlane, David Greenspan, Vicent Marti, Neil Williams,
Benjamin Dumke-von der Ehe, Jeff Atwood: CommonMark. A strongly
defined, highly compatible specification of
Markdown, 2017. [commonmark.org][MacFarlane_et_al_2017]

[MacFarlane_et_al_2017]: https://commonmark.org/

Stefan Müller: DSLs in den digitalen Geisteswissenschaften,
Präsentation auf dem
[dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen),
München 2016. Short-URL: [tiny.badw.de/2JVy][Müller_2016]

[Müller_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/Mueller_Anzeichnung_10_Vortrag_M%C3%BCnchen.pdf

Markus Voelter, Sbastian Benz, Christian Dietrich, Birgit Engelmann,
Mats Helander, Lennart Kats, Eelco Visser, Guido Wachsmuth:
DSL Engineering. Designing, Implementing and Using Domain-Specific Languages, 2013.
[dslbook.org/][voelter_2013]

Christopher Seaton: A Programming Language Where the Syntax and Semantics
are Mutuable at Runtime, University of Bristol 2007,
[chrisseaton.com/katahdin/katahdin.pdf][seaton_2007]

Vegard Øye: General Parser Combinators in Racket, 2012,
[epsil.github.io/gll/][vegard_2012]

[vegard_2012]: https://epsil.github.io/gll/

[seaton_2007]: https://chrisseaton.com/katahdin/katahdin.pdf

[voelter_2013]: http://dslbook.org/

[tex_stackexchange_no_bnf]: https://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-of-the-tex-language

[tex_stackexchange_latex_parsers]: http://tex.stackexchange.com/questions/4223/what-parsers-for-latex-mathematics-exist-outside-of-the-tex-engines

[XText_website]: https://www.eclipse.org/Xtext/

and many more...

Raw data

            {
    "_id": null,
    "home_page": "https://gitlab.lrz.de/badw-it/DHParser",
    "name": "DHParser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.7",
    "maintainer_email": null,
    "keywords": "parser generator, domain specific languages, Digital Humanities, parsing expression grammar, EBNF",
    "author": "Eckhart Arnold",
    "author_email": "eckhart.arnold@posteo.de",
    "download_url": "https://files.pythonhosted.org/packages/13/a2/c0fc1ba8bf91aefb3756e90170448b80135504d0ccb1bdb570d8b5474301/dhparser-1.8.1.tar.gz",
    "platform": null,
    "description": "DHParser\n========\n\n![](https://img.shields.io/pypi/v/DHParser) \n![](https://img.shields.io/pypi/status/DHParser)\n![](https://img.shields.io/pypi/l/DHParser)\n![](https://img.shields.io/pypi/pyversions/DHParser)\n\nDHParser - Rapid prototyping of formal grammars and \ndomain specific languages (DSL) in the Digital Humanities\n\nThis software is open source software under the Apache 2.0-License (see section License, below).\n\nCopyright 2016-2024  Eckhart Arnold, Bavarian Academy of Sciences and Humanities\n\n\nPurpose\n-------\n\nDHParser has been developed with three main purposes in mind:\n\n1. Developing parsers for domain specific languages and notations, either existing\n   notations, like, LaTeX, or newly created DSLs, like the\n   [Medieval-Latin-Dictionary-DSL](https://gitlab.lrz.de/badw-it/mlw-dsl-oeffentlich).\n\n   Typically, these languages are strict formal languages the grammar of\n   which can be described with context-free grammars. (In cases where\n   this does not hold like TeX, it is often still possible to describe a \n   reasonably large subset of the formal language with a context free grammar.) \n\n2. Developing parsers for semi-structured or informally structured\n   text-data. \n   \n   This kind of data is typically what you get when retro-digitizing\n   textual data like printed bibliographies, or reference works or\n   dictionaries. Often such works can be captured with a formal \n   grammar, but these grammars require a lot of iterations and tests\n   to develop and usually become much more ramified than the grammars\n   of well-designed formal languages. Thus, DHParser's elaborated\n   testing and debugging-framework for grammars.\n\n   (See Florian Zacherl's [Dissertation on the retro-digitalization of dictionary data](https://www.kit.gwi.uni-muenchen.de/?band=82908&v=1)\n   for an interesting case study. I am confident that the development of\n   a suitable formal grammar is much easier with an elaborated framework\n   like DHParser than with the PHP-parsing-expression-grammar-kit that\n   Florian Zacherl has used.)\n\n3. Developing processing-pipelines for tree-structured data. \n\n   In typical digital humanities applications one wants to produce\n   different forms of output (say, printed, online-human-readable,\n   online-machine-readable) from one and the same source of data.\n   Therefore, the parsing stage (if the data source is structured\n   text-data) will be followed by more or less intricate bifurcated\n   processing pipelines.\n   \n\nFeatures\n--------\n\n* Memoizing packrat-parser based on Parsing Expression Grammars. This\n  means: \n  \n    - Linear parsing time\n\n    - Any EBNF-grammar supported, including left-recursive grammars \n      (via \"seed and grow\"-algorithm)\n\n    - Unlimited look ahead and look behind\n\n* [Macros](\n  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#macros)\n  to avoid code-repetition within grammars\n\n* [Declarative tree-transformations](\n  https://dhparser.readthedocs.io/en/latest/manuals/03_AST-transformation.html#declarative-tree-transformation)\n  for post-processing syntax-trees\n\n* Unit testing framework and post-mortem-debuger for [test-driven grammar\n  development](\n  https://dhparser.readthedocs.io/en/latest/Overview.html#test-driven-grammar-development)\n  and rapid-prototyping of grammars\n\n* [Customizable error reporting](\n  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#error-catching),\n  [recovery after syntax errors](\n  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#skip-and-resume) \n  and support for [fail-tolerant parsers](\n  https://dhparser.readthedocs.io/en/latest/manuals/01_EBNF-grammars.html#fail-tolerant-parsing)\n\n* Support for [Language-servers](https://microsoft.github.io/language-server-protocol/)\n\n* Workflow-support and [data-processing-pipelines](\n  https://dhparser.readthedocs.io/en/latest/manuals/04_postprocessing.html#processing-pipelines)\n\n* XML-support like [mapping flat-text to the DOM-tree](\n  https://dhparser.readthedocs.io/en/latest/manuals/02_document-trees.html#content-mappings)\n  (\"node-tree\" in DHParser's terminology) and \n  [adding markup in arbitrary places](\n  https://dhparser.readthedocs.io/en/latest/manuals/02_document-trees.html#markup-insertion),\n  even if this requires splitting tags.\n\n* Full Unicode support\n\n* No dependencies except the Python Standard Library\n\n* [Extensive documentation](https://dhparser.readthedocs.io) and examples\n\n\nEase of use\n-----------\n\n**Directly compile existing EBNF-grammars:**\n\nDHParser recognizes various dialects of EBNF or PEG-syntax for specifying\ngrammars. For any already given grammar-specification in EBNF or PEG, \nit is not unlikely that DHParser can generate a parser either right away or \nwith only minor changes or additions.\n\nYou can try this by compiling the file `XML_W3C_SPEC.ebnf` in the `examples/XML`\nof the source-tree which contains the official XML-grammar directly extracted\nfrom [www.w3.org/TR/xml/](https://www.w3.org/TR/xml/):\n\n    $ dhparser examples/XML/XML_W3C_SPEC.ebnf\n\nThis command produces a Python-Script `XML_W3C_SPECParser.py` in the same\ndirectory as the EBNF-file. This file can be run on any XML-file and will\nyield its concrete syntax tree, e.g.:\n\n    $ python examples/XML/XML_W3C_SPECParser.py examples/XML/example.xml\n\nNote, that the concrete syntax tree of an XML file as returned by the generated\nparser is not the same as the data-tree encoded by that very XML-file. In \norder to receive the data tree, further transformations are necessary. See\n`examples/XML/XMLParser.py` for an example of how this can be done.\n\n**Use (small) grammars on the fly in Python code:**\n\nSmall grammars can also directly be compiled from Python-code. (Here, we\nuse DHParser's preferred syntax which does not require trailing semicolons\nand uses the tilde `~` as a special sign to denote \"insignificant\" whitespace.)\n\nkey_value_store.py:\n\n    #!/usr/bin/env python \n    # A mini-DSL for a key value store\n    from DHParser.dsl import create_parser\n\n    # specify the grammar of your DSL in EBNF-notation\n    grammar = '''@ drop = whitespace, strings\n    key_store   = ~ { entry }\n    entry       = key \"=\"~ value          # ~ means: insignificant whitespace \n    key         = /\\w+/~                  # Scanner-less parsing: Use regular\n    value       = /\\\"[^\"\\n]*\\\"/~          # expressions wherever you like'''\n\n    # generating a parser is almost as simple as compiling a regular expression\n    parser = create_parser(grammar)       # parser factory for thread-safety\n\nNow, parse some text and extract the data from the Python-shell:\n\n    >>> from key_value_store import parser\n    >>> text = '''\n            title    = \"Odysee 2001\"\n            director = \"Stanley Kubrick\"\n        '''\n    >>> data = parser(text)\n    >>> for entry in data.select('entry'):\n            print(entry['key'], entry['value'])\n\n    title \"Odysee 2001\"\n    director \"Stanley Kubrick\"\n\nOr, serialize as XML:\n\n    >>> print(data.as_xml())\n\n    <key_store>\n      <entry>\n        <key>title</key>\n        <value>\"Odysee 2001\"</value>\n      </entry>\n      <entry>\n        <key>director</key>\n        <value>\"Stanley Kubrick\"</value>\n      </entry>\n    </key_store>\n\n**Set up DSL-projects with unit-tests for long-term-development:** \n\nFor larger projects that require testing and incremental grammar development,\nuse:\n  \n  $ dhparser NEW_PROJECT_NAME \n\nto set up a project-directory with all the scaffolding for a new DSL-project,\nincluding the full unit-testing-framework.\n\nInstallation\n------------\n\nYou can install DHParser from the Python package index [pypi.org](https://pypi.org):\n\n    python -m pip install --user DHParser\n\nAlternatively, you can clone the latest version from \n[gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser)\n\n\nGetting Started\n---------------\n\nSee [Introduction.md](https://gitlab.lrz.de/badw-it/DHParser/blob/master/Introduction.md) for the\nmotivation and an overview how DHParser works or jump right into the\n[Step by Step Guide](https://gitlab.lrz.de/badw-it/DHParser/blob/master/documentation_src/StepByStepGuide.rst) to\nlearn how to set up and use DHParser.\nOr have a look at the \n[comprehensive overview of DHParser's features](https://gitlab.lrz.de/badw-it/DHParser/-/blob/master/documentation_src/Overview.rst) \nto see how DHParser supports the construction of domain specific languages.\n\nDocumentation\n-------------\n\nFor the full documentation see: [dhparser.readthedocs.io](https://dhparser.readthedocs.io/en/latest/)\n\nLicense\n-------\n\nDHParser is open source software under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).\n\nCopyright 2016-2022  Eckhart Arnold, Bavarian Academy of Sciences and Humanities\n\nLicensed under the Apache License, Version 2.0 (the \"License\");\nyou may not use this file except in compliance with the License.\nYou may obtain a copy of the License at\n\n    https://www.apache.org/licenses/LICENSE-2.0\n\nUnless required by applicable law or agreed to in writing, software\ndistributed under the License is distributed on an \"AS IS\" BASIS,\nWITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\nSee the License for the specific language governing permissions and\nlimitations under the License.\n\n\nOptional Post-Installation\n--------------------------\n\nIt is recommended that you install the `regex`-module\n(https://bitbucket.org/mrabarnett/mrab-regex). If present, DHParser\nwill use `regex` instead of the built-in `re`-module for regular\nexpressions. `regex` is faster and more powerful than `re`.\n\nIn order to speed up DHParser even more, it can be compiled with\nthe Python to C compiler [Cython](https://cython.org). Cython\nversion 3.0 or higher is required to compile DHParser. Type:\n\n    pip install cython\n \non the command-line to install cython. Once cython has been \nbuilt and installed, you can run the \"dhparser_cythonize\"-script \nfrom the command line:\n\n    dhparser_cythonize\n\nOn Linux-systems, in case you want to use clang instead of the\ngcc-compiler, type:\n\n    export CC=/usr/bin/clang; dhparser_cythonize \n\nUsing clang may also help to circumvent C-errors like \n\"incompatible pointer types\".\n\nThe Cython-compiled version is about 2-3 times faster than the \nCPython-interpreted version. Compiling can take quite a while. \nIf you are in a hurry, you can just can also just call\n`dhparser_cythonize_stringview` which just compiles the \nstringview-module, which profits the most from being \"cythonized\".\n\nDepending on the use case, e.g. when parsing large files, \n[PyPy3](https://www.pypy.org/) yields even greater speed-ups. \nHowever, in other cases pypy can also be noticeably slower than cpython!\nTo circumvent the longer startup times of pypy3 in comparison to CPython, \nit is recommended to use the xxxServer.py-scripts rather than calling \nthe xxxParser.py-script each time when parsing many documents subsequently.\n\nAnother way to speed up your parser is by adding \"@ optimizations = all\"\nat the beginning of your EBNF-grammar-file. DHParser then tries to \ncompile (some) non-recursive parts of your grammar to entirely to regular \nrexpressions which yields a 10-20% speedup. Beware that this option\nis still experimental!\n\n\nSources\n-------\n\nFind the sources on [gitlab.lrz.de/badw-it/DHParser](https://gitlab.lrz.de/badw-it/DHParser) .\nGet them with:\n\n    git clone https://gitlab.lrz.de/badw-it/DHParser\n\nThere exists a mirror of this repository on Github:\nhttps://github.com/jecki/DHParser Be aware, though, that the github-mirror\nmay occasionally lag behind a few commits.\n\n\nPackaging\n---------\n\nDHParser uses [Poetry](https://python-poetry.org/) for packaging and\ndependency-management. In order to build a package from the sources,\ntype:\n\n    poetry build\n\non the command line. The packages will then appear in the \"dist\" subdirectory.\n\n\nAuthor\n------\n\nAuthor: Eckhart Arnold, Bavarian Academy of Sciences\nEmail:  arnold@badw.de\n\nHow to cite\n-----------\n\nIf you use DHParser for Scientific Work, then please cite it as:\n\nDHParser. A Parser-Generator for Digital-Humanities-Applications,  \nDivision for Digital Humanities Research & Development, Bavarian Academy of Science and Technology, \nMunich Germany 2017, https://gitlab.lrz.de/badw-it/dhparser\n\nReferences and Acknowledgement\n------------------------------\n\nEckhart Arnold: Dom\u00e4nenspezifische Notationen. Eine (noch)\nuntersch\u00e4tzte Technologie in den Digitalen Geisteswissenschaften,\nPr\u00e4sentation auf dem\n[dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen),\nM\u00fcnchen 2016. Short-URL: [tiny.badw.de/2JVT][Arnold_2016]\n\n[Arnold_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/EA_Pr%C3%A4sentation_Auszeichnungssprachen.pdf\n\nBrian Ford: Parsing Expression Grammars: A Recognition-Based Syntactic\nFoundation, Cambridge\nMassachusetts, 2004. Short-URL:[t1p.de/jihs][Ford_2004]\n\n[Ford_2004]: https://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf\n\n[Ford_20XX]: https://bford.info/packrat/\n\nRichard A. Frost, Rahmatullah Hafiz and Paul Callaghan: Parser\nCombinators for Ambiguous Left-Recursive Grammars, in: P. Hudak and\nD.S. Warren (Eds.): PADL 2008, LNCS 4902, pp. 167\u2013181, Springer-Verlag\nBerlin Heidelberg 2008.\n\nElizabeth Scott and Adrian Johnstone, GLL Parsing,\nin: Electronic Notes in Theoretical Computer Science 253 (2010) 177\u2013189,\n[dotat.at/tmp/gll.pdf][scott_johnstone_2010]\n\n[scott_johnstone_2010]: https://dotat.at/tmp/gll.pdf\n\nDominikus Herzberg: Objekt-orientierte Parser-Kombinatoren in Python,\nBlog-Post, September, 18th 2008 on denkspuren. gedanken, ideen,\nanregungen und links rund um informatik-themen, short-URL:\n[t1p.de/bm3k][Herzberg_2008a]\n\n[Herzberg_2008a]: https://denkspuren.blogspot.de/2008/09/objekt-orientierte-parser-kombinatoren.html\n\nDominikus Herzberg: Eine einfache Grammatik f\u00fcr LaTeX, Blog-Post,\nSeptember, 18th 2008 on denkspuren. gedanken, ideen, anregungen und\nlinks rund um informatik-themen, short-URL:\n[t1p.de/7jzh][Herzberg_2008b]\n\n[Herzberg_2008b]: https://denkspuren.blogspot.de/2008/09/eine-einfache-grammatik-fr-latex.html\n\nDominikus Herzberg: Uniform Syntax, Blog-Post, February, 27th 2007 on\ndenkspuren. gedanken, ideen, anregungen und links rund um\ninformatik-themen, short-URL: [t1p.de/s0zk][Herzberg_2007]\n\n[Herzberg_2007]: https://denkspuren.blogspot.de/2007/02/uniform-syntax.html\n\n[ISO_IEC_14977]: https://www.cl.cam.ac.uk/~mgk25/iso-14977.pdf\n\nJohn MacFarlane, David Greenspan, Vicent Marti, Neil Williams,\nBenjamin Dumke-von der Ehe, Jeff Atwood: CommonMark. A strongly\ndefined, highly compatible specification of\nMarkdown, 2017. [commonmark.org][MacFarlane_et_al_2017]\n\n[MacFarlane_et_al_2017]: https://commonmark.org/\n\nStefan M\u00fcller: DSLs in den digitalen Geisteswissenschaften,\nPr\u00e4sentation auf dem\n[dhmuc-Workshop: Digitale Editionen und Auszeichnungssprachen](https://dhmuc.hypotheses.org/workshop-digitale-editionen-und-auszeichnungssprachen),\nM\u00fcnchen 2016. Short-URL: [tiny.badw.de/2JVy][M\u00fcller_2016]\n\n[M\u00fcller_2016]: https://f.hypotheses.org/wp-content/blogs.dir/1856/files/2016/12/Mueller_Anzeichnung_10_Vortrag_M%C3%BCnchen.pdf\n\nMarkus Voelter, Sbastian Benz, Christian Dietrich, Birgit Engelmann,\nMats Helander, Lennart Kats, Eelco Visser, Guido Wachsmuth:\nDSL Engineering. Designing, Implementing and Using Domain-Specific Languages, 2013.\n[dslbook.org/][voelter_2013]\n\nChristopher Seaton: A Programming Language Where the Syntax and Semantics\nare Mutuable at Runtime, University of Bristol 2007,\n[chrisseaton.com/katahdin/katahdin.pdf][seaton_2007]\n\nVegard \u00d8ye: General Parser Combinators in Racket, 2012,\n[epsil.github.io/gll/][vegard_2012]\n\n[vegard_2012]: https://epsil.github.io/gll/\n\n[seaton_2007]: https://chrisseaton.com/katahdin/katahdin.pdf\n\n[voelter_2013]: http://dslbook.org/\n\n[tex_stackexchange_no_bnf]: https://tex.stackexchange.com/questions/4201/is-there-a-bnf-grammar-of-the-tex-language\n\n[tex_stackexchange_latex_parsers]: http://tex.stackexchange.com/questions/4223/what-parsers-for-latex-mathematics-exist-outside-of-the-tex-engines\n\n[XText_website]: https://www.eclipse.org/Xtext/\n\nand many more...",
    "bugtrack_url": null,
    "license": "Apache-2",
    "summary": "Parser Generator and DSL-construction-kit",
    "version": "1.8.1",
    "project_urls": {
        "Documentation": "https://dhparser.readthedocs.io/en/latest/",
        "Homepage": "https://gitlab.lrz.de/badw-it/DHParser",
        "Repository": "https://gitlab.lrz.de/badw-it/DHParser"
    },
    "split_keywords": [
        "parser generator",
        " domain specific languages",
        " digital humanities",
        " parsing expression grammar",
        " ebnf"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "62e421774ea615a045b4b6a213eca55d919619741fe2eabed628d3e25c8136c1",
                "md5": "4dd8f3df4100c9ca9ad0e86750bcefcc",
                "sha256": "abe8198667d957d0c1770647cd890f3d5e2f050be6b7a6fb074300bd3492c5ff"
            },
            "downloads": -1,
            "filename": "dhparser-1.8.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4dd8f3df4100c9ca9ad0e86750bcefcc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.7",
            "size": 498899,
            "upload_time": "2024-10-19T10:22:20",
            "upload_time_iso_8601": "2024-10-19T10:22:20.151849Z",
            "url": "https://files.pythonhosted.org/packages/62/e4/21774ea615a045b4b6a213eca55d919619741fe2eabed628d3e25c8136c1/dhparser-1.8.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "13a2c0fc1ba8bf91aefb3756e90170448b80135504d0ccb1bdb570d8b5474301",
                "md5": "3f891d5e775fc37ca8515c0cde68aa32",
                "sha256": "1895be9c8b5b4d9f6838cf390587b656e9822e691b0bebcfbccae80b4e70243b"
            },
            "downloads": -1,
            "filename": "dhparser-1.8.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3f891d5e775fc37ca8515c0cde68aa32",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.7",
            "size": 457689,
            "upload_time": "2024-10-19T10:22:21",
            "upload_time_iso_8601": "2024-10-19T10:22:21.644424Z",
            "url": "https://files.pythonhosted.org/packages/13/a2/c0fc1ba8bf91aefb3756e90170448b80135504d0ccb1bdb570d8b5474301/dhparser-1.8.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-19 10:22:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "dhparser"
}

Eckhart Arnold