Name | funcparserlib JSON |
Version |
1.0.1
JSON |
| download |
home_page | https://funcparserlib.pirx.ru |
Summary | Recursive descent parsing library based on functional combinators |
upload_time | 2022-11-03 23:10:54 |
maintainer | |
docs_url | None |
author | Andrey Vlasovskikh |
requires_python | >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.* |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
Funcparserlib
=============
Recursive descent parsing library for Python based on functional combinators.
[![PyPI](https://img.shields.io/pypi/v/funcparserlib)](https://pypi.org/project/funcparserlib/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/funcparserlib)](https://pypi.org/project/funcparserlib/)
Description
-----------
The primary focus of `funcparserlib` is **parsing little languages** or **external DSLs** (domain specific languages).
Parsers made with `funcparserlib` are pure-Python LL(\*) parsers. It means that it's **very easy to write parsers** without thinking about lookaheads and other hardcore parsing stuff. However, recursive descent parsing is a rather slow method compared to LL(k) or LR(k) algorithms. Still, parsing with `funcparserlib` is **at least twice faster than PyParsing**, a very popular library for Python.
The source code of `funcparserlib` is only 1.2K lines of code, with lots of comments. Its API is fully type hinted. It features the longest parsed prefix error reporting, as well as a tiny lexer generator for token position tracking.
The idea of parser combinators used in `funcparserlib` comes from the [Introduction to Functional Programming](https://www.cl.cam.ac.uk/teaching/Lectures/funprog-jrh-1996/) course. We have converted it from ML into Python.
Installation
------------
You can install `funcparserlib` from [PyPI](https://pypi.org/project/funcparserlib/):
```shell
$ pip install funcparserlib
```
There are no dependencies on other libraries.
Documentation
-------------
* [Getting Started](https://funcparserlib.pirx.ru/getting-started/)
* Your **starting point** with `funcparserlib`
* [API Reference](https://funcparserlib.pirx.ru/api/)
* Learn the details of the API
There are several examples available in the `tests/` directory:
* [GraphViz DOT parser](https://github.com/vlasovskikh/funcparserlib/blob/master/tests/dot.py)
* [JSON parser](https://github.com/vlasovskikh/funcparserlib/blob/master/tests/json.py)
See also [the changelog](https://funcparserlib.pirx.ru/changes/).
Example
-------
Let's consider a little language of **numeric expressions** with a syntax similar to Python expressions. Here are some expression strings in this language:
```
0
1 + 2 + 3
-1 + 2 ** 32
3.1415926 * (2 + 7.18281828e-1) * 42
```
Here is **the complete source code** of the tokenizer and the parser for this language written using `funcparserlib`:
```python
from typing import List, Tuple, Union
from dataclasses import dataclass
from funcparserlib.lexer import make_tokenizer, TokenSpec, Token
from funcparserlib.parser import tok, Parser, many, forward_decl, finished
@dataclass
class BinaryExpr:
op: str
left: "Expr"
right: "Expr"
Expr = Union[BinaryExpr, int, float]
def tokenize(s: str) -> List[Token]:
specs = [
TokenSpec("whitespace", r"\s+"),
TokenSpec("float", r"[+\-]?\d+\.\d*([Ee][+\-]?\d+)*"),
TokenSpec("int", r"[+\-]?\d+"),
TokenSpec("op", r"(\*\*)|[+\-*/()]"),
]
tokenizer = make_tokenizer(specs)
return [t for t in tokenizer(s) if t.type != "whitespace"]
def parse(tokens: List[Token]) -> Expr:
int_num = tok("int") >> int
float_num = tok("float") >> float
number = int_num | float_num
expr: Parser[Token, Expr] = forward_decl()
parenthesized = -op("(") + expr + -op(")")
primary = number | parenthesized
power = primary + many(op("**") + primary) >> to_expr
term = power + many((op("*") | op("/")) + power) >> to_expr
sum = term + many((op("+") | op("-")) + term) >> to_expr
expr.define(sum)
document = expr + -finished
return document.parse(tokens)
def op(name: str) -> Parser[Token, str]:
return tok("op", name)
def to_expr(args: Tuple[Expr, List[Tuple[str, Expr]]]) -> Expr:
first, rest = args
result = first
for op, expr in rest:
result = BinaryExpr(op, result, expr)
return result
```
Now, consider this numeric expression: `3.1415926 * (2 + 7.18281828e-1) * 42`.
Let's `tokenize()` it using the tokenizer we've created with `funcparserlib.lexer`:
```
[
Token('float', '3.1415926'),
Token('op', '*'),
Token('op', '('),
Token('int', '2'),
Token('op', '+'),
Token('float', '7.18281828e-1'),
Token('op', ')'),
Token('op', '*'),
Token('int', '42'),
]
```
Let's `parse()` these tokens into an expression tree using our parser created with `funcparserlib.parser`:
```
BinaryExpr(
op='*',
left=BinaryExpr(
op='*',
left=3.1415926,
right=BinaryExpr(op='+', left=2, right=0.718281828),
),
right=42,
)
```
Learn how to write this parser using `funcparserlib` in the [Getting Started](https://funcparserlib.pirx.ru/getting-started/) guide!
Used By
-------
Some open-source projects that use `funcparserlib` as an explicit dependency:
* [Hy](https://github.com/hylang/hy), a Lisp dialect that's embedded in Python
* 4.2K stars, version `>= 1.0.0a0`, Python 3.7+
* [Splash](https://github.com/scrapinghub/splash), a JavaScript rendering service with HTTP API, by Scrapinghub
* 3.6K stars, version `*`. Python 3 in Docker
* [graphite-beacon](https://github.com/klen/graphite-beacon), a simple alerting system for Graphite metrics
* 459 stars, version `==0.3.6`, Python 2 and 3
* [blockdiag](https://github.com/blockdiag/blockdiag), generates block-diagram image file from spec-text file
* 148 stars, version `>= 1.0.0a0`, Python 3.7+
* [kll](https://github.com/kiibohd/kll), Keyboard Layout Language (KLL) compiler
* 109 stars, copied source code, Python 3.5+
Next
----
Read the [Getting Started](https://funcparserlib.pirx.ru/getting-started/) guide to start learning `funcparserlib`.
Raw data
{
"_id": null,
"home_page": "https://funcparserlib.pirx.ru",
"name": "funcparserlib",
"maintainer": "",
"docs_url": null,
"requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*",
"maintainer_email": "",
"keywords": "",
"author": "Andrey Vlasovskikh",
"author_email": "andrey.vlasovskikh@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/93/44/a21dfd9c45ad6909257e5186378a4fedaf41406824ce1ec06bc2a6c168e7/funcparserlib-1.0.1.tar.gz",
"platform": null,
"description": "Funcparserlib\n=============\n\nRecursive descent parsing library for Python based on functional combinators.\n\n[![PyPI](https://img.shields.io/pypi/v/funcparserlib)](https://pypi.org/project/funcparserlib/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/funcparserlib)](https://pypi.org/project/funcparserlib/)\n\n\nDescription\n-----------\n\nThe primary focus of `funcparserlib` is **parsing little languages** or **external DSLs** (domain specific languages).\n\nParsers made with `funcparserlib` are pure-Python LL(\\*) parsers. It means that it's **very easy to write parsers** without thinking about lookaheads and other hardcore parsing stuff. However, recursive descent parsing is a rather slow method compared to LL(k) or LR(k) algorithms. Still, parsing with `funcparserlib` is **at least twice faster than PyParsing**, a very popular library for Python.\n\nThe source code of `funcparserlib` is only 1.2K lines of code, with lots of comments. Its API is fully type hinted. It features the longest parsed prefix error reporting, as well as a tiny lexer generator for token position tracking.\n\nThe idea of parser combinators used in `funcparserlib` comes from the [Introduction to Functional Programming](https://www.cl.cam.ac.uk/teaching/Lectures/funprog-jrh-1996/) course. We have converted it from ML into Python.\n\n\nInstallation\n------------\n\nYou can install `funcparserlib` from [PyPI](https://pypi.org/project/funcparserlib/):\n\n```shell\n$ pip install funcparserlib\n```\n\nThere are no dependencies on other libraries.\n\n\nDocumentation\n-------------\n\n* [Getting Started](https://funcparserlib.pirx.ru/getting-started/)\n * Your **starting point** with `funcparserlib`\n* [API Reference](https://funcparserlib.pirx.ru/api/)\n * Learn the details of the API\n\nThere are several examples available in the `tests/` directory:\n\n* [GraphViz DOT parser](https://github.com/vlasovskikh/funcparserlib/blob/master/tests/dot.py)\n* [JSON parser](https://github.com/vlasovskikh/funcparserlib/blob/master/tests/json.py)\n\nSee also [the changelog](https://funcparserlib.pirx.ru/changes/).\n\n\nExample\n-------\n\nLet's consider a little language of **numeric expressions** with a syntax similar to Python expressions. Here are some expression strings in this language:\n\n```\n0\n1 + 2 + 3\n-1 + 2 ** 32\n3.1415926 * (2 + 7.18281828e-1) * 42\n```\n\n\nHere is **the complete source code** of the tokenizer and the parser for this language written using `funcparserlib`:\n\n```python\nfrom typing import List, Tuple, Union\nfrom dataclasses import dataclass\n\nfrom funcparserlib.lexer import make_tokenizer, TokenSpec, Token\nfrom funcparserlib.parser import tok, Parser, many, forward_decl, finished\n\n\n@dataclass\nclass BinaryExpr:\n op: str\n left: \"Expr\"\n right: \"Expr\"\n\n\nExpr = Union[BinaryExpr, int, float]\n\n\ndef tokenize(s: str) -> List[Token]:\n specs = [\n TokenSpec(\"whitespace\", r\"\\s+\"),\n TokenSpec(\"float\", r\"[+\\-]?\\d+\\.\\d*([Ee][+\\-]?\\d+)*\"),\n TokenSpec(\"int\", r\"[+\\-]?\\d+\"),\n TokenSpec(\"op\", r\"(\\*\\*)|[+\\-*/()]\"),\n ]\n tokenizer = make_tokenizer(specs)\n return [t for t in tokenizer(s) if t.type != \"whitespace\"]\n\n\ndef parse(tokens: List[Token]) -> Expr:\n int_num = tok(\"int\") >> int\n float_num = tok(\"float\") >> float\n number = int_num | float_num\n\n expr: Parser[Token, Expr] = forward_decl()\n parenthesized = -op(\"(\") + expr + -op(\")\")\n primary = number | parenthesized\n power = primary + many(op(\"**\") + primary) >> to_expr\n term = power + many((op(\"*\") | op(\"/\")) + power) >> to_expr\n sum = term + many((op(\"+\") | op(\"-\")) + term) >> to_expr\n expr.define(sum)\n\n document = expr + -finished\n\n return document.parse(tokens)\n\n\ndef op(name: str) -> Parser[Token, str]:\n return tok(\"op\", name)\n\n\ndef to_expr(args: Tuple[Expr, List[Tuple[str, Expr]]]) -> Expr:\n first, rest = args\n result = first\n for op, expr in rest:\n result = BinaryExpr(op, result, expr)\n return result\n```\n\nNow, consider this numeric expression: `3.1415926 * (2 + 7.18281828e-1) * 42`.\n\nLet's `tokenize()` it using the tokenizer we've created with `funcparserlib.lexer`:\n\n```\n[\n Token('float', '3.1415926'),\n Token('op', '*'),\n Token('op', '('),\n Token('int', '2'),\n Token('op', '+'),\n Token('float', '7.18281828e-1'),\n Token('op', ')'),\n Token('op', '*'),\n Token('int', '42'),\n]\n```\n\nLet's `parse()` these tokens into an expression tree using our parser created with `funcparserlib.parser`:\n\n```\nBinaryExpr(\n op='*',\n left=BinaryExpr(\n op='*',\n left=3.1415926,\n right=BinaryExpr(op='+', left=2, right=0.718281828),\n ),\n right=42,\n)\n```\n\nLearn how to write this parser using `funcparserlib` in the [Getting Started](https://funcparserlib.pirx.ru/getting-started/) guide!\n\n\nUsed By\n-------\n\nSome open-source projects that use `funcparserlib` as an explicit dependency:\n\n* [Hy](https://github.com/hylang/hy), a Lisp dialect that's embedded in Python\n * 4.2K stars, version `>= 1.0.0a0`, Python 3.7+\n* [Splash](https://github.com/scrapinghub/splash), a JavaScript rendering service with HTTP API, by Scrapinghub\n * 3.6K stars, version `*`. Python 3 in Docker\n* [graphite-beacon](https://github.com/klen/graphite-beacon), a simple alerting system for Graphite metrics\n * 459 stars, version `==0.3.6`, Python 2 and 3\n* [blockdiag](https://github.com/blockdiag/blockdiag), generates block-diagram image file from spec-text file\n * 148 stars, version `>= 1.0.0a0`, Python 3.7+\n* [kll](https://github.com/kiibohd/kll), Keyboard Layout Language (KLL) compiler\n * 109 stars, copied source code, Python 3.5+\n\n\nNext\n----\n\nRead the [Getting Started](https://funcparserlib.pirx.ru/getting-started/) guide to start learning `funcparserlib`.\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Recursive descent parsing library based on functional combinators",
"version": "1.0.1",
"project_urls": {
"Homepage": "https://funcparserlib.pirx.ru",
"Repository": "https://github.com/vlasovskikh/funcparserlib"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8366acd740d84f59c655935f586c113a863aa404dfe932052a68a1163d88ea63",
"md5": "e462deccf7e5db08177d311ac65acbf0",
"sha256": "95da15d3f0d00b9b6f4bf04005c708af3faa115f7b45692ace064ebe758c68e8"
},
"downloads": -1,
"filename": "funcparserlib-1.0.1-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "e462deccf7e5db08177d311ac65acbf0",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*",
"size": 17842,
"upload_time": "2022-11-03T23:10:53",
"upload_time_iso_8601": "2022-11-03T23:10:53.434774Z",
"url": "https://files.pythonhosted.org/packages/83/66/acd740d84f59c655935f586c113a863aa404dfe932052a68a1163d88ea63/funcparserlib-1.0.1-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9344a21dfd9c45ad6909257e5186378a4fedaf41406824ce1ec06bc2a6c168e7",
"md5": "248a9c249886dd72ef2dca2aca2ad1eb",
"sha256": "a2c4a0d7942f7a0e7635c369d921066c8d4cae7f8b5bf7914466bec3c69837f4"
},
"downloads": -1,
"filename": "funcparserlib-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "248a9c249886dd72ef2dca2aca2ad1eb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*",
"size": 17238,
"upload_time": "2022-11-03T23:10:54",
"upload_time_iso_8601": "2022-11-03T23:10:54.553696Z",
"url": "https://files.pythonhosted.org/packages/93/44/a21dfd9c45ad6909257e5186378a4fedaf41406824ce1ec06bc2a6c168e7/funcparserlib-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-11-03 23:10:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "vlasovskikh",
"github_project": "funcparserlib",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "funcparserlib"
}