[![CI](https://github.com/cesbit/pyleri/workflows/CI/badge.svg)](https://github.com/cesbit/pyleri/actions)
[![Release Version](https://img.shields.io/github/release/cesbit/pyleri)](https://github.com/cesbit/pyleri/releases)
Python Left-Right Parser
========================
Pyleri is an easy-to-use parser created for [SiriDB](http://siridb.net/). We first used [lrparsing](http://lrparsing.sourceforge.net/doc/html/) and wrote [jsleri](https://github.com/cesbit/jsleri) for auto-completion and suggestions in our web console. Later we found small issues within the `lrparsing` module and also had difficulties keeping the language the same in all projects. That is when we decided to create Pyleri which can export a created grammar to JavaScript, C, Python, Go and Java.
Gabriele Tomassetti [wrote a tutorial](https://tomassetti.me/pyleri-tutorial/) about the pyleri library.
---------------------------------------
* [Related projects](#related-projects)
* [Installation](#installation)
* [Quick usage](#quick-usage)
* [Grammar](#grammar)
* [Grammar.parse()](#parse)
* [Grammar.export_js()](#export_js)
* [Grammar.export_c()](#export_c)
* [Grammar.export_go()](#export_go)
* [Grammar.export_java()](#export_java)
* [Grammar.export_py()](#export_py)
* [Result](#result)
* [is_valid](#is_valid)
* [Position](#position)
* [Tree](#tree)
* [Expecting](#expecting)
* [Elements](#elements)
* [Keyword](#keyword)
* [Regex](#regex)
* [Token](#token)
* [Tokens](#tokens)
* [Sequence](#sequence)
* [Choice](#choice)
* [Repeat](#repeat)
* [List](#list)
* [Optional](#optional)
* [Ref](#ref)
* [Prio](#prio)
---------------------------------------
## Related projects
- [jsleri](https://github.com/cesbit/jsleri): JavaScript parser
- [libcleri](https://github.com/cesbit/libcleri): C parser
- [goleri](https://github.com/cesbit/goleri): Go parser
- [jleri](https://github.com/cesbit/jleri): Java parser
## Installation
The easiest way is to use PyPI:
sudo pip3 install pyleri
## Quick usage
```python
# Imports, note that we skip the imports in other examples...
from pyleri import (
Grammar,
Keyword,
Regex,
Sequence)
# Create a Grammar Class to define your language
class MyGrammar(Grammar):
r_name = Regex('(?:"(?:[^"]*)")+')
k_hi = Keyword('hi')
START = Sequence(k_hi, r_name)
# Compile your grammar by creating an instance of the Grammar Class.
my_grammar = MyGrammar()
# Use the compiled grammar to parse 'strings'
print(my_grammar.parse('hi "Iris"').is_valid) # => True
print(my_grammar.parse('bye "Iris"').is_valid) # => False
print(my_grammar.parse('bye "Iris"').as_str()) # => error at position 0, expecting: hi
```
## Grammar
When writing a grammar you should subclass Grammar. A Grammar expects at least a `START` property so the parser knows where to start parsing. Grammar has some default properties which can be overwritten like `RE_KEYWORDS`, which will be explained later. Grammar also has a parse method: `parse()`, and a few export methods: [export_js()](#export_js), [export_c()](#export_c), [export_py()](#export_py), [export_go()](#export_go) and [export_java()](#export_java) which are explained below.
### parse
syntax:
```python
Grammar().parse(string)
```
The `parse()` method returns a result object which has the following properties that are further explained in [Result](#result):
- `expecting`
- `is_valid`
- `pos`
- `tree`
### export_js
syntax:
```python
Grammar().export_js(
js_module_name='jsleri',
js_template=Grammar.JS_TEMPLATE,
js_indent=' ' * 4)
```
Optional keyword arguments:
- `js_module_name`: Name of the JavaScript module. (default: 'jsleri')
- `js_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.JS_TEMPLATE.
- `js_indent`: indentation used in the JavaScript file. (default: 4 spaces)
For example when using our Quick usage grammar, this is the output when running `my_grammar.export_js()`:
```javascript
/* jshint newcap: false */
/*
* This grammar is generated using the Grammar.export_js() method and
* should be used with the jsleri JavaScript module.
*
* Source class: MyGrammar
* Created at: 2015-11-04 10:06:06
*/
'use strict';
(function (
Regex,
Sequence,
Keyword,
Grammar
) {
var r_name = Regex('^(?:"(?:[^"]*)")+');
var k_hi = Keyword('hi');
var START = Sequence(
k_hi,
r_name
);
window.MyGrammar = Grammar(START, '^\w+');
})(
window.jsleri.Regex,
window.jsleri.Sequence,
window.jsleri.Keyword,
window.jsleri.Grammar
);
```
### export_c
syntax:
```python
Grammar().export_c(
target=Grammar.C_TARGET,
c_indent=' ' * 4)
```
Optional keyword arguments:
- `target`: Name of the c module. (default: 'grammar')
- `c_indent`: indentation used in the c files. (default: 4 spaces)
The return value is a tuple containing the source (c) file and header (h) file.
For example when using our Quick usage grammar, this is the output when running `my_grammar.export_c()`:
```c
/*
* grammar.c
*
* This grammar is generated using the Grammar.export_c() method and
* should be used with the libcleri module.
*
* Source class: MyGrammar
* Created at: 2016-05-09 12:16:49
*/
#include "grammar.h"
#include <stdio.h>
#define CLERI_CASE_SENSITIVE 0
#define CLERI_CASE_INSENSITIVE 1
#define CLERI_FIRST_MATCH 0
#define CLERI_MOST_GREEDY 1
cleri_grammar_t * compile_grammar(void)
{
cleri_t * r_name = cleri_regex(CLERI_GID_R_NAME, "^(?:\"(?:[^\"]*)\")+");
cleri_t * k_hi = cleri_keyword(CLERI_GID_K_HI, "hi", CLERI_CASE_INSENSITIVE);
cleri_t * START = cleri_sequence(
CLERI_GID_START,
2,
k_hi,
r_name
);
cleri_grammar_t * grammar = cleri_grammar(START, "^\\w+");
return grammar;
}
```
and the header file...
```c
/*
* grammar.h
*
* This grammar is generated using the Grammar.export_c() method and
* should be used with the libcleri module.
*
* Source class: MyGrammar
* Created at: 2016-05-09 12:16:49
*/
#ifndef CLERI_EXPORT_GRAMMAR_H_
#define CLERI_EXPORT_GRAMMAR_H_
#include <grammar.h>
#include <cleri/cleri.h>
cleri_grammar_t * compile_grammar(void);
enum cleri_grammar_ids {
CLERI_NONE, // used for objects with no name
CLERI_GID_K_HI,
CLERI_GID_R_NAME,
CLERI_GID_START,
CLERI_END // can be used to get the enum length
};
#endif /* CLERI_EXPORT_GRAMMAR_H_ */
```
### export_go
syntax:
```python
Grammar().export_go(
go_template=Grammar.GO_TEMPLATE,
go_indent='\t',
go_package='grammar')
```
Optional keyword arguments:
- `go_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.GO_TEMPLATE.
- `go_indent`: indentation used in the Go file. (default: one tab)
- `go_package`: Name of the go package. (default: 'grammar')
For example when using our Quick usage grammar, this is the output when running `my_grammar.export_go()`:
```go
package grammar
// This grammar is generated using the Grammar.export_go() method and
// should be used with the goleri module.
//
// Source class: MyGrammar
// Created at: 2017-03-14 19:07:09
import (
"regexp"
"github.com/cesbit/goleri"
)
// Element indentifiers
const (
NoGid = iota
GidKHi = iota
GidRName = iota
GidSTART = iota
)
// MyGrammar returns a compiled goleri grammar.
func MyGrammar() *goleri.Grammar {
rName := goleri.NewRegex(GidRName, regexp.MustCompile(`^(?:"(?:[^"]*)")+`))
kHi := goleri.NewKeyword(GidKHi, "hi", false)
START := goleri.NewSequence(
GidSTART,
kHi,
rName,
)
return goleri.NewGrammar(START, regexp.MustCompile(`^\w+`))
}
```
### export_java
syntax:
```python
Grammar().export_java(
java_template=Grammar.JAVA_TEMPLATE,
java_indent=' ' * 4,
java_package=None,
is_public=True)
```
Optional keyword arguments:
- `java_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.JAVA_TEMPLATE.
- `java_indent`: indentation used in the Java file. (default: four spaces)
- `java_package`: Name of the Java package or None when no package is specified. (default: None)
- `is_public`: Class and constructor are defined as public when True, else they will be defined as package private.
For example when using our Quick usage grammar, this is the output when running `my_grammar.export_java()`:
```java
/**
* This grammar is generated using the Grammar.export_java() method and
* should be used with the jleri module.
*
* Source class: MyGrammar
* Created at: 2018-07-04 12:12:34
*/
import jleri.Grammar;
import jleri.Element;
import jleri.Sequence;
import jleri.Regex;
import jleri.Keyword;
public class MyGrammar extends Grammar {
enum Ids {
K_HI,
R_NAME,
START
}
private static final Element R_NAME = new Regex(Ids.R_NAME, "^(?:\"(?:[^\"]*)\")+");
private static final Element K_HI = new Keyword(Ids.K_HI, "hi", false);
private static final Element START = new Sequence(
Ids.START,
K_HI,
R_NAME
);
public MyGrammar() {
super(START, "^\\w+");
}
}
```
### export_py
syntax:
```python
Grammar().export_py(
py_module_name='pyleri',
py_template=Grammar.PY_TEMPLATE,
py_indent=' ' * 4)
```
Optional keyword arguments:
- `py_module_name`: Name of the Pyleri Module. (default: 'pyleri')
- `py_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.PY_TEMPLATE.
- `py_indent`: indentation used in the Python file. (default: 4 spaces)
For example when using our Quick usage grammar, this is the output when running `my_grammar.export_py()`:
```python
"""
This grammar is generated using the Grammar.export_py() method and
should be used with the pyleri python module.
Source class: MyGrammar
Created at: 2017-03-14 19:14:51
"""
import re
from pyleri import Sequence
from pyleri import Keyword
from pyleri import Grammar
from pyleri import Regex
class MyGrammar(Grammar):
RE_KEYWORDS = re.compile('^\\w+')
r_name = Regex('^(?:"(?:[^"]*)")+')
k_hi = Keyword('hi')
START = Sequence(
k_hi,
r_name
)
```
## Result
The result of the `parse()` method contains 4 properties that will be explained next. A function `as_str(translate=None)` is also available which will
show the result as a string. The `translate` argument should be a function which accepts an element as argument. This function can be used to
return custom strings for certain elements. If the return value of `translate` is `None` then the function will fall try to generate a string value. If
the return value is an empty string, the value will be ignored.
Example of translate functions:
```python
# In case a translation function returns an empty string, no text is used
def translate(elem):
return '' # as a result you get something like: 'error at position x'
# Text may be returned based on gid
def translate(elem):
if elem is some_elem:
return 'A' # something like: error at position x, expecting: A
elif elem is other_elem:
return '' # other_elem will be ignored
else:
return None # normal parsing
# A translate function can be used as follow:
print(my_grammar.parse('some string').as_str(translate=translate))
```
### is_valid
`is_valid` returns a boolean value, `True` when the given string is valid according to the given grammar, `False` when not valid.
Let us take the example from Quick usage.
```python
res = my_grammar.parse('bye "Iris"')
print(res.is_valid) # => False
```
### Position
`pos` returns the position where the parser had to stop. (when `is_valid` is `True` this value will be equal to the length of the given string with `str.rstrip()` applied)
Let us take the example from Quick usage.
```python
result = my_grammar.parse('hi Iris')
print(res.is_valid, result.pos) # => False, 3
```
### Tree
`tree` contains the parse tree. Even when `is_valid` is `False` the parse tree is returned but will only contain results as far as parsing has succeeded. The tree is the root node which can include several `children` nodes. The structure will be further clarified in the following example which explains a way of visualizing the parse tree.
Example:
```python
import json
from pyleri import Choice
from pyleri import Grammar
from pyleri import Keyword
from pyleri import Regex
from pyleri import Repeat
from pyleri import Sequence
# Create a Grammar Class to define your language
class MyGrammar(Grammar):
r_name = Regex('(?:"(?:[^"]*)")+')
k_hi = Keyword('hi')
k_bye = Keyword('bye')
START = Repeat(Sequence(Choice(k_hi, k_bye), r_name))
# Returns properties of a node object as a dictionary:
def node_props(node, children):
return {
'start': node.start,
'end': node.end,
'name': node.element.name if hasattr(node.element, 'name') else None,
'element': node.element.__class__.__name__,
'string': node.string,
'children': children}
# Recursive method to get the children of a node object:
def get_children(children):
return [node_props(c, get_children(c.children)) for c in children]
# View the parse tree:
def view_parse_tree(res):
start = res.tree.children[0] \
if res.tree.children else res.tree
return node_props(start, get_children(start.children))
if __name__ == '__main__':
# Compile your grammar by creating an instance of the Grammar Class:
my_grammar = MyGrammar()
res = my_grammar.parse('hi "pyleri" bye "pyleri"')
# The parse tree is visualized as a JSON object:
print(json.dumps(view_parse_tree(res), indent=2))
```
Part of the output is shown below.
```json
{
"start": 0,
"end": 23,
"name": "START",
"element": "Repeat",
"string": "hi \"pyleri\" bye \"pyleri\"",
"children": [
{
"start": 0,
"end": 11,
"name": null,
"element": "Sequence",
"string": "hi \"pyleri\"",
"children": [
{
"start": 0,
"end": 2,
"name": null,
"element": "Choice",
"string": "hi",
"children": [
{
"start": 0,
"end": 2,
"name": "k_hi",
"element": "Keyword",
"string": "hi",
"children": []
}
]
},
{
"start": 3,
"end": 11,
"name": "r_name",
"element": "Regex",
"string": "\"pyleri\"",
"children": []
}
"..."
"..."
```
A node contains 5 properties that will be explained next:
- `start` property returns the start of the node object.
- `end` property returns the end of the node object.
- `element` returns the [Element](#elements)'s type (e.g. Repeat, Sequence, Keyword, etc.). An element can be assigned to a variable; for instance in the example above `Keyword('hi')` was assigned to `k_hi`. With `element.name` the assigned name `k_hi` will be returned. Note that it is not a given that an element is named; in our example `Sequence` was not assigned, thus in this case the element has no attribute `name`.
- `string` returns the string that is parsed.
- `children` can return a node object containing deeper layered nodes provided that there are any. In our example the root node has an element type `Repeat()`, starts at 0 and ends at 24, and it has two `children`. These children are node objects that have both an element type `Sequence`, start at 0 and 12 respectively, and so on.
### Expecting
`expecting` returns a Python set() containing elements which pyleri expects at `pos`. Even if `is_valid` is true there might be elements in this set, for example when an `Optional()` element could be added to the string. "Expecting" is useful if you want to implement things like auto-completion, syntax error handling, auto-syntax-correction etc. The following example will illustrate a way of implementation.
Example:
```python
import re
import random
from pyleri import Choice
from pyleri import Grammar
from pyleri import Keyword
from pyleri import Repeat
from pyleri import Sequence
from pyleri import end_of_statement
# Create a Grammar Class to define your language.
class MyGrammar(Grammar):
RE_KEYWORDS = re.compile(r'\S+')
r_name = Keyword('"pyleri"')
k_hi = Keyword('hi')
k_bye = Keyword('bye')
START = Repeat(Sequence(Choice(k_hi, k_bye), r_name), mi=2)
# Print the expected elements as a indented and numbered list.
def print_expecting(node_expecting, string_expecting):
for loop, e in enumerate(node_expecting):
string_expecting = '{}\n\t({}) {}'.format(string_expecting, loop, e)
print(string_expecting)
# Complete a string until it is valid according to the grammar.
def auto_correction(string, my_grammar):
node = my_grammar.parse(string)
print('\nParsed string: {}'.format(node.tree.string))
if node.is_valid:
string_expecting = 'String is valid. \nExpected: '
print_expecting(node.expecting, string_expecting)
else:
string_expecting = 'String is NOT valid.\nExpected: ' \
if not node.pos \
else 'String is NOT valid. \nAfter "{}" expected: '.format(
node.tree.string[:node.pos])
print_expecting(node.expecting, string_expecting)
selected = random.choice(list(node.expecting))
string = '{} {}'.format(node.tree.string[:node.pos],
selected
if selected
is not end_of_statement else '')
auto_correction(string, my_grammar)
if __name__ == '__main__':
# Compile your grammar by creating an instance of the Grammar Class.
my_grammar = MyGrammar()
string = 'hello "pyleri"'
auto_correction(string, my_grammar)
```
Output:
```
Parsed string: hello "pyleri"
String is NOT valid.
Expected:
(1) hi
(2) bye
Parsed string: bye
String is NOT valid.
After " bye" expected:
(1) "pyleri"
Parsed string: bye "pyleri"
String is NOT valid.
After " bye "pyleri"" expected:
(1) hi
(2) bye
Parsed string: bye "pyleri" hi
String is NOT valid.
After " bye "pyleri" hi" expected:
(1) "pyleri"
Parsed string: bye "pyleri" hi "pyleri"
String is valid.
Expected:
(1) hi
(2) bye
```
In the above example we parsed an invalid string according to the grammar class. The `auto-correction()` method that we built for this example combines all properties from the `parse()` to create a valid string. The output shows every recursion of the `auto-correction()` method and prints successively the set of expected elements. It takes one randomly and adds it to the string. When the string corresponds to the grammar, the property `is_valid` will return `True`. Notably the `expecting` property still contains elements even if the `is_valid` returned `True`. The reason in this example is due to the [Repeat](#repeat) element.
## Elements
Pyleri has several elements which are all subclasses of [Element](#element) and can be used to create a grammar.
### Keyword
syntax:
```python
Keyword(keyword, ign_case=False)
```
The parser needs to match the keyword which is just a string. When matching keywords we need to tell the parser what characters are allowed in keywords. By default Pyleri uses `^\w+` which is both in Python and JavaScript equal to `^[A-Za-z0-9_]+`. We can overwrite the default by setting `RE_KEYWORDS` in the grammar. Keyword() accepts one keyword argument `ign_case` to tell the parser if we should match case insensitive.
Example:
```python
class TicTacToe(Grammar):
# Let's allow keywords with alphabetic characters and dashes.
RE_KEYWORDS = re.compile('^[A-Za-z-]+')
START = Keyword('tic-tac-toe', ign_case=True)
ttt_grammar = TicTacToe()
ttt_grammar.parse('Tic-Tac-Toe').is_valid # => True
```
### Regex
syntax:
```python
Regex(pattern, flags=0)
```
The parser compiles a regular expression using the `re` module. The current version of pyleri has only support for the `re.IGNORECASE` flag.
See the [Quick usage](#quick-usage) example for how to use `Regex`.
### Token
syntax:
```python
Token(token)
```
A token can be one or more characters and is usually used to match operators like `+`, `-`, `//` and so on. When we parse a string object where pyleri expects an element, it will automatically be converted to a `Token()` object.
Example:
```python
class Ni(Grammar):
t_dash = Token('-')
# We could just write delimiter='-' because
# any string will be converted to Token()
START = List(Keyword('ni'), delimiter=t_dash)
ni = Ni()
ni.parse('ni-ni-ni-ni-ni').is_valid # => True
```
### Tokens
syntax:
```python
Tokens(tokens)
```
Can be used to register multiple tokens at once. The `tokens` argument should be a string with tokens separated by spaces. If given tokens are different in size the parser will try to match the longest tokens first.
Example:
```python
class Ni(Grammar):
tks = Tokens('+ - !=')
START = List(Keyword('ni'), delimiter=tks)
ni = Ni()
ni.parse('ni + ni != ni - ni').is_valid # => True
```
### Sequence
syntax:
```python
Sequence(element, element, ...)
```
The parser needs to match each element in a sequence.
Example:
```python
class TicTacToe(Grammar):
START = Sequence(Keyword('Tic'), Keyword('Tac'), Keyword('Toe'))
ttt_grammar = TicTacToe()
ttt_grammar.parse('Tic Tac Toe').is_valid # => True
```
### Choice
syntax:
```python
Choice(element, element, ..., most_greedy=True)
```
The parser needs to choose between one of the given elements. Choice accepts one keyword argument `most_greedy` which is `True` by default. When `most_greedy` is set to `False` the parser will stop at the first match. When `True` the parser will try each element and returns the longest match. Setting `most_greedy` to `False` can provide some extra performance. Note that the parser will try to match each element in the exact same order they are parsed to Choice.
Example: let us use `Choice` to modify the Quick usage example to allow the string 'bye "Iris"'
```python
class MyGrammar(Grammar):
r_name = Regex('(?:"(?:[^"]*)")+')
k_hi = Keyword('hi')
k_bye = Keyword('bye')
START = Sequence(Choice(k_hi, k_bye), r_name)
my_grammar = MyGrammar()
my_grammar.parse('hi "Iris"').is_valid # => True
my_grammar.parse('bye "Iris"').is_valid # => True
```
### Repeat
syntax:
```python
Repeat(element, mi=0, ma=None)
```
The parser needs at least `mi` elements and at most `ma` elements. When `ma` is set to `None` we allow unlimited number of elements. `mi` can be any integer value equal or higher than 0 but not larger then `ma`.
Example:
```python
class Ni(Grammar):
START = Repeat(Keyword('ni'))
ni = Ni()
ni.parse('ni ni ni ni ni').is_valid # => True
```
It is not allowed to bind a name to the same element twice and Repeat(element, 1, 1) is a common solution to bind the element a second (or more) time(s).
For example consider the following:
```python
class MyGrammar(Grammar):
r_name = Regex('(?:"(?:[^"]*)")+')
# Raises a SyntaxError because we try to bind a second time.
r_address = r_name # WRONG
# Instead use Repeat
r_address = Repeat(r_name, 1, 1) # RIGHT
```
### List
syntax:
```python
List(element, delimiter=',', mi=0, ma=None, opt=False)
```
List is like Repeat but with a delimiter. A comma is used as default delimiter but any element is allowed. When a string is used as delimiter it will be converted to a `Token` element. `mi` and `ma` work exactly like with Repeat. An optional keyword argument `opt` can be set to `True` to allow the list to end with a delimiter. By default this is set to `False` which means the list has to end with an element.
Example:
```python
class Ni(Grammar):
START = List(Keyword('ni'))
ni = Ni()
ni.parse('ni, ni, ni, ni, ni').is_valid # => True
```
### Optional
syntax:
```python
Optional(element)
```
The parser looks for an optional element. It is like using `Repeat(element, 0, 1)` but we encourage to use `Optional` since it is more readable. (and slightly faster)
Example:
```python
class MyGrammar(Grammar):
r_name = Regex('(?:"(?:[^"]*)")+')
k_hi = Keyword('hi')
START = Sequence(k_hi, Optional(r_name))
my_grammar = MyGrammar()
my_grammar.parse('hi "Iris"').is_valid # => True
my_grammar.parse('hi').is_valid # => True
```
### Ref
syntax:
```python
Ref()
```
The grammar can make a forward reference to make recursion possible. In the example below we create a forward reference to START but note that
a reference to any element can be made.
>Warning: A reference is not protected against testing the same position in
>a string. This could potentially lead to an infinite loop.
>For example:
>```python
>r = Ref()
>r = Optional(r) # DON'T DO THIS
>```
>Use [Prio](#prio) if such recursive construction is required.
Example:
```python
class NestedNi(Grammar):
START = Ref()
ni_item = Choice(Keyword('ni'), START)
START = Sequence('[', List(ni_item), ']')
nested_ni = NestedNi()
nested_ni.parse('[ni, ni, [ni, [], [ni, ni]]]').is_valid # => True
```
### Prio
syntax:
```python
Prio(element, element, ...)
```
Choose the first match from the prio elements and allow `THIS` for recursive operations. With `THIS` we point to the `Prio` element. Probably the example below explains how `Prio` and `THIS` can be used.
>Note: Use a [Ref](#ref) when possible.
>A `Prio` element is required when the same position in a string is potentially
>checked more than once.
Example:
```python
class Ni(Grammar):
k_ni = Keyword('ni')
START = Prio(
k_ni,
# '(' and ')' are automatically converted to Token('(') and Token(')')
Sequence('(', THIS, ')'),
Sequence(THIS, Keyword('or'), THIS),
Sequence(THIS, Keyword('and'), THIS))
ni = Ni()
ni.parse('(ni or ni) and (ni or ni)').is_valid # => True
```
Raw data
{
"_id": null,
"home_page": "https://github.com/cesbit/pyleri",
"name": "pyleri",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "parser, grammar, autocompletion",
"author": "Jeroen van der Heijden",
"author_email": "jeroen@cesbit.com",
"download_url": "https://files.pythonhosted.org/packages/93/6a/4a2a8a05a4945b253d40654149056ae03b9d5747f3c1c423bb93f1e6d13f/pyleri-1.4.3.tar.gz",
"platform": null,
"description": "[![CI](https://github.com/cesbit/pyleri/workflows/CI/badge.svg)](https://github.com/cesbit/pyleri/actions)\n[![Release Version](https://img.shields.io/github/release/cesbit/pyleri)](https://github.com/cesbit/pyleri/releases)\n\nPython Left-Right Parser\n========================\nPyleri is an easy-to-use parser created for [SiriDB](http://siridb.net/). We first used [lrparsing](http://lrparsing.sourceforge.net/doc/html/) and wrote [jsleri](https://github.com/cesbit/jsleri) for auto-completion and suggestions in our web console. Later we found small issues within the `lrparsing` module and also had difficulties keeping the language the same in all projects. That is when we decided to create Pyleri which can export a created grammar to JavaScript, C, Python, Go and Java.\n\nGabriele Tomassetti [wrote a tutorial](https://tomassetti.me/pyleri-tutorial/) about the pyleri library.\n\n---------------------------------------\n * [Related projects](#related-projects)\n * [Installation](#installation)\n * [Quick usage](#quick-usage)\n * [Grammar](#grammar)\n * [Grammar.parse()](#parse)\n * [Grammar.export_js()](#export_js)\n * [Grammar.export_c()](#export_c)\n * [Grammar.export_go()](#export_go)\n * [Grammar.export_java()](#export_java)\n * [Grammar.export_py()](#export_py)\n * [Result](#result)\n * [is_valid](#is_valid)\n * [Position](#position)\n * [Tree](#tree)\n * [Expecting](#expecting)\n * [Elements](#elements)\n * [Keyword](#keyword)\n * [Regex](#regex)\n * [Token](#token)\n * [Tokens](#tokens)\n * [Sequence](#sequence)\n * [Choice](#choice)\n * [Repeat](#repeat)\n * [List](#list)\n * [Optional](#optional)\n * [Ref](#ref)\n * [Prio](#prio)\n\n\n---------------------------------------\n## Related projects\n- [jsleri](https://github.com/cesbit/jsleri): JavaScript parser\n- [libcleri](https://github.com/cesbit/libcleri): C parser\n- [goleri](https://github.com/cesbit/goleri): Go parser\n- [jleri](https://github.com/cesbit/jleri): Java parser\n\n## Installation\nThe easiest way is to use PyPI:\n\n sudo pip3 install pyleri\n\n## Quick usage\n```python\n# Imports, note that we skip the imports in other examples...\nfrom pyleri import (\n Grammar,\n Keyword,\n Regex,\n Sequence)\n\n# Create a Grammar Class to define your language\nclass MyGrammar(Grammar):\n r_name = Regex('(?:\"(?:[^\"]*)\")+')\n k_hi = Keyword('hi')\n START = Sequence(k_hi, r_name)\n\n# Compile your grammar by creating an instance of the Grammar Class.\nmy_grammar = MyGrammar()\n\n# Use the compiled grammar to parse 'strings'\nprint(my_grammar.parse('hi \"Iris\"').is_valid) # => True\nprint(my_grammar.parse('bye \"Iris\"').is_valid) # => False\nprint(my_grammar.parse('bye \"Iris\"').as_str()) # => error at position 0, expecting: hi\n```\n\n## Grammar\nWhen writing a grammar you should subclass Grammar. A Grammar expects at least a `START` property so the parser knows where to start parsing. Grammar has some default properties which can be overwritten like `RE_KEYWORDS`, which will be explained later. Grammar also has a parse method: `parse()`, and a few export methods: [export_js()](#export_js), [export_c()](#export_c), [export_py()](#export_py), [export_go()](#export_go) and [export_java()](#export_java) which are explained below.\n\n\n### parse\nsyntax:\n```python\nGrammar().parse(string)\n```\nThe `parse()` method returns a result object which has the following properties that are further explained in [Result](#result):\n- `expecting`\n- `is_valid`\n- `pos`\n- `tree`\n\n\n### export_js\nsyntax:\n```python\nGrammar().export_js(\n js_module_name='jsleri',\n js_template=Grammar.JS_TEMPLATE,\n js_indent=' ' * 4)\n```\nOptional keyword arguments:\n- `js_module_name`: Name of the JavaScript module. (default: 'jsleri')\n- `js_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.JS_TEMPLATE.\n- `js_indent`: indentation used in the JavaScript file. (default: 4 spaces)\n\nFor example when using our Quick usage grammar, this is the output when running `my_grammar.export_js()`:\n```javascript\n/* jshint newcap: false */\n\n/*\n * This grammar is generated using the Grammar.export_js() method and\n * should be used with the jsleri JavaScript module.\n *\n * Source class: MyGrammar\n * Created at: 2015-11-04 10:06:06\n */\n\n'use strict';\n\n(function (\n Regex,\n Sequence,\n Keyword,\n Grammar\n ) {\n var r_name = Regex('^(?:\"(?:[^\"]*)\")+');\n var k_hi = Keyword('hi');\n var START = Sequence(\n k_hi,\n r_name\n );\n\n window.MyGrammar = Grammar(START, '^\\w+');\n\n})(\n window.jsleri.Regex,\n window.jsleri.Sequence,\n window.jsleri.Keyword,\n window.jsleri.Grammar\n);\n```\n\n### export_c\nsyntax:\n```python\nGrammar().export_c(\n target=Grammar.C_TARGET,\n c_indent=' ' * 4)\n```\nOptional keyword arguments:\n- `target`: Name of the c module. (default: 'grammar')\n- `c_indent`: indentation used in the c files. (default: 4 spaces)\n\nThe return value is a tuple containing the source (c) file and header (h) file.\n\nFor example when using our Quick usage grammar, this is the output when running `my_grammar.export_c()`:\n```c\n/*\n * grammar.c\n *\n * This grammar is generated using the Grammar.export_c() method and\n * should be used with the libcleri module.\n *\n * Source class: MyGrammar\n * Created at: 2016-05-09 12:16:49\n */\n\n#include \"grammar.h\"\n#include <stdio.h>\n\n#define CLERI_CASE_SENSITIVE 0\n#define CLERI_CASE_INSENSITIVE 1\n\n#define CLERI_FIRST_MATCH 0\n#define CLERI_MOST_GREEDY 1\n\ncleri_grammar_t * compile_grammar(void)\n{\n cleri_t * r_name = cleri_regex(CLERI_GID_R_NAME, \"^(?:\\\"(?:[^\\\"]*)\\\")+\");\n cleri_t * k_hi = cleri_keyword(CLERI_GID_K_HI, \"hi\", CLERI_CASE_INSENSITIVE);\n cleri_t * START = cleri_sequence(\n CLERI_GID_START,\n 2,\n k_hi,\n r_name\n );\n\n cleri_grammar_t * grammar = cleri_grammar(START, \"^\\\\w+\");\n\n return grammar;\n}\n```\nand the header file...\n```c\n/*\n * grammar.h\n *\n * This grammar is generated using the Grammar.export_c() method and\n * should be used with the libcleri module.\n *\n * Source class: MyGrammar\n * Created at: 2016-05-09 12:16:49\n */\n#ifndef CLERI_EXPORT_GRAMMAR_H_\n#define CLERI_EXPORT_GRAMMAR_H_\n\n#include <grammar.h>\n#include <cleri/cleri.h>\n\ncleri_grammar_t * compile_grammar(void);\n\nenum cleri_grammar_ids {\n CLERI_NONE, // used for objects with no name\n CLERI_GID_K_HI,\n CLERI_GID_R_NAME,\n CLERI_GID_START,\n CLERI_END // can be used to get the enum length\n};\n\n#endif /* CLERI_EXPORT_GRAMMAR_H_ */\n\n```\n### export_go\nsyntax:\n```python\nGrammar().export_go(\n go_template=Grammar.GO_TEMPLATE,\n go_indent='\\t',\n go_package='grammar')\n```\nOptional keyword arguments:\n- `go_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.GO_TEMPLATE.\n- `go_indent`: indentation used in the Go file. (default: one tab)\n- `go_package`: Name of the go package. (default: 'grammar')\n\nFor example when using our Quick usage grammar, this is the output when running `my_grammar.export_go()`:\n```go\npackage grammar\n\n// This grammar is generated using the Grammar.export_go() method and\n// should be used with the goleri module.\n//\n// Source class: MyGrammar\n// Created at: 2017-03-14 19:07:09\n\nimport (\n \"regexp\"\n\n \"github.com/cesbit/goleri\"\n)\n\n// Element indentifiers\nconst (\n NoGid = iota\n GidKHi = iota\n GidRName = iota\n GidSTART = iota\n)\n\n// MyGrammar returns a compiled goleri grammar.\nfunc MyGrammar() *goleri.Grammar {\n rName := goleri.NewRegex(GidRName, regexp.MustCompile(`^(?:\"(?:[^\"]*)\")+`))\n kHi := goleri.NewKeyword(GidKHi, \"hi\", false)\n START := goleri.NewSequence(\n GidSTART,\n kHi,\n rName,\n )\n return goleri.NewGrammar(START, regexp.MustCompile(`^\\w+`))\n}\n```\n### export_java\nsyntax:\n```python\nGrammar().export_java(\n java_template=Grammar.JAVA_TEMPLATE,\n java_indent=' ' * 4,\n java_package=None,\n is_public=True)\n```\nOptional keyword arguments:\n- `java_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.JAVA_TEMPLATE.\n- `java_indent`: indentation used in the Java file. (default: four spaces)\n- `java_package`: Name of the Java package or None when no package is specified. (default: None)\n- `is_public`: Class and constructor are defined as public when True, else they will be defined as package private.\n\nFor example when using our Quick usage grammar, this is the output when running `my_grammar.export_java()`:\n```java\n/**\n * This grammar is generated using the Grammar.export_java() method and\n * should be used with the jleri module.\n *\n * Source class: MyGrammar\n * Created at: 2018-07-04 12:12:34\n */\n\nimport jleri.Grammar;\nimport jleri.Element;\nimport jleri.Sequence;\nimport jleri.Regex;\nimport jleri.Keyword;\n\npublic class MyGrammar extends Grammar {\n enum Ids {\n K_HI,\n R_NAME,\n START\n }\n\n private static final Element R_NAME = new Regex(Ids.R_NAME, \"^(?:\\\"(?:[^\\\"]*)\\\")+\");\n private static final Element K_HI = new Keyword(Ids.K_HI, \"hi\", false);\n private static final Element START = new Sequence(\n Ids.START,\n K_HI,\n R_NAME\n );\n\n public MyGrammar() {\n super(START, \"^\\\\w+\");\n }\n}\n```\n### export_py\nsyntax:\n```python\nGrammar().export_py(\n py_module_name='pyleri',\n py_template=Grammar.PY_TEMPLATE,\n py_indent=' ' * 4)\n```\nOptional keyword arguments:\n- `py_module_name`: Name of the Pyleri Module. (default: 'pyleri')\n- `py_template`: Template String used for the export. You might want to look at the default string which can be found at Grammar.PY_TEMPLATE.\n- `py_indent`: indentation used in the Python file. (default: 4 spaces)\n\nFor example when using our Quick usage grammar, this is the output when running `my_grammar.export_py()`:\n```python\n\"\"\"\n This grammar is generated using the Grammar.export_py() method and\n should be used with the pyleri python module.\n\n Source class: MyGrammar\n Created at: 2017-03-14 19:14:51\n\"\"\"\nimport re\nfrom pyleri import Sequence\nfrom pyleri import Keyword\nfrom pyleri import Grammar\nfrom pyleri import Regex\n\nclass MyGrammar(Grammar):\n\n RE_KEYWORDS = re.compile('^\\\\w+')\n r_name = Regex('^(?:\"(?:[^\"]*)\")+')\n k_hi = Keyword('hi')\n START = Sequence(\n k_hi,\n r_name\n )\n```\n\n## Result\nThe result of the `parse()` method contains 4 properties that will be explained next. A function `as_str(translate=None)` is also available which will\nshow the result as a string. The `translate` argument should be a function which accepts an element as argument. This function can be used to\nreturn custom strings for certain elements. If the return value of `translate` is `None` then the function will fall try to generate a string value. If\nthe return value is an empty string, the value will be ignored.\n\nExample of translate functions:\n```python\n# In case a translation function returns an empty string, no text is used\ndef translate(elem):\n return '' # as a result you get something like: 'error at position x'\n\n# Text may be returned based on gid\ndef translate(elem):\n if elem is some_elem:\n return 'A' # something like: error at position x, expecting: A\n elif elem is other_elem:\n return '' # other_elem will be ignored\n else:\n return None # normal parsing\n\n# A translate function can be used as follow:\nprint(my_grammar.parse('some string').as_str(translate=translate))\n```\n\n### is_valid\n`is_valid` returns a boolean value, `True` when the given string is valid according to the given grammar, `False` when not valid.\n\nLet us take the example from Quick usage.\n```python\nres = my_grammar.parse('bye \"Iris\"')\nprint(res.is_valid) # => False\n```\n\n### Position\n`pos` returns the position where the parser had to stop. (when `is_valid` is `True` this value will be equal to the length of the given string with `str.rstrip()` applied)\n\nLet us take the example from Quick usage.\n```python\nresult = my_grammar.parse('hi Iris')\nprint(res.is_valid, result.pos) # => False, 3\n```\n\n### Tree\n`tree` contains the parse tree. Even when `is_valid` is `False` the parse tree is returned but will only contain results as far as parsing has succeeded. The tree is the root node which can include several `children` nodes. The structure will be further clarified in the following example which explains a way of visualizing the parse tree.\n\nExample:\n```python\nimport json\nfrom pyleri import Choice\nfrom pyleri import Grammar\nfrom pyleri import Keyword\nfrom pyleri import Regex\nfrom pyleri import Repeat\nfrom pyleri import Sequence\n\n\n# Create a Grammar Class to define your language\nclass MyGrammar(Grammar):\n r_name = Regex('(?:\"(?:[^\"]*)\")+')\n k_hi = Keyword('hi')\n k_bye = Keyword('bye')\n START = Repeat(Sequence(Choice(k_hi, k_bye), r_name))\n\n\n# Returns properties of a node object as a dictionary:\ndef node_props(node, children):\n return {\n 'start': node.start,\n 'end': node.end,\n 'name': node.element.name if hasattr(node.element, 'name') else None,\n 'element': node.element.__class__.__name__,\n 'string': node.string,\n 'children': children}\n\n\n# Recursive method to get the children of a node object:\ndef get_children(children):\n return [node_props(c, get_children(c.children)) for c in children]\n\n\n# View the parse tree:\ndef view_parse_tree(res):\n start = res.tree.children[0] \\\n if res.tree.children else res.tree\n return node_props(start, get_children(start.children))\n\n\nif __name__ == '__main__':\n # Compile your grammar by creating an instance of the Grammar Class:\n my_grammar = MyGrammar()\n res = my_grammar.parse('hi \"pyleri\" bye \"pyleri\"')\n # The parse tree is visualized as a JSON object:\n print(json.dumps(view_parse_tree(res), indent=2))\n```\n\nPart of the output is shown below.\n\n```json\n\n {\n \"start\": 0,\n \"end\": 23,\n \"name\": \"START\",\n \"element\": \"Repeat\",\n \"string\": \"hi \\\"pyleri\\\" bye \\\"pyleri\\\"\",\n \"children\": [\n {\n \"start\": 0,\n \"end\": 11,\n \"name\": null,\n \"element\": \"Sequence\",\n \"string\": \"hi \\\"pyleri\\\"\",\n \"children\": [\n {\n \"start\": 0,\n \"end\": 2,\n \"name\": null,\n \"element\": \"Choice\",\n \"string\": \"hi\",\n \"children\": [\n {\n \"start\": 0,\n \"end\": 2,\n \"name\": \"k_hi\",\n \"element\": \"Keyword\",\n \"string\": \"hi\",\n \"children\": []\n }\n ]\n },\n {\n \"start\": 3,\n \"end\": 11,\n \"name\": \"r_name\",\n \"element\": \"Regex\",\n \"string\": \"\\\"pyleri\\\"\",\n \"children\": []\n }\n\n \"...\"\n \"...\"\n\n\n```\nA node contains 5 properties that will be explained next:\n\n- `start` property returns the start of the node object.\n- `end` property returns the end of the node object.\n- `element` returns the [Element](#elements)'s type (e.g. Repeat, Sequence, Keyword, etc.). An element can be assigned to a variable; for instance in the example above `Keyword('hi')` was assigned to `k_hi`. With `element.name` the assigned name `k_hi` will be returned. Note that it is not a given that an element is named; in our example `Sequence` was not assigned, thus in this case the element has no attribute `name`.\n- `string` returns the string that is parsed.\n- `children` can return a node object containing deeper layered nodes provided that there are any. In our example the root node has an element type `Repeat()`, starts at 0 and ends at 24, and it has two `children`. These children are node objects that have both an element type `Sequence`, start at 0 and 12 respectively, and so on.\n\n\n### Expecting\n`expecting` returns a Python set() containing elements which pyleri expects at `pos`. Even if `is_valid` is true there might be elements in this set, for example when an `Optional()` element could be added to the string. \"Expecting\" is useful if you want to implement things like auto-completion, syntax error handling, auto-syntax-correction etc. The following example will illustrate a way of implementation.\n\nExample:\n```python\nimport re\nimport random\nfrom pyleri import Choice\nfrom pyleri import Grammar\nfrom pyleri import Keyword\nfrom pyleri import Repeat\nfrom pyleri import Sequence\nfrom pyleri import end_of_statement\n\n\n# Create a Grammar Class to define your language.\nclass MyGrammar(Grammar):\n RE_KEYWORDS = re.compile(r'\\S+')\n r_name = Keyword('\"pyleri\"')\n k_hi = Keyword('hi')\n k_bye = Keyword('bye')\n START = Repeat(Sequence(Choice(k_hi, k_bye), r_name), mi=2)\n\n\n# Print the expected elements as a indented and numbered list.\ndef print_expecting(node_expecting, string_expecting):\n for loop, e in enumerate(node_expecting):\n string_expecting = '{}\\n\\t({}) {}'.format(string_expecting, loop, e)\n print(string_expecting)\n\n\n# Complete a string until it is valid according to the grammar.\ndef auto_correction(string, my_grammar):\n node = my_grammar.parse(string)\n print('\\nParsed string: {}'.format(node.tree.string))\n\n if node.is_valid:\n string_expecting = 'String is valid. \\nExpected: '\n print_expecting(node.expecting, string_expecting)\n\n else:\n string_expecting = 'String is NOT valid.\\nExpected: ' \\\n if not node.pos \\\n else 'String is NOT valid. \\nAfter \"{}\" expected: '.format(\n node.tree.string[:node.pos])\n print_expecting(node.expecting, string_expecting)\n\n selected = random.choice(list(node.expecting))\n string = '{} {}'.format(node.tree.string[:node.pos],\n selected\n if selected\n is not end_of_statement else '')\n\n auto_correction(string, my_grammar)\n\n\nif __name__ == '__main__':\n # Compile your grammar by creating an instance of the Grammar Class.\n my_grammar = MyGrammar()\n string = 'hello \"pyleri\"'\n auto_correction(string, my_grammar)\n\n```\n\nOutput:\n```\nParsed string: hello \"pyleri\"\nString is NOT valid.\nExpected:\n (1) hi\n (2) bye\n\nParsed string: bye\nString is NOT valid.\nAfter \" bye\" expected:\n (1) \"pyleri\"\n\nParsed string: bye \"pyleri\"\nString is NOT valid.\nAfter \" bye \"pyleri\"\" expected:\n (1) hi\n (2) bye\n\nParsed string: bye \"pyleri\" hi\nString is NOT valid.\nAfter \" bye \"pyleri\" hi\" expected:\n (1) \"pyleri\"\n\nParsed string: bye \"pyleri\" hi \"pyleri\"\nString is valid.\nExpected:\n (1) hi\n (2) bye\n\n```\nIn the above example we parsed an invalid string according to the grammar class. The `auto-correction()` method that we built for this example combines all properties from the `parse()` to create a valid string. The output shows every recursion of the `auto-correction()` method and prints successively the set of expected elements. It takes one randomly and adds it to the string. When the string corresponds to the grammar, the property `is_valid` will return `True`. Notably the `expecting` property still contains elements even if the `is_valid` returned `True`. The reason in this example is due to the [Repeat](#repeat) element.\n\n## Elements\nPyleri has several elements which are all subclasses of [Element](#element) and can be used to create a grammar.\n\n### Keyword\nsyntax:\n```python\nKeyword(keyword, ign_case=False)\n```\nThe parser needs to match the keyword which is just a string. When matching keywords we need to tell the parser what characters are allowed in keywords. By default Pyleri uses `^\\w+` which is both in Python and JavaScript equal to `^[A-Za-z0-9_]+`. We can overwrite the default by setting `RE_KEYWORDS` in the grammar. Keyword() accepts one keyword argument `ign_case` to tell the parser if we should match case insensitive.\n\nExample:\n\n```python\nclass TicTacToe(Grammar):\n # Let's allow keywords with alphabetic characters and dashes.\n RE_KEYWORDS = re.compile('^[A-Za-z-]+')\n\n START = Keyword('tic-tac-toe', ign_case=True)\n\nttt_grammar = TicTacToe()\nttt_grammar.parse('Tic-Tac-Toe').is_valid # => True\n```\n\n### Regex\nsyntax:\n```python\nRegex(pattern, flags=0)\n```\nThe parser compiles a regular expression using the `re` module. The current version of pyleri has only support for the `re.IGNORECASE` flag.\nSee the [Quick usage](#quick-usage) example for how to use `Regex`.\n\n### Token\nsyntax:\n```python\nToken(token)\n```\nA token can be one or more characters and is usually used to match operators like `+`, `-`, `//` and so on. When we parse a string object where pyleri expects an element, it will automatically be converted to a `Token()` object.\n\nExample:\n```python\nclass Ni(Grammar):\n t_dash = Token('-')\n # We could just write delimiter='-' because\n # any string will be converted to Token()\n START = List(Keyword('ni'), delimiter=t_dash)\n\nni = Ni()\nni.parse('ni-ni-ni-ni-ni').is_valid # => True\n```\n\n### Tokens\nsyntax:\n```python\nTokens(tokens)\n```\nCan be used to register multiple tokens at once. The `tokens` argument should be a string with tokens separated by spaces. If given tokens are different in size the parser will try to match the longest tokens first.\n\nExample:\n```python\nclass Ni(Grammar):\n tks = Tokens('+ - !=')\n START = List(Keyword('ni'), delimiter=tks)\n\nni = Ni()\nni.parse('ni + ni != ni - ni').is_valid # => True\n```\n\n### Sequence\nsyntax:\n```python\nSequence(element, element, ...)\n```\nThe parser needs to match each element in a sequence.\n\nExample:\n```python\nclass TicTacToe(Grammar):\n START = Sequence(Keyword('Tic'), Keyword('Tac'), Keyword('Toe'))\n\nttt_grammar = TicTacToe()\nttt_grammar.parse('Tic Tac Toe').is_valid # => True\n```\n\n### Choice\nsyntax:\n```python\nChoice(element, element, ..., most_greedy=True)\n```\nThe parser needs to choose between one of the given elements. Choice accepts one keyword argument `most_greedy` which is `True` by default. When `most_greedy` is set to `False` the parser will stop at the first match. When `True` the parser will try each element and returns the longest match. Setting `most_greedy` to `False` can provide some extra performance. Note that the parser will try to match each element in the exact same order they are parsed to Choice.\n\nExample: let us use `Choice` to modify the Quick usage example to allow the string 'bye \"Iris\"'\n```python\nclass MyGrammar(Grammar):\n r_name = Regex('(?:\"(?:[^\"]*)\")+')\n k_hi = Keyword('hi')\n k_bye = Keyword('bye')\n START = Sequence(Choice(k_hi, k_bye), r_name)\n\nmy_grammar = MyGrammar()\nmy_grammar.parse('hi \"Iris\"').is_valid # => True\nmy_grammar.parse('bye \"Iris\"').is_valid # => True\n```\n\n### Repeat\nsyntax:\n```python\nRepeat(element, mi=0, ma=None)\n```\nThe parser needs at least `mi` elements and at most `ma` elements. When `ma` is set to `None` we allow unlimited number of elements. `mi` can be any integer value equal or higher than 0 but not larger then `ma`.\n\nExample:\n```python\nclass Ni(Grammar):\n START = Repeat(Keyword('ni'))\n\nni = Ni()\nni.parse('ni ni ni ni ni').is_valid # => True\n```\n\nIt is not allowed to bind a name to the same element twice and Repeat(element, 1, 1) is a common solution to bind the element a second (or more) time(s).\n\nFor example consider the following:\n```python\nclass MyGrammar(Grammar):\n r_name = Regex('(?:\"(?:[^\"]*)\")+')\n\n # Raises a SyntaxError because we try to bind a second time.\n r_address = r_name # WRONG\n\n # Instead use Repeat\n r_address = Repeat(r_name, 1, 1) # RIGHT\n```\n\n### List\nsyntax:\n```python\nList(element, delimiter=',', mi=0, ma=None, opt=False)\n```\nList is like Repeat but with a delimiter. A comma is used as default delimiter but any element is allowed. When a string is used as delimiter it will be converted to a `Token` element. `mi` and `ma` work exactly like with Repeat. An optional keyword argument `opt` can be set to `True` to allow the list to end with a delimiter. By default this is set to `False` which means the list has to end with an element.\n\nExample:\n```python\nclass Ni(Grammar):\n START = List(Keyword('ni'))\n\nni = Ni()\nni.parse('ni, ni, ni, ni, ni').is_valid # => True\n```\n\n### Optional\nsyntax:\n```python\nOptional(element)\n```\nThe parser looks for an optional element. It is like using `Repeat(element, 0, 1)` but we encourage to use `Optional` since it is more readable. (and slightly faster)\n\nExample:\n```python\nclass MyGrammar(Grammar):\n r_name = Regex('(?:\"(?:[^\"]*)\")+')\n k_hi = Keyword('hi')\n START = Sequence(k_hi, Optional(r_name))\n\nmy_grammar = MyGrammar()\nmy_grammar.parse('hi \"Iris\"').is_valid # => True\nmy_grammar.parse('hi').is_valid # => True\n```\n\n### Ref\nsyntax:\n```python\nRef()\n```\nThe grammar can make a forward reference to make recursion possible. In the example below we create a forward reference to START but note that\na reference to any element can be made.\n\n>Warning: A reference is not protected against testing the same position in\n>a string. This could potentially lead to an infinite loop.\n>For example:\n>```python\n>r = Ref()\n>r = Optional(r) # DON'T DO THIS\n>```\n>Use [Prio](#prio) if such recursive construction is required.\n\nExample:\n```python\nclass NestedNi(Grammar):\n START = Ref()\n ni_item = Choice(Keyword('ni'), START)\n START = Sequence('[', List(ni_item), ']')\n\nnested_ni = NestedNi()\nnested_ni.parse('[ni, ni, [ni, [], [ni, ni]]]').is_valid # => True\n```\n\n### Prio\nsyntax:\n```python\nPrio(element, element, ...)\n```\nChoose the first match from the prio elements and allow `THIS` for recursive operations. With `THIS` we point to the `Prio` element. Probably the example below explains how `Prio` and `THIS` can be used.\n\n>Note: Use a [Ref](#ref) when possible.\n>A `Prio` element is required when the same position in a string is potentially\n>checked more than once.\n\nExample:\n```python\nclass Ni(Grammar):\n k_ni = Keyword('ni')\n START = Prio(\n k_ni,\n # '(' and ')' are automatically converted to Token('(') and Token(')')\n Sequence('(', THIS, ')'),\n Sequence(THIS, Keyword('or'), THIS),\n Sequence(THIS, Keyword('and'), THIS))\n\nni = Ni()\nni.parse('(ni or ni) and (ni or ni)').is_valid # => True\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Python Left-Right Parser",
"version": "1.4.3",
"project_urls": {
"Download": "https://github.com/cesbit/pyleri/tarball/1.4.3",
"Homepage": "https://github.com/cesbit/pyleri"
},
"split_keywords": [
"parser",
" grammar",
" autocompletion"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "936a4a2a8a05a4945b253d40654149056ae03b9d5747f3c1c423bb93f1e6d13f",
"md5": "7dc0dd922f83ab3eba1e396526375d37",
"sha256": "17ac2a2e934bf1d9432689d558e9787960738d64aa789bc3a6760c2823cb67d2"
},
"downloads": -1,
"filename": "pyleri-1.4.3.tar.gz",
"has_sig": false,
"md5_digest": "7dc0dd922f83ab3eba1e396526375d37",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 36111,
"upload_time": "2024-03-26T09:20:51",
"upload_time_iso_8601": "2024-03-26T09:20:51.415461Z",
"url": "https://files.pythonhosted.org/packages/93/6a/4a2a8a05a4945b253d40654149056ae03b9d5747f3c1c423bb93f1e6d13f/pyleri-1.4.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-26 09:20:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cesbit",
"github_project": "pyleri",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pyleri"
}