flexparser


Nameflexparser JSON
Version 0.2.1 PyPI version JSON
download
home_page
SummaryParsing made fun ... using typing.
upload_time2024-03-08 21:35:39
maintainer
docs_urlNone
author
requires_python>=3.9
licenseBSD-3-Clause
keywords parser code parsing source
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            .. image:: https://img.shields.io/pypi/v/flexparser.svg
    :target: https://pypi.python.org/pypi/flexparser
    :alt: Latest Version

.. image:: https://img.shields.io/pypi/l/flexparser.svg
    :target: https://pypi.python.org/pypi/flexparser
    :alt: License

.. image:: https://img.shields.io/pypi/pyversions/flexparser.svg
    :target: https://pypi.python.org/pypi/flexparser
    :alt: Python Versions

.. image:: https://github.com/hgrecco/flexparser/workflows/CI/badge.svg
    :target: https://github.com/hgrecco/flexparser/actions?query=workflow%3ACI
    :alt: CI

.. image:: https://github.com/hgrecco/flexparser/workflows/Lint/badge.svg
    :target: https://github.com/hgrecco/flexparser/actions?query=workflow%3ALint
    :alt: LINTER

.. image:: https://coveralls.io/repos/github/hgrecco/flexparser/badge.svg?branch=main
    :target: https://coveralls.io/github/hgrecco/flexparser?branch=main
    :alt: Coverage


flexparser
==========

Why write another parser? I have asked myself the same question while
working on this project. It is clear that there are excellent parsers out
there but I wanted to experiment with another way of writing them.

The idea is quite simple. You write a class for every type of content
(called here ``ParsedStatement``) you need to parse. Each class should
have a ``from_string`` constructor. We used extensively the ``typing``
module to make the output structure easy to use and less error prone.

For example:

.. code-block:: python

    from dataclasses import dataclass

    import flexparser as fp

    @dataclass(frozen=True)
    class Assigment(fp.ParsedStatement):
        """Parses the following `this <- other`
        """

        lhs: str
        rhs: str

        @classmethod
        def from_string(cls, s):
            lhs, rhs = s.split("<-")
            return cls(lhs.strip(), rhs.strip())

(using a frozen dataclass is not necessary but it convenient. Being a
dataclass you get the init, str, repr, etc for free. Being frozen, sort
of immutable, makes them easier to reason around)

In certain cases you might want to signal the parser
that his class is not appropriate to parse the statement.

.. code-block:: python

    @dataclass(frozen=True)
    class Assigment(fp.ParsedStatement):
        """Parses the following `this <- other`
        """

        lhs: str
        rhs: str

        @classmethod
        def from_string(cls, s):
            if "<-" not in s:
                # This means: I do not know how to parse it
                # try with another ParsedStatement class.
                return None
            lhs, rhs = s.split("<-")
            return cls(lhs.strip(), rhs.strip())


You might also want to indicate that this is the right ``ParsedStatement``
but something is not right:

.. code-block:: python

    @dataclass(frozen=True)
    class InvalidIdentifier(fp.ParsingError):
        value: str


    @dataclass(frozen=True)
    class Assigment(fp.ParsedStatement):
        """Parses the following `this <- other`
        """

        lhs: str
        rhs: str

        @classmethod
        def from_string(cls, s):
            if "<-" not in s:
                # This means: I do not know how to parse it
                # try with another ParsedStatement class.
                return None
            lhs, rhs = (p.strip() for p in s.split("<-"))

            if not str.isidentifier(lhs):
                return InvalidIdentifier(lhs)

            return cls(lhs, rhs)


Put this into ``source.txt``

.. code-block:: text

    one <- other
    2two <- new
    three <- newvalue
    one == three

and then run the following code:

.. code-block:: python

    parsed = fp.parse("source.txt", Assigment)
    for el in parsed.iter_statements():
        print(repr(el))

will produce the following output:

.. code-block:: text

    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)
    Assigment(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other', lhs='one', rhs='other')
    InvalidIdentifier(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new', value='2two')
    Assigment(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')
    UnknownStatement(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three')
    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)


The result is a collection of ``ParsedStatement`` or ``ParsingError`` (flanked by
``BOF`` and ``EOS`` indicating beginning of file and ending of stream respectively
Alternative, it can beginning with ``BOR`` with means beginning of resource and it
is used when parsing a Python Resource provided with a package).

Notice that there are two correctly parsed statements (``Assigment``), one
error found (``InvalidIdentifier``) and one unknown (``UnknownStatement``).

Cool, right? Just writing a ``from_string`` method that outputs a datastructure
produces a usable structure of parsed objects.

Now what? Let's say we want to support equality comparison. Simply do:

.. code-block:: python

    @dataclass(frozen=True)
    class EqualityComparison(fp.ParsedStatement):
        """Parses the following `this == other`
        """

        lhs: str
        rhs: str

        @classmethod
        def from_string(cls, s):
            if "==" not in s:
                return None
            lhs, rhs = (p.strip() for p in s.split("=="))

            return cls(lhs, rhs)

    parsed = fp.parse("source.txt", (Assigment, Equality))
    for el in parsed.iter_statements():
        print(repr(el))

and run it again:

.. code-block:: text

    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)
    Assigment(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other', lhs='one', rhs='other')
    InvalidIdentifier(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new', value='2two')
    Assigment(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')
    EqualityComparison(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three', lhs='one', rhs='three')
    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)


You need to group certain statements together: welcome to ``Block``
This construct allows you to group

.. code-block:: python

    class Begin(fp.ParsedStatement):

        @classmethod
        def from_string(cls, s):
            if s == "begin":
                return cls()

            return None

    class End(fp.ParsedStatement):

        @classmethod
        def from_string(cls, s):
            if s == "end":
                return cls()

            return None

    class AssigmentBlock(fp.Block[Begin, Assigment, End]):
        pass

    parsed = fp.parse("source.txt", (AssigmentBlock, Equality))


Run the code:

.. code-block:: text

    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)
    UnknownStatement(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other')
    UnknownStatement(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new')
    UnknownStatement(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue')
    UnknownStatement(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three')
    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)


Notice that there are a lot of ``UnknownStatement`` now, because we instructed
the parser to only look for assignment within a block. So change your text file to:

.. code-block:: text

    begin
    one <- other
    2two <- new
    three <- newvalue
    end
    one == three

and try again:

.. code-block:: text

    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='3d8ce0051dcdd6f0f80ef789a0df179509d927874f242005ac41ed886ae0b71a30b845b9bfcb30194461c0ef6a3ca324c36f411dfafc7e588611f1eb0269bb5a'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source2.txt'), mtime=1658550707.1248093)
    Begin(start_line=1, start_col=0, end_line=1, end_col=5, raw='begin')
    Assigment(start_line=2, start_col=0, end_line=2, end_col=12, raw='one <- other', lhs='one', rhs='other')
    InvalidIdentifier(start_line=3, start_col=0, end_line=3, end_col=11, raw='2two <- new', value='2two')
    Assigment(start_line=4, start_col=0, end_line=4, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')
    End(start_line=5, start_col=0, end_line=5, end_col=3, raw='end')
    EqualityComparison(start_line=6, start_col=0, end_line=6, end_col=12, raw='one == three', lhs='one', rhs='three')
    EOS(start_line=7, start_col=0, end_line=7, end_col=0, raw=None)


Until now we have used ``parsed.iter_statements`` to iterate over all parsed statements.
But let's look inside ``parsed``, an object of ``ParsedProject`` type. It is a thin wrapper
over a dictionary mapping files to parsed content. Because we have provided a single file
and this does not contain a link another, our ``parsed`` object contains a single element.
The key is ``None`` indicating that the file 'source.txt' was loaded from the root location
(None). The content is a ``ParsedSourceFile`` object with the following attributes:

- **path**: full path of the source file
- **mtime**: modification file of the source file
- **content_hash**: hash of the pickled content
- **config**: extra parameters that can be given to the parser (see below).

.. code-block:: text

    ParsedSource(
        parsed_source=parse.<locals>.CustomRootBlock(
            opening=BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='3d8ce0051dcdd6f0f80ef789a0df179509d927874f242005ac41ed886ae0b71a30b845b9bfcb30194461c0ef6a3ca324c36f411dfafc7e588611f1eb0269bb5a'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source2.txt'), mtime=1658550707.1248093),
            body=(
                Block.subclass_with.<locals>.CustomBlock(
                    opening=Begin(start_line=1, start_col=0, end_line=1, end_col=5, raw='begin'),
                    body=(
                        Assigment(start_line=2, start_col=0, end_line=2, end_col=12, raw='one <- other', lhs='one', rhs='other'),
                        InvalidIdentifier(start_line=3, start_col=0, end_line=3, end_col=11, raw='2two <- new', value='2two'),
                        Assigment(start_line=4, start_col=0, end_line=4, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')
                    ),
                    closing=End(start_line=5, start_col=0, end_line=5, end_col=3, raw='end')),
                EqualityComparison(start_line=6, start_col=0, end_line=6, end_col=12, raw='one == three', lhs='one', rhs='three')),
            closing=EOS(start_line=7, start_col=0, end_line=7, end_col=0, raw=None)),
        config=None
    )


A few things to notice:

1. We were using a block before without knowing. The ``RootBlock`` is a
   special type of Block that starts and ends automatically with the
   file.
2. ``opening``, ``body``, ``closing`` are automatically annotated with the
   possible ``ParsedStatement`` (plus `ParsingError`),
   therefore autocompletes works in most IDEs.
3. The same is true for the defined ``ParsedStatement`` (we have use
   ``dataclass`` for a reason). This makes using the actual
   result of the parsing a charm!.
4. That annoying ``subclass_with.<locals>`` is because we have built
   a class on the fly when we used ``Block.subclass_with``. You can
   get rid of it (which is actually useful for pickling) by explicit
   subclassing Block in your code (see below).


Multiple source files
---------------------

Most projects have more than one source file internally connected.
A file might refer to another that also need to be parsed (e.g. an
`#include` statement in c). **flexparser** provides the ``IncludeStatement``
base class specially for this purpose.

.. code-block:: python

    @dataclass(frozen=True)
    class Include(fp.IncludeStatement):
        """A naive implementation of #include "file"
        """

        value: str

        @classmethod
        def from_string(cls, s):
            if s.startwith("#include "):
                return None

            value = s[len("#include "):].strip().strip('"')

            return cls(value)

        @propery
        def target(self):
            return self.value

The only difference is that you need to implement a ``target`` property
that returns the file name or resource that this statement refers to.


Customizing statementization
----------------------------

statementi ... what? **flexparser** works by trying to parse each statement with
one of the known classes. So it is fair to ask what is an statement in this
context and how can you configure it to your needs. A text file is split into
non overlapping strings called **statements**. Parsing work as follows:

1. each file is split into statements (can be single or multi line).
2. each statement is parsed with the first of the contextually
   available ParsedStatement or Block subclassed that returns
   a ``ParsedStatement`` or ``ParsingError``

You can customize how to split each line into statements with two arguments
provided to parse:

- **strip_spaces** (`bool`): indicates that leading and trailing spaces must
  be removed before attempting to parse.
  (default: True)
- **delimiters** (`dict`): indicates how each line must be subsplit.
  (default: do not divide)

An delimiter example might be
``{";": (fp.DelimiterInclude.SKIP, fp.DelimiterAction.CONTINUE)}``
which tells the statementizer (sorry) that when a ";" is found a new statement should
begin. ``DelimiterMode.SKIP`` tells that ";" should not be added to the previous
statement nor to the next. Other valid values are ``SPLIT_AFTER`` and ``SPLIT_BEFORE``
to append or prepend the delimiter character to the previous or next statement.
The second element tells the statementizer (sorry again) what to do next:
valid values are: `CONTINUE`, `CAPTURE_NEXT_TIL_EOL`, `STOP_PARSING_LINE`, and
`STOP_PARSING`.

This is useful with comments. For example,
``{"#": (fp.DelimiterMode.WITH_NEXT, fp.DelimiterAction.CAPTURE_NEXT_TIL_EOL))}``
tells the statementizer (it is not funny anymore) that after the first "#"
it should stop splitting and capture all.

This allows:

.. code-block:: text

    ## This will work as a single statement
    # This will work as a single statement #
    # This will work as # a single statement #
    a = 3 # this will produce two statements (a=3, and the rest)


Explicit Block classes
----------------------

.. code-block:: python

    class AssigmentBlock(fp.Block[Begin, Assigment, End]):
        pass

    class EntryBlock(fp.RootBlock[Union[AssigmentBlock, Equality]]):
        pass

    parsed = fp.parse("source.txt", EntryBlock)


Customizing parsing
-------------------

In certain cases you might want to leave to the user some configuration
details. We have method for that!. Instead of overriding ``from_string``
override ``from_string_and_config``. The second argument is an object
that can be given to the parser, which in turn will be passed to each
``ParsedStatement`` class.

.. code-block:: python

    @dataclass(frozen=True)
    class NumericAssigment(fp.ParsedStatement):
        """Parses the following `this <- other`
        """

        lhs: str
        rhs: numbers.Number

        @classmethod
        def from_string_and_config(cls, s, config):
            if "==" not in s:
                # This means: I do not know how to parse it
                # try with another ParsedStatement class.
                return None
            lhs, rhs = s.split("==")
            return cls(lhs.strip(), config.numeric_type(rhs.strip()))

    class Config:

        numeric_type = float

    parsed = fp.parse("source.txt", NumericAssigment, Config)

----

This project was started as a part of Pint_, the python units package.

See AUTHORS_ for a list of the maintainers.

To review an ordered list of notable changes for each version of a project,
see CHANGES_

.. _`AUTHORS`: https://github.com/hgrecco/flexparser/blob/main/AUTHORS
.. _`CHANGES`: https://github.com/hgrecco/flexparser/blob/main/CHANGES
.. _`Pint`: https://github.com/hgrecco/pint

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "flexparser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "\"Hernan E. Grecco\" <hernan.grecco@gmail.com>",
    "keywords": "parser,code,parsing,source",
    "author": "",
    "author_email": "\"Hernan E. Grecco\" <hernan.grecco@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/30/d0/e646499ef11597258677625bbec43dc5ec450ed14e45329c1f3b717d93fe/flexparser-0.2.1.tar.gz",
    "platform": null,
    "description": ".. image:: https://img.shields.io/pypi/v/flexparser.svg\n    :target: https://pypi.python.org/pypi/flexparser\n    :alt: Latest Version\n\n.. image:: https://img.shields.io/pypi/l/flexparser.svg\n    :target: https://pypi.python.org/pypi/flexparser\n    :alt: License\n\n.. image:: https://img.shields.io/pypi/pyversions/flexparser.svg\n    :target: https://pypi.python.org/pypi/flexparser\n    :alt: Python Versions\n\n.. image:: https://github.com/hgrecco/flexparser/workflows/CI/badge.svg\n    :target: https://github.com/hgrecco/flexparser/actions?query=workflow%3ACI\n    :alt: CI\n\n.. image:: https://github.com/hgrecco/flexparser/workflows/Lint/badge.svg\n    :target: https://github.com/hgrecco/flexparser/actions?query=workflow%3ALint\n    :alt: LINTER\n\n.. image:: https://coveralls.io/repos/github/hgrecco/flexparser/badge.svg?branch=main\n    :target: https://coveralls.io/github/hgrecco/flexparser?branch=main\n    :alt: Coverage\n\n\nflexparser\n==========\n\nWhy write another parser? I have asked myself the same question while\nworking on this project. It is clear that there are excellent parsers out\nthere but I wanted to experiment with another way of writing them.\n\nThe idea is quite simple. You write a class for every type of content\n(called here ``ParsedStatement``) you need to parse. Each class should\nhave a ``from_string`` constructor. We used extensively the ``typing``\nmodule to make the output structure easy to use and less error prone.\n\nFor example:\n\n.. code-block:: python\n\n    from dataclasses import dataclass\n\n    import flexparser as fp\n\n    @dataclass(frozen=True)\n    class Assigment(fp.ParsedStatement):\n        \"\"\"Parses the following `this <- other`\n        \"\"\"\n\n        lhs: str\n        rhs: str\n\n        @classmethod\n        def from_string(cls, s):\n            lhs, rhs = s.split(\"<-\")\n            return cls(lhs.strip(), rhs.strip())\n\n(using a frozen dataclass is not necessary but it convenient. Being a\ndataclass you get the init, str, repr, etc for free. Being frozen, sort\nof immutable, makes them easier to reason around)\n\nIn certain cases you might want to signal the parser\nthat his class is not appropriate to parse the statement.\n\n.. code-block:: python\n\n    @dataclass(frozen=True)\n    class Assigment(fp.ParsedStatement):\n        \"\"\"Parses the following `this <- other`\n        \"\"\"\n\n        lhs: str\n        rhs: str\n\n        @classmethod\n        def from_string(cls, s):\n            if \"<-\" not in s:\n                # This means: I do not know how to parse it\n                # try with another ParsedStatement class.\n                return None\n            lhs, rhs = s.split(\"<-\")\n            return cls(lhs.strip(), rhs.strip())\n\n\nYou might also want to indicate that this is the right ``ParsedStatement``\nbut something is not right:\n\n.. code-block:: python\n\n    @dataclass(frozen=True)\n    class InvalidIdentifier(fp.ParsingError):\n        value: str\n\n\n    @dataclass(frozen=True)\n    class Assigment(fp.ParsedStatement):\n        \"\"\"Parses the following `this <- other`\n        \"\"\"\n\n        lhs: str\n        rhs: str\n\n        @classmethod\n        def from_string(cls, s):\n            if \"<-\" not in s:\n                # This means: I do not know how to parse it\n                # try with another ParsedStatement class.\n                return None\n            lhs, rhs = (p.strip() for p in s.split(\"<-\"))\n\n            if not str.isidentifier(lhs):\n                return InvalidIdentifier(lhs)\n\n            return cls(lhs, rhs)\n\n\nPut this into ``source.txt``\n\n.. code-block:: text\n\n    one <- other\n    2two <- new\n    three <- newvalue\n    one == three\n\nand then run the following code:\n\n.. code-block:: python\n\n    parsed = fp.parse(\"source.txt\", Assigment)\n    for el in parsed.iter_statements():\n        print(repr(el))\n\nwill produce the following output:\n\n.. code-block:: text\n\n    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)\n    Assigment(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other', lhs='one', rhs='other')\n    InvalidIdentifier(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new', value='2two')\n    Assigment(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')\n    UnknownStatement(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three')\n    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)\n\n\nThe result is a collection of ``ParsedStatement`` or ``ParsingError`` (flanked by\n``BOF`` and ``EOS`` indicating beginning of file and ending of stream respectively\nAlternative, it can beginning with ``BOR`` with means beginning of resource and it\nis used when parsing a Python Resource provided with a package).\n\nNotice that there are two correctly parsed statements (``Assigment``), one\nerror found (``InvalidIdentifier``) and one unknown (``UnknownStatement``).\n\nCool, right? Just writing a ``from_string`` method that outputs a datastructure\nproduces a usable structure of parsed objects.\n\nNow what? Let's say we want to support equality comparison. Simply do:\n\n.. code-block:: python\n\n    @dataclass(frozen=True)\n    class EqualityComparison(fp.ParsedStatement):\n        \"\"\"Parses the following `this == other`\n        \"\"\"\n\n        lhs: str\n        rhs: str\n\n        @classmethod\n        def from_string(cls, s):\n            if \"==\" not in s:\n                return None\n            lhs, rhs = (p.strip() for p in s.split(\"==\"))\n\n            return cls(lhs, rhs)\n\n    parsed = fp.parse(\"source.txt\", (Assigment, Equality))\n    for el in parsed.iter_statements():\n        print(repr(el))\n\nand run it again:\n\n.. code-block:: text\n\n    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)\n    Assigment(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other', lhs='one', rhs='other')\n    InvalidIdentifier(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new', value='2two')\n    Assigment(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')\n    EqualityComparison(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three', lhs='one', rhs='three')\n    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)\n\n\nYou need to group certain statements together: welcome to ``Block``\nThis construct allows you to group\n\n.. code-block:: python\n\n    class Begin(fp.ParsedStatement):\n\n        @classmethod\n        def from_string(cls, s):\n            if s == \"begin\":\n                return cls()\n\n            return None\n\n    class End(fp.ParsedStatement):\n\n        @classmethod\n        def from_string(cls, s):\n            if s == \"end\":\n                return cls()\n\n            return None\n\n    class AssigmentBlock(fp.Block[Begin, Assigment, End]):\n        pass\n\n    parsed = fp.parse(\"source.txt\", (AssigmentBlock, Equality))\n\n\nRun the code:\n\n.. code-block:: text\n\n    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='37bc23cde7cad3ece96b7abf64906c84decc116de1e0486679eb6ca696f233a403f756e2e431063c82abed4f0e342294c2fe71af69111faea3765b78cb90c03f'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source1.txt'), mtime=1658550284.9419456)\n    UnknownStatement(start_line=1, start_col=0, end_line=1, end_col=12, raw='one <- other')\n    UnknownStatement(start_line=2, start_col=0, end_line=2, end_col=11, raw='2two <- new')\n    UnknownStatement(start_line=3, start_col=0, end_line=3, end_col=17, raw='three <- newvalue')\n    UnknownStatement(start_line=4, start_col=0, end_line=4, end_col=12, raw='one == three')\n    EOS(start_line=5, start_col=0, end_line=5, end_col=0, raw=None)\n\n\nNotice that there are a lot of ``UnknownStatement`` now, because we instructed\nthe parser to only look for assignment within a block. So change your text file to:\n\n.. code-block:: text\n\n    begin\n    one <- other\n    2two <- new\n    three <- newvalue\n    end\n    one == three\n\nand try again:\n\n.. code-block:: text\n\n    BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='3d8ce0051dcdd6f0f80ef789a0df179509d927874f242005ac41ed886ae0b71a30b845b9bfcb30194461c0ef6a3ca324c36f411dfafc7e588611f1eb0269bb5a'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source2.txt'), mtime=1658550707.1248093)\n    Begin(start_line=1, start_col=0, end_line=1, end_col=5, raw='begin')\n    Assigment(start_line=2, start_col=0, end_line=2, end_col=12, raw='one <- other', lhs='one', rhs='other')\n    InvalidIdentifier(start_line=3, start_col=0, end_line=3, end_col=11, raw='2two <- new', value='2two')\n    Assigment(start_line=4, start_col=0, end_line=4, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')\n    End(start_line=5, start_col=0, end_line=5, end_col=3, raw='end')\n    EqualityComparison(start_line=6, start_col=0, end_line=6, end_col=12, raw='one == three', lhs='one', rhs='three')\n    EOS(start_line=7, start_col=0, end_line=7, end_col=0, raw=None)\n\n\nUntil now we have used ``parsed.iter_statements`` to iterate over all parsed statements.\nBut let's look inside ``parsed``, an object of ``ParsedProject`` type. It is a thin wrapper\nover a dictionary mapping files to parsed content. Because we have provided a single file\nand this does not contain a link another, our ``parsed`` object contains a single element.\nThe key is ``None`` indicating that the file 'source.txt' was loaded from the root location\n(None). The content is a ``ParsedSourceFile`` object with the following attributes:\n\n- **path**: full path of the source file\n- **mtime**: modification file of the source file\n- **content_hash**: hash of the pickled content\n- **config**: extra parameters that can be given to the parser (see below).\n\n.. code-block:: text\n\n    ParsedSource(\n        parsed_source=parse.<locals>.CustomRootBlock(\n            opening=BOF(start_line=0, start_col=0, end_line=0, end_col=0, raw=None, content_hash=Hash(algorithm_name='blake2b', hexdigest='3d8ce0051dcdd6f0f80ef789a0df179509d927874f242005ac41ed886ae0b71a30b845b9bfcb30194461c0ef6a3ca324c36f411dfafc7e588611f1eb0269bb5a'), path=PosixPath('/Users/grecco/Documents/code/flexparser/examples/in_readme/source2.txt'), mtime=1658550707.1248093),\n            body=(\n                Block.subclass_with.<locals>.CustomBlock(\n                    opening=Begin(start_line=1, start_col=0, end_line=1, end_col=5, raw='begin'),\n                    body=(\n                        Assigment(start_line=2, start_col=0, end_line=2, end_col=12, raw='one <- other', lhs='one', rhs='other'),\n                        InvalidIdentifier(start_line=3, start_col=0, end_line=3, end_col=11, raw='2two <- new', value='2two'),\n                        Assigment(start_line=4, start_col=0, end_line=4, end_col=17, raw='three <- newvalue', lhs='three', rhs='newvalue')\n                    ),\n                    closing=End(start_line=5, start_col=0, end_line=5, end_col=3, raw='end')),\n                EqualityComparison(start_line=6, start_col=0, end_line=6, end_col=12, raw='one == three', lhs='one', rhs='three')),\n            closing=EOS(start_line=7, start_col=0, end_line=7, end_col=0, raw=None)),\n        config=None\n    )\n\n\nA few things to notice:\n\n1. We were using a block before without knowing. The ``RootBlock`` is a\n   special type of Block that starts and ends automatically with the\n   file.\n2. ``opening``, ``body``, ``closing`` are automatically annotated with the\n   possible ``ParsedStatement`` (plus `ParsingError`),\n   therefore autocompletes works in most IDEs.\n3. The same is true for the defined ``ParsedStatement`` (we have use\n   ``dataclass`` for a reason). This makes using the actual\n   result of the parsing a charm!.\n4. That annoying ``subclass_with.<locals>`` is because we have built\n   a class on the fly when we used ``Block.subclass_with``. You can\n   get rid of it (which is actually useful for pickling) by explicit\n   subclassing Block in your code (see below).\n\n\nMultiple source files\n---------------------\n\nMost projects have more than one source file internally connected.\nA file might refer to another that also need to be parsed (e.g. an\n`#include` statement in c). **flexparser** provides the ``IncludeStatement``\nbase class specially for this purpose.\n\n.. code-block:: python\n\n    @dataclass(frozen=True)\n    class Include(fp.IncludeStatement):\n        \"\"\"A naive implementation of #include \"file\"\n        \"\"\"\n\n        value: str\n\n        @classmethod\n        def from_string(cls, s):\n            if s.startwith(\"#include \"):\n                return None\n\n            value = s[len(\"#include \"):].strip().strip('\"')\n\n            return cls(value)\n\n        @propery\n        def target(self):\n            return self.value\n\nThe only difference is that you need to implement a ``target`` property\nthat returns the file name or resource that this statement refers to.\n\n\nCustomizing statementization\n----------------------------\n\nstatementi ... what? **flexparser** works by trying to parse each statement with\none of the known classes. So it is fair to ask what is an statement in this\ncontext and how can you configure it to your needs. A text file is split into\nnon overlapping strings called **statements**. Parsing work as follows:\n\n1. each file is split into statements (can be single or multi line).\n2. each statement is parsed with the first of the contextually\n   available ParsedStatement or Block subclassed that returns\n   a ``ParsedStatement`` or ``ParsingError``\n\nYou can customize how to split each line into statements with two arguments\nprovided to parse:\n\n- **strip_spaces** (`bool`): indicates that leading and trailing spaces must\n  be removed before attempting to parse.\n  (default: True)\n- **delimiters** (`dict`): indicates how each line must be subsplit.\n  (default: do not divide)\n\nAn delimiter example might be\n``{\";\": (fp.DelimiterInclude.SKIP, fp.DelimiterAction.CONTINUE)}``\nwhich tells the statementizer (sorry) that when a \";\" is found a new statement should\nbegin. ``DelimiterMode.SKIP`` tells that \";\" should not be added to the previous\nstatement nor to the next. Other valid values are ``SPLIT_AFTER`` and ``SPLIT_BEFORE``\nto append or prepend the delimiter character to the previous or next statement.\nThe second element tells the statementizer (sorry again) what to do next:\nvalid values are: `CONTINUE`, `CAPTURE_NEXT_TIL_EOL`, `STOP_PARSING_LINE`, and\n`STOP_PARSING`.\n\nThis is useful with comments. For example,\n``{\"#\": (fp.DelimiterMode.WITH_NEXT, fp.DelimiterAction.CAPTURE_NEXT_TIL_EOL))}``\ntells the statementizer (it is not funny anymore) that after the first \"#\"\nit should stop splitting and capture all.\n\nThis allows:\n\n.. code-block:: text\n\n    ## This will work as a single statement\n    # This will work as a single statement #\n    # This will work as # a single statement #\n    a = 3 # this will produce two statements (a=3, and the rest)\n\n\nExplicit Block classes\n----------------------\n\n.. code-block:: python\n\n    class AssigmentBlock(fp.Block[Begin, Assigment, End]):\n        pass\n\n    class EntryBlock(fp.RootBlock[Union[AssigmentBlock, Equality]]):\n        pass\n\n    parsed = fp.parse(\"source.txt\", EntryBlock)\n\n\nCustomizing parsing\n-------------------\n\nIn certain cases you might want to leave to the user some configuration\ndetails. We have method for that!. Instead of overriding ``from_string``\noverride ``from_string_and_config``. The second argument is an object\nthat can be given to the parser, which in turn will be passed to each\n``ParsedStatement`` class.\n\n.. code-block:: python\n\n    @dataclass(frozen=True)\n    class NumericAssigment(fp.ParsedStatement):\n        \"\"\"Parses the following `this <- other`\n        \"\"\"\n\n        lhs: str\n        rhs: numbers.Number\n\n        @classmethod\n        def from_string_and_config(cls, s, config):\n            if \"==\" not in s:\n                # This means: I do not know how to parse it\n                # try with another ParsedStatement class.\n                return None\n            lhs, rhs = s.split(\"==\")\n            return cls(lhs.strip(), config.numeric_type(rhs.strip()))\n\n    class Config:\n\n        numeric_type = float\n\n    parsed = fp.parse(\"source.txt\", NumericAssigment, Config)\n\n----\n\nThis project was started as a part of Pint_, the python units package.\n\nSee AUTHORS_ for a list of the maintainers.\n\nTo review an ordered list of notable changes for each version of a project,\nsee CHANGES_\n\n.. _`AUTHORS`: https://github.com/hgrecco/flexparser/blob/main/AUTHORS\n.. _`CHANGES`: https://github.com/hgrecco/flexparser/blob/main/CHANGES\n.. _`Pint`: https://github.com/hgrecco/pint\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "Parsing made fun ... using typing.",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/hgrecco/flexparser"
    },
    "split_keywords": [
        "parser",
        "code",
        "parsing",
        "source"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cbc23682a77629e509057380a8b44c9184079d3050df64966eb98ee7d21947c6",
                "md5": "aa8142b121bdb9ec4468d3c6795841ad",
                "sha256": "6b7076cbfd29626bdd83806910befcf8e7f1595053afc93ee627be946e391029"
            },
            "downloads": -1,
            "filename": "flexparser-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aa8142b121bdb9ec4468d3c6795841ad",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 26877,
            "upload_time": "2024-03-08T21:35:37",
            "upload_time_iso_8601": "2024-03-08T21:35:37.234290Z",
            "url": "https://files.pythonhosted.org/packages/cb/c2/3682a77629e509057380a8b44c9184079d3050df64966eb98ee7d21947c6/flexparser-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "30d0e646499ef11597258677625bbec43dc5ec450ed14e45329c1f3b717d93fe",
                "md5": "778708b0ec08f420640dd4209be7736a",
                "sha256": "47892d375bb9b6f5b3a41216e78e17c829eba9a3fbd81a620c3f551f479d456f"
            },
            "downloads": -1,
            "filename": "flexparser-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "778708b0ec08f420640dd4209be7736a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 31151,
            "upload_time": "2024-03-08T21:35:39",
            "upload_time_iso_8601": "2024-03-08T21:35:39.290130Z",
            "url": "https://files.pythonhosted.org/packages/30/d0/e646499ef11597258677625bbec43dc5ec450ed14e45329c1f3b717d93fe/flexparser-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-08 21:35:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hgrecco",
    "github_project": "flexparser",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "flexparser"
}
        
Elapsed time: 0.19854s