aweson


Nameaweson JSON
Version 3.0.0 PyPI version JSON
download
home_pageNone
SummaryTraversing and manipulating hierarchical info sets (JSON) using pythonic JSON Path-like expressions
upload_time2025-07-26 09:45:41
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseNone
keywords json path json path safe navigation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            aweson
======

Traversing and manipulating hierarchical data (think JSON) using
pythonic `JSON Path`_ -like expressions. This library doesn't support
every JSON Path notation, but it has it's own tricks to offer, e.g.
``with_values()``.


Importing:

>>> from aweson import JP, find_all, find_all_duplicate, find_all_unique, find_next, with_values


Iterating over hierarchical data
--------------------------------

>>> content = {"employees": [
...     {"name": "Doe, John", "age": 32, "account": "johndoe"},
...     {"name": "Doe, Jane", "age": -23, "account": "janedoe"},
...     {"name": "Deer, Jude", "age": 42, "account": "judedeer"},
... ]}
>>> list(find_all(content, JP.employees[:].name))
['Doe, John', 'Doe, Jane', 'Deer, Jude']

    The JSON Path-like expression ``JP.employees[:].name`` is `not` a string.
    Most JSON Path supporting libraries, like `python-jsonpath`_, `jsonpath-rfc9535`_
    have the JSON Path as a string, which they parse.
    Using this library You build a `Python expression`, parsed and interpreted
    by Python itself. This way Your IDE will be of actual help.

To address all items in a list, Pythonic slice expression
``[:]`` is used. Naturally, other indexing and slice expressions also work:

>>> list(find_all(content, JP.employees[1].name))
['Doe, Jane']
>>> list(find_all(content, JP.employees[-1].name))
['Deer, Jude']
>>> list(find_all(content, JP.employees[:2].name))
['Doe, John', 'Doe, Jane']

    These indexing and slicing expressions are valid expressions for both `JSON Path`_
    and Python. The more conventional JSON Path notation for selecting all items of a list,
    ``$.some_array[*]``, is (sort of) supported, only as ``JP.some_array["*"]``.


Selecting list items by boolean expressions
-------------------------------------------

Dictionaries in lists can also be selected by simple boolean expressions evaluated within
the context of each such dictionary, for instance

>>> list(find_all(content, JP.employees[JP.age > 35]))
[{'name': 'Deer, Jude', 'age': 42, 'account': 'judedeer'}]

Only simple comparisons are supported, and only these operators: ``==``, ``!=``,
``<``, ``<=``, ``>``, ``>=``.

    Both operands can be dict keys in a list item, e.g. expressions like
    ``JP.years[JP.planned_budget < JP.realized_budget]`` are supported.

In addition to this, existence of a sub-item or path can also be used as
a list item selector, e.g. ``JP.years[JP.planned_budget]`` would select only
the years where the key ``planned_budget`` exists.


Field name by regular expressions
---------------------------------

Consider the following ``dict`` content

>>> content = {
...     "apple": [{"name": "red delicious"}, {"name": "punakaneli"}],
...     "pear": [{"name": "wilhelm"}, {"name": "conference"}]
... }

if You want to iterate over `all` fruit items, both apples and pears,
You can do so:

>>> list(find_all(content, JP["apple|pear"][:].name))
['red delicious', 'punakaneli', 'wilhelm', 'conference']

or even

>>> list(find_all(content, JP[".*"][:].name))
['red delicious', 'punakaneli', 'wilhelm', 'conference']

if You are interested in everything, not only apples and pears.


Paths to items iterated
-----------------------

You may be interested in the actual path of an item being returned.

    When You use ``enumerate()`` with a ``list``, You want to obtain the
    index of an item alongside with the item's value during iteration. For
    instance,

    >>> list(enumerate(["a", "b"]))
    [(0, 'a'), (1, 'b')]

    and You can use that index to refer to the item itself, even to retrieve
    it again from the list.

Similarly, when iterating within a hierarchical data structure, You
may want to obtain a `pointer` (i.e. path object) alongside the item's
value:

>>> content = {"employees": [
...     {"name": "Doe, John", "age": 32, "account": "johndoe"},
...     {"name": "Doe, Jane", "age": -23, "account": "janedoe"},
...     {"name": "Deer, Jude", "age": 42, "account": "judedeer"},
... ]}
>>> path, item = next(tup for tup in find_all(
...     content,
...     JP.employees[JP.age < 0],
...     with_path=True
... ))
>>> item
{'name': 'Doe, Jane', 'age': -23, 'account': 'janedoe'}

The path to the item found is:

>>> str(path)
'$.employees[1]'

The path object yielded along is a JSON Path-like object, just as if You
constructed it as ``JP.employee[1]``.

    With argument ``with_path=True`` passed, ``find_all()`` yields tuples
    instead of items only. The first item of a yielded tuple is the path object,
    and the second item is the item itself. This is consistent with ``enumerate()``
    behavior.

Also, the JSON Path-like objects have a field called ``.parent``, so that You can
access the parent data structure, consider a path object you've obtained. You
can dig out its respective value:

>>> path = JP.employees[1].name
>>> next(find_all(content, path))
'Doe, Jane'

But if you want to have access to the containing structure, use ``.parent``:

>>> next(find_all(content, path.parent))
{'name': 'Doe, Jane', 'age': -23, 'account': 'janedoe'}


.. _subitems:

Selecting sub-items
-------------------

You can select sub-items of iterated items, comes handy into turning one structure
into another, like a list of records into a ``dict``:

>>> {account: name for account, name in find_all(content, JP.employees[:](JP.account, JP.name))}
{'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}

    This is roughly equivalent to:

    >>> {item["account"]: item["name"] for item in find_all(content, JP.employees[:])}
    {'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}

    The sub-item selection, while slightly more verbose, is arguably more
    declarative.

You can also make a sub-items selection produce dictionaries by explicitly naming sub-paths:

>>> list(find_all(content, JP.employees[:](id=JP.account, username=JP.name)))
[{'id': 'johndoe', 'username': 'Doe, John'}, {'id': 'janedoe', 'username': 'Doe, Jane'}, {'id': 'judedeer', 'username': 'Deer, Jude'}]

In the code above, the key ``"account"`` is rendered as ``id``,
and ``"name"`` as ``username``.


Variable field name selection
-----------------------------

The forms ``JP["field_name"]`` and ``JP.field_name`` are equivalent:

>>> from functools import reduce
>>> def my_sum(content, field_name_to_sum, initial):
...     return reduce(
...         lambda x, y: x + y,
...         find_all(content, JP.employees[:][field_name_to_sum]),
...         initial
...     )
>>> my_sum(content, "age", 0)
51
>>> my_sum(content, "account", "")
'johndoejanedoejudedeer'

    At this point, some disambiguation is due:

    - ``JP["field"]`` is equivalent to ``JP.field``, both select a key/value pair
      of a dictionary,

    - ``JP[".*"]`` is a regular expression, select all key/value pairs of a dictionary.

    - ``JP["*"]`` selects all items in a list, equivalent to ``JP[:]``,


.. _withvalues:

Utility ``with_values()``
-------------------------

You can produce a copy of Your hierarchical data with some values overwritten (or
even added):

>>> content = [{"msg": "hallo"}, {"msg": "hello"}, {"msg": "bye"}]
>>> with_values(content, JP[1].msg, "moi")
[{'msg': 'hallo'}, {'msg': 'moi'}, {'msg': 'bye'}]

    Note that the original ``content`` is not mutated:

    >>> content
    [{'msg': 'hallo'}, {'msg': 'hello'}, {'msg': 'bye'}]

You can also overwrite values at multiple places:

>>> with_values(content, JP[:].msg, "moi")
[{'msg': 'moi'}, {'msg': 'moi'}, {'msg': 'moi'}]

or even insert new key / value pairs into ``dict`` s:

>>> with_values(content, JP[:].id, -1)
[{'msg': 'hallo', 'id': -1}, {'msg': 'hello', 'id': -1}, {'msg': 'bye', 'id': -1}]

Writing or added the same value in multiple places is perhaps not that
useful. However, You _can_ use an iterator to supply the values to use for
overwriting or adding:

>>> with_values(content, JP[:].id, iter(range(100)))
[{'msg': 'hallo', 'id': 0}, {'msg': 'hello', 'id': 1}, {'msg': 'bye', 'id': 2}]

    or, more elegantly, if range ``stop=100`` irks You, using ``itertools.count()``:

    >>> from itertools import count
    >>> with_values(content, JP[:].id, count(0, 1))
    [{'msg': 'hallo', 'id': 0}, {'msg': 'hello', 'id': 1}, {'msg': 'bye', 'id': 2}]

You can also provide a (unary) function, taking the current value as an argument,
calculating the new value to be inserted:

>>> with_values(content, JP[:].msg, lambda msg: msg.upper())
[{'msg': 'HALLO'}, {'msg': 'HELLO'}, {'msg': 'BYE'}]

In the example above, the value for dictionary key `"msg"` is given
as argument to the function, and this form is good for re-calculating
an existing value. If You want to add a new key/value pair to a dictionary,
You can achieve that in one of two ways:

- Iterate over dictionaries of the list, receiving each dictionary as argument to Your
  function, and re-calculate entire dictionaries:

>>> with_values(
...     content,
...     JP[:],
...     lambda d: d | {"msg_startswith_h": d["msg"].startswith("h")}
... )
[{'msg': 'hallo', 'msg_startswith_h': True}, {'msg': 'hello', 'msg_startswith_h': True}, {'msg': 'bye', 'msg_startswith_h': False}]

- Iterate over dictionaries of the list, receiving each dictionary as argument to
  Your function just as above, but use the
  `sub-item expression`, to compose dictionary content
  for You, e.g. adding even two keys ( ``"id"`` and ``"verdict"`` ) now, to each
  dictionary item:

>>> counter = count(0, 1)
>>> with_values(
...     content,
...     JP[:](JP.id, JP.msg_startswith_h),
...     lambda d: (next(counter), d["msg"].startswith("h"))
... )
[{'msg': 'hallo', 'id': 0, 'msg_startswith_h': True}, {'msg': 'hello', 'id': 1, 'msg_startswith_h': True}, {'msg': 'bye', 'id': 2, 'msg_startswith_h': False}]

    Above, You declare what keys You are interested in overwriting or adding
    (``"id"`` and ``"msg_startswith_h"``), and Your function returns a tuple of
    just those values, based on the parent dictionary given as argument to it.


    The function ``with_values()`` has a similar idea to `JSON Patch`_, except there
    is no point of a full-fledged patching facility, after all, Python list
    and dictionary comprehensions go a long way in manipulating content hierarchy.


Utility ``find_next()``
-----------------------

Often, You just need a first value, roughly equivalent to a ``next(find_all(...))``
invocation. You can use ``find_next()`` for this, for instance

>>> find_next([{"hello": 5}, {"hello": 42}], JP[:].hello)
5
>>> find_next([{"hello": 5}, {"hello": 42}], JP[1].hello)
42

You can also ask for the path of the value returned, in the style of ``with_path=True``
above

>>> path, value = find_next([{"hello": 5}, {"hello": 42}], JP[-1].hello, with_path=True)
>>> str(path)
'$[1].hello'
>>> value
42

You can also supply a default value for ``find_next()``, just like for ``next()``:

>>> find_next([{"hello": 5}, {"hello": 42}], JP[3].hello, default=17)
17

>>> find_next([{"hello": 5}, {"hello": 42}], JP[3].hello, default=17)
17


Utilities ``find_all_unique()``, ``find_all_duplicate()``
---------------------------------------------------------

A common task is to find only unique items in data, e.g.

>>> content = [{"hi": 1}, {"hi": 2}, {"hi": 1}, {"hi": 3}, {"hi": -22}, {"hi": 3}]
>>> list(find_all_unique(content, JP[:].hi))
[1, 2, 3, -22]

and of course You can ask for the paths, too

>>> content = [{"hi": 1}, {"hi": 2}, {"hi": 1}, {"hi": 3}, {"hi": -22}, {"hi": 3}]
>>> [(str(path), item) for path, item in find_all_unique(content, JP[:].hi, with_path=True)]
[('$[0].hi', 1), ('$[1].hi', 2), ('$[3].hi', 3), ('$[4].hi', -22)]

A related common task is to find duplicates, e.g.

>>> content = {
...     "apple": [{"name": "red delicious", "id": 123}, {"name": "punakaneli", "id": 234}],
...     "pear": [{"name": "wilhelm", "id": 345}, {"name": "conference", "id": 123}]
... }
>>> [f"Duplicate ID: {item} at {path.parent}" for path, item in find_all_duplicate(content, JP["apple|pear"][:].id, with_path=True)]
['Duplicate ID: 123 at $.pear[1]']


Suppressing indexing and key errors, safe navigation operator
-------------------------------------------------------------

By default, path expressions are strict, e.g. for non-existent ``list`` indexes
You get an ``IndexError``:

>>> list(find_all([0, 1], JP[2]))
Traceback (most recent call last):
    ...
IndexError: list index out of range

which is consistent with how a ``list`` behaves. Similarly, for
non-existent ``dict`` keys You get a ``KeyError``:

>>> list(find_all({"hello": 42}, JP.hi))
Traceback (most recent call last):
    ...
KeyError: 'hi'

You can suppress these errors and simply have nothing yielded, for ``list`` indexes:

>>> list(find_all([0, 1], JP[2], lenient=True))
[]

and for ``dict`` keys:

>>> list(find_all({"hello": 42}, JP.hi, lenient=True))
[]

In fact, ``find_next()`` which, in turn, invokes ``find_all()``,
delegates its call to ``find_all()`` with ``lenient=True`` whenever a default
value is defined for ``find_next()`` itself. Thus, supplying a ``None`` as a default
value to ``find_next()``:

>>> empty_content = []
>>> type( find_next(empty_content, JP[3].hello[:].hi[:3], default=None) )
<class 'NoneType'>

is as close to a `safe navigation operator` implementation as You can get
given that `PEP 505`_ has deferred status.


Use Case: JSON content validator and tests
------------------------------------------

The utilities above may benefit You in writing production code, but also unit tests
can be made for more readable and self-explanatory.

Imagine You have a JSON content like this in a request body:

>>> fruits = {
...    "apple": [{"name": "red delicious"}, {"name": "punakaneli"}],
...    "pear": [{"name": "conference"}, {"name": "wilhelm"}],
... }

with the type of a fruit (apple, pear) encoded in the hierarchy itself.

    This is often the case, since processing items of a certain type is easy,
    e.g. in Python:

    >>> [apple["name"] for apple in fruits["apple"]]
    ['red delicious', 'punakaneli']

Let's say Your business analyst says the name of fruit is unique on document scope,
i.e. no two fruits can have the same name regardless whether they are of the same
type or not, and You must validate this unique constraint for all requests.

You wish the JSON format would be flat, something like
``[{"name": "red delicious", "type": "apple"}, ...]``, encoding the type in
a key, because then You could use JSON Schema facility
`uniqueKeys <https://docs.json-everything.net/schema/vocabs/uniquekeys/#schema-uniquekeys-keyword>`__,
but You are not in control of the JSON format: You need a custom validator.
With this library, it's easy enough to fashion something like below:

>>> def verify_unique_fruit_names(content: dict) -> None | str:
...    """
...    Return the (path, name) tuple of the first fruit name
...    duplicate within the entire document if any, None otherwise.
...    """
...    return next(
...       find_all_duplicate(content, JP[".*"][:].name, with_path=True),
...       None
...    )

First off, You want to test that Your implementation will regard the valid document
``fruits`` valid:

>>> assert verify_unique_fruit_names(fruits) is None

Then, You want to verify that the some document with name duplicates will not
pass verification, with the expected error info tuple returned. At this point
test suites normally choose between two alternatives, the bad and the ugly:

- The bad: the input document is small and simple. The test is easy to read
  and maintain as It's easy to spot where the input is broken, but one is left
  with the nagging feeling, whether will ``verify_unique_fruit_names()`` work
  for more complex inputs, too?

- The ugly: the input document is big and complex. Now You know for sure
  that ``verify_unique_fruit_names()`` works for bigger input, except now the
  test is not readable / maintainable, as it's not clear at all, at first glance,
  where the input is broken. You now have a so called `MD5 test`: no one knows
  why it breaks when it does.

Can we have the good? Can we have complex input `and` make sure it's clear
where it's broken? Yes we can, we can use ``with_values()``, e.g. consider this:

>>> an_apple_name = find_next(fruits, JP.apple[0].name)

that is, we have a known apple name.

>>> an_apple_name
'red delicious'

Let's use that name to introduce a duplicate:

>>> broken_path = JP.pear[0].name
>>> fruits_with_duplicate_names = with_values(fruits, broken_path, an_apple_name)

Now our fixture explains where and how it's broken! Let's check,
just to satisfy our curiosity, what the broken input looks like:

>>> fruits_with_duplicate_names
{'apple': [{'name': 'red delicious'}, {'name': 'punakaneli'}], 'pear': [{'name': 'red delicious'}, {'name': 'wilhelm'}]}

After this, the expectations in our tests will be self-explanatory:

>>> error_path, error_value = verify_unique_fruit_names(fruits_with_duplicate_names)
>>> assert error_path == broken_path
>>> assert error_value == an_apple_name

Best of all, you can make a parametrized test, with small and big input both,
so you can have a full coverage which is readable and maintainable.

.. _JSON Path: https://www.rfc-editor.org/rfc/rfc9535
.. _python-jsonpath: https://pypi.org/project/python-jsonpath
.. _jsonpath-rfc9535: https://pypi.org/project/jsonpath-rfc9535
.. _JSON Patch: https://jsonpatch.com/
.. _PEP 505: https://peps.python.org/pep-0505/

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "aweson",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "JSON, Path, JSON Path, safe navigation",
    "author": null,
    "author_email": "Ferenc Dosa-Racz <dosarf@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/29/a9/e12cd37656ea041d61a4a0f643032e42ff3f6b23d59ed265eb030b95c9a5/aweson-3.0.0.tar.gz",
    "platform": null,
    "description": "aweson\n======\n\nTraversing and manipulating hierarchical data (think JSON) using\npythonic `JSON Path`_ -like expressions. This library doesn't support\nevery JSON Path notation, but it has it's own tricks to offer, e.g.\n``with_values()``.\n\n\nImporting:\n\n>>> from aweson import JP, find_all, find_all_duplicate, find_all_unique, find_next, with_values\n\n\nIterating over hierarchical data\n--------------------------------\n\n>>> content = {\"employees\": [\n...     {\"name\": \"Doe, John\", \"age\": 32, \"account\": \"johndoe\"},\n...     {\"name\": \"Doe, Jane\", \"age\": -23, \"account\": \"janedoe\"},\n...     {\"name\": \"Deer, Jude\", \"age\": 42, \"account\": \"judedeer\"},\n... ]}\n>>> list(find_all(content, JP.employees[:].name))\n['Doe, John', 'Doe, Jane', 'Deer, Jude']\n\n    The JSON Path-like expression ``JP.employees[:].name`` is `not` a string.\n    Most JSON Path supporting libraries, like `python-jsonpath`_, `jsonpath-rfc9535`_\n    have the JSON Path as a string, which they parse.\n    Using this library You build a `Python expression`, parsed and interpreted\n    by Python itself. This way Your IDE will be of actual help.\n\nTo address all items in a list, Pythonic slice expression\n``[:]`` is used. Naturally, other indexing and slice expressions also work:\n\n>>> list(find_all(content, JP.employees[1].name))\n['Doe, Jane']\n>>> list(find_all(content, JP.employees[-1].name))\n['Deer, Jude']\n>>> list(find_all(content, JP.employees[:2].name))\n['Doe, John', 'Doe, Jane']\n\n    These indexing and slicing expressions are valid expressions for both `JSON Path`_\n    and Python. The more conventional JSON Path notation for selecting all items of a list,\n    ``$.some_array[*]``, is (sort of) supported, only as ``JP.some_array[\"*\"]``.\n\n\nSelecting list items by boolean expressions\n-------------------------------------------\n\nDictionaries in lists can also be selected by simple boolean expressions evaluated within\nthe context of each such dictionary, for instance\n\n>>> list(find_all(content, JP.employees[JP.age > 35]))\n[{'name': 'Deer, Jude', 'age': 42, 'account': 'judedeer'}]\n\nOnly simple comparisons are supported, and only these operators: ``==``, ``!=``,\n``<``, ``<=``, ``>``, ``>=``.\n\n    Both operands can be dict keys in a list item, e.g. expressions like\n    ``JP.years[JP.planned_budget < JP.realized_budget]`` are supported.\n\nIn addition to this, existence of a sub-item or path can also be used as\na list item selector, e.g. ``JP.years[JP.planned_budget]`` would select only\nthe years where the key ``planned_budget`` exists.\n\n\nField name by regular expressions\n---------------------------------\n\nConsider the following ``dict`` content\n\n>>> content = {\n...     \"apple\": [{\"name\": \"red delicious\"}, {\"name\": \"punakaneli\"}],\n...     \"pear\": [{\"name\": \"wilhelm\"}, {\"name\": \"conference\"}]\n... }\n\nif You want to iterate over `all` fruit items, both apples and pears,\nYou can do so:\n\n>>> list(find_all(content, JP[\"apple|pear\"][:].name))\n['red delicious', 'punakaneli', 'wilhelm', 'conference']\n\nor even\n\n>>> list(find_all(content, JP[\".*\"][:].name))\n['red delicious', 'punakaneli', 'wilhelm', 'conference']\n\nif You are interested in everything, not only apples and pears.\n\n\nPaths to items iterated\n-----------------------\n\nYou may be interested in the actual path of an item being returned.\n\n    When You use ``enumerate()`` with a ``list``, You want to obtain the\n    index of an item alongside with the item's value during iteration. For\n    instance,\n\n    >>> list(enumerate([\"a\", \"b\"]))\n    [(0, 'a'), (1, 'b')]\n\n    and You can use that index to refer to the item itself, even to retrieve\n    it again from the list.\n\nSimilarly, when iterating within a hierarchical data structure, You\nmay want to obtain a `pointer` (i.e. path object) alongside the item's\nvalue:\n\n>>> content = {\"employees\": [\n...     {\"name\": \"Doe, John\", \"age\": 32, \"account\": \"johndoe\"},\n...     {\"name\": \"Doe, Jane\", \"age\": -23, \"account\": \"janedoe\"},\n...     {\"name\": \"Deer, Jude\", \"age\": 42, \"account\": \"judedeer\"},\n... ]}\n>>> path, item = next(tup for tup in find_all(\n...     content,\n...     JP.employees[JP.age < 0],\n...     with_path=True\n... ))\n>>> item\n{'name': 'Doe, Jane', 'age': -23, 'account': 'janedoe'}\n\nThe path to the item found is:\n\n>>> str(path)\n'$.employees[1]'\n\nThe path object yielded along is a JSON Path-like object, just as if You\nconstructed it as ``JP.employee[1]``.\n\n    With argument ``with_path=True`` passed, ``find_all()`` yields tuples\n    instead of items only. The first item of a yielded tuple is the path object,\n    and the second item is the item itself. This is consistent with ``enumerate()``\n    behavior.\n\nAlso, the JSON Path-like objects have a field called ``.parent``, so that You can\naccess the parent data structure, consider a path object you've obtained. You\ncan dig out its respective value:\n\n>>> path = JP.employees[1].name\n>>> next(find_all(content, path))\n'Doe, Jane'\n\nBut if you want to have access to the containing structure, use ``.parent``:\n\n>>> next(find_all(content, path.parent))\n{'name': 'Doe, Jane', 'age': -23, 'account': 'janedoe'}\n\n\n.. _subitems:\n\nSelecting sub-items\n-------------------\n\nYou can select sub-items of iterated items, comes handy into turning one structure\ninto another, like a list of records into a ``dict``:\n\n>>> {account: name for account, name in find_all(content, JP.employees[:](JP.account, JP.name))}\n{'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}\n\n    This is roughly equivalent to:\n\n    >>> {item[\"account\"]: item[\"name\"] for item in find_all(content, JP.employees[:])}\n    {'johndoe': 'Doe, John', 'janedoe': 'Doe, Jane', 'judedeer': 'Deer, Jude'}\n\n    The sub-item selection, while slightly more verbose, is arguably more\n    declarative.\n\nYou can also make a sub-items selection produce dictionaries by explicitly naming sub-paths:\n\n>>> list(find_all(content, JP.employees[:](id=JP.account, username=JP.name)))\n[{'id': 'johndoe', 'username': 'Doe, John'}, {'id': 'janedoe', 'username': 'Doe, Jane'}, {'id': 'judedeer', 'username': 'Deer, Jude'}]\n\nIn the code above, the key ``\"account\"`` is rendered as ``id``,\nand ``\"name\"`` as ``username``.\n\n\nVariable field name selection\n-----------------------------\n\nThe forms ``JP[\"field_name\"]`` and ``JP.field_name`` are equivalent:\n\n>>> from functools import reduce\n>>> def my_sum(content, field_name_to_sum, initial):\n...     return reduce(\n...         lambda x, y: x + y,\n...         find_all(content, JP.employees[:][field_name_to_sum]),\n...         initial\n...     )\n>>> my_sum(content, \"age\", 0)\n51\n>>> my_sum(content, \"account\", \"\")\n'johndoejanedoejudedeer'\n\n    At this point, some disambiguation is due:\n\n    - ``JP[\"field\"]`` is equivalent to ``JP.field``, both select a key/value pair\n      of a dictionary,\n\n    - ``JP[\".*\"]`` is a regular expression, select all key/value pairs of a dictionary.\n\n    - ``JP[\"*\"]`` selects all items in a list, equivalent to ``JP[:]``,\n\n\n.. _withvalues:\n\nUtility ``with_values()``\n-------------------------\n\nYou can produce a copy of Your hierarchical data with some values overwritten (or\neven added):\n\n>>> content = [{\"msg\": \"hallo\"}, {\"msg\": \"hello\"}, {\"msg\": \"bye\"}]\n>>> with_values(content, JP[1].msg, \"moi\")\n[{'msg': 'hallo'}, {'msg': 'moi'}, {'msg': 'bye'}]\n\n    Note that the original ``content`` is not mutated:\n\n    >>> content\n    [{'msg': 'hallo'}, {'msg': 'hello'}, {'msg': 'bye'}]\n\nYou can also overwrite values at multiple places:\n\n>>> with_values(content, JP[:].msg, \"moi\")\n[{'msg': 'moi'}, {'msg': 'moi'}, {'msg': 'moi'}]\n\nor even insert new key / value pairs into ``dict`` s:\n\n>>> with_values(content, JP[:].id, -1)\n[{'msg': 'hallo', 'id': -1}, {'msg': 'hello', 'id': -1}, {'msg': 'bye', 'id': -1}]\n\nWriting or added the same value in multiple places is perhaps not that\nuseful. However, You _can_ use an iterator to supply the values to use for\noverwriting or adding:\n\n>>> with_values(content, JP[:].id, iter(range(100)))\n[{'msg': 'hallo', 'id': 0}, {'msg': 'hello', 'id': 1}, {'msg': 'bye', 'id': 2}]\n\n    or, more elegantly, if range ``stop=100`` irks You, using ``itertools.count()``:\n\n    >>> from itertools import count\n    >>> with_values(content, JP[:].id, count(0, 1))\n    [{'msg': 'hallo', 'id': 0}, {'msg': 'hello', 'id': 1}, {'msg': 'bye', 'id': 2}]\n\nYou can also provide a (unary) function, taking the current value as an argument,\ncalculating the new value to be inserted:\n\n>>> with_values(content, JP[:].msg, lambda msg: msg.upper())\n[{'msg': 'HALLO'}, {'msg': 'HELLO'}, {'msg': 'BYE'}]\n\nIn the example above, the value for dictionary key `\"msg\"` is given\nas argument to the function, and this form is good for re-calculating\nan existing value. If You want to add a new key/value pair to a dictionary,\nYou can achieve that in one of two ways:\n\n- Iterate over dictionaries of the list, receiving each dictionary as argument to Your\n  function, and re-calculate entire dictionaries:\n\n>>> with_values(\n...     content,\n...     JP[:],\n...     lambda d: d | {\"msg_startswith_h\": d[\"msg\"].startswith(\"h\")}\n... )\n[{'msg': 'hallo', 'msg_startswith_h': True}, {'msg': 'hello', 'msg_startswith_h': True}, {'msg': 'bye', 'msg_startswith_h': False}]\n\n- Iterate over dictionaries of the list, receiving each dictionary as argument to\n  Your function just as above, but use the\n  `sub-item expression`, to compose dictionary content\n  for You, e.g. adding even two keys ( ``\"id\"`` and ``\"verdict\"`` ) now, to each\n  dictionary item:\n\n>>> counter = count(0, 1)\n>>> with_values(\n...     content,\n...     JP[:](JP.id, JP.msg_startswith_h),\n...     lambda d: (next(counter), d[\"msg\"].startswith(\"h\"))\n... )\n[{'msg': 'hallo', 'id': 0, 'msg_startswith_h': True}, {'msg': 'hello', 'id': 1, 'msg_startswith_h': True}, {'msg': 'bye', 'id': 2, 'msg_startswith_h': False}]\n\n    Above, You declare what keys You are interested in overwriting or adding\n    (``\"id\"`` and ``\"msg_startswith_h\"``), and Your function returns a tuple of\n    just those values, based on the parent dictionary given as argument to it.\n\n\n    The function ``with_values()`` has a similar idea to `JSON Patch`_, except there\n    is no point of a full-fledged patching facility, after all, Python list\n    and dictionary comprehensions go a long way in manipulating content hierarchy.\n\n\nUtility ``find_next()``\n-----------------------\n\nOften, You just need a first value, roughly equivalent to a ``next(find_all(...))``\ninvocation. You can use ``find_next()`` for this, for instance\n\n>>> find_next([{\"hello\": 5}, {\"hello\": 42}], JP[:].hello)\n5\n>>> find_next([{\"hello\": 5}, {\"hello\": 42}], JP[1].hello)\n42\n\nYou can also ask for the path of the value returned, in the style of ``with_path=True``\nabove\n\n>>> path, value = find_next([{\"hello\": 5}, {\"hello\": 42}], JP[-1].hello, with_path=True)\n>>> str(path)\n'$[1].hello'\n>>> value\n42\n\nYou can also supply a default value for ``find_next()``, just like for ``next()``:\n\n>>> find_next([{\"hello\": 5}, {\"hello\": 42}], JP[3].hello, default=17)\n17\n\n>>> find_next([{\"hello\": 5}, {\"hello\": 42}], JP[3].hello, default=17)\n17\n\n\nUtilities ``find_all_unique()``, ``find_all_duplicate()``\n---------------------------------------------------------\n\nA common task is to find only unique items in data, e.g.\n\n>>> content = [{\"hi\": 1}, {\"hi\": 2}, {\"hi\": 1}, {\"hi\": 3}, {\"hi\": -22}, {\"hi\": 3}]\n>>> list(find_all_unique(content, JP[:].hi))\n[1, 2, 3, -22]\n\nand of course You can ask for the paths, too\n\n>>> content = [{\"hi\": 1}, {\"hi\": 2}, {\"hi\": 1}, {\"hi\": 3}, {\"hi\": -22}, {\"hi\": 3}]\n>>> [(str(path), item) for path, item in find_all_unique(content, JP[:].hi, with_path=True)]\n[('$[0].hi', 1), ('$[1].hi', 2), ('$[3].hi', 3), ('$[4].hi', -22)]\n\nA related common task is to find duplicates, e.g.\n\n>>> content = {\n...     \"apple\": [{\"name\": \"red delicious\", \"id\": 123}, {\"name\": \"punakaneli\", \"id\": 234}],\n...     \"pear\": [{\"name\": \"wilhelm\", \"id\": 345}, {\"name\": \"conference\", \"id\": 123}]\n... }\n>>> [f\"Duplicate ID: {item} at {path.parent}\" for path, item in find_all_duplicate(content, JP[\"apple|pear\"][:].id, with_path=True)]\n['Duplicate ID: 123 at $.pear[1]']\n\n\nSuppressing indexing and key errors, safe navigation operator\n-------------------------------------------------------------\n\nBy default, path expressions are strict, e.g. for non-existent ``list`` indexes\nYou get an ``IndexError``:\n\n>>> list(find_all([0, 1], JP[2]))\nTraceback (most recent call last):\n    ...\nIndexError: list index out of range\n\nwhich is consistent with how a ``list`` behaves. Similarly, for\nnon-existent ``dict`` keys You get a ``KeyError``:\n\n>>> list(find_all({\"hello\": 42}, JP.hi))\nTraceback (most recent call last):\n    ...\nKeyError: 'hi'\n\nYou can suppress these errors and simply have nothing yielded, for ``list`` indexes:\n\n>>> list(find_all([0, 1], JP[2], lenient=True))\n[]\n\nand for ``dict`` keys:\n\n>>> list(find_all({\"hello\": 42}, JP.hi, lenient=True))\n[]\n\nIn fact, ``find_next()`` which, in turn, invokes ``find_all()``,\ndelegates its call to ``find_all()`` with ``lenient=True`` whenever a default\nvalue is defined for ``find_next()`` itself. Thus, supplying a ``None`` as a default\nvalue to ``find_next()``:\n\n>>> empty_content = []\n>>> type( find_next(empty_content, JP[3].hello[:].hi[:3], default=None) )\n<class 'NoneType'>\n\nis as close to a `safe navigation operator` implementation as You can get\ngiven that `PEP 505`_ has deferred status.\n\n\nUse Case: JSON content validator and tests\n------------------------------------------\n\nThe utilities above may benefit You in writing production code, but also unit tests\ncan be made for more readable and self-explanatory.\n\nImagine You have a JSON content like this in a request body:\n\n>>> fruits = {\n...    \"apple\": [{\"name\": \"red delicious\"}, {\"name\": \"punakaneli\"}],\n...    \"pear\": [{\"name\": \"conference\"}, {\"name\": \"wilhelm\"}],\n... }\n\nwith the type of a fruit (apple, pear) encoded in the hierarchy itself.\n\n    This is often the case, since processing items of a certain type is easy,\n    e.g. in Python:\n\n    >>> [apple[\"name\"] for apple in fruits[\"apple\"]]\n    ['red delicious', 'punakaneli']\n\nLet's say Your business analyst says the name of fruit is unique on document scope,\ni.e. no two fruits can have the same name regardless whether they are of the same\ntype or not, and You must validate this unique constraint for all requests.\n\nYou wish the JSON format would be flat, something like\n``[{\"name\": \"red delicious\", \"type\": \"apple\"}, ...]``, encoding the type in\na key, because then You could use JSON Schema facility\n`uniqueKeys <https://docs.json-everything.net/schema/vocabs/uniquekeys/#schema-uniquekeys-keyword>`__,\nbut You are not in control of the JSON format: You need a custom validator.\nWith this library, it's easy enough to fashion something like below:\n\n>>> def verify_unique_fruit_names(content: dict) -> None | str:\n...    \"\"\"\n...    Return the (path, name) tuple of the first fruit name\n...    duplicate within the entire document if any, None otherwise.\n...    \"\"\"\n...    return next(\n...       find_all_duplicate(content, JP[\".*\"][:].name, with_path=True),\n...       None\n...    )\n\nFirst off, You want to test that Your implementation will regard the valid document\n``fruits`` valid:\n\n>>> assert verify_unique_fruit_names(fruits) is None\n\nThen, You want to verify that the some document with name duplicates will not\npass verification, with the expected error info tuple returned. At this point\ntest suites normally choose between two alternatives, the bad and the ugly:\n\n- The bad: the input document is small and simple. The test is easy to read\n  and maintain as It's easy to spot where the input is broken, but one is left\n  with the nagging feeling, whether will ``verify_unique_fruit_names()`` work\n  for more complex inputs, too?\n\n- The ugly: the input document is big and complex. Now You know for sure\n  that ``verify_unique_fruit_names()`` works for bigger input, except now the\n  test is not readable / maintainable, as it's not clear at all, at first glance,\n  where the input is broken. You now have a so called `MD5 test`: no one knows\n  why it breaks when it does.\n\nCan we have the good? Can we have complex input `and` make sure it's clear\nwhere it's broken? Yes we can, we can use ``with_values()``, e.g. consider this:\n\n>>> an_apple_name = find_next(fruits, JP.apple[0].name)\n\nthat is, we have a known apple name.\n\n>>> an_apple_name\n'red delicious'\n\nLet's use that name to introduce a duplicate:\n\n>>> broken_path = JP.pear[0].name\n>>> fruits_with_duplicate_names = with_values(fruits, broken_path, an_apple_name)\n\nNow our fixture explains where and how it's broken! Let's check,\njust to satisfy our curiosity, what the broken input looks like:\n\n>>> fruits_with_duplicate_names\n{'apple': [{'name': 'red delicious'}, {'name': 'punakaneli'}], 'pear': [{'name': 'red delicious'}, {'name': 'wilhelm'}]}\n\nAfter this, the expectations in our tests will be self-explanatory:\n\n>>> error_path, error_value = verify_unique_fruit_names(fruits_with_duplicate_names)\n>>> assert error_path == broken_path\n>>> assert error_value == an_apple_name\n\nBest of all, you can make a parametrized test, with small and big input both,\nso you can have a full coverage which is readable and maintainable.\n\n.. _JSON Path: https://www.rfc-editor.org/rfc/rfc9535\n.. _python-jsonpath: https://pypi.org/project/python-jsonpath\n.. _jsonpath-rfc9535: https://pypi.org/project/jsonpath-rfc9535\n.. _JSON Patch: https://jsonpatch.com/\n.. _PEP 505: https://peps.python.org/pep-0505/\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Traversing and manipulating hierarchical info sets (JSON) using pythonic JSON Path-like expressions",
    "version": "3.0.0",
    "project_urls": {
        "Changelog": "https://github.com/dosarf/aweson/blob/master/CHANGELOG.md",
        "Homepage": "https://github.com/dosarf/aweson",
        "Issues": "https://github.com/dosarf/aweson/issues"
    },
    "split_keywords": [
        "json",
        " path",
        " json path",
        " safe navigation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1e28835b2fb7ffc7fe0822ac7259dd7d17e056aedbf6488b38d226bbe8fb3946",
                "md5": "9cbf1908173301dd195d4f33bea37f2b",
                "sha256": "e428d614434c956e15bf8a28884f597bdf2608a839a582fa0ef238da3e2b2ac3"
            },
            "downloads": -1,
            "filename": "aweson-3.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9cbf1908173301dd195d4f33bea37f2b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 15299,
            "upload_time": "2025-07-26T09:45:40",
            "upload_time_iso_8601": "2025-07-26T09:45:40.310578Z",
            "url": "https://files.pythonhosted.org/packages/1e/28/835b2fb7ffc7fe0822ac7259dd7d17e056aedbf6488b38d226bbe8fb3946/aweson-3.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "29a9e12cd37656ea041d61a4a0f643032e42ff3f6b23d59ed265eb030b95c9a5",
                "md5": "80c90012f09735f21035817d79b67e9a",
                "sha256": "39a834c24e654d8fe26de49a965270927e76a9e182c04d114ef8dcc111802b2f"
            },
            "downloads": -1,
            "filename": "aweson-3.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "80c90012f09735f21035817d79b67e9a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 30363,
            "upload_time": "2025-07-26T09:45:41",
            "upload_time_iso_8601": "2025-07-26T09:45:41.942737Z",
            "url": "https://files.pythonhosted.org/packages/29/a9/e12cd37656ea041d61a4a0f643032e42ff3f6b23d59ed265eb030b95c9a5/aweson-3.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-26 09:45:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dosarf",
    "github_project": "aweson",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "aweson"
}
        
Elapsed time: 1.50701s