deep-collections


Namedeep-collections JSON
Version 0.4.2 PyPI version JSON
download
home_pagehttps://github.com/terminal-labs/deep_collections
SummaryEasy access to items in deep collections.
upload_time2023-04-05 22:42:31
maintainer
docs_urlNone
authorJoseph Nix
requires_python>=3.7,<4.0
licenseBSD
keywords deep nested dict json data
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## Deep Collections

[![PyPI version](https://badge.fury.io/py/deep-collections.svg)](https://pypi.org/project/deep-collections/)
[![codecov](https://codecov.io/gh/terminal-labs/deep_collections/branch/main/graph/badge.svg?token=F1JVYFDCJI)](https://codecov.io/gh/terminal-labs/deep_collections)
<a href="https://github.com/ambv/black"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a>
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/terminal-labs/deep_collections/main.svg)](https://results.pre-commit.ci/latest/github/terminal-labs/deep_collections/main)

deep_collections is a Python library that provides tooling for easy access to deep collections (dicts, lists, deques, etc), while maintaining a great portion of the collection's original API. The class DeepCollection class will automatically subclass the original collection that is provided, and add several quality of life extensions to make using deep collections much more enjoyable.

Got a bundle of JSON from an API? A large Python object from some data science problem? Some very lengthy set of instructions from some infrastructure as code like Ansible or SaltStack? Explore and modify it with ease.

DeepCollection can take virtually any kind of object including all built-in container types ([dict](https://docs.python.org/3/library/stdtypes.html#dict), [list](https://docs.python.org/3/library/stdtypes.html#list), [set](https://docs.python.org/3/library/stdtypes.html#set), and [tuple](https://docs.python.org/3/library/stdtypes.html#tuple)), everything in the [collections module](https://docs.python.org/3/library/collections.html), and [dotty-dicts](https://github.com/pawelzny/dotty_dict), and all of these nested in any fashion.

### Features

- Path traversal by supplying a list of path components as a key. This works for getting, setting, and deleting.
- Accessing nested components by supply only path fragments.
- Setting paths when parent parts do not exist.
- Path traversal through dict-like collections by dot chaining for getting
- Finding all paths to keys or subpaths
- Finding all values for keys or subpaths, and deduping them.
- Provide all of the above through a class that is:
    - easily instantiable
    - a native subclass of the type it was instantiated with
    - easily subclassable


### Path concept

DeepCollections has a concept of a "path" for nested collections, where a path is a sequence of keys or indices that if followed in order, traverse the deep collection. As a quick example, `{'a': ['b', {'c': 'd'}]}` could be traversed with the path `['a', 1, 'c']` to find the value `'d'`.

DeepCollections natively use paths as well as simple keys and indices. For `dc = DeepCollection(foo)`, items can be retrieved through the familiar `dc[path]` as normal if `path` is a simple key or index, or if it is a non-stringlike iterable path (strings are assumed to be literal keys). This is done with a custom `__getitem__` method. Similarly, `__delitem__` and `__setitem__` also support using a path. The same flexibility exists for the familiar methods like `.get`, which behaves the same as `dict.get`, but can accept a path as well as a key.

### Matching
Path elements are interpretted as patterns to match against keys and indices. By default this feature is on and uses globbing.

#### Recursion

`"**"` recurses any depth to find the match for the next pattern given. For example:

```python
dc = DeepCollection({"a": {"b": {"c": {"d": 5}}}, "d": 4})
dc["a", "**", "d"] == 5
```

Coupled with another matching style like globbing allows you to do some powerful filtering:

```python
dc = DeepCollection({"a": {"b": {"c": {"xd": {"e": 0}, "yd": {"e": 1}, "zf": {"e": 2}}}}, "e": 3})
dc["a", "**", "?d", "e"] == [0, 1]
```

This feature is independent of other matching patterns. In other words, you could swap globbing out for another matchin style, but `"**"` will remain usable unless disabled on it's own. You might want to use regex through your path but pair that with recursion.

#### Matching numeric keys and indicies

To enable pattern matching (like globbing) to make sense when attempting to match indices and numeric keys, if a path element is a string and appears to use globbing, it will be matched against the stringified index/key. In other words

```python
dc = DeepCollection(["a", "b", "c"])
dc["[0-1]"] == DeepCollection(["a", "b"])
dc["[5]"] == DeepCollection([])  # Matching pattern detected (globbing), so no results yields an empty list.
dc["5"]  # Raises TypeError. No matching pattern detected, so direct use of `"5"` was attempted and not cast to an int.

dc = DeepCollection({1: 'i', '1': 'j', 'a': 'k'})
dc['*[!1]'] == "k"
```

This is a compromise to afford pattern matching indices and numeric keys. As with deeper path traversal, since we're matching a pattern, 0 hits is not treated as a KeyError or IndexError, but simply returns an empty list.

The often relied upon KeyError and IndexError are both saved when pattern matching is not detected.

```python
dc = DeepCollection(["a", "b", "c"])
dc[5]
...
IndexError: list index out of range

DeepCollection({})["a"]
...
KeyError: 'a'
```

### Matching Styles

Deep Collections supports the following matching styles:

- glob
- regex
- equality
- hash
- glob+regex
- custom (_built in soon_)

This can be set with many functions by passing e.g. `match_with="regex"`.

As said above, the special use of `"**"` is independant, and currently always on. Future versions will allow toggling this off as well.

To abandon all matching styles and traverse paths as quickly as possible, use `getitem_by_path_strict`.

#### Matching Style: Globbing

Any given path element is matched with `fnmatchcase` from [the Python stdlib](https://docs.python.org/3/library/fnmatch.html#fnmatch.fnmatchcase). This style is used in the above examples.

#### Matching Style: Regex

Any given path element is matched with `re.compile().match()` from [the Python stdlib](https://docs.python.org/3/library/re.html).

### DeepCollection object API

DeepCollections are instantiated as a normal class, optionally with a given initial collection as an arguement.

```python
from deep_collections import DeepCollection

dc = DeepCollection()
# or
dc = DeepCollection({"a": {"b": {"c": "d"}}})
# or
dc = DeepCollection(["a", ["b", ["c", "d"]]])
```

These are the noteworthy methods available on all DCs:

- `__getitem__`
- `__delitem__`
- `__setitem__`
- `get`
- `paths_to_value`
- `paths_to_key`
- `values_for_key`
- `deduped_values_for_key`

There are also corresponding functions availble that can use any native object that could be deep, but is not a `DeepCollection`, like a normal nested `dict` or `list`. This may be a convenient alternative to ad hoc traverse an object you already have, but it is also faster to use because it doesn't come with the initialization cost of a DeepCollection object. So if speed matters, use a function.

### deep_collections function API

All of the useful methods for DeepCollection objects are available as functions that can take a collection as an argument, as well as several other supporting functions, which are made plainly availble.

The core functions are focused on using the same path concept. The available functions and their related DC methods are:

- `getitem_by_path` - `DeepCollection().__getitem__`
- `get_by_path` - `DeepCollection().get`
- `set_by_path` - `DeepCollection().set_by_path`
- `del_by_path` - `DeepCollection().del_by_path`
- `paths_to_value` - `DeepCollection().paths_to_value`
- `paths_to_key` - `DeepCollection().paths_to_key`
- `values_for_key` - `DeepCollection().values_for_key`
- `deduped_values_for_key` - `DeepCollection().deduped_values_for_key`
- `dedupe_items`
- `resolve_path`
- `matched_keys`

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/terminal-labs/deep_collections",
    "name": "deep-collections",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7,<4.0",
    "maintainer_email": "",
    "keywords": "deep,nested,dict,json,data",
    "author": "Joseph Nix",
    "author_email": "nixjdm@terminallabs.com",
    "download_url": "https://files.pythonhosted.org/packages/97/cc/729a17c19c86f00ba15e1cec2c0bd3610eb318da64b2519f25660c041152/deep_collections-0.4.2.tar.gz",
    "platform": null,
    "description": "## Deep Collections\n\n[![PyPI version](https://badge.fury.io/py/deep-collections.svg)](https://pypi.org/project/deep-collections/)\n[![codecov](https://codecov.io/gh/terminal-labs/deep_collections/branch/main/graph/badge.svg?token=F1JVYFDCJI)](https://codecov.io/gh/terminal-labs/deep_collections)\n<a href=\"https://github.com/ambv/black\"><img alt=\"Code style: black\" src=\"https://img.shields.io/badge/code%20style-black-000000.svg\"></a>\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/terminal-labs/deep_collections/main.svg)](https://results.pre-commit.ci/latest/github/terminal-labs/deep_collections/main)\n\ndeep_collections is a Python library that provides tooling for easy access to deep collections (dicts, lists, deques, etc), while maintaining a great portion of the collection's original API. The class DeepCollection class will automatically subclass the original collection that is provided, and add several quality of life extensions to make using deep collections much more enjoyable.\n\nGot a bundle of JSON from an API? A large Python object from some data science problem? Some very lengthy set of instructions from some infrastructure as code like Ansible or SaltStack? Explore and modify it with ease.\n\nDeepCollection can take virtually any kind of object including all built-in container types ([dict](https://docs.python.org/3/library/stdtypes.html#dict), [list](https://docs.python.org/3/library/stdtypes.html#list), [set](https://docs.python.org/3/library/stdtypes.html#set), and [tuple](https://docs.python.org/3/library/stdtypes.html#tuple)), everything in the [collections module](https://docs.python.org/3/library/collections.html), and [dotty-dicts](https://github.com/pawelzny/dotty_dict), and all of these nested in any fashion.\n\n### Features\n\n- Path traversal by supplying a list of path components as a key. This works for getting, setting, and deleting.\n- Accessing nested components by supply only path fragments.\n- Setting paths when parent parts do not exist.\n- Path traversal through dict-like collections by dot chaining for getting\n- Finding all paths to keys or subpaths\n- Finding all values for keys or subpaths, and deduping them.\n- Provide all of the above through a class that is:\n    - easily instantiable\n    - a native subclass of the type it was instantiated with\n    - easily subclassable\n\n\n### Path concept\n\nDeepCollections has a concept of a \"path\" for nested collections, where a path is a sequence of keys or indices that if followed in order, traverse the deep collection. As a quick example, `{'a': ['b', {'c': 'd'}]}` could be traversed with the path `['a', 1, 'c']` to find the value `'d'`.\n\nDeepCollections natively use paths as well as simple keys and indices. For `dc = DeepCollection(foo)`, items can be retrieved through the familiar `dc[path]` as normal if `path` is a simple key or index, or if it is a non-stringlike iterable path (strings are assumed to be literal keys). This is done with a custom `__getitem__` method. Similarly, `__delitem__` and `__setitem__` also support using a path. The same flexibility exists for the familiar methods like `.get`, which behaves the same as `dict.get`, but can accept a path as well as a key.\n\n### Matching\nPath elements are interpretted as patterns to match against keys and indices. By default this feature is on and uses globbing.\n\n#### Recursion\n\n`\"**\"` recurses any depth to find the match for the next pattern given. For example:\n\n```python\ndc = DeepCollection({\"a\": {\"b\": {\"c\": {\"d\": 5}}}, \"d\": 4})\ndc[\"a\", \"**\", \"d\"] == 5\n```\n\nCoupled with another matching style like globbing allows you to do some powerful filtering:\n\n```python\ndc = DeepCollection({\"a\": {\"b\": {\"c\": {\"xd\": {\"e\": 0}, \"yd\": {\"e\": 1}, \"zf\": {\"e\": 2}}}}, \"e\": 3})\ndc[\"a\", \"**\", \"?d\", \"e\"] == [0, 1]\n```\n\nThis feature is independent of other matching patterns. In other words, you could swap globbing out for another matchin style, but `\"**\"` will remain usable unless disabled on it's own. You might want to use regex through your path but pair that with recursion.\n\n#### Matching numeric keys and indicies\n\nTo enable pattern matching (like globbing) to make sense when attempting to match indices and numeric keys, if a path element is a string and appears to use globbing, it will be matched against the stringified index/key. In other words\n\n```python\ndc = DeepCollection([\"a\", \"b\", \"c\"])\ndc[\"[0-1]\"] == DeepCollection([\"a\", \"b\"])\ndc[\"[5]\"] == DeepCollection([])  # Matching pattern detected (globbing), so no results yields an empty list.\ndc[\"5\"]  # Raises TypeError. No matching pattern detected, so direct use of `\"5\"` was attempted and not cast to an int.\n\ndc = DeepCollection({1: 'i', '1': 'j', 'a': 'k'})\ndc['*[!1]'] == \"k\"\n```\n\nThis is a compromise to afford pattern matching indices and numeric keys. As with deeper path traversal, since we're matching a pattern, 0 hits is not treated as a KeyError or IndexError, but simply returns an empty list.\n\nThe often relied upon KeyError and IndexError are both saved when pattern matching is not detected.\n\n```python\ndc = DeepCollection([\"a\", \"b\", \"c\"])\ndc[5]\n...\nIndexError: list index out of range\n\nDeepCollection({})[\"a\"]\n...\nKeyError: 'a'\n```\n\n### Matching Styles\n\nDeep Collections supports the following matching styles:\n\n- glob\n- regex\n- equality\n- hash\n- glob+regex\n- custom (_built in soon_)\n\nThis can be set with many functions by passing e.g. `match_with=\"regex\"`.\n\nAs said above, the special use of `\"**\"` is independant, and currently always on. Future versions will allow toggling this off as well.\n\nTo abandon all matching styles and traverse paths as quickly as possible, use `getitem_by_path_strict`.\n\n#### Matching Style: Globbing\n\nAny given path element is matched with `fnmatchcase` from [the Python stdlib](https://docs.python.org/3/library/fnmatch.html#fnmatch.fnmatchcase). This style is used in the above examples.\n\n#### Matching Style: Regex\n\nAny given path element is matched with `re.compile().match()` from [the Python stdlib](https://docs.python.org/3/library/re.html).\n\n### DeepCollection object API\n\nDeepCollections are instantiated as a normal class, optionally with a given initial collection as an arguement.\n\n```python\nfrom deep_collections import DeepCollection\n\ndc = DeepCollection()\n# or\ndc = DeepCollection({\"a\": {\"b\": {\"c\": \"d\"}}})\n# or\ndc = DeepCollection([\"a\", [\"b\", [\"c\", \"d\"]]])\n```\n\nThese are the noteworthy methods available on all DCs:\n\n- `__getitem__`\n- `__delitem__`\n- `__setitem__`\n- `get`\n- `paths_to_value`\n- `paths_to_key`\n- `values_for_key`\n- `deduped_values_for_key`\n\nThere are also corresponding functions availble that can use any native object that could be deep, but is not a `DeepCollection`, like a normal nested `dict` or `list`. This may be a convenient alternative to ad hoc traverse an object you already have, but it is also faster to use because it doesn't come with the initialization cost of a DeepCollection object. So if speed matters, use a function.\n\n### deep_collections function API\n\nAll of the useful methods for DeepCollection objects are available as functions that can take a collection as an argument, as well as several other supporting functions, which are made plainly availble.\n\nThe core functions are focused on using the same path concept. The available functions and their related DC methods are:\n\n- `getitem_by_path` - `DeepCollection().__getitem__`\n- `get_by_path` - `DeepCollection().get`\n- `set_by_path` - `DeepCollection().set_by_path`\n- `del_by_path` - `DeepCollection().del_by_path`\n- `paths_to_value` - `DeepCollection().paths_to_value`\n- `paths_to_key` - `DeepCollection().paths_to_key`\n- `values_for_key` - `DeepCollection().values_for_key`\n- `deduped_values_for_key` - `DeepCollection().deduped_values_for_key`\n- `dedupe_items`\n- `resolve_path`\n- `matched_keys`\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Easy access to items in deep collections.",
    "version": "0.4.2",
    "split_keywords": [
        "deep",
        "nested",
        "dict",
        "json",
        "data"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bb76fb44a31120bdd7d2a5f7e829a651361dfe48eb7530d2dce2499da3a0c79a",
                "md5": "60ba29e9fbe228d5ef898d4c25984a07",
                "sha256": "69d1f24966536c96055b2d456a9d833f76da962ee0793580969b357178b180c5"
            },
            "downloads": -1,
            "filename": "deep_collections-0.4.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60ba29e9fbe228d5ef898d4c25984a07",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7,<4.0",
            "size": 13576,
            "upload_time": "2023-04-05T22:42:29",
            "upload_time_iso_8601": "2023-04-05T22:42:29.350626Z",
            "url": "https://files.pythonhosted.org/packages/bb/76/fb44a31120bdd7d2a5f7e829a651361dfe48eb7530d2dce2499da3a0c79a/deep_collections-0.4.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "97cc729a17c19c86f00ba15e1cec2c0bd3610eb318da64b2519f25660c041152",
                "md5": "e3cefdfe7461ac95b6615f018a1a69af",
                "sha256": "e4e6444fd238427ffa77f7bc459678ded97c69b5eb7c8dc88deb0e3116980c3b"
            },
            "downloads": -1,
            "filename": "deep_collections-0.4.2.tar.gz",
            "has_sig": false,
            "md5_digest": "e3cefdfe7461ac95b6615f018a1a69af",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7,<4.0",
            "size": 15390,
            "upload_time": "2023-04-05T22:42:31",
            "upload_time_iso_8601": "2023-04-05T22:42:31.458335Z",
            "url": "https://files.pythonhosted.org/packages/97/cc/729a17c19c86f00ba15e1cec2c0bd3610eb318da64b2519f25660c041152/deep_collections-0.4.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-05 22:42:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "terminal-labs",
    "github_project": "deep_collections",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "deep-collections"
}
        
Elapsed time: 0.06439s