flexcache


Nameflexcache JSON
Version 0.3 PyPI version JSON
download
home_page
SummarySaves and loads to the cache a transformed versions of a source object.
upload_time2024-03-09 03:21:07
maintainer
docs_urlNone
author
requires_python>=3.9
licenseBSD
keywords cache optimization storage disk
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            .. image:: https://img.shields.io/pypi/v/flexcache.svg
    :target: https://pypi.python.org/pypi/flexcache
    :alt: Latest Version

.. image:: https://img.shields.io/pypi/l/flexcache.svg
    :target: https://pypi.python.org/pypi/flexcache
    :alt: License

.. image:: https://img.shields.io/pypi/pyversions/flexcache.svg
    :target: https://pypi.python.org/pypi/flexcache
    :alt: Python Versions

.. image:: https://github.com/hgrecco/flexcache/workflows/CI/badge.svg
    :target: https://github.com/hgrecco/flexcache/actions?query=workflow%3ACI
    :alt: CI

.. image:: https://github.com/hgrecco/flexcache/workflows/Lint/badge.svg
    :target: https://github.com/hgrecco/flexcache/actions?query=workflow%3ALint
    :alt: LINTER

.. image:: https://coveralls.io/repos/github/hgrecco/flexcache/badge.svg?branch=main
    :target: https://coveralls.io/github/hgrecco/flexcache?branch=main
    :alt: Coverage


flexcache
=========

An robust and extensible package to cache on disk the result of expensive
calculations.

Consider an expensive function `parse` that takes a path and returns a
parsed version:

.. code-block:: python

    >>> content = parse("source.txt")

It would be nice to automatically and persistently cache this result and
this is where flexcache comes in.

First, we create a `DiskCache` object:

.. code-block:: python

    >>> from flexcache import DiskCacheByMTime
    >>> dc = DiskCacheByMTime(cache_folder="/my/cache/folder")

and then is loaded:

.. code-block:: python

    >>> content, basename = dc.load("source.txt", converter=parse)

If this is the first call, as the cached result is not available,
`parse` will be called on `source.txt` and the output will be saved
and returned. The next time, the cached will be loaded and returned.

When the source is changed, the DiskCache detects that the cached
file is older, calls `parse` again storing and returning the new
result.

In certain cases you would rather detect that the file has changed
by hashing the file. Simply use `DiskCacheByHash` instead of
`DiskCacheByMTime`.

Cached files are saved using the pickle protocol, and each has
a companion json file with the header content.

This idea is completely flexible, and apply not only to parser.
In **flexcache** we say there are two types of objects: **source object**
and **converted object**. The conversion function maps the former in
to the latter. The cache stores the latter by looking a customizable
aspect of the former.


Building your own caching logic
-------------------------------

In certain cases you would like to customize how caching and
invalidation is done.

You can achieve this by subclassing the `DiskCache`.

.. code-block:: python

    >>> from flexcache import DiskCache
    >>> class MyDiskCache(DiskCache):
    ...
    ...    @dataclass(frozen=True)
    ...    class MyHeader(NameByPathHeader, InvalidateByExist, BasicPythonHeader):
    ...         pass
    ...
    ...    _header_classes = {pathlib.Path: MyHeader}

Here we created a custom Header class and use it to handle `pathlib.Path`
objects. You can even have multiple headers registered in the same class
to handle different source object types.

We provide a convenient set of mixable classes to achieve almost any behavior.
These are divided in three categories and you must choose at least one
from every kind.

Headers
~~~~~~~

These classes store the information that will be saved along side the cached file.

- **BaseHeader**: source object and identifier of the converter function.
- **BasicPythonHeader**: source and identifier of the converter function,
  platform, python implementation, python version.


Invalidate
~~~~~~~~~~

These classes define how the cache will decide if the cached converted object is an actual
representation of the source object.

- **InvalidateByExist**: the cached file must exists.
- **InvalidateByPathMTime**: the cached file exists and is newer than the source object
  (which has to be `pathlib.Path`)
- **InvalidateByMultiPathsMtime**: the cached file exists and is newer than the each path
  in the source object (which has to be `tuple[pathlib.Path]`)


Naming
~~~~~~

These classes define how the name is generated. The basename for the cache file is
a hash hexdigest built by feeding a collection of values determined by the Header object.

- **NameByFields**: all fields except the `source_object`.
- **NameByPath**: resolved path of the source object
  (which has to be `pathlib.Path`).
- **NameByMultiPaths**: resolved path of each path source object
  (which has to be `tuple[pathlib.Path]`), sorted in ascending order.
- **NameByFileContent**: the bytes content of the file referred by the source object
  (which has to be `pathlib.Path`).
- **NameByHashIter**: the values in the source object.
  (which has to be `tuple[str]`), sorted in ascending order
- **NameByObj**: the pickled version of the source object
  (which has to be pickable), using the highest available protocol.
  This also adds `pickle_protocol` to the header.


You can mix and match as you see it fit, and of course, you can make your own.

Finally, you can also avoid saving the header by setting the `_store_header`
class attribute to `False`.

----

This project was started as a part of Pint_, the python units package.

See AUTHORS_ for a list of the maintainers.

To review an ordered list of notable changes for each version of a project,
see CHANGES_

.. _`AUTHORS`: https://github.com/hgrecco/flexcache/blob/main/AUTHORS
.. _`CHANGES`: https://github.com/hgrecco/flexcache/blob/main/CHANGES
.. _`Pint`: https://github.com/hgrecco/pint

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "flexcache",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "\"Hernan E. Grecco\" <hernan.grecco@gmail.com>",
    "keywords": "cache,optimization,storage,disk",
    "author": "",
    "author_email": "\"Hernan E. Grecco\" <hernan.grecco@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/55/b0/8a21e330561c65653d010ef112bf38f60890051d244ede197ddaa08e50c1/flexcache-0.3.tar.gz",
    "platform": null,
    "description": ".. image:: https://img.shields.io/pypi/v/flexcache.svg\n    :target: https://pypi.python.org/pypi/flexcache\n    :alt: Latest Version\n\n.. image:: https://img.shields.io/pypi/l/flexcache.svg\n    :target: https://pypi.python.org/pypi/flexcache\n    :alt: License\n\n.. image:: https://img.shields.io/pypi/pyversions/flexcache.svg\n    :target: https://pypi.python.org/pypi/flexcache\n    :alt: Python Versions\n\n.. image:: https://github.com/hgrecco/flexcache/workflows/CI/badge.svg\n    :target: https://github.com/hgrecco/flexcache/actions?query=workflow%3ACI\n    :alt: CI\n\n.. image:: https://github.com/hgrecco/flexcache/workflows/Lint/badge.svg\n    :target: https://github.com/hgrecco/flexcache/actions?query=workflow%3ALint\n    :alt: LINTER\n\n.. image:: https://coveralls.io/repos/github/hgrecco/flexcache/badge.svg?branch=main\n    :target: https://coveralls.io/github/hgrecco/flexcache?branch=main\n    :alt: Coverage\n\n\nflexcache\n=========\n\nAn robust and extensible package to cache on disk the result of expensive\ncalculations.\n\nConsider an expensive function `parse` that takes a path and returns a\nparsed version:\n\n.. code-block:: python\n\n    >>> content = parse(\"source.txt\")\n\nIt would be nice to automatically and persistently cache this result and\nthis is where flexcache comes in.\n\nFirst, we create a `DiskCache` object:\n\n.. code-block:: python\n\n    >>> from flexcache import DiskCacheByMTime\n    >>> dc = DiskCacheByMTime(cache_folder=\"/my/cache/folder\")\n\nand then is loaded:\n\n.. code-block:: python\n\n    >>> content, basename = dc.load(\"source.txt\", converter=parse)\n\nIf this is the first call, as the cached result is not available,\n`parse` will be called on `source.txt` and the output will be saved\nand returned. The next time, the cached will be loaded and returned.\n\nWhen the source is changed, the DiskCache detects that the cached\nfile is older, calls `parse` again storing and returning the new\nresult.\n\nIn certain cases you would rather detect that the file has changed\nby hashing the file. Simply use `DiskCacheByHash` instead of\n`DiskCacheByMTime`.\n\nCached files are saved using the pickle protocol, and each has\na companion json file with the header content.\n\nThis idea is completely flexible, and apply not only to parser.\nIn **flexcache** we say there are two types of objects: **source object**\nand **converted object**. The conversion function maps the former in\nto the latter. The cache stores the latter by looking a customizable\naspect of the former.\n\n\nBuilding your own caching logic\n-------------------------------\n\nIn certain cases you would like to customize how caching and\ninvalidation is done.\n\nYou can achieve this by subclassing the `DiskCache`.\n\n.. code-block:: python\n\n    >>> from flexcache import DiskCache\n    >>> class MyDiskCache(DiskCache):\n    ...\n    ...    @dataclass(frozen=True)\n    ...    class MyHeader(NameByPathHeader, InvalidateByExist, BasicPythonHeader):\n    ...         pass\n    ...\n    ...    _header_classes = {pathlib.Path: MyHeader}\n\nHere we created a custom Header class and use it to handle `pathlib.Path`\nobjects. You can even have multiple headers registered in the same class\nto handle different source object types.\n\nWe provide a convenient set of mixable classes to achieve almost any behavior.\nThese are divided in three categories and you must choose at least one\nfrom every kind.\n\nHeaders\n~~~~~~~\n\nThese classes store the information that will be saved along side the cached file.\n\n- **BaseHeader**: source object and identifier of the converter function.\n- **BasicPythonHeader**: source and identifier of the converter function,\n  platform, python implementation, python version.\n\n\nInvalidate\n~~~~~~~~~~\n\nThese classes define how the cache will decide if the cached converted object is an actual\nrepresentation of the source object.\n\n- **InvalidateByExist**: the cached file must exists.\n- **InvalidateByPathMTime**: the cached file exists and is newer than the source object\n  (which has to be `pathlib.Path`)\n- **InvalidateByMultiPathsMtime**: the cached file exists and is newer than the each path\n  in the source object (which has to be `tuple[pathlib.Path]`)\n\n\nNaming\n~~~~~~\n\nThese classes define how the name is generated. The basename for the cache file is\na hash hexdigest built by feeding a collection of values determined by the Header object.\n\n- **NameByFields**: all fields except the `source_object`.\n- **NameByPath**: resolved path of the source object\n  (which has to be `pathlib.Path`).\n- **NameByMultiPaths**: resolved path of each path source object\n  (which has to be `tuple[pathlib.Path]`), sorted in ascending order.\n- **NameByFileContent**: the bytes content of the file referred by the source object\n  (which has to be `pathlib.Path`).\n- **NameByHashIter**: the values in the source object.\n  (which has to be `tuple[str]`), sorted in ascending order\n- **NameByObj**: the pickled version of the source object\n  (which has to be pickable), using the highest available protocol.\n  This also adds `pickle_protocol` to the header.\n\n\nYou can mix and match as you see it fit, and of course, you can make your own.\n\nFinally, you can also avoid saving the header by setting the `_store_header`\nclass attribute to `False`.\n\n----\n\nThis project was started as a part of Pint_, the python units package.\n\nSee AUTHORS_ for a list of the maintainers.\n\nTo review an ordered list of notable changes for each version of a project,\nsee CHANGES_\n\n.. _`AUTHORS`: https://github.com/hgrecco/flexcache/blob/main/AUTHORS\n.. _`CHANGES`: https://github.com/hgrecco/flexcache/blob/main/CHANGES\n.. _`Pint`: https://github.com/hgrecco/pint\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Saves and loads to the cache a transformed versions of a source object.",
    "version": "0.3",
    "project_urls": {
        "Homepage": "https://github.com/hgrecco/flexcache"
    },
    "split_keywords": [
        "cache",
        "optimization",
        "storage",
        "disk"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "27cdc883e1a7c447479d6e13985565080e3fea88ab5a107c21684c813dba1875",
                "md5": "bf9972ca7d2645390c1cbf4e9fb943ae",
                "sha256": "d43c9fea82336af6e0115e308d9d33a185390b8346a017564611f1466dcd2e32"
            },
            "downloads": -1,
            "filename": "flexcache-0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bf9972ca7d2645390c1cbf4e9fb943ae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 13263,
            "upload_time": "2024-03-09T03:21:05",
            "upload_time_iso_8601": "2024-03-09T03:21:05.635813Z",
            "url": "https://files.pythonhosted.org/packages/27/cd/c883e1a7c447479d6e13985565080e3fea88ab5a107c21684c813dba1875/flexcache-0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "55b08a21e330561c65653d010ef112bf38f60890051d244ede197ddaa08e50c1",
                "md5": "11e710fd4049053b2c1a939aa46fcc54",
                "sha256": "18743bd5a0621bfe2cf8d519e4c3bfdf57a269c15d1ced3fb4b64e0ff4600656"
            },
            "downloads": -1,
            "filename": "flexcache-0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "11e710fd4049053b2c1a939aa46fcc54",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 15816,
            "upload_time": "2024-03-09T03:21:07",
            "upload_time_iso_8601": "2024-03-09T03:21:07.555508Z",
            "url": "https://files.pythonhosted.org/packages/55/b0/8a21e330561c65653d010ef112bf38f60890051d244ede197ddaa08e50c1/flexcache-0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-09 03:21:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hgrecco",
    "github_project": "flexcache",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "flexcache"
}
        
Elapsed time: 1.91831s