binfootprint


Namebinfootprint JSON
Version 1.2.1 PyPI version JSON
download
home_pagehttps://github.com/richard-hartmann/binfootprint
Summaryunique serialization of python objetcs
upload_time2023-11-14 12:07:05
maintainer
docs_urlNone
authorRichard Hartmann
requires_python>=3.8,<4.0
licenseMIT
keywords serialization binary key cache function cache scientific computing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # binfootprint - unique serialization of python objects

[![PyPI version](https://badge.fury.io/py/binfootprint.svg)](https://badge.fury.io/py/binfootprint)

## Why unique serialization

When caching computationally expansive function calls, the input arguments (*args, **kwargs)
serve as key to look up the result of the function.
To perform efficient lookups these keys (often a large number of nested python objects) needs to be hashable.
Since python's build-in hash function is randomly seeded (and applies to a few data types only) it is not
suited for persistent caching.
Alternatively, standard hash functions, as provided by the 
[hashlib library](https://docs.python.org/3/library/hashlib.html), can be used.
As they relay on  byte sequences as input, python objects need to be converted to such a sequence first.
Surely, python's pickle module provides such a serialization which, for our purpose, has the drawback that
the byte sequence is not guaranteed to be unique (e.g., a dictionary can be stored as different byte sequences,
as the order of the (key, value) pairs is irrelevant).

The binfootprint module fills that gap.
It guarantees that a particular python object will have a unique binary representation which 
can serve as input for any hash function.  

## Quick start

`binfootprint.dump(data)` generate a unique binary representation 
(binary footprint) of `data`.
```python
b = binfootprint.dump(['hallo', 42])
```

Its output can serve as suitable input for a hash function.
```python
hashlib.sha256(b).hexdigest()
```

`binfootprint.load(data)` reconstructs the original python object.
```python
ob = binfootprint.load(b)
```

Numpy array can be serialized.
```python
a = numpy.asarray([0, 2.3, 4])
b = binfootprint.dump(a)
```

Classes which implement `__getstate__` (pickle interface) or `__bfkey__` can be
serialized too.

```python
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __getstate__(self):
        return [self.x, self.y]
        
ob = Point(4, -2)
b = binfootprint.dump(ob)
```
If `__bfkey__` is implemented, it is used over `__getstate__`.

*New since version 1.2.0:* 
[`functools.partial`](https://docs.python.org/3/library/functools.html?highlight=partial#functools.partial) 
objects can now be serialized too. This allows to cache a function which takes a `functools.partial`
as argument.

```python
def gaussian(x, a, sigma, x0):
    return a * math.exp(-(x-x0)**2 / 2 / sigma**2)

@binfootprint.util.ShelveCacheDec()
def quad(f, x_min, x_max, dx):
    r = 0
    x = x_min
    while x < x_max:
        r += f(x)
        x += dx
    return dx*r

g = functools.partial(gaussian, a=1, sigma=1, x0=-2.34)
quad(g, x_min=-10, x_max=10, dx=0.001)

```

### cache decorator 

Utilizing the unique binary representation of python objects, a persistent 
cache for quite general functions is provided by the `ShelveCache` class.
The decorator `ShelveCacheDec` makes it really easy to use: 

```python
@binfootprint.ShelveCacheDec(path='.cache')
def area(p):
    return p.x * p.y
```

### parameter base class

To conveniently organize a set of parameters suitable as key for caching 
you can subclass `ABCParameter`. Why should you do that?

- The `__bfkey__` method of `ABCParameter` ignores parameters that are `None`.
  This allows to extend your function interface without loosing access to 
  cached results from earlier stages.
- You can add informative information to the `__non_key__` member which
  are not included in the binary representation of the parameter class.

```python
class Param(bf.ABCParameter):
    __slots__ = ["x", "y", "__non_key__"]

    def __init__(self, x, y, msg=""):
        super().__init__()
        self.x = x
        self.y = y
        self.__non_key__ = dict()
        self.__non_key__["msg"] = msg
```


## Which data types can be serialized

Python's **fundamental data types** are supported
* integer 
* float (64bit)
* complex (128bit)
* strings
* byte arrays
* special build-in constants: `True`, `False`, `None`

as well as their **nested combination** by means of the **native data structures**
- tuple
- list
- dictionary
- namedtuple.

In addition, the following types are supported:
- numpy `ndarray`: 
  The serialization makes use of numpy's 
  [format.write_array()](https://numpy.org/devdocs/reference/generated/numpy.lib.format.write_array.html) 
  function using version 1.0.
- `functools.partial` objects (*new since version 1.2.0*)

 Furthermore, any class that implements 

- `__getstate__` (python's pickle interface)

can be serialized as well, given that the returned data from `__getstate__` can be serialized
**and the returned data is not `None`**
Distinction between objects is realized by adding the class name and the name of the module which defines 
the class to the binary data.
This in turn allows to also reconstruct the original object by means of the `__setstate__` method.

In case the `__getstate__` method is not suitable, you can implement

- `__bfkey__`

which should return the necessary data to distinguish different objects.
The spirit of `__bfkey__` is very similar to that of `__getstate__`, although it is meant
for serialization only, and to for reconstruction the original object.

Note that, if `__bfkey__` is implemented it will be used, regardless of `__getstate__`.

Note: dumping older version is not supported anymore. If backwards compatibility is needed check out older
code from git. If needed converters should/will be written.

### be carefull with functions

Since a function objects seem to implement `__getstate__` which, however, returns `None`, 
dumping a function will fail.
**Whether this makes sense, can be discussed.**
Implementing your own callable ore using `partioal` objects can circumvent this.

## Installation

### pip
install the latest version using pip

    pip install binfootprint

### poetry
Using poetry allows you to include this package in your project as a dependency.

### git
check out the code from github

    git clone https://github.com/richard-hartmann/binfootprint.git

### dependencies

- python3
- numpy

## How to use the binfootprint module

### data serialization

Generating the binary footprint is done using the `dump(obj)` method.

#### very simple
```python
import binfootprint as bf
bf.dump(['hallo', 42])
```

#### more complex
```python
import hashlib
import binfootprint as bf

SIGMA_Z = 0x34
data = {
    'Færøerne': {
        'area': (1399, 'km^2'),
        'population': 54000
    },
    SIGMA_Z: [[-1, 0],
              [0, 1]],
    'usefulness': None
}
b = bf.dump(data)
print("MD5 check sum:", hashlib.md5(b).hexdigest())
```

### reconstruct serialized data

Although the primary focus of this module is the binary representation,
for reasons of convenience or debugging it might be useful restore the original
python object from the binary data. 
Calling the `load(bin_data)` function achieves that task. 
  
```python
import binfootprint as bf

data = ['hallo', 42]
b = bf.dump(data)
data_prime = bf.load(b)
print(data_prime)
```

### python objects - `__getstate__`

Since `__getstate__` is assumed to uniquely represent the state of an
object by means of the returned data, it can be used to generate a unique binary
representation.

```python
import binfootprint as bf

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __getstate__(self):
        return [self.x, self.y]
    def __setstate__(self, state):
        self.x = state[0]
        self.y = state[1]

ob = Point(4, -2)
b = bf.dump(ob)
```
Since `__setstate__` is implemented as well, the original object can be
reconstructed.
```python
ob_prime = bf.load(b)
print("type:", type(ob_prime))
# type: <class '__main__.Point'>
print("member x:", ob_prime.x)
# member x: 4
print("member y:", ob_prime.y)
# member y: -2
```

### implement `__bfkey__` if `__getstate__` is not suited

In case `__getstate__` returns data which is not sufficient to uniquely label
an object or if the data cannot be serialized by the binaryfootprint module,
the method `__bfkey__` should be implemented.
It is expected to return serializable data which uniquely identifies the state
of the object.
Note that, if `__bfkey__` is present, `__getstate__` is ignored.

**Importantly**, when deserializing the binary data from an object implementing 
`__bfkey__`, the python object **is not returned**, since there is no 
`__setstate__`equivalent. Instead, the class name, the name of the module defining 
the class and the data returned by `__bfkey__` is recovered.
This should not pose a problem, since the main focus of the binfootprint module is
the unique binary serialization of an object.
To ensure deserialization use python's `pickle` module.

```python
class Point2(Point):
    def __bfkey__(self):
        return {'x': self.x, 'y': self.y}

ob = Point2(5, 3)
b = bf.dump(ob)

ob_prime = bf.load(b)
print("load on bfkey:", ob_prime)
# load on bfkey: ('Point2', '__main__', {'x': 5, 'y': 3})
```

### numpy ndarrays

Numpy's `ndarray` are supported by relaying on numpy's binary serialization 
using [format.write_array()](https://numpy.org/devdocs/reference/generated/numpy.lib.format.write_array.html).

```python
import binfootprint as bf
import numpy as np

a = np.asarray([0, 1, 1, 0])
b1 = bf.dump(a)
```

As expected, changing the shape or data type yield a different binary representation

```python
a2 = a1.reshape(2,2)
b2 = bf.dump(a2)
a3 = np.asarray(a1, dtype=np.complex128)
b3 = bf.dump(a3)
print("            MD5 of int array :", hashlib.md5(b1).hexdigest())
# 949bfba1237c48007a066398f744a161
print("MD5 of int array shape (2,2) :", hashlib.md5(b2).hexdigest())
# e9049a19f82c6f282d65466a72360cd8
print("        MD5 of complex array :", hashlib.md5(b3).hexdigest())
# 2274ea54925d88ec4d53853050e55a82
```

# caching

With the binaryfootprint module, caching function calls is straight forward.
An implementation of such a cache using python's `shelve` for persistent storage
is provided by the `ShelveCacheDec` class.

```python
@binfootprint.ShelveCacheDec()
def area(p):
    print(" * f(p(x={},y={})) called".format(p.x, p.y))
    return p.x * p.y
```

It is safe to use the `ShelveCacheDec` with the same data location (`path`)   
on different functions, since the name of the function and the name of the 
module defining the function determined the name of the underlying database.  

In addition to caching the decorator extends the function signature by the 
kwarg `_cache_flag` which modifies the caching behavior as follows:

- `_cache_flag = 'no_cache'`: Simple call of `fnc` with no caching.
- `_cache_flag = 'update'`: Call `fnc` and update the cache with recent return value.
- `_cache_flag = 'has_key'`: Return `True` if the call has already been cached, otherwise `False`.
- `_cache_flag = 'cache_only'`: Raises a `KeyError` if the result has not been cached yet.

```python
p = Point(10, 10)
print("first call results in")
print(area(p))
# * f(p(x=10,y=10)) called
# 100

print("second call results in")
print(area(p))
# 100
p = Point(10, 11)

print("f(p(10, 11)) is in cache?")
print(area(p, _cache_flag='has_key'))
# False
```

# pitfalls

### ints and floats

Since the binary representation between ints and floats is different, `1` and `1.0`
will be treated as different things.
This means that the cached value of a function call with an argument being `1` is
not found when passing `1.0` as argument.
Although the result of the function will most likely be the same.
Obviously, the same holds true for numpy array of different `dtype`.

# Parameter class

Tha abstract base class `ABCParameter` allows to conveniently manage a set 
of parameters.

Relevant parameters, explicitly specified as data member via `__slots__` 
mechanism, are returned by `__bfkey__` method (see above).
Their order in the `__slots__` definition is irrelevant.
**Importantly**, class members are included only if they are not `None`.
In this way a parameter class definition can be extended while still being 
able to reproduce the binary footprint of an older class definition.

If present, the class member `__non_key__` has a special meaning.
It is not included in the parameter-values list returned by `__bfkey__`.
It is expected to be dictionary-like and allows storing 
additional / informative information.
This is also reflected by the string representation of the class.

```python
class Param(binfootprint.ABCParameter):
    __slots__ = ["x", "y", "__non_key__"]

    def __init__(self, x, y, msg=""):
        super().__init__()
        self.x = x
        self.y = y
        self.__non_key__ = dict()
        self.__non_key__['msg'] = msg


p = Param(3, 4.5)
bfp = binfootprint.dump(p)
print("{}\n has hex hash value {}...".format(
    p, binfootprint.hash_hex_from_bin_data(bfp)[:6])
)
# x : 3
# y : 4.5
# --- extra info ---
# msg : 
# has hex hash value 38dbe8...

p = Param(3, 4.5, msg="I told you, don't use x=3!")
bfp = binfootprint.dump(p)
print("{}\n has hex hash value {}...".format(
    p, binfootprint.hash_hex_from_bin_data(bfp)[:6])
)
# x : 3
# y : 4.5
# --- extra info ---
# msg : I told you, don't use x=3!
# has hex hash value 38dbe8...
```
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/richard-hartmann/binfootprint",
    "name": "binfootprint",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "serialization,binary key,cache,function cache,scientific computing",
    "author": "Richard Hartmann",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/84/c6/4cd34e9d0da3743382474b88a2dcf9bd6dac6d7be7a6ddef52a5ca8d1a5f/binfootprint-1.2.1.tar.gz",
    "platform": null,
    "description": "# binfootprint - unique serialization of python objects\n\n[![PyPI version](https://badge.fury.io/py/binfootprint.svg)](https://badge.fury.io/py/binfootprint)\n\n## Why unique serialization\n\nWhen caching computationally expansive function calls, the input arguments (*args, **kwargs)\nserve as key to look up the result of the function.\nTo perform efficient lookups these keys (often a large number of nested python objects) needs to be hashable.\nSince python's build-in hash function is randomly seeded (and applies to a few data types only) it is not\nsuited for persistent caching.\nAlternatively, standard hash functions, as provided by the \n[hashlib library](https://docs.python.org/3/library/hashlib.html), can be used.\nAs they relay on  byte sequences as input, python objects need to be converted to such a sequence first.\nSurely, python's pickle module provides such a serialization which, for our purpose, has the drawback that\nthe byte sequence is not guaranteed to be unique (e.g., a dictionary can be stored as different byte sequences,\nas the order of the (key, value) pairs is irrelevant).\n\nThe binfootprint module fills that gap.\nIt guarantees that a particular python object will have a unique binary representation which \ncan serve as input for any hash function.  \n\n## Quick start\n\n`binfootprint.dump(data)` generate a unique binary representation \n(binary footprint) of `data`.\n```python\nb = binfootprint.dump(['hallo', 42])\n```\n\nIts output can serve as suitable input for a hash function.\n```python\nhashlib.sha256(b).hexdigest()\n```\n\n`binfootprint.load(data)` reconstructs the original python object.\n```python\nob = binfootprint.load(b)\n```\n\nNumpy array can be serialized.\n```python\na = numpy.asarray([0, 2.3, 4])\nb = binfootprint.dump(a)\n```\n\nClasses which implement `__getstate__` (pickle interface) or `__bfkey__` can be\nserialized too.\n\n```python\nclass Point:\n    def __init__(self, x, y):\n        self.x = x\n        self.y = y\n    def __getstate__(self):\n        return [self.x, self.y]\n        \nob = Point(4, -2)\nb = binfootprint.dump(ob)\n```\nIf `__bfkey__` is implemented, it is used over `__getstate__`.\n\n*New since version 1.2.0:* \n[`functools.partial`](https://docs.python.org/3/library/functools.html?highlight=partial#functools.partial) \nobjects can now be serialized too. This allows to cache a function which takes a `functools.partial`\nas argument.\n\n```python\ndef gaussian(x, a, sigma, x0):\n    return a * math.exp(-(x-x0)**2 / 2 / sigma**2)\n\n@binfootprint.util.ShelveCacheDec()\ndef quad(f, x_min, x_max, dx):\n    r = 0\n    x = x_min\n    while x < x_max:\n        r += f(x)\n        x += dx\n    return dx*r\n\ng = functools.partial(gaussian, a=1, sigma=1, x0=-2.34)\nquad(g, x_min=-10, x_max=10, dx=0.001)\n\n```\n\n### cache decorator \n\nUtilizing the unique binary representation of python objects, a persistent \ncache for quite general functions is provided by the `ShelveCache` class.\nThe decorator `ShelveCacheDec` makes it really easy to use: \n\n```python\n@binfootprint.ShelveCacheDec(path='.cache')\ndef area(p):\n    return p.x * p.y\n```\n\n### parameter base class\n\nTo conveniently organize a set of parameters suitable as key for caching \nyou can subclass `ABCParameter`. Why should you do that?\n\n- The `__bfkey__` method of `ABCParameter` ignores parameters that are `None`.\n  This allows to extend your function interface without loosing access to \n  cached results from earlier stages.\n- You can add informative information to the `__non_key__` member which\n  are not included in the binary representation of the parameter class.\n\n```python\nclass Param(bf.ABCParameter):\n    __slots__ = [\"x\", \"y\", \"__non_key__\"]\n\n    def __init__(self, x, y, msg=\"\"):\n        super().__init__()\n        self.x = x\n        self.y = y\n        self.__non_key__ = dict()\n        self.__non_key__[\"msg\"] = msg\n```\n\n\n## Which data types can be serialized\n\nPython's **fundamental data types** are supported\n* integer \n* float (64bit)\n* complex (128bit)\n* strings\n* byte arrays\n* special build-in constants: `True`, `False`, `None`\n\nas well as their **nested combination** by means of the **native data structures**\n- tuple\n- list\n- dictionary\n- namedtuple.\n\nIn addition, the following types are supported:\n- numpy `ndarray`: \n  The serialization makes use of numpy's \n  [format.write_array()](https://numpy.org/devdocs/reference/generated/numpy.lib.format.write_array.html) \n  function using version 1.0.\n- `functools.partial` objects (*new since version 1.2.0*)\n\n Furthermore, any class that implements \n\n- `__getstate__` (python's pickle interface)\n\ncan be serialized as well, given that the returned data from `__getstate__` can be serialized\n**and the returned data is not `None`**\nDistinction between objects is realized by adding the class name and the name of the module which defines \nthe class to the binary data.\nThis in turn allows to also reconstruct the original object by means of the `__setstate__` method.\n\nIn case the `__getstate__` method is not suitable, you can implement\n\n- `__bfkey__`\n\nwhich should return the necessary data to distinguish different objects.\nThe spirit of `__bfkey__` is very similar to that of `__getstate__`, although it is meant\nfor serialization only, and to for reconstruction the original object.\n\nNote that, if `__bfkey__` is implemented it will be used, regardless of `__getstate__`.\n\nNote: dumping older version is not supported anymore. If backwards compatibility is needed check out older\ncode from git. If needed converters should/will be written.\n\n### be carefull with functions\n\nSince a function objects seem to implement `__getstate__` which, however, returns `None`, \ndumping a function will fail.\n**Whether this makes sense, can be discussed.**\nImplementing your own callable ore using `partioal` objects can circumvent this.\n\n## Installation\n\n### pip\ninstall the latest version using pip\n\n    pip install binfootprint\n\n### poetry\nUsing poetry allows you to include this package in your project as a dependency.\n\n### git\ncheck out the code from github\n\n    git clone https://github.com/richard-hartmann/binfootprint.git\n\n### dependencies\n\n- python3\n- numpy\n\n## How to use the binfootprint module\n\n### data serialization\n\nGenerating the binary footprint is done using the `dump(obj)` method.\n\n#### very simple\n```python\nimport binfootprint as bf\nbf.dump(['hallo', 42])\n```\n\n#### more complex\n```python\nimport hashlib\nimport binfootprint as bf\n\nSIGMA_Z = 0x34\ndata = {\n    'F\u00e6r\u00f8erne': {\n        'area': (1399, 'km^2'),\n        'population': 54000\n    },\n    SIGMA_Z: [[-1, 0],\n              [0, 1]],\n    'usefulness': None\n}\nb = bf.dump(data)\nprint(\"MD5 check sum:\", hashlib.md5(b).hexdigest())\n```\n\n### reconstruct serialized data\n\nAlthough the primary focus of this module is the binary representation,\nfor reasons of convenience or debugging it might be useful restore the original\npython object from the binary data. \nCalling the `load(bin_data)` function achieves that task. \n  \n```python\nimport binfootprint as bf\n\ndata = ['hallo', 42]\nb = bf.dump(data)\ndata_prime = bf.load(b)\nprint(data_prime)\n```\n\n### python objects - `__getstate__`\n\nSince `__getstate__` is assumed to uniquely represent the state of an\nobject by means of the returned data, it can be used to generate a unique binary\nrepresentation.\n\n```python\nimport binfootprint as bf\n\nclass Point:\n    def __init__(self, x, y):\n        self.x = x\n        self.y = y\n    def __getstate__(self):\n        return [self.x, self.y]\n    def __setstate__(self, state):\n        self.x = state[0]\n        self.y = state[1]\n\nob = Point(4, -2)\nb = bf.dump(ob)\n```\nSince `__setstate__` is implemented as well, the original object can be\nreconstructed.\n```python\nob_prime = bf.load(b)\nprint(\"type:\", type(ob_prime))\n# type: <class '__main__.Point'>\nprint(\"member x:\", ob_prime.x)\n# member x: 4\nprint(\"member y:\", ob_prime.y)\n# member y: -2\n```\n\n### implement `__bfkey__` if `__getstate__` is not suited\n\nIn case `__getstate__` returns data which is not sufficient to uniquely label\nan object or if the data cannot be serialized by the binaryfootprint module,\nthe method `__bfkey__` should be implemented.\nIt is expected to return serializable data which uniquely identifies the state\nof the object.\nNote that, if `__bfkey__` is present, `__getstate__` is ignored.\n\n**Importantly**, when deserializing the binary data from an object implementing \n`__bfkey__`, the python object **is not returned**, since there is no \n`__setstate__`equivalent. Instead, the class name, the name of the module defining \nthe class and the data returned by `__bfkey__` is recovered.\nThis should not pose a problem, since the main focus of the binfootprint module is\nthe unique binary serialization of an object.\nTo ensure deserialization use python's `pickle` module.\n\n```python\nclass Point2(Point):\n    def __bfkey__(self):\n        return {'x': self.x, 'y': self.y}\n\nob = Point2(5, 3)\nb = bf.dump(ob)\n\nob_prime = bf.load(b)\nprint(\"load on bfkey:\", ob_prime)\n# load on bfkey: ('Point2', '__main__', {'x': 5, 'y': 3})\n```\n\n### numpy ndarrays\n\nNumpy's `ndarray` are supported by relaying on numpy's binary serialization \nusing [format.write_array()](https://numpy.org/devdocs/reference/generated/numpy.lib.format.write_array.html).\n\n```python\nimport binfootprint as bf\nimport numpy as np\n\na = np.asarray([0, 1, 1, 0])\nb1 = bf.dump(a)\n```\n\nAs expected, changing the shape or data type yield a different binary representation\n\n```python\na2 = a1.reshape(2,2)\nb2 = bf.dump(a2)\na3 = np.asarray(a1, dtype=np.complex128)\nb3 = bf.dump(a3)\nprint(\"            MD5 of int array :\", hashlib.md5(b1).hexdigest())\n# 949bfba1237c48007a066398f744a161\nprint(\"MD5 of int array shape (2,2) :\", hashlib.md5(b2).hexdigest())\n# e9049a19f82c6f282d65466a72360cd8\nprint(\"        MD5 of complex array :\", hashlib.md5(b3).hexdigest())\n# 2274ea54925d88ec4d53853050e55a82\n```\n\n# caching\n\nWith the binaryfootprint module, caching function calls is straight forward.\nAn implementation of such a cache using python's `shelve` for persistent storage\nis provided by the `ShelveCacheDec` class.\n\n```python\n@binfootprint.ShelveCacheDec()\ndef area(p):\n    print(\" * f(p(x={},y={})) called\".format(p.x, p.y))\n    return p.x * p.y\n```\n\nIt is safe to use the `ShelveCacheDec` with the same data location (`path`)   \non different functions, since the name of the function and the name of the \nmodule defining the function determined the name of the underlying database.  \n\nIn addition to caching the decorator extends the function signature by the \nkwarg `_cache_flag` which modifies the caching behavior as follows:\n\n- `_cache_flag = 'no_cache'`: Simple call of `fnc` with no caching.\n- `_cache_flag = 'update'`: Call `fnc` and update the cache with recent return value.\n- `_cache_flag = 'has_key'`: Return `True` if the call has already been cached, otherwise `False`.\n- `_cache_flag = 'cache_only'`: Raises a `KeyError` if the result has not been cached yet.\n\n```python\np = Point(10, 10)\nprint(\"first call results in\")\nprint(area(p))\n# * f(p(x=10,y=10)) called\n# 100\n\nprint(\"second call results in\")\nprint(area(p))\n# 100\np = Point(10, 11)\n\nprint(\"f(p(10, 11)) is in cache?\")\nprint(area(p, _cache_flag='has_key'))\n# False\n```\n\n# pitfalls\n\n### ints and floats\n\nSince the binary representation between ints and floats is different, `1` and `1.0`\nwill be treated as different things.\nThis means that the cached value of a function call with an argument being `1` is\nnot found when passing `1.0` as argument.\nAlthough the result of the function will most likely be the same.\nObviously, the same holds true for numpy array of different `dtype`.\n\n# Parameter class\n\nTha abstract base class `ABCParameter` allows to conveniently manage a set \nof parameters.\n\nRelevant parameters, explicitly specified as data member via `__slots__` \nmechanism, are returned by `__bfkey__` method (see above).\nTheir order in the `__slots__` definition is irrelevant.\n**Importantly**, class members are included only if they are not `None`.\nIn this way a parameter class definition can be extended while still being \nable to reproduce the binary footprint of an older class definition.\n\nIf present, the class member `__non_key__` has a special meaning.\nIt is not included in the parameter-values list returned by `__bfkey__`.\nIt is expected to be dictionary-like and allows storing \nadditional / informative information.\nThis is also reflected by the string representation of the class.\n\n```python\nclass Param(binfootprint.ABCParameter):\n    __slots__ = [\"x\", \"y\", \"__non_key__\"]\n\n    def __init__(self, x, y, msg=\"\"):\n        super().__init__()\n        self.x = x\n        self.y = y\n        self.__non_key__ = dict()\n        self.__non_key__['msg'] = msg\n\n\np = Param(3, 4.5)\nbfp = binfootprint.dump(p)\nprint(\"{}\\n has hex hash value {}...\".format(\n    p, binfootprint.hash_hex_from_bin_data(bfp)[:6])\n)\n# x : 3\n# y : 4.5\n# --- extra info ---\n# msg : \n# has hex hash value 38dbe8...\n\np = Param(3, 4.5, msg=\"I told you, don't use x=3!\")\nbfp = binfootprint.dump(p)\nprint(\"{}\\n has hex hash value {}...\".format(\n    p, binfootprint.hash_hex_from_bin_data(bfp)[:6])\n)\n# x : 3\n# y : 4.5\n# --- extra info ---\n# msg : I told you, don't use x=3!\n# has hex hash value 38dbe8...\n```",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "unique serialization of python objetcs",
    "version": "1.2.1",
    "project_urls": {
        "Homepage": "https://github.com/richard-hartmann/binfootprint",
        "Repository": "https://github.com/richard-hartmann/binfootprint"
    },
    "split_keywords": [
        "serialization",
        "binary key",
        "cache",
        "function cache",
        "scientific computing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6302bd55aa5ac56a08fd2d4592276c4834e27c1771abea5abdaec94dac7a096b",
                "md5": "dbe44c83b236387c8fdf45f8cc2f1ebe",
                "sha256": "447d85921fced6d2c3da8a4815075a676727c367c5445bb1de83a7bff9f9acd3"
            },
            "downloads": -1,
            "filename": "binfootprint-1.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dbe44c83b236387c8fdf45f8cc2f1ebe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 14075,
            "upload_time": "2023-11-14T12:07:03",
            "upload_time_iso_8601": "2023-11-14T12:07:03.357064Z",
            "url": "https://files.pythonhosted.org/packages/63/02/bd55aa5ac56a08fd2d4592276c4834e27c1771abea5abdaec94dac7a096b/binfootprint-1.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84c64cd34e9d0da3743382474b88a2dcf9bd6dac6d7be7a6ddef52a5ca8d1a5f",
                "md5": "a2b3fd475c39458e664e7399c4966299",
                "sha256": "90a572651157d2f8e8a1a555ccfbdb0701e6be4a290d2926bfdc7361fb58c9f7"
            },
            "downloads": -1,
            "filename": "binfootprint-1.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "a2b3fd475c39458e664e7399c4966299",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 15740,
            "upload_time": "2023-11-14T12:07:05",
            "upload_time_iso_8601": "2023-11-14T12:07:05.090548Z",
            "url": "https://files.pythonhosted.org/packages/84/c6/4cd34e9d0da3743382474b88a2dcf9bd6dac6d7be7a6ddef52a5ca8d1a5f/binfootprint-1.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-14 12:07:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "richard-hartmann",
    "github_project": "binfootprint",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "binfootprint"
}
        
Elapsed time: 0.25976s