# Pottery: Redis for Humans 🌎🌍🌏
[Redis](http://redis.io/) is awesome, but [Redis
commands](http://redis.io/commands) are not always intuitive. Pottery is a
Pythonic way to access Redis. If you know how to use Python dicts, then you
already know how to use Pottery. Pottery is useful for accessing Redis more
easily, and also for implementing microservice resilience patterns; and it has
been battle tested in production at scale.
[![Build status](https://img.shields.io/github/workflow/status/brainix/pottery/Python%20package/master)](https://github.com/brainix/pottery/actions?query=branch%3Amaster)
[![Security status](https://img.shields.io/badge/security-bandit-dark.svg)](https://github.com/PyCQA/bandit)
[![Latest released version](https://badge.fury.io/py/pottery.svg)](https://badge.fury.io/py/pottery)
![Supported Python versions](https://img.shields.io/pypi/pyversions/pottery)
![Number of lines of code](https://img.shields.io/tokei/lines/github/brainix/pottery)
[![Total number of downloads](https://pepy.tech/badge/pottery)](https://pepy.tech/project/pottery)
[![Downloads per month](https://pepy.tech/badge/pottery/month)](https://pepy.tech/project/pottery)
[![Downloads per week](https://pepy.tech/badge/pottery/week)](https://pepy.tech/project/pottery)
## Table of Contents
- [Dicts 📖](#dicts)
- [Sets 🛍️](#sets)
- [Lists ⛓](#lists)
- [Counters 🧮](#counters)
- [Deques 🖇️](#deques)
- [Queues 🚶♂️🚶♀️🚶♂️](#queues)
- [Redlock 🔒](#redlock)
- [synchronize() 👯♀️](#synchronize)
- [AIORedlock 🔒](#aioredlock)
- [NextID 🔢](#nextid)
- [redis_cache()](#redis_cache)
- [CachedOrderedDict](#cachedordereddict)
- [Bloom filters 🌸](#bloom-filters)
- [HyperLogLogs 🪵](#hyperloglogs)
- [ContextTimer ⏱️](#contexttimer)
## Installation
```shell
$ pip3 install pottery
```
## Usage
First, set up your Redis client:
```python
>>> from redis import Redis
>>> redis = Redis.from_url('redis://localhost:6379/1')
>>>
```
## <a name="dicts"></a>Dicts 📖
`RedisDict` is a Redis-backed container compatible with Python’s
[`dict`](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).
Here is a small example using a `RedisDict`:
```python
>>> from pottery import RedisDict
>>> tel = RedisDict({'jack': 4098, 'sape': 4139}, redis=redis, key='tel')
>>> tel['guido'] = 4127
>>> tel
RedisDict{'jack': 4098, 'sape': 4139, 'guido': 4127}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
RedisDict{'jack': 4098, 'guido': 4127, 'irv': 4127}
>>> list(tel)
['jack', 'guido', 'irv']
>>> sorted(tel)
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False
>>>
```
Notice the first two keyword arguments to `RedisDict()`: The first is your
Redis client. The second is the Redis key name for your dict. Other than
that, you can use your `RedisDict` the same way that you use any other Python
`dict`.
*Limitations:*
1. Keys and values must be JSON serializable.
## <a name="sets"></a>Sets 🛍️
`RedisSet` is a Redis-backed container compatible with Python’s
[`set`](https://docs.python.org/3/tutorial/datastructures.html#sets).
Here is a brief demonstration:
```python
>>> from pottery import RedisSet
>>> basket = RedisSet({'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}, redis=redis, key='basket')
>>> sorted(basket)
['apple', 'banana', 'orange', 'pear']
>>> 'orange' in basket
True
>>> 'crabgrass' in basket
False
>>> a = RedisSet('abracadabra', redis=redis, key='magic')
>>> b = set('alacazam')
>>> sorted(a)
['a', 'b', 'c', 'd', 'r']
>>> sorted(a - b)
['b', 'd', 'r']
>>> sorted(a | b)
['a', 'b', 'c', 'd', 'l', 'm', 'r', 'z']
>>> sorted(a & b)
['a', 'c']
>>> sorted(a ^ b)
['b', 'd', 'l', 'm', 'r', 'z']
>>>
```
Notice the two keyword arguments to `RedisSet()`: The first is your Redis
client. The second is the Redis key name for your set. Other than that, you
can use your `RedisSet` the same way that you use any other Python `set`.
Do more efficient membership testing for multiple elements using
`.contains_many()`:
```python
>>> nirvana = RedisSet({'kurt', 'krist', 'dave'}, redis=redis, key='nirvana')
>>> tuple(nirvana.contains_many('kurt', 'krist', 'chat', 'dave'))
(True, True, False, True)
>>>
```
*Limitations:*
1. Elements must be JSON serializable.
## <a name="lists"></a>Lists ⛓
`RedisList` is a Redis-backed container compatible with Python’s
[`list`](https://docs.python.org/3/tutorial/introduction.html#lists).
```python
>>> from pottery import RedisList
>>> squares = RedisList([1, 4, 9, 16, 25], redis=redis, key='squares')
>>> squares
RedisList[1, 4, 9, 16, 25]
>>> squares[0]
1
>>> squares[-1]
25
>>> squares[-3:]
[9, 16, 25]
>>> squares[:]
[1, 4, 9, 16, 25]
>>> squares + [36, 49, 64, 81, 100]
RedisList[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
>>>
```
Notice the two keyword arguments to `RedisList()`: The first is your Redis
client. The second is the Redis key name for your list. Other than that, you
can use your `RedisList` the same way that you use any other Python `list`.
*Limitations:*
1. Elements must be JSON serializable.
2. Under the hood, Python implements `list` using an array. Redis implements
list using a
[doubly linked list](https://redis.io/topics/data-types-intro#redis-lists).
As such, inserting elements at the head or tail of a `RedisList` is fast,
O(1). However, accessing `RedisList` elements by index is slow, O(n). So
in terms of performance and ideal use cases, `RedisList` is more similar to
Python’s `deque` than Python’s `list`. Instead of `RedisList`,
consider using [`RedisDeque`](#deques).
## <a name="counters"></a>Counters 🧮
`RedisCounter` is a Redis-backed container compatible with Python’s
[`collections.Counter`](https://docs.python.org/3/library/collections.html#collections.Counter).
```python
>>> from pottery import RedisCounter
>>> c = RedisCounter(redis=redis, key='my-counter')
>>> c = RedisCounter('gallahad', redis=redis, key='my-counter')
>>> c.clear()
>>> c = RedisCounter({'red': 4, 'blue': 2}, redis=redis, key='my-counter')
>>> c.clear()
>>> c = RedisCounter(redis=redis, key='my-counter', cats=4, dogs=8)
>>> c.clear()
>>> c = RedisCounter(['eggs', 'ham'], redis=redis, key='my-counter')
>>> c['bacon']
0
>>> c['sausage'] = 0
>>> del c['sausage']
>>> c.clear()
>>> c = RedisCounter(redis=redis, key='my-counter', a=4, b=2, c=0, d=-2)
>>> sorted(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']
>>> c.clear()
>>> RedisCounter('abracadabra', redis=redis, key='my-counter').most_common(3)
[('a', 5), ('b', 2), ('r', 2)]
>>> c.clear()
>>> c = RedisCounter(redis=redis, key='my-counter', a=4, b=2, c=0, d=-2)
>>> from collections import Counter
>>> d = Counter(a=1, b=2, c=3, d=4)
>>> c.subtract(d)
>>> c
RedisCounter{'a': 3, 'b': 0, 'c': -3, 'd': -6}
>>>
```
Notice the first two keyword arguments to `RedisCounter()`: The first is your
Redis client. The second is the Redis key name for your counter. Other than
that, you can use your `RedisCounter` the same way that you use any other
Python `Counter`.
*Limitations:*
1. Keys must be JSON serializable.
## <a name="deques"></a>Deques 🖇️
`RedisDeque` is a Redis-backed container compatible with Python’s
[`collections.deque`](https://docs.python.org/3/library/collections.html#collections.deque).
Example:
```python
>>> from pottery import RedisDeque
>>> d = RedisDeque('ghi', redis=redis, key='letters')
>>> for elem in d:
... print(elem.upper())
G
H
I
>>> d.append('j')
>>> d.appendleft('f')
>>> d
RedisDeque(['f', 'g', 'h', 'i', 'j'])
>>> d.pop()
'j'
>>> d.popleft()
'f'
>>> list(d)
['g', 'h', 'i']
>>> d[0]
'g'
>>> d[-1]
'i'
>>> list(reversed(d))
['i', 'h', 'g']
>>> 'h' in d
True
>>> d.extend('jkl')
>>> d
RedisDeque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> d.rotate(1)
>>> d
RedisDeque(['l', 'g', 'h', 'i', 'j', 'k'])
>>> d.rotate(-1)
>>> d
RedisDeque(['g', 'h', 'i', 'j', 'k', 'l'])
>>> RedisDeque(reversed(d), redis=redis)
RedisDeque(['l', 'k', 'j', 'i', 'h', 'g'])
>>> d.clear()
>>> d.extendleft('abc')
>>> d
RedisDeque(['c', 'b', 'a'])
>>>
```
Notice the two keyword arguments to `RedisDeque()`: The first is your Redis
client. The second is the Redis key name for your deque. Other than that, you
can use your `RedisDeque` the same way that you use any other Python `deque`.
*Limitations:*
1. Elements must be JSON serializable.
## <a name="queues"></a>Queues 🚶♂️🚶♀️🚶♂️
`RedisSimpleQueue` is a Redis-backed multi-producer, multi-consumer FIFO queue
compatible with Python’s
[`queue.SimpleQueue`](https://docs.python.org/3/library/queue.html#simplequeue-objects).
In general, use a Python `queue.Queue` if you’re using it in one or more
threads, use `multiprocessing.Queue` if you’re using it between processes,
and use `RedisSimpleQueue` if you’re sharing it across machines or if you
need for your queue to persist across application crashes or restarts.
Instantiate a `RedisSimpleQueue`:
```python
>>> from pottery import RedisSimpleQueue
>>> cars = RedisSimpleQueue(redis=redis, key='cars')
>>>
```
Notice the two keyword arguments to `RedisSimpleQueue()`: The first is your
Redis client. The second is the Redis key name for your queue. Other than
that, you can use your `RedisSimpleQueue` the same way that you use any other
Python `queue.SimpleQueue`.
Check the queue state, put some items in the queue, and get those items back
out:
```python
>>> cars.empty()
True
>>> cars.qsize()
0
>>> cars.put('Jeep')
>>> cars.put('Honda')
>>> cars.put('Audi')
>>> cars.empty()
False
>>> cars.qsize()
3
>>> cars.get()
'Jeep'
>>> cars.get()
'Honda'
>>> cars.get()
'Audi'
>>> cars.empty()
True
>>> cars.qsize()
0
>>>
```
*Limitations:*
1. Items must be JSON serializable.
## <a name="redlock"></a>Redlock 🔒
`Redlock` is a safe and reliable lock to coordinate access to a resource shared
across threads, processes, and even machines, without a single point of
failure. [Rationale and algorithm
description.](http://redis.io/topics/distlock)
`Redlock` implements Python’s excellent
[`threading.Lock`](https://docs.python.org/3/library/threading.html#lock-objects)
API as closely as is feasible. In other words, you can use `Redlock` the same
way that you use `threading.Lock`. The main reason to use `Redlock` over
`threading.Lock` is that `Redlock` can coordinate access to a resource shared
across different machines; `threading.Lock` can’t.
Instantiate a `Redlock`:
```python
>>> from pottery import Redlock
>>> printer_lock = Redlock(key='printer', masters={redis}, auto_release_time=.2)
>>>
```
The `key` argument represents the resource, and the `masters` argument
specifies your Redis masters across which to distribute the lock. In
production, you should have 5 Redis masters. This is to eliminate a single
point of failure — you can lose up to 2 out of the 5 Redis masters and
your `Redlock` will remain available and performant. Now you can protect
access to your resource:
```python
>>> if printer_lock.acquire():
... # Critical section - print stuff here.
... print('printer_lock is locked')
... printer_lock.release()
printer_lock is locked
>>> bool(printer_lock.locked())
False
>>>
```
Or you can protect access to your resource inside a context manager:
```python
>>> with printer_lock:
... # Critical section - print stuff here.
... print('printer_lock is locked')
printer_lock is locked
>>> bool(printer_lock.locked())
False
>>>
```
It’s safest to instantiate a new `Redlock` object every time you need to
protect your resource and to not share `Redlock` instances across different
parts of code. In other words, think of the `key` as identifying the resource;
don’t think of any particular `Redlock` as identifying the resource.
Instantiating a new `Redlock` every time you need a lock sidesteps bugs by
decoupling how you use `Redlock` from the forking/threading model of your
application/service.
`Redlock`s are automatically released (by default, after 10 seconds). You
should take care to ensure that your critical section completes well within
that timeout. The reasons that `Redlock`s are automatically released are to
preserve
[“liveness”](http://redis.io/topics/distlock#liveness-arguments)
and to avoid deadlocks (in the event that a process dies inside a critical
section before it releases its lock).
```python
>>> import time
>>> if printer_lock.acquire():
... # Critical section - print stuff here.
... time.sleep(printer_lock.auto_release_time)
>>> bool(printer_lock.locked())
False
>>>
```
If 10 seconds isn’t enough to complete executing your critical section,
then you can specify your own auto release time (in seconds):
```python
>>> printer_lock = Redlock(key='printer', masters={redis}, auto_release_time=.2)
>>> if printer_lock.acquire():
... # Critical section - print stuff here.
... time.sleep(printer_lock.auto_release_time / 2)
>>> bool(printer_lock.locked())
True
>>> time.sleep(printer_lock.auto_release_time / 2)
>>> bool(printer_lock.locked())
False
>>>
```
By default, `.acquire()` blocks indefinitely until the lock is acquired. You
can make `.acquire()` return immediately with the `blocking` argument.
`.acquire()` returns `True` if the lock was acquired; `False` if not.
```python
>>> printer_lock_1 = Redlock(key='printer', masters={redis}, auto_release_time=.2)
>>> printer_lock_2 = Redlock(key='printer', masters={redis}, auto_release_time=.2)
>>> printer_lock_1.acquire(blocking=False)
True
>>> printer_lock_2.acquire(blocking=False) # Returns immediately.
False
>>> printer_lock_1.release()
>>>
```
You can make `.acquire()` block but not indefinitely by specifying the
`timeout` argument (in seconds):
```python
>>> printer_lock_1.acquire()
True
>>> printer_lock_2.acquire(timeout=printer_lock_1.auto_release_time / 2) # Waits 100 milliseconds.
False
>>> import contextlib
>>> from pottery import ReleaseUnlockedLock
>>> with contextlib.suppress(ReleaseUnlockedLock):
... printer_lock_1.release()
>>>
```
You can similarly configure the Redlock context manager’s
blocking/timeout behavior during Redlock initialization. If the context
manager fails to acquire the lock, it raises the `QuorumNotAchieved` exception.
```python
>>> import contextlib
>>> from pottery import QuorumNotAchieved
>>> printer_lock_1 = Redlock(key='printer', masters={redis}, context_manager_blocking=True, context_manager_timeout=0.2)
>>> printer_lock_2 = Redlock(key='printer', masters={redis}, context_manager_blocking=True, context_manager_timeout=0.2)
>>> with printer_lock_1:
... with contextlib.suppress(QuorumNotAchieved):
... with printer_lock_2: # Waits 200 milliseconds; raises QuorumNotAchieved.
... pass
... print(f"printer_lock_1 is {'locked' if printer_lock_1.locked() else 'unlocked'}")
... print(f"printer_lock_2 is {'locked' if printer_lock_2.locked() else 'unlocked'}")
printer_lock_1 is locked
printer_lock_2 is unlocked
>>>
```
### <a name="synchronize"></a>synchronize() 👯♀️
`synchronize()` is a decorator that allows only one thread to execute a
function at a time. Under the hood, `synchronize()` uses a Redlock, so refer
to the [Redlock documentation](#redlock) for more details.
Here’s how to use `synchronize()`:
```python
>>> from pottery import synchronize
>>> @synchronize(key='synchronized-func', masters={redis}, auto_release_time=1.5, blocking=True, timeout=-1)
... def func():
... # Only one thread can execute this function at a time.
... return True
...
>>> func()
True
>>>
```
## <a name="aioredlock"></a>AIORedlock 🔒
`AIORedlock` is the asyncio implementation of Redlock, compatible with
Python’s
[`asyncio.Lock`](https://docs.python.org/3/library/asyncio-sync.html#lock).
Instantiate an `AIORedlock` and protect a resource:
```python
>>> import asyncio
>>> from redis.asyncio import Redis as AIORedis
>>> from pottery import AIORedlock
>>> async def main():
... aioredis = AIORedis.from_url('redis://localhost:6379/1')
... shower = AIORedlock(key='shower', masters={aioredis})
... if await shower.acquire():
... # Critical section - no other coroutine can enter while we hold the lock.
... print(f"shower is {'occupied' if await shower.locked() else 'available'}")
... await shower.release()
... print(f"shower is {'occupied' if await shower.locked() else 'available'}")
...
>>> asyncio.run(main(), debug=True)
shower is occupied
shower is available
>>>
```
Or you can protect access to your resource inside a context manager:
```python
>>> asyncio.set_event_loop(asyncio.new_event_loop())
>>> async def main():
... aioredis = AIORedis.from_url('redis://localhost:6379/1')
... shower = AIORedlock(key='shower', masters={aioredis})
... async with shower:
... # Critical section - no other coroutine can enter while we hold the lock.
... print(f"shower is {'occupied' if await shower.locked() else 'available'}")
... print(f"shower is {'occupied' if await shower.locked() else 'available'}")
...
>>> asyncio.run(main(), debug=True)
shower is occupied
shower is available
>>>
```
## <a name="nextid"></a>NextID 🔢
`NextID` safely and reliably produces increasing IDs across threads, processes,
and even machines, without a single point of failure. [Rationale and algorithm
description.](http://antirez.com/news/102)
Instantiate an ID generator:
```python
>>> from pottery import NextID
>>> tweet_ids = NextID(key='tweet-ids', masters={redis})
>>>
```
The `key` argument represents the sequence (so that you can have different
sequences for user IDs, comment IDs, etc.), and the `masters` argument
specifies your Redis masters across which to distribute ID generation (in
production, you should have 5 Redis masters). Now, whenever you need a user
ID, call `next()` on the ID generator:
```python
>>> next(tweet_ids)
1
>>> next(tweet_ids)
2
>>> next(tweet_ids)
3
>>>
```
Two caveats:
1. If many clients are generating IDs concurrently, then there may be
“holes” in the sequence of IDs (e.g.: 1, 2, 6, 10, 11, 21,
…).
2. This algorithm scales to about 5,000 IDs per second (with 5 Redis masters).
If you need IDs faster than that, then you may want to consider other
techniques.
## redis_cache()
`redis_cache()` is a simple lightweight unbounded function return value cache,
sometimes called
[“memoize”](https://en.wikipedia.org/wiki/Memoization).
`redis_cache()` implements Python’s excellent
[`functools.cache()`](https://docs.python.org/3/library/functools.html#functools.cache)
API as closely as is feasible. In other words, you can use `redis_cache()` the
same way that you use `functools.cache()`.
*Limitations:*
1. Arguments to the function must be hashable.
2. Return values from the function must be JSON serializable.
3. Just like `functools.cache()`, `redis_cache()` does not allow for a maximum
size, and does not evict old values, and grows unbounded. Only use
`redis_cache()` in one of these cases:
1. Your function’s argument space has a known small cardinality.
2. You specify a `timeout` when calling `redis_cache()` to decorate your
function, to dump your _entire_ return value cache `timeout` seconds
after the last cache access (hit or miss).
3. You periodically call `.cache_clear()` to dump your _entire_ return
value cache.
4. You’re ok with your return value cache growing unbounded, and you
[understand the implications](https://docs.redislabs.com/latest/rs/administering/database-operations/eviction-policy/)
of this for your underlying Redis instance.
In general, you should only use `redis_cache()` when you want to reuse
previously computed values. Accordingly, it doesn’t make sense to cache
functions with side-effects or impure functions such as `time()` or `random()`.
Decorate a function:
```python
>>> import time
>>> from pottery import redis_cache
>>> @redis_cache(redis=redis, key='expensive-function-cache')
... def expensive_function(n):
... time.sleep(.1) # Simulate an expensive computation or database lookup.
... return n
...
>>>
```
Notice the two keyword arguments to `redis_cache()`: The first is your Redis
client. The second is the Redis key name for your function’s return
value cache.
Call your function and observe the cache hit/miss rates:
```python
>>> expensive_function(5)
5
>>> expensive_function.cache_info()
CacheInfo(hits=0, misses=1, maxsize=None, currsize=1)
>>> expensive_function(5)
5
>>> expensive_function.cache_info()
CacheInfo(hits=1, misses=1, maxsize=None, currsize=1)
>>> expensive_function(6)
6
>>> expensive_function.cache_info()
CacheInfo(hits=1, misses=2, maxsize=None, currsize=2)
>>>
```
Notice that the first call to `expensive_function()` takes 1 second and results
in a cache miss; but the second call returns almost immediately and results in
a cache hit. This is because after the first call, `redis_cache()` cached the
return value for the call when `n == 5`.
You can access your original undecorated underlying `expensive_function()` as
`expensive_function.__wrapped__`. This is useful for introspection, for
bypassing the cache, or for rewrapping the original function with a different
cache.
You can force a cache reset for a particular combination of `args`/`kwargs`
with `expensive_function.__bypass__`. A call to
`expensive_function.__bypass__(*args, **kwargs)` bypasses the cache lookup,
calls the original underlying function, then caches the results for future
calls to `expensive_function(*args, **kwargs)`. Note that a call to
`expensive_function.__bypass__(*args, **kwargs)` results in neither a cache hit
nor a cache miss.
Finally, clear/invalidate your function’s entire return value cache with
`expensive_function.cache_clear()`:
```python
>>> expensive_function.cache_info()
CacheInfo(hits=1, misses=2, maxsize=None, currsize=2)
>>> expensive_function.cache_clear()
>>> expensive_function.cache_info()
CacheInfo(hits=0, misses=0, maxsize=None, currsize=0)
>>>
```
## CachedOrderedDict
The best way that I can explain `CachedOrderedDict` is through an example
use-case. Imagine that your search engine returns document IDs, which then you
have to hydrate into full documents via the database to return to the client.
The data structure used to represent such search results must have the
following properties:
1. It must preserve the order of the document IDs returned by the search engine.
2. It must map document IDs to hydrated documents.
3. It must cache previously hydrated documents.
Properties 1 and 2 are satisfied by Python’s
[`collections.OrderedDict`](https://docs.python.org/3/library/collections.html#collections.OrderedDict).
However, `CachedOrderedDict` extends Python’s `OrderedDict` to also
satisfy property 3.
The most common usage pattern for `CachedOrderedDict` is as follows:
1. Instantiate `CachedOrderedDict` with the IDs that you must look up or
compute passed in as the `dict_keys` argument to the initializer.
2. Compute and store the cache misses for future lookups.
3. Return some representation of your `CachedOrderedDict` to the client.
Instantiate a `CachedOrderedDict`:
```python
>>> from pottery import CachedOrderedDict
>>> search_results_1 = CachedOrderedDict(
... redis_client=redis,
... redis_key='search-results',
... dict_keys=(1, 2, 3, 4, 5),
... )
>>>
```
The `redis_client` argument to the initializer is your Redis client, and the
`redis_key` argument is the Redis key for the Redis Hash backing your cache.
The `dict_keys` argument represents an ordered iterable of keys to be looked up
and automatically populated in your `CachedOrderedDict` (on cache hits), or
that you’ll have to compute and populate for future lookups (on cache
misses). Regardless of whether keys are cache hits or misses,
`CachedOrderedDict` preserves the order of `dict_keys` (like a list), maps
those keys to values (like a dict), and maintains an underlying cache for
future key lookups.
In the beginning, the cache is empty, so let’s populate it:
```python
>>> sorted(search_results_1.misses())
[1, 2, 3, 4, 5]
>>> search_results_1[1] = 'one'
>>> search_results_1[2] = 'two'
>>> search_results_1[3] = 'three'
>>> search_results_1[4] = 'four'
>>> search_results_1[5] = 'five'
>>> sorted(search_results_1.misses())
[]
>>>
```
Note that `CachedOrderedDict` preserves the order of `dict_keys`:
```python
>>> for key, value in search_results_1.items():
... print(f'{key}: {value}')
1: one
2: two
3: three
4: four
5: five
>>>
```
Now, let’s look at a combination of cache hits and misses:
```python
>>> search_results_2 = CachedOrderedDict(
... redis_client=redis,
... redis_key='search-results',
... dict_keys=(2, 4, 6, 8, 10),
... )
>>> sorted(search_results_2.misses())
[6, 8, 10]
>>> search_results_2[2]
'two'
>>> search_results_2[6] = 'six'
>>> search_results_2[8] = 'eight'
>>> search_results_2[10] = 'ten'
>>> sorted(search_results_2.misses())
[]
>>> for key, value in search_results_2.items():
... print(f'{key}: {value}')
2: two
4: four
6: six
8: eight
10: ten
>>>
```
*Limitations:*
1. Keys and values must be JSON serializable.
## <a name="bloom-filters"></a>Bloom filters 🌸
Bloom filters are a powerful data structure that help you to answer the
questions, _“Have I seen this element before?”_ and _“How
many distinct elements have I seen?”_; but not the question, _“What
are all of the elements that I’ve seen before?”_ So think of Bloom
filters as Python sets that you can add elements to, use to test element
membership, and get the length of; but that you can’t iterate through or
get elements back out of.
Bloom filters are probabilistic, which means that they can sometimes generate
false positives (as in, they may report that you’ve seen a particular
element before even though you haven’t). But they will never generate
false negatives (so every time that they report that you haven’t seen a
particular element before, you really must never have seen it). You can tune
your acceptable false positive probability, though at the expense of the
storage size and the element insertion/lookup time of your Bloom filter.
Create a `BloomFilter`:
```python
>>> from pottery import BloomFilter
>>> dilberts = BloomFilter(
... num_elements=100,
... false_positives=0.01,
... redis=redis,
... key='dilberts',
... )
>>>
```
Here, `num_elements` represents the number of elements that you expect to
insert into your `BloomFilter`, and `false_positives` represents your
acceptable false positive probability. Using these two parameters,
`BloomFilter` automatically computes its own storage size and number of times
to run its hash functions on element insertion/lookup such that it can
guarantee a false positive rate at or below what you can tolerate, given that
you’re going to insert your specified number of elements.
Insert an element into the `BloomFilter`:
```python
>>> dilberts.add('rajiv')
>>>
```
Test for membership in the `BloomFilter`:
```python
>>> 'rajiv' in dilberts
True
>>> 'raj' in dilberts
False
>>> 'dan' in dilberts
False
>>>
```
See how many elements we’ve inserted into the `BloomFilter`:
```python
>>> len(dilberts)
1
>>>
```
Note that `BloomFilter.__len__()` is an approximation, not an exact value,
though it’s quite accurate.
Insert multiple elements into the `BloomFilter`:
```python
>>> dilberts.update({'raj', 'dan'})
>>>
```
Do more efficient membership testing for multiple elements using
`.contains_many()`:
```python
>>> tuple(dilberts.contains_many('rajiv', 'raj', 'dan', 'luis'))
(True, True, True, False)
>>>
```
Remove all of the elements from the `BloomFilter`:
```python
>>> dilberts.clear()
>>> len(dilberts)
0
>>>
```
*Limitations:*
1. Elements must be JSON serializable.
2. `len(bf)` is probabilistic in that it’s an accurate approximation. You
can tune how accurate you want it to be with the `num_elements` and
`false_positives` arguments to `.__init__()`, at the expense of storage space
and insertion/lookup time.
3. Membership testing against a Bloom filter is probabilistic in that it *may*
return false positives, but *never* returns false negatives. This means that
if `element in bf` evaluates to `True`, then you *may* have inserted the
element into the Bloom filter. But if `element in bf` evaluates to `False`,
then you *must not* have inserted it. Again, you can tune accuracy with the
`num_elements` and `false_positives` arguments to `.__init__()`, at the
expense of storage space and insertion/lookup time.
## <a name="hyperloglogs"></a>HyperLogLogs 🪵
HyperLogLogs are an interesting data structure designed to answer the question,
_“How many distinct elements have I seen?”_; but not the questions,
_“Have I seen this element before?”_ or _“What are all of the
elements that I’ve seen before?”_ So think of HyperLogLogs as
Python sets that you can add elements to and get the length of; but that you
can’t use to test element membership, iterate through, or get elements
out of.
HyperLogLogs are probabilistic, which means that they’re accurate within
a margin of error up to 2%. However, they can reasonably accurately estimate
the cardinality (size) of vast datasets (like the number of unique Google
searches issued in a day) with a tiny amount of storage (1.5 KB).
Create a `HyperLogLog`:
```python
>>> from pottery import HyperLogLog
>>> google_searches = HyperLogLog(redis=redis, key='google-searches')
>>>
```
Insert an element into the `HyperLogLog`:
```python
>>> google_searches.add('sonic the hedgehog video game')
>>>
```
See how many elements we’ve inserted into the `HyperLogLog`:
```python
>>> len(google_searches)
1
>>>
```
Insert multiple elements into the `HyperLogLog`:
```python
>>> google_searches.update({
... 'google in 1998',
... 'minesweeper',
... 'joey tribbiani',
... 'wizard of oz',
... 'rgb to hex',
... 'pac-man',
... 'breathing exercise',
... 'do a barrel roll',
... 'snake',
... })
>>> len(google_searches)
10
>>>
```
Through a clever hack, we can do membership testing against a `HyperLogLog`,
even though it was never designed for this purpose. The way that the hack works
is that it creates a temporary copy of the `HyperLogLog`, then inserts the
element that you’re running the membership test for into the temporary
copy. If the insertion changes the temporary `HyperLogLog`’s cardinality,
then the element must not have been inserted into the original `HyperLogLog`.
```python
>>> 'joey tribbiani' in google_searches
True
>>> 'jennifer aniston' in google_searches
False
>>>
```
Do more efficient membership testing for multiple elements using
`.contains_many()`:
```python
>>> tuple(google_searches.contains_many('joey tribbiani', 'jennifer aniston'))
(True, False)
>>>
```
Remove all of the elements from the `HyperLogLog`:
```python
>>> google_searches.clear()
>>> len(google_searches)
0
>>>
```
*Limitations:*
1. Elements must be JSON serializable.
2. `len(hll)` is probabilistic in that it’s an accurate approximation.
3. Membership testing against a HyperLogLog is probabilistic in that it *may*
return false positives, but *never* returns false negatives. This means that
if `element in hll` evaluates to `True`, then you *may* have inserted the
element into the HyperLogLog. But if `element in hll` evaluates to `False`,
then you *must not* have inserted it.
## <a name="contexttimer"></a>ContextTimer ⏱️
`ContextTimer` helps you easily and accurately measure elapsed time. Note that
`ContextTimer` measures wall (real-world) time, not CPU time; and that
`elapsed()` returns time in milliseconds.
You can use `ContextTimer` stand-alone…
```python
>>> import time
>>> from pottery import ContextTimer
>>> timer = ContextTimer()
>>> timer.start()
>>> time.sleep(0.1)
>>> 100 <= timer.elapsed() < 200
True
>>> timer.stop()
>>> time.sleep(0.1)
>>> 100 <= timer.elapsed() < 200
True
>>>
```
…or as a context manager:
```python
>>> tests = []
>>> with ContextTimer() as timer:
... time.sleep(0.1)
... tests.append(100 <= timer.elapsed() < 200)
>>> time.sleep(0.1)
>>> tests.append(100 <= timer.elapsed() < 200)
>>> tests
[True, True]
>>>
```
## Contributing
### Obtain source code
1. Clone the git repo:
1. `$ git clone git@github.com:brainix/pottery.git`
2. `$ cd pottery/`
2. Install project-level dependencies:
1. `$ make install`
### Run tests
1. In one Terminal session:
1. `$ cd pottery/`
2. `$ redis-server`
2. In a second Terminal session:
1. `$ cd pottery/`
2. `$ make test`
3. `$ make test-readme`
`make test` runs all of the unit tests as well as the coverage test. However,
sometimes, when debugging, it can be useful to run an individual test module,
class, or method:
1. In one Terminal session:
1. `$ cd pottery/`
2. `$ redis-server`
2. In a second Terminal session:
1. Run a test module with `$ make test tests=tests.test_dict`
2. Run a test class with: `$ make test tests=tests.test_dict.DictTests`
3. Run a test method with: `$ make test tests=tests.test_dict.DictTests.test_keyexistserror`
`make test-readme` doctests the Python code examples in this README to ensure
that they’re correct.
Raw data
{
"_id": null,
"home_page": "https://github.com/brainix/pottery",
"name": "pottery-prod",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7, <4",
"maintainer_email": "",
"keywords": "Redis client persistent storage",
"author": "Rajiv Bakulesh Shah",
"author_email": "brainix@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a9/6c/1f28d8c6bc620d45621bb4a0d7fd85bb8b85f64ae2e2fe73016e6cfc5bf1/pottery-prod-3.0.0.tar.gz",
"platform": null,
"description": "# Pottery: Redis for Humans \ud83c\udf0e\ud83c\udf0d\ud83c\udf0f\n\n[Redis](http://redis.io/) is awesome, but [Redis\ncommands](http://redis.io/commands) are not always intuitive. Pottery is a\nPythonic way to access Redis. If you know how to use Python dicts, then you\nalready know how to use Pottery. Pottery is useful for accessing Redis more\neasily, and also for implementing microservice resilience patterns; and it has\nbeen battle tested in production at scale.\n\n[![Build status](https://img.shields.io/github/workflow/status/brainix/pottery/Python%20package/master)](https://github.com/brainix/pottery/actions?query=branch%3Amaster)\n[![Security status](https://img.shields.io/badge/security-bandit-dark.svg)](https://github.com/PyCQA/bandit)\n[![Latest released version](https://badge.fury.io/py/pottery.svg)](https://badge.fury.io/py/pottery)\n\n![Supported Python versions](https://img.shields.io/pypi/pyversions/pottery)\n![Number of lines of code](https://img.shields.io/tokei/lines/github/brainix/pottery)\n\n[![Total number of downloads](https://pepy.tech/badge/pottery)](https://pepy.tech/project/pottery)\n[![Downloads per month](https://pepy.tech/badge/pottery/month)](https://pepy.tech/project/pottery)\n[![Downloads per week](https://pepy.tech/badge/pottery/week)](https://pepy.tech/project/pottery)\n\n\n\n## Table of Contents\n- [Dicts \ud83d\udcd6](#dicts)\n- [Sets \ud83d\udecd\ufe0f](#sets)\n- [Lists \u26d3](#lists)\n- [Counters \ud83e\uddee](#counters)\n- [Deques \ud83d\udd87\ufe0f](#deques)\n- [Queues \ud83d\udeb6\u200d\u2642\ufe0f\ud83d\udeb6\u200d\u2640\ufe0f\ud83d\udeb6\u200d\u2642\ufe0f](#queues)\n- [Redlock \ud83d\udd12](#redlock)\n - [synchronize() \ud83d\udc6f\u200d\u2640\ufe0f](#synchronize)\n- [AIORedlock \ud83d\udd12](#aioredlock)\n- [NextID \ud83d\udd22](#nextid)\n- [redis_cache()](#redis_cache)\n- [CachedOrderedDict](#cachedordereddict)\n- [Bloom filters \ud83c\udf38](#bloom-filters)\n- [HyperLogLogs \ud83e\udeb5](#hyperloglogs)\n- [ContextTimer \u23f1\ufe0f](#contexttimer)\n\n\n\n## Installation\n\n```shell\n$ pip3 install pottery\n```\n\n## Usage\n\nFirst, set up your Redis client:\n\n```python\n>>> from redis import Redis\n>>> redis = Redis.from_url('redis://localhost:6379/1')\n>>>\n```\n\n\n\n## <a name=\"dicts\"></a>Dicts \ud83d\udcd6\n\n`RedisDict` is a Redis-backed container compatible with Python’s\n[`dict`](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).\n\nHere is a small example using a `RedisDict`:\n\n```python\n>>> from pottery import RedisDict\n>>> tel = RedisDict({'jack': 4098, 'sape': 4139}, redis=redis, key='tel')\n>>> tel['guido'] = 4127\n>>> tel\nRedisDict{'jack': 4098, 'sape': 4139, 'guido': 4127}\n>>> tel['jack']\n4098\n>>> del tel['sape']\n>>> tel['irv'] = 4127\n>>> tel\nRedisDict{'jack': 4098, 'guido': 4127, 'irv': 4127}\n>>> list(tel)\n['jack', 'guido', 'irv']\n>>> sorted(tel)\n['guido', 'irv', 'jack']\n>>> 'guido' in tel\nTrue\n>>> 'jack' not in tel\nFalse\n>>>\n```\n\nNotice the first two keyword arguments to `RedisDict()`: The first is your\nRedis client. The second is the Redis key name for your dict. Other than\nthat, you can use your `RedisDict` the same way that you use any other Python\n`dict`.\n\n*Limitations:*\n\n1. Keys and values must be JSON serializable.\n\n\n\n## <a name=\"sets\"></a>Sets \ud83d\udecd\ufe0f\n\n`RedisSet` is a Redis-backed container compatible with Python’s\n[`set`](https://docs.python.org/3/tutorial/datastructures.html#sets).\n\nHere is a brief demonstration:\n\n```python\n>>> from pottery import RedisSet\n>>> basket = RedisSet({'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}, redis=redis, key='basket')\n>>> sorted(basket)\n['apple', 'banana', 'orange', 'pear']\n>>> 'orange' in basket\nTrue\n>>> 'crabgrass' in basket\nFalse\n\n>>> a = RedisSet('abracadabra', redis=redis, key='magic')\n>>> b = set('alacazam')\n>>> sorted(a)\n['a', 'b', 'c', 'd', 'r']\n>>> sorted(a - b)\n['b', 'd', 'r']\n>>> sorted(a | b)\n['a', 'b', 'c', 'd', 'l', 'm', 'r', 'z']\n>>> sorted(a & b)\n['a', 'c']\n>>> sorted(a ^ b)\n['b', 'd', 'l', 'm', 'r', 'z']\n>>>\n```\n\nNotice the two keyword arguments to `RedisSet()`: The first is your Redis\nclient. The second is the Redis key name for your set. Other than that, you\ncan use your `RedisSet` the same way that you use any other Python `set`.\n\nDo more efficient membership testing for multiple elements using\n`.contains_many()`:\n\n```python\n>>> nirvana = RedisSet({'kurt', 'krist', 'dave'}, redis=redis, key='nirvana')\n>>> tuple(nirvana.contains_many('kurt', 'krist', 'chat', 'dave'))\n(True, True, False, True)\n>>>\n```\n\n*Limitations:*\n\n1. Elements must be JSON serializable.\n\n\n\n## <a name=\"lists\"></a>Lists \u26d3\n\n`RedisList` is a Redis-backed container compatible with Python’s\n[`list`](https://docs.python.org/3/tutorial/introduction.html#lists).\n\n```python\n>>> from pottery import RedisList\n>>> squares = RedisList([1, 4, 9, 16, 25], redis=redis, key='squares')\n>>> squares\nRedisList[1, 4, 9, 16, 25]\n>>> squares[0]\n1\n>>> squares[-1]\n25\n>>> squares[-3:]\n[9, 16, 25]\n>>> squares[:]\n[1, 4, 9, 16, 25]\n>>> squares + [36, 49, 64, 81, 100]\nRedisList[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]\n>>>\n```\n\nNotice the two keyword arguments to `RedisList()`: The first is your Redis\nclient. The second is the Redis key name for your list. Other than that, you\ncan use your `RedisList` the same way that you use any other Python `list`.\n\n*Limitations:*\n\n1. Elements must be JSON serializable.\n2. Under the hood, Python implements `list` using an array. Redis implements\n list using a\n [doubly linked list](https://redis.io/topics/data-types-intro#redis-lists).\n As such, inserting elements at the head or tail of a `RedisList` is fast,\n O(1). However, accessing `RedisList` elements by index is slow, O(n). So\n in terms of performance and ideal use cases, `RedisList` is more similar to\n Python’s `deque` than Python’s `list`. Instead of `RedisList`,\n consider using [`RedisDeque`](#deques).\n\n\n\n## <a name=\"counters\"></a>Counters \ud83e\uddee\n\n`RedisCounter` is a Redis-backed container compatible with Python’s\n[`collections.Counter`](https://docs.python.org/3/library/collections.html#collections.Counter).\n\n```python\n>>> from pottery import RedisCounter\n>>> c = RedisCounter(redis=redis, key='my-counter')\n>>> c = RedisCounter('gallahad', redis=redis, key='my-counter')\n>>> c.clear()\n>>> c = RedisCounter({'red': 4, 'blue': 2}, redis=redis, key='my-counter')\n>>> c.clear()\n>>> c = RedisCounter(redis=redis, key='my-counter', cats=4, dogs=8)\n>>> c.clear()\n\n>>> c = RedisCounter(['eggs', 'ham'], redis=redis, key='my-counter')\n>>> c['bacon']\n0\n>>> c['sausage'] = 0\n>>> del c['sausage']\n>>> c.clear()\n\n>>> c = RedisCounter(redis=redis, key='my-counter', a=4, b=2, c=0, d=-2)\n>>> sorted(c.elements())\n['a', 'a', 'a', 'a', 'b', 'b']\n>>> c.clear()\n\n>>> RedisCounter('abracadabra', redis=redis, key='my-counter').most_common(3)\n[('a', 5), ('b', 2), ('r', 2)]\n>>> c.clear()\n\n>>> c = RedisCounter(redis=redis, key='my-counter', a=4, b=2, c=0, d=-2)\n>>> from collections import Counter\n>>> d = Counter(a=1, b=2, c=3, d=4)\n>>> c.subtract(d)\n>>> c\nRedisCounter{'a': 3, 'b': 0, 'c': -3, 'd': -6}\n>>>\n```\n\nNotice the first two keyword arguments to `RedisCounter()`: The first is your\nRedis client. The second is the Redis key name for your counter. Other than\nthat, you can use your `RedisCounter` the same way that you use any other\nPython `Counter`.\n\n*Limitations:*\n\n1. Keys must be JSON serializable.\n\n\n\n## <a name=\"deques\"></a>Deques \ud83d\udd87\ufe0f\n\n`RedisDeque` is a Redis-backed container compatible with Python’s\n[`collections.deque`](https://docs.python.org/3/library/collections.html#collections.deque).\n\nExample:\n\n```python\n>>> from pottery import RedisDeque\n>>> d = RedisDeque('ghi', redis=redis, key='letters')\n>>> for elem in d:\n... print(elem.upper())\nG\nH\nI\n\n>>> d.append('j')\n>>> d.appendleft('f')\n>>> d\nRedisDeque(['f', 'g', 'h', 'i', 'j'])\n\n>>> d.pop()\n'j'\n>>> d.popleft()\n'f'\n>>> list(d)\n['g', 'h', 'i']\n>>> d[0]\n'g'\n>>> d[-1]\n'i'\n\n>>> list(reversed(d))\n['i', 'h', 'g']\n>>> 'h' in d\nTrue\n>>> d.extend('jkl')\n>>> d\nRedisDeque(['g', 'h', 'i', 'j', 'k', 'l'])\n>>> d.rotate(1)\n>>> d\nRedisDeque(['l', 'g', 'h', 'i', 'j', 'k'])\n>>> d.rotate(-1)\n>>> d\nRedisDeque(['g', 'h', 'i', 'j', 'k', 'l'])\n\n>>> RedisDeque(reversed(d), redis=redis)\nRedisDeque(['l', 'k', 'j', 'i', 'h', 'g'])\n>>> d.clear()\n\n>>> d.extendleft('abc')\n>>> d\nRedisDeque(['c', 'b', 'a'])\n>>>\n```\n\nNotice the two keyword arguments to `RedisDeque()`: The first is your Redis\nclient. The second is the Redis key name for your deque. Other than that, you\ncan use your `RedisDeque` the same way that you use any other Python `deque`.\n\n*Limitations:*\n\n1. Elements must be JSON serializable.\n\n\n\n## <a name=\"queues\"></a>Queues \ud83d\udeb6\u200d\u2642\ufe0f\ud83d\udeb6\u200d\u2640\ufe0f\ud83d\udeb6\u200d\u2642\ufe0f\n\n`RedisSimpleQueue` is a Redis-backed multi-producer, multi-consumer FIFO queue\ncompatible with Python’s\n[`queue.SimpleQueue`](https://docs.python.org/3/library/queue.html#simplequeue-objects).\nIn general, use a Python `queue.Queue` if you’re using it in one or more\nthreads, use `multiprocessing.Queue` if you’re using it between processes,\nand use `RedisSimpleQueue` if you’re sharing it across machines or if you\nneed for your queue to persist across application crashes or restarts.\n\nInstantiate a `RedisSimpleQueue`:\n\n```python\n>>> from pottery import RedisSimpleQueue\n>>> cars = RedisSimpleQueue(redis=redis, key='cars')\n>>>\n```\n\nNotice the two keyword arguments to `RedisSimpleQueue()`: The first is your\nRedis client. The second is the Redis key name for your queue. Other than\nthat, you can use your `RedisSimpleQueue` the same way that you use any other\nPython `queue.SimpleQueue`.\n\nCheck the queue state, put some items in the queue, and get those items back\nout:\n\n```python\n>>> cars.empty()\nTrue\n>>> cars.qsize()\n0\n>>> cars.put('Jeep')\n>>> cars.put('Honda')\n>>> cars.put('Audi')\n>>> cars.empty()\nFalse\n>>> cars.qsize()\n3\n>>> cars.get()\n'Jeep'\n>>> cars.get()\n'Honda'\n>>> cars.get()\n'Audi'\n>>> cars.empty()\nTrue\n>>> cars.qsize()\n0\n>>>\n```\n\n*Limitations:*\n\n1. Items must be JSON serializable.\n\n\n\n## <a name=\"redlock\"></a>Redlock \ud83d\udd12\n\n`Redlock` is a safe and reliable lock to coordinate access to a resource shared\nacross threads, processes, and even machines, without a single point of\nfailure. [Rationale and algorithm\ndescription.](http://redis.io/topics/distlock)\n\n`Redlock` implements Python’s excellent\n[`threading.Lock`](https://docs.python.org/3/library/threading.html#lock-objects)\nAPI as closely as is feasible. In other words, you can use `Redlock` the same\nway that you use `threading.Lock`. The main reason to use `Redlock` over\n`threading.Lock` is that `Redlock` can coordinate access to a resource shared\nacross different machines; `threading.Lock` can’t.\n\nInstantiate a `Redlock`:\n\n```python\n>>> from pottery import Redlock\n>>> printer_lock = Redlock(key='printer', masters={redis}, auto_release_time=.2)\n>>>\n```\n\nThe `key` argument represents the resource, and the `masters` argument\nspecifies your Redis masters across which to distribute the lock. In\nproduction, you should have 5 Redis masters. This is to eliminate a single\npoint of failure — you can lose up to 2 out of the 5 Redis masters and\nyour `Redlock` will remain available and performant. Now you can protect\naccess to your resource:\n\n```python\n>>> if printer_lock.acquire():\n... # Critical section - print stuff here.\n... print('printer_lock is locked')\n... printer_lock.release()\nprinter_lock is locked\n>>> bool(printer_lock.locked())\nFalse\n>>>\n```\n\nOr you can protect access to your resource inside a context manager:\n\n```python\n>>> with printer_lock:\n... # Critical section - print stuff here.\n... print('printer_lock is locked')\nprinter_lock is locked\n>>> bool(printer_lock.locked())\nFalse\n>>>\n```\n\nIt’s safest to instantiate a new `Redlock` object every time you need to\nprotect your resource and to not share `Redlock` instances across different\nparts of code. In other words, think of the `key` as identifying the resource;\ndon’t think of any particular `Redlock` as identifying the resource.\nInstantiating a new `Redlock` every time you need a lock sidesteps bugs by\ndecoupling how you use `Redlock` from the forking/threading model of your\napplication/service.\n\n`Redlock`s are automatically released (by default, after 10 seconds). You\nshould take care to ensure that your critical section completes well within\nthat timeout. The reasons that `Redlock`s are automatically released are to\npreserve\n[“liveness”](http://redis.io/topics/distlock#liveness-arguments)\nand to avoid deadlocks (in the event that a process dies inside a critical\nsection before it releases its lock).\n\n```python\n>>> import time\n>>> if printer_lock.acquire():\n... # Critical section - print stuff here.\n... time.sleep(printer_lock.auto_release_time)\n>>> bool(printer_lock.locked())\nFalse\n>>>\n```\n\nIf 10 seconds isn’t enough to complete executing your critical section,\nthen you can specify your own auto release time (in seconds):\n\n```python\n>>> printer_lock = Redlock(key='printer', masters={redis}, auto_release_time=.2)\n>>> if printer_lock.acquire():\n... # Critical section - print stuff here.\n... time.sleep(printer_lock.auto_release_time / 2)\n>>> bool(printer_lock.locked())\nTrue\n>>> time.sleep(printer_lock.auto_release_time / 2)\n>>> bool(printer_lock.locked())\nFalse\n>>>\n```\n\nBy default, `.acquire()` blocks indefinitely until the lock is acquired. You\ncan make `.acquire()` return immediately with the `blocking` argument.\n`.acquire()` returns `True` if the lock was acquired; `False` if not.\n\n```python\n>>> printer_lock_1 = Redlock(key='printer', masters={redis}, auto_release_time=.2)\n>>> printer_lock_2 = Redlock(key='printer', masters={redis}, auto_release_time=.2)\n>>> printer_lock_1.acquire(blocking=False)\nTrue\n>>> printer_lock_2.acquire(blocking=False) # Returns immediately.\nFalse\n>>> printer_lock_1.release()\n>>>\n```\n\nYou can make `.acquire()` block but not indefinitely by specifying the\n`timeout` argument (in seconds):\n\n```python\n>>> printer_lock_1.acquire()\nTrue\n>>> printer_lock_2.acquire(timeout=printer_lock_1.auto_release_time / 2) # Waits 100 milliseconds.\nFalse\n>>> import contextlib\n>>> from pottery import ReleaseUnlockedLock\n>>> with contextlib.suppress(ReleaseUnlockedLock):\n... printer_lock_1.release()\n>>>\n```\n\nYou can similarly configure the Redlock context manager’s\nblocking/timeout behavior during Redlock initialization. If the context\nmanager fails to acquire the lock, it raises the `QuorumNotAchieved` exception.\n\n```python\n>>> import contextlib\n>>> from pottery import QuorumNotAchieved\n>>> printer_lock_1 = Redlock(key='printer', masters={redis}, context_manager_blocking=True, context_manager_timeout=0.2)\n>>> printer_lock_2 = Redlock(key='printer', masters={redis}, context_manager_blocking=True, context_manager_timeout=0.2)\n>>> with printer_lock_1:\n... with contextlib.suppress(QuorumNotAchieved):\n... with printer_lock_2: # Waits 200 milliseconds; raises QuorumNotAchieved.\n... pass\n... print(f\"printer_lock_1 is {'locked' if printer_lock_1.locked() else 'unlocked'}\")\n... print(f\"printer_lock_2 is {'locked' if printer_lock_2.locked() else 'unlocked'}\")\nprinter_lock_1 is locked\nprinter_lock_2 is unlocked\n>>>\n```\n\n\n\n### <a name=\"synchronize\"></a>synchronize() \ud83d\udc6f\u200d\u2640\ufe0f\n\n`synchronize()` is a decorator that allows only one thread to execute a\nfunction at a time. Under the hood, `synchronize()` uses a Redlock, so refer\nto the [Redlock documentation](#redlock) for more details.\n\nHere’s how to use `synchronize()`:\n\n```python\n>>> from pottery import synchronize\n>>> @synchronize(key='synchronized-func', masters={redis}, auto_release_time=1.5, blocking=True, timeout=-1)\n... def func():\n... # Only one thread can execute this function at a time.\n... return True\n...\n>>> func()\nTrue\n>>>\n```\n\n\n\n## <a name=\"aioredlock\"></a>AIORedlock \ud83d\udd12\n\n`AIORedlock` is the asyncio implementation of Redlock, compatible with\nPython’s\n[`asyncio.Lock`](https://docs.python.org/3/library/asyncio-sync.html#lock).\n\nInstantiate an `AIORedlock` and protect a resource:\n\n```python\n>>> import asyncio\n>>> from redis.asyncio import Redis as AIORedis\n>>> from pottery import AIORedlock\n>>> async def main():\n... aioredis = AIORedis.from_url('redis://localhost:6379/1')\n... shower = AIORedlock(key='shower', masters={aioredis})\n... if await shower.acquire():\n... # Critical section - no other coroutine can enter while we hold the lock.\n... print(f\"shower is {'occupied' if await shower.locked() else 'available'}\")\n... await shower.release()\n... print(f\"shower is {'occupied' if await shower.locked() else 'available'}\")\n...\n>>> asyncio.run(main(), debug=True)\nshower is occupied\nshower is available\n>>>\n```\n\nOr you can protect access to your resource inside a context manager:\n\n```python\n>>> asyncio.set_event_loop(asyncio.new_event_loop())\n>>> async def main():\n... aioredis = AIORedis.from_url('redis://localhost:6379/1')\n... shower = AIORedlock(key='shower', masters={aioredis})\n... async with shower:\n... # Critical section - no other coroutine can enter while we hold the lock.\n... print(f\"shower is {'occupied' if await shower.locked() else 'available'}\")\n... print(f\"shower is {'occupied' if await shower.locked() else 'available'}\")\n...\n>>> asyncio.run(main(), debug=True)\nshower is occupied\nshower is available\n>>>\n```\n\n\n\n## <a name=\"nextid\"></a>NextID \ud83d\udd22\n\n`NextID` safely and reliably produces increasing IDs across threads, processes,\nand even machines, without a single point of failure. [Rationale and algorithm\ndescription.](http://antirez.com/news/102)\n\nInstantiate an ID generator:\n\n```python\n>>> from pottery import NextID\n>>> tweet_ids = NextID(key='tweet-ids', masters={redis})\n>>>\n```\n\nThe `key` argument represents the sequence (so that you can have different\nsequences for user IDs, comment IDs, etc.), and the `masters` argument\nspecifies your Redis masters across which to distribute ID generation (in\nproduction, you should have 5 Redis masters). Now, whenever you need a user\nID, call `next()` on the ID generator:\n\n```python\n>>> next(tweet_ids)\n1\n>>> next(tweet_ids)\n2\n>>> next(tweet_ids)\n3\n>>>\n```\n\nTwo caveats:\n\n1. If many clients are generating IDs concurrently, then there may be\n “holes” in the sequence of IDs (e.g.: 1, 2, 6, 10, 11, 21,\n …).\n2. This algorithm scales to about 5,000 IDs per second (with 5 Redis masters).\n If you need IDs faster than that, then you may want to consider other\n techniques.\n\n\n\n## redis_cache()\n\n`redis_cache()` is a simple lightweight unbounded function return value cache,\nsometimes called\n[“memoize”](https://en.wikipedia.org/wiki/Memoization).\n`redis_cache()` implements Python’s excellent\n[`functools.cache()`](https://docs.python.org/3/library/functools.html#functools.cache)\nAPI as closely as is feasible. In other words, you can use `redis_cache()` the\nsame way that you use `functools.cache()`.\n\n*Limitations:*\n\n1. Arguments to the function must be hashable.\n2. Return values from the function must be JSON serializable.\n3. Just like `functools.cache()`, `redis_cache()` does not allow for a maximum\n size, and does not evict old values, and grows unbounded. Only use\n `redis_cache()` in one of these cases:\n 1. Your function’s argument space has a known small cardinality.\n 2. You specify a `timeout` when calling `redis_cache()` to decorate your\n function, to dump your _entire_ return value cache `timeout` seconds\n after the last cache access (hit or miss).\n 3. You periodically call `.cache_clear()` to dump your _entire_ return\n value cache.\n 4. You’re ok with your return value cache growing unbounded, and you\n [understand the implications](https://docs.redislabs.com/latest/rs/administering/database-operations/eviction-policy/)\n of this for your underlying Redis instance.\n\nIn general, you should only use `redis_cache()` when you want to reuse\npreviously computed values. Accordingly, it doesn’t make sense to cache\nfunctions with side-effects or impure functions such as `time()` or `random()`.\n\nDecorate a function:\n\n```python\n>>> import time\n>>> from pottery import redis_cache\n>>> @redis_cache(redis=redis, key='expensive-function-cache')\n... def expensive_function(n):\n... time.sleep(.1) # Simulate an expensive computation or database lookup.\n... return n\n...\n>>>\n```\n\nNotice the two keyword arguments to `redis_cache()`: The first is your Redis\nclient. The second is the Redis key name for your function’s return\nvalue cache.\n\nCall your function and observe the cache hit/miss rates:\n\n```python\n>>> expensive_function(5)\n5\n>>> expensive_function.cache_info()\nCacheInfo(hits=0, misses=1, maxsize=None, currsize=1)\n>>> expensive_function(5)\n5\n>>> expensive_function.cache_info()\nCacheInfo(hits=1, misses=1, maxsize=None, currsize=1)\n>>> expensive_function(6)\n6\n>>> expensive_function.cache_info()\nCacheInfo(hits=1, misses=2, maxsize=None, currsize=2)\n>>>\n```\n\nNotice that the first call to `expensive_function()` takes 1 second and results\nin a cache miss; but the second call returns almost immediately and results in\na cache hit. This is because after the first call, `redis_cache()` cached the\nreturn value for the call when `n == 5`.\n\nYou can access your original undecorated underlying `expensive_function()` as\n`expensive_function.__wrapped__`. This is useful for introspection, for\nbypassing the cache, or for rewrapping the original function with a different\ncache.\n\nYou can force a cache reset for a particular combination of `args`/`kwargs`\nwith `expensive_function.__bypass__`. A call to\n`expensive_function.__bypass__(*args, **kwargs)` bypasses the cache lookup,\ncalls the original underlying function, then caches the results for future\ncalls to `expensive_function(*args, **kwargs)`. Note that a call to\n`expensive_function.__bypass__(*args, **kwargs)` results in neither a cache hit\nnor a cache miss.\n\nFinally, clear/invalidate your function’s entire return value cache with\n`expensive_function.cache_clear()`:\n\n```python\n>>> expensive_function.cache_info()\nCacheInfo(hits=1, misses=2, maxsize=None, currsize=2)\n>>> expensive_function.cache_clear()\n>>> expensive_function.cache_info()\nCacheInfo(hits=0, misses=0, maxsize=None, currsize=0)\n>>>\n```\n\n\n\n## CachedOrderedDict\n\nThe best way that I can explain `CachedOrderedDict` is through an example\nuse-case. Imagine that your search engine returns document IDs, which then you\nhave to hydrate into full documents via the database to return to the client.\nThe data structure used to represent such search results must have the\nfollowing properties:\n\n1. It must preserve the order of the document IDs returned by the search engine.\n2. It must map document IDs to hydrated documents.\n3. It must cache previously hydrated documents.\n\nProperties 1 and 2 are satisfied by Python’s\n[`collections.OrderedDict`](https://docs.python.org/3/library/collections.html#collections.OrderedDict).\nHowever, `CachedOrderedDict` extends Python’s `OrderedDict` to also\nsatisfy property 3.\n\nThe most common usage pattern for `CachedOrderedDict` is as follows:\n\n1. Instantiate `CachedOrderedDict` with the IDs that you must look up or\n compute passed in as the `dict_keys` argument to the initializer.\n2. Compute and store the cache misses for future lookups.\n3. Return some representation of your `CachedOrderedDict` to the client.\n\nInstantiate a `CachedOrderedDict`:\n\n```python\n>>> from pottery import CachedOrderedDict\n>>> search_results_1 = CachedOrderedDict(\n... redis_client=redis,\n... redis_key='search-results',\n... dict_keys=(1, 2, 3, 4, 5),\n... )\n>>>\n```\n\nThe `redis_client` argument to the initializer is your Redis client, and the\n`redis_key` argument is the Redis key for the Redis Hash backing your cache.\nThe `dict_keys` argument represents an ordered iterable of keys to be looked up\nand automatically populated in your `CachedOrderedDict` (on cache hits), or\nthat you’ll have to compute and populate for future lookups (on cache\nmisses). Regardless of whether keys are cache hits or misses,\n`CachedOrderedDict` preserves the order of `dict_keys` (like a list), maps\nthose keys to values (like a dict), and maintains an underlying cache for\nfuture key lookups.\n\nIn the beginning, the cache is empty, so let’s populate it:\n\n```python\n>>> sorted(search_results_1.misses())\n[1, 2, 3, 4, 5]\n>>> search_results_1[1] = 'one'\n>>> search_results_1[2] = 'two'\n>>> search_results_1[3] = 'three'\n>>> search_results_1[4] = 'four'\n>>> search_results_1[5] = 'five'\n>>> sorted(search_results_1.misses())\n[]\n>>>\n```\n\nNote that `CachedOrderedDict` preserves the order of `dict_keys`:\n\n```python\n>>> for key, value in search_results_1.items():\n... print(f'{key}: {value}')\n1: one\n2: two\n3: three\n4: four\n5: five\n>>>\n```\n\nNow, let’s look at a combination of cache hits and misses:\n\n```python\n>>> search_results_2 = CachedOrderedDict(\n... redis_client=redis,\n... redis_key='search-results',\n... dict_keys=(2, 4, 6, 8, 10),\n... )\n>>> sorted(search_results_2.misses())\n[6, 8, 10]\n>>> search_results_2[2]\n'two'\n>>> search_results_2[6] = 'six'\n>>> search_results_2[8] = 'eight'\n>>> search_results_2[10] = 'ten'\n>>> sorted(search_results_2.misses())\n[]\n>>> for key, value in search_results_2.items():\n... print(f'{key}: {value}')\n2: two\n4: four\n6: six\n8: eight\n10: ten\n>>>\n```\n\n*Limitations:*\n\n1. Keys and values must be JSON serializable.\n\n\n\n## <a name=\"bloom-filters\"></a>Bloom filters \ud83c\udf38\n\nBloom filters are a powerful data structure that help you to answer the\nquestions, _“Have I seen this element before?”_ and _“How\nmany distinct elements have I seen?”_; but not the question, _“What\nare all of the elements that I’ve seen before?”_ So think of Bloom\nfilters as Python sets that you can add elements to, use to test element\nmembership, and get the length of; but that you can’t iterate through or\nget elements back out of.\n\nBloom filters are probabilistic, which means that they can sometimes generate\nfalse positives (as in, they may report that you’ve seen a particular\nelement before even though you haven’t). But they will never generate\nfalse negatives (so every time that they report that you haven’t seen a\nparticular element before, you really must never have seen it). You can tune\nyour acceptable false positive probability, though at the expense of the\nstorage size and the element insertion/lookup time of your Bloom filter.\n\nCreate a `BloomFilter`:\n\n```python\n>>> from pottery import BloomFilter\n>>> dilberts = BloomFilter(\n... num_elements=100,\n... false_positives=0.01,\n... redis=redis,\n... key='dilberts',\n... )\n>>>\n```\n\nHere, `num_elements` represents the number of elements that you expect to\ninsert into your `BloomFilter`, and `false_positives` represents your\nacceptable false positive probability. Using these two parameters,\n`BloomFilter` automatically computes its own storage size and number of times\nto run its hash functions on element insertion/lookup such that it can\nguarantee a false positive rate at or below what you can tolerate, given that\nyou’re going to insert your specified number of elements.\n\nInsert an element into the `BloomFilter`:\n\n```python\n>>> dilberts.add('rajiv')\n>>>\n```\n\nTest for membership in the `BloomFilter`:\n\n```python\n>>> 'rajiv' in dilberts\nTrue\n>>> 'raj' in dilberts\nFalse\n>>> 'dan' in dilberts\nFalse\n>>>\n```\n\nSee how many elements we’ve inserted into the `BloomFilter`:\n\n```python\n>>> len(dilberts)\n1\n>>>\n```\n\nNote that `BloomFilter.__len__()` is an approximation, not an exact value,\nthough it’s quite accurate.\n\nInsert multiple elements into the `BloomFilter`:\n\n```python\n>>> dilberts.update({'raj', 'dan'})\n>>>\n```\n\nDo more efficient membership testing for multiple elements using\n`.contains_many()`:\n\n```python\n>>> tuple(dilberts.contains_many('rajiv', 'raj', 'dan', 'luis'))\n(True, True, True, False)\n>>>\n```\n\nRemove all of the elements from the `BloomFilter`:\n\n```python\n>>> dilberts.clear()\n>>> len(dilberts)\n0\n>>>\n```\n\n*Limitations:*\n\n1. Elements must be JSON serializable.\n2. `len(bf)` is probabilistic in that it’s an accurate approximation. You\n can tune how accurate you want it to be with the `num_elements` and\n `false_positives` arguments to `.__init__()`, at the expense of storage space\n and insertion/lookup time.\n3. Membership testing against a Bloom filter is probabilistic in that it *may*\n return false positives, but *never* returns false negatives. This means that\n if `element in bf` evaluates to `True`, then you *may* have inserted the\n element into the Bloom filter. But if `element in bf` evaluates to `False`,\n then you *must not* have inserted it. Again, you can tune accuracy with the\n `num_elements` and `false_positives` arguments to `.__init__()`, at the\n expense of storage space and insertion/lookup time.\n\n\n\n## <a name=\"hyperloglogs\"></a>HyperLogLogs \ud83e\udeb5\n\nHyperLogLogs are an interesting data structure designed to answer the question,\n_“How many distinct elements have I seen?”_; but not the questions,\n_“Have I seen this element before?”_ or _“What are all of the\nelements that I’ve seen before?”_ So think of HyperLogLogs as\nPython sets that you can add elements to and get the length of; but that you\ncan’t use to test element membership, iterate through, or get elements\nout of.\n\nHyperLogLogs are probabilistic, which means that they’re accurate within\na margin of error up to 2%. However, they can reasonably accurately estimate\nthe cardinality (size) of vast datasets (like the number of unique Google\nsearches issued in a day) with a tiny amount of storage (1.5 KB).\n\nCreate a `HyperLogLog`:\n\n```python\n>>> from pottery import HyperLogLog\n>>> google_searches = HyperLogLog(redis=redis, key='google-searches')\n>>>\n```\n\nInsert an element into the `HyperLogLog`:\n\n```python\n>>> google_searches.add('sonic the hedgehog video game')\n>>>\n```\n\nSee how many elements we’ve inserted into the `HyperLogLog`:\n\n```python\n>>> len(google_searches)\n1\n>>>\n```\n\nInsert multiple elements into the `HyperLogLog`:\n\n```python\n>>> google_searches.update({\n... 'google in 1998',\n... 'minesweeper',\n... 'joey tribbiani',\n... 'wizard of oz',\n... 'rgb to hex',\n... 'pac-man',\n... 'breathing exercise',\n... 'do a barrel roll',\n... 'snake',\n... })\n>>> len(google_searches)\n10\n>>>\n```\n\nThrough a clever hack, we can do membership testing against a `HyperLogLog`,\neven though it was never designed for this purpose. The way that the hack works\nis that it creates a temporary copy of the `HyperLogLog`, then inserts the\nelement that you’re running the membership test for into the temporary\ncopy. If the insertion changes the temporary `HyperLogLog`’s cardinality,\nthen the element must not have been inserted into the original `HyperLogLog`.\n\n```python\n>>> 'joey tribbiani' in google_searches\nTrue\n>>> 'jennifer aniston' in google_searches\nFalse\n>>>\n```\n\nDo more efficient membership testing for multiple elements using\n`.contains_many()`:\n\n```python\n>>> tuple(google_searches.contains_many('joey tribbiani', 'jennifer aniston'))\n(True, False)\n>>>\n```\n\nRemove all of the elements from the `HyperLogLog`:\n\n```python\n>>> google_searches.clear()\n>>> len(google_searches)\n0\n>>>\n```\n\n*Limitations:*\n\n1. Elements must be JSON serializable.\n2. `len(hll)` is probabilistic in that it’s an accurate approximation.\n3. Membership testing against a HyperLogLog is probabilistic in that it *may*\n return false positives, but *never* returns false negatives. This means that\n if `element in hll` evaluates to `True`, then you *may* have inserted the\n element into the HyperLogLog. But if `element in hll` evaluates to `False`,\n then you *must not* have inserted it.\n\n\n\n## <a name=\"contexttimer\"></a>ContextTimer \u23f1\ufe0f\n\n`ContextTimer` helps you easily and accurately measure elapsed time. Note that\n`ContextTimer` measures wall (real-world) time, not CPU time; and that\n`elapsed()` returns time in milliseconds.\n\nYou can use `ContextTimer` stand-alone…\n\n```python\n>>> import time\n>>> from pottery import ContextTimer\n>>> timer = ContextTimer()\n>>> timer.start()\n>>> time.sleep(0.1)\n>>> 100 <= timer.elapsed() < 200\nTrue\n>>> timer.stop()\n>>> time.sleep(0.1)\n>>> 100 <= timer.elapsed() < 200\nTrue\n>>>\n```\n\n…or as a context manager:\n\n```python\n>>> tests = []\n>>> with ContextTimer() as timer:\n... time.sleep(0.1)\n... tests.append(100 <= timer.elapsed() < 200)\n>>> time.sleep(0.1)\n>>> tests.append(100 <= timer.elapsed() < 200)\n>>> tests\n[True, True]\n>>>\n```\n\n\n\n## Contributing\n\n### Obtain source code\n\n1. Clone the git repo:\n 1. `$ git clone git@github.com:brainix/pottery.git`\n 2. `$ cd pottery/`\n2. Install project-level dependencies:\n 1. `$ make install`\n\n### Run tests\n\n1. In one Terminal session:\n 1. `$ cd pottery/`\n 2. `$ redis-server`\n2. In a second Terminal session:\n 1. `$ cd pottery/`\n 2. `$ make test`\n 3. `$ make test-readme`\n\n`make test` runs all of the unit tests as well as the coverage test. However,\nsometimes, when debugging, it can be useful to run an individual test module,\nclass, or method:\n\n1. In one Terminal session:\n 1. `$ cd pottery/`\n 2. `$ redis-server`\n2. In a second Terminal session:\n 1. Run a test module with `$ make test tests=tests.test_dict`\n 2. Run a test class with: `$ make test tests=tests.test_dict.DictTests`\n 3. Run a test method with: `$ make test tests=tests.test_dict.DictTests.test_keyexistserror`\n\n`make test-readme` doctests the Python code examples in this README to ensure\nthat they’re correct.\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Redis is awesome, but Redis commands are not always intuitive. Pottery is a",
"version": "3.0.0",
"project_urls": {
"Homepage": "https://github.com/brainix/pottery"
},
"split_keywords": [
"redis",
"client",
"persistent",
"storage"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a96c1f28d8c6bc620d45621bb4a0d7fd85bb8b85f64ae2e2fe73016e6cfc5bf1",
"md5": "9ae4a8fb5574ba636c983a27dba083cc",
"sha256": "64cf9a5073173a02d47bade43027ed1977ccda9a25bbd2ff92fde5ad06db5006"
},
"downloads": -1,
"filename": "pottery-prod-3.0.0.tar.gz",
"has_sig": false,
"md5_digest": "9ae4a8fb5574ba636c983a27dba083cc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7, <4",
"size": 89118,
"upload_time": "2023-06-02T08:22:37",
"upload_time_iso_8601": "2023-06-02T08:22:37.987372Z",
"url": "https://files.pythonhosted.org/packages/a9/6c/1f28d8c6bc620d45621bb4a0d7fd85bb8b85f64ae2e2fe73016e6cfc5bf1/pottery-prod-3.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-02 08:22:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "brainix",
"github_project": "pottery",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [],
"lcname": "pottery-prod"
}