# Redis-index: Inverted Index using efficient Redis set
Redis-index helps to delegate part of the work from database to cache.
It is useful for highload projects, with complex serach logic underneath the hood.
[![Build Status](https://github.com/ErhoSen/redis-index/workflows/Build/badge.svg)](https://github.com/ErhoSen/redis-index/actions?query=workflow:Build)
[![codecov](https://codecov.io/gh/ErhoSen/redis-index/branch/master/graph/badge.svg)](https://codecov.io/gh/ErhoSen/redis-index)
![License](https://img.shields.io/pypi/pyversions/redis-index.svg)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![PyPI](https://img.shields.io/github/license/erhosen/redis-index.svg)](https://pypi.org/project/redis-index/)
## Introduction
Suppose you have to implement a service that will fetch data for a given set of filters.
```http
GET /api/companies?region=US¤cy=USD&search_ids=233,816,266,...
```
Filters may require a significant costs for the database: each of them involves joining multiple tables. By writing a solution on raw SQL, we have a risk of stumbling into database performance.
Such "heavy" queries can be precalculated, and put into redis SET.
We can intersect the resulting SETs with each other, thereby greatly simplifying our SQL.
```python
search_ids = {233, 816, 266, ...}
us_companies_ids = {266, 112, 643, ...}
usd_companies_ids = {816, 54, 8395, ...}
filtered_ids = search_ids & us_companies_ids & usd_companies_ids # intersection
...
"SELECT * from companies whrere id in {filtered_ids}"
```
But getting such precalculated SETS from Redis to Python memory could be another bottleneck:
filters can be really large, and we don't want to transfer a lot of data between servers.
The solution is intersect these SETs directly in redis.
This is exactly what redis-index library does.
## Installation
Use `pip` to install `redis-index`.
```bash
pip install redis-index
```
## Usage
1) Declare your filters. They must inherit BaseFilter class.
```python
from redis_index import BaseFilter
class RegionFilter(BaseFilter):
def get_ids(self, region, **kwargs) -> List[int]:
"""
get_ids should return a precalculated list of ints.
"""
with psycopg2.connect(...) as conn:
with conn.cursor() as cursor:
cursor.execute('SELECT id FROM companies WREHE region = %s', (region, ))
return cursor.fetchall()
class CurrencyFilter(BaseFilter):
def get_ids(self, currency, **kwargs):
with psycopg2.connect(...) as conn:
with conn.cursor() as cursor:
cursor.execute('SELECT id FROM companies WREHE currency = %s', (currency, ))
return cursor.fetchall()
```
2) Initialize Filtering object
```python
from redis_index import RedisFiltering
from hot_redis import HotClient
redis_clent = HotClient(host="localhost", port=6379)
filtering = RedisFiltering(redis_clent)
```
3) Now you can use `filtering` as a singleton in your project.
Simply call `filter()` method with specific filters, and your `search_ids`
```python
company_ids = request.GET["company_ids"] # input list
result = filtering.filter(search_ids, [RegionFilter("US"), CurrencyFilter("USD")])
```
The result will be a list, that contains only ids, that are both satisfying RegionFilter and CurrencyFilter.
## How to warm the cache?
You can warm up the cache in various ways, for example, using the cron command
```crontab
*/5 * * * * python warm_filters
```
Inside such a command, you can use specific method `warm_filters`
```python
result = filtering.filter(search_ids, [RegionFilter("US"), CurrencyFilter("USD")])
```
Or directly RedisIndex class
```python
for _filter in [RegionFilter("US"), CurrencyFilter("USD")]:
filter_index = RedisIndex(_filter, redis_client)
filter_index.warm()
```
## Statsd integration
Redis-index optionally supports statsd-integration.
![Redis-Index performance](https://github.com/ErhoSen/redis-index/raw/master/images/redis_index_performance.png "Redis-Index performance")
![Redis-Index by filters](https://github.com/ErhoSen/redis-index/raw/master/images/redis_index_by_filters.png "Redis-Index by filters")
## Code of Conduct
Everyone interacting in the project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the [PyPA Code of Conduct](https://www.pypa.io/en/latest/code-of-conduct/).
## History
### [0.1.11] - 2019-11-08
#### Added
- Added code for initial release
Raw data
{
"_id": null,
"home_page": "https://github.com/ErhoSen/redis-index",
"name": "redis-index",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8,<4.0",
"maintainer_email": "",
"keywords": "redis,index,gin,intersection,filters",
"author": "Vladimir Vyazovetskov",
"author_email": "erhosen@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/e1/9d/73bb9e907eb22a71497ae50f903995cc68337ba4b32f0352875b5c5be7b1/redis_index-0.8.0.tar.gz",
"platform": null,
"description": "# Redis-index: Inverted Index using efficient Redis set\n\nRedis-index helps to delegate part of the work from database to cache.\nIt is useful for highload projects, with complex serach logic underneath the hood.\n\n[![Build Status](https://github.com/ErhoSen/redis-index/workflows/Build/badge.svg)](https://github.com/ErhoSen/redis-index/actions?query=workflow:Build)\n[![codecov](https://codecov.io/gh/ErhoSen/redis-index/branch/master/graph/badge.svg)](https://codecov.io/gh/ErhoSen/redis-index)\n![License](https://img.shields.io/pypi/pyversions/redis-index.svg)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![PyPI](https://img.shields.io/github/license/erhosen/redis-index.svg)](https://pypi.org/project/redis-index/)\n\n## Introduction\n\nSuppose you have to implement a service that will fetch data for a given set of filters.\n\n```http\nGET /api/companies?region=US¤cy=USD&search_ids=233,816,266,...\n```\n\nFilters may require a significant costs for the database: each of them involves joining multiple tables. By writing a solution on raw SQL, we have a risk of stumbling into database performance.\n\nSuch \"heavy\" queries can be precalculated, and put into redis SET.\nWe can intersect the resulting SETs with each other, thereby greatly simplifying our SQL.\n\n```python\nsearch_ids = {233, 816, 266, ...}\nus_companies_ids = {266, 112, 643, ...}\nusd_companies_ids = {816, 54, 8395, ...}\n\nfiltered_ids = search_ids & us_companies_ids & usd_companies_ids # intersection\n...\n\"SELECT * from companies whrere id in {filtered_ids}\"\n```\n\nBut getting such precalculated SETS from Redis to Python memory could be another bottleneck:\nfilters can be really large, and we don't want to transfer a lot of data between servers.\n\nThe solution is intersect these SETs directly in redis.\nThis is exactly what redis-index library does.\n\n## Installation\n\nUse `pip` to install `redis-index`.\n\n```bash\npip install redis-index\n```\n\n## Usage\n\n1) Declare your filters. They must inherit BaseFilter class.\n\n```python\nfrom redis_index import BaseFilter\n\nclass RegionFilter(BaseFilter):\n\n def get_ids(self, region, **kwargs) -> List[int]:\n \"\"\"\n get_ids should return a precalculated list of ints.\n \"\"\"\n with psycopg2.connect(...) as conn:\n with conn.cursor() as cursor:\n cursor.execute('SELECT id FROM companies WREHE region = %s', (region, ))\n return cursor.fetchall()\n\nclass CurrencyFilter(BaseFilter):\n\n def get_ids(self, currency, **kwargs):\n with psycopg2.connect(...) as conn:\n with conn.cursor() as cursor:\n cursor.execute('SELECT id FROM companies WREHE currency = %s', (currency, ))\n return cursor.fetchall()\n```\n\n2) Initialize Filtering object\n\n```python\nfrom redis_index import RedisFiltering\nfrom hot_redis import HotClient\n\nredis_clent = HotClient(host=\"localhost\", port=6379)\nfiltering = RedisFiltering(redis_clent)\n```\n\n3) Now you can use `filtering` as a singleton in your project.\nSimply call `filter()` method with specific filters, and your `search_ids`\n\n```python\ncompany_ids = request.GET[\"company_ids\"] # input list\nresult = filtering.filter(search_ids, [RegionFilter(\"US\"), CurrencyFilter(\"USD\")])\n```\n\nThe result will be a list, that contains only ids, that are both satisfying RegionFilter and CurrencyFilter.\n\n## How to warm the cache?\n\nYou can warm up the cache in various ways, for example, using the cron command\n```crontab\n*/5 * * * * python warm_filters\n```\n\nInside such a command, you can use specific method `warm_filters`\n\n```python\nresult = filtering.filter(search_ids, [RegionFilter(\"US\"), CurrencyFilter(\"USD\")])\n```\n\nOr directly RedisIndex class\n```python\nfor _filter in [RegionFilter(\"US\"), CurrencyFilter(\"USD\")]:\n filter_index = RedisIndex(_filter, redis_client)\n filter_index.warm()\n```\n\n## Statsd integration\n\nRedis-index optionally supports statsd-integration.\n\n![Redis-Index performance](https://github.com/ErhoSen/redis-index/raw/master/images/redis_index_performance.png \"Redis-Index performance\")\n\n![Redis-Index by filters](https://github.com/ErhoSen/redis-index/raw/master/images/redis_index_by_filters.png \"Redis-Index by filters\")\n\n## Code of Conduct\n\nEveryone interacting in the project's codebases, issue trackers, chat rooms, and mailing lists is expected to follow the [PyPA Code of Conduct](https://www.pypa.io/en/latest/code-of-conduct/).\n\n## History\n\n### [0.1.11] - 2019-11-08\n\n#### Added\n\n- Added code for initial release\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Inverted Index using efficient Redis set",
"version": "0.8.0",
"project_urls": {
"Homepage": "https://github.com/ErhoSen/redis-index",
"Repository": "https://github.com/ErhoSen/redis-index"
},
"split_keywords": [
"redis",
"index",
"gin",
"intersection",
"filters"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d54efe7e159dcbd50bcbca8d09ea3054a7c8e12da980dc49360ba28cdad5114a",
"md5": "1767569289c7adfc8aa03322e0a9f8c7",
"sha256": "7f2e21a8b68c77c112bf91d0e29aee7bac13216bd898052725c3f0c39f67faed"
},
"downloads": -1,
"filename": "redis_index-0.8.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1767569289c7adfc8aa03322e0a9f8c7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8,<4.0",
"size": 6736,
"upload_time": "2023-12-30T13:02:03",
"upload_time_iso_8601": "2023-12-30T13:02:03.398471Z",
"url": "https://files.pythonhosted.org/packages/d5/4e/fe7e159dcbd50bcbca8d09ea3054a7c8e12da980dc49360ba28cdad5114a/redis_index-0.8.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e19d73bb9e907eb22a71497ae50f903995cc68337ba4b32f0352875b5c5be7b1",
"md5": "ae0404fe61035debeb609c5b6e47b985",
"sha256": "ac5354dca5b86775ee84058bfaab493f582dca83bc642bc8ee2087c4e3201cb8"
},
"downloads": -1,
"filename": "redis_index-0.8.0.tar.gz",
"has_sig": false,
"md5_digest": "ae0404fe61035debeb609c5b6e47b985",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8,<4.0",
"size": 5450,
"upload_time": "2023-12-30T13:02:05",
"upload_time_iso_8601": "2023-12-30T13:02:05.207825Z",
"url": "https://files.pythonhosted.org/packages/e1/9d/73bb9e907eb22a71497ae50f903995cc68337ba4b32f0352875b5c5be7b1/redis_index-0.8.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-30 13:02:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ErhoSen",
"github_project": "redis-index",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "redis-index"
}