# sentry-nodestore-opensearch
Sentry nodestore OpenSearch backend
[![image](https://img.shields.io/pypi/v/sentry-nodestore-opensearch.svg)](https://pypi.python.org/pypi/sentry-nodestore-opensearch)
Supported Sentry 24.x & OpenSearch 2.x versions
Use OpenSearch cluster to store node objects from Sentry.
By default, self-hosted Sentry uses a PostgreSQL database for settings and nodestore, which under high load becomes a bottleneck, causing the database size to grow rapidly and slowing down the entire system.
Switching nodestore to a dedicated OpenSearch cluster provides better scalability:
- OpenSearch clusters can be scaled horizontally by adding more data nodes (PostgreSQL cannot).
- Data in OpenSearch can be sharded and replicated between data nodes, which increases throughput.
- OpenSearch automatically rebalances when new data nodes are added.
- Scheduled Sentry cleanup performs much faster and more stably when using OpenSearch nodestore, as it relies on simple deletion of old indices (cleanup in a terabyte-sized PostgreSQL nodestore is very challenging).
## Installation
Rebuild the Sentry Docker image with the nodestore package installation.
```dockerfile
FROM getsentry/sentry:24.4.1
RUN pip install sentry-nodestore-opensearch \
```
## Configuration
Set `SENTRY_NODESTORE` at your `sentry.conf.py`
``` python
from opensearchpy import OpenSearch
os_client = OpenSearch(
['https://username:password@opensearch:9200'],
http_compress=True,
request_timeout=60,
max_retries=3,
retry_on_timeout=True,
# ❯ openssl s_client -connect opensearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin
ssl_assert_fingerprint=(
"PUT_FINGERPRINT_HERE"
)
)
SENTRY_NODESTORE = 'sentry_nodestore_opensearch.OpenSearchNodeStorage'
SENTRY_NODESTORE_OPTIONS = {
'es': os_client,
'refresh': False, # ref: https://opensearch.org/docs/latest/opensearch/rest-api/index-apis/refresh/
# other OpenSearch-related options
}
from sentry.conf.server import * # Default for sentry.conf.py
INSTALLED_APPS = list(INSTALLED_APPS)
INSTALLED_APPS.append('sentry_nodestore_opensearch')
INSTALLED_APPS = tuple(INSTALLED_APPS)
```
## Usage
### Setup opensearch index template
Ensure OpenSearch is up and running before this step. This will create an index template in OpenSearch.
``` shell
sentry upgrade --with-nodestore
```
Or you can prepare the index template manually with this JSON. It may be customized for your needs, but the template name must be sentry because of the nodestore initialization script.
``` json
{
"template": {
"settings": {
"index": {
"number_of_shards": "3",
"number_of_replicas": "0",
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
}
}
},
"mappings": {
"dynamic": "false",
"dynamic_templates": [],
"properties": {
"data": {
"type": "text",
"index": false,
"store": true
},
"timestamp": {
"type": "date",
"store": true
}
}
},
"aliases": {
"sentry": {}
}
}
}
```
### Migrate Data from Default PostgreSQL Nodestore to OpenSearch
PostgreSQL and OpenSearch must be accessible from the machine where you run this code.
``` python
from opensearchpy import OpenSearch
from opensearchpy.helpers import bulk
import psycopg2
os_client = OpenSearch(
['https://username:password@opensearch:9200'],
http_compress=True,
request_timeout=60,
max_retries=3,
retry_on_timeout=True,
# ❯ openssl s_client -connect opensearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin
ssl_assert_fingerprint=(
"PUT_FINGERPRINT_HERE"
)
)
conn = psycopg2.connect(
dbname="sentry",
user="sentry",
password="password",
host="hostname",
port="5432"
)
cur = conn.cursor()
cur.execute("SELECT reltuples AS estimate FROM pg_class WHERE relname = 'nodestore_node'")
result = cur.fetchone()
count = int(result[0])
print(f"Estimated rows: {count}")
cur.close()
cursor = conn.cursor(name='fetch_nodes')
cursor.execute("SELECT * FROM nodestore_node ORDER BY timestamp ASC")
while True:
records = cursor.fetchmany(size=2000)
if not records:
break
bulk_data = []
for r in records:
id = r[0]
data = r[1]
date = r[2].strftime("%Y-%m-%d")
ts = r[2].isoformat()
index = f"sentry-{date}"
doc = {
'data': data,
'timestamp': ts
}
action = {
"_index": index,
"_id": id,
"_source": doc
}
bulk_data.append(action)
bulk(os_client, bulk_data)
count -= len(records)
print(f"Remaining rows: {count}")
cursor.close()
conn.close()
```
Raw data
{
"_id": null,
"home_page": "https://github.com/wedeploy/sentry-nodestore-opensearch",
"name": "sentry-nodestore-opensearch",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "sentry, opensearch, nodestore, backend",
"author": "info@wedeploy.pl",
"author_email": "info@wedeploy.pl",
"download_url": "https://files.pythonhosted.org/packages/d1/d0/9478129599c7d354f0efdf542bafced41c1aa2cb944d1f0245919879b98b/sentry_nodestore_opensearch-1.0.0.tar.gz",
"platform": null,
"description": "# sentry-nodestore-opensearch\n\nSentry nodestore OpenSearch backend\n\n[![image](https://img.shields.io/pypi/v/sentry-nodestore-opensearch.svg)](https://pypi.python.org/pypi/sentry-nodestore-opensearch)\n\nSupported Sentry 24.x & OpenSearch 2.x versions\n\nUse OpenSearch cluster to store node objects from Sentry.\n\nBy default, self-hosted Sentry uses a PostgreSQL database for settings and nodestore, which under high load becomes a bottleneck, causing the database size to grow rapidly and slowing down the entire system.\n\nSwitching nodestore to a dedicated OpenSearch cluster provides better scalability:\n\n- OpenSearch clusters can be scaled horizontally by adding more data nodes (PostgreSQL cannot).\n- Data in OpenSearch can be sharded and replicated between data nodes, which increases throughput.\n- OpenSearch automatically rebalances when new data nodes are added.\n- Scheduled Sentry cleanup performs much faster and more stably when using OpenSearch nodestore, as it relies on simple deletion of old indices (cleanup in a terabyte-sized PostgreSQL nodestore is very challenging).\n\n## Installation\n\nRebuild the Sentry Docker image with the nodestore package installation.\n\n```dockerfile\nFROM getsentry/sentry:24.4.1\nRUN pip install sentry-nodestore-opensearch \\\n``` \n\n## Configuration\n\nSet `SENTRY_NODESTORE` at your `sentry.conf.py`\n\n``` python\nfrom opensearchpy import OpenSearch\nos_client = OpenSearch(\n ['https://username:password@opensearch:9200'],\n http_compress=True,\n request_timeout=60,\n max_retries=3,\n retry_on_timeout=True,\n # \u276f openssl s_client -connect opensearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin\n ssl_assert_fingerprint=(\n \"PUT_FINGERPRINT_HERE\"\n )\n )\nSENTRY_NODESTORE = 'sentry_nodestore_opensearch.OpenSearchNodeStorage'\nSENTRY_NODESTORE_OPTIONS = {\n 'es': os_client,\n 'refresh': False, # ref: https://opensearch.org/docs/latest/opensearch/rest-api/index-apis/refresh/\n # other OpenSearch-related options\n}\n\nfrom sentry.conf.server import * # Default for sentry.conf.py\nINSTALLED_APPS = list(INSTALLED_APPS)\nINSTALLED_APPS.append('sentry_nodestore_opensearch')\nINSTALLED_APPS = tuple(INSTALLED_APPS)\n```\n\n## Usage\n\n### Setup opensearch index template\n\nEnsure OpenSearch is up and running before this step. This will create an index template in OpenSearch.\n\n``` shell\nsentry upgrade --with-nodestore\n```\n\nOr you can prepare the index template manually with this JSON. It may be customized for your needs, but the template name must be sentry because of the nodestore initialization script.\n``` json\n{\n \"template\": {\n \"settings\": {\n \"index\": {\n \"number_of_shards\": \"3\",\n \"number_of_replicas\": \"0\",\n \"routing\": {\n \"allocation\": {\n \"include\": {\n \"_tier_preference\": \"data_content\"\n }\n }\n }\n }\n },\n \"mappings\": {\n \"dynamic\": \"false\",\n \"dynamic_templates\": [],\n \"properties\": {\n \"data\": {\n \"type\": \"text\",\n \"index\": false,\n \"store\": true\n },\n \"timestamp\": {\n \"type\": \"date\",\n \"store\": true\n }\n }\n },\n \"aliases\": {\n \"sentry\": {}\n }\n }\n}\n```\n\n### Migrate Data from Default PostgreSQL Nodestore to OpenSearch\n\nPostgreSQL and OpenSearch must be accessible from the machine where you run this code.\n\n``` python\nfrom opensearchpy import OpenSearch\nfrom opensearchpy.helpers import bulk\nimport psycopg2\n\nos_client = OpenSearch(\n ['https://username:password@opensearch:9200'],\n http_compress=True,\n request_timeout=60,\n max_retries=3,\n retry_on_timeout=True,\n # \u276f openssl s_client -connect opensearch:9200 < /dev/null 2>/dev/null | openssl x509 -fingerprint -noout -in /dev/stdin\n ssl_assert_fingerprint=(\n \"PUT_FINGERPRINT_HERE\"\n )\n)\n\nconn = psycopg2.connect(\n dbname=\"sentry\",\n user=\"sentry\",\n password=\"password\",\n host=\"hostname\",\n port=\"5432\"\n)\n\ncur = conn.cursor()\ncur.execute(\"SELECT reltuples AS estimate FROM pg_class WHERE relname = 'nodestore_node'\")\nresult = cur.fetchone()\ncount = int(result[0])\nprint(f\"Estimated rows: {count}\")\ncur.close()\n\ncursor = conn.cursor(name='fetch_nodes')\ncursor.execute(\"SELECT * FROM nodestore_node ORDER BY timestamp ASC\")\n\nwhile True:\n records = cursor.fetchmany(size=2000)\n\n if not records:\n break\n\n bulk_data = []\n\n for r in records:\n id = r[0]\n data = r[1]\n date = r[2].strftime(\"%Y-%m-%d\")\n ts = r[2].isoformat()\n index = f\"sentry-{date}\"\n\n doc = {\n 'data': data,\n 'timestamp': ts\n }\n\n action = {\n \"_index\": index,\n \"_id\": id,\n \"_source\": doc\n }\n\n bulk_data.append(action)\n\n bulk(os_client, bulk_data)\n count -= len(records)\n print(f\"Remaining rows: {count}\")\n\ncursor.close()\nconn.close()\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Sentry nodestore OpenSearch backend",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/wedeploy/sentry-nodestore-opensearch/issues",
"CI": "https://github.com/wedeploy/sentry-nodestore-opensearch/actions",
"Homepage": "https://github.com/wedeploy/sentry-nodestore-opensearch",
"Source Code": "https://github.com/wedeploy/sentry-nodestore-opensearch"
},
"split_keywords": [
"sentry",
" opensearch",
" nodestore",
" backend"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f5073769adcc18120f8ae58767273033f6c21242282d923c3168466eadaa2833",
"md5": "b4a785dedc255766d93f7b8518d54b16",
"sha256": "e1ca39abc4756e2327efe6fe7f49d3edc496af7d6878e290231642e216cfc3e9"
},
"downloads": -1,
"filename": "sentry_nodestore_opensearch-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b4a785dedc255766d93f7b8518d54b16",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 9977,
"upload_time": "2024-11-27T12:22:28",
"upload_time_iso_8601": "2024-11-27T12:22:28.333635Z",
"url": "https://files.pythonhosted.org/packages/f5/07/3769adcc18120f8ae58767273033f6c21242282d923c3168466eadaa2833/sentry_nodestore_opensearch-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d1d09478129599c7d354f0efdf542bafced41c1aa2cb944d1f0245919879b98b",
"md5": "38d260a49faf0c184e451d0680c30dc7",
"sha256": "3888dda95ca4301afb7e1015187928a6fc6a57891d6b67420761e594530133a4"
},
"downloads": -1,
"filename": "sentry_nodestore_opensearch-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "38d260a49faf0c184e451d0680c30dc7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 9450,
"upload_time": "2024-11-27T12:22:30",
"upload_time_iso_8601": "2024-11-27T12:22:30.206317Z",
"url": "https://files.pythonhosted.org/packages/d1/d0/9478129599c7d354f0efdf542bafced41c1aa2cb944d1f0245919879b98b/sentry_nodestore_opensearch-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-27 12:22:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "wedeploy",
"github_project": "sentry-nodestore-opensearch",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sentry-nodestore-opensearch"
}