# sz_semantics
Transform JSON output from the [Senzing SDK](https://senzing.com/docs/python/)
for use with graph technologies, semantics, and downstream LLM integration.
## Install
This library uses [`poetry`](https://python-poetry.org/docs/) for
demos:
```bash
poetry update
```
Otherwise, to use the library:
```bash
pip install sz_sematics
```
For the [gRCP server](https://github.com/senzing-garage/serve-grpc),
if you don't already have Senzing and its gRPC server otherwise
installed pull the latest Docker container:
```bash
docker pull senzing/serve-grpc:latest
```
## Usage: Masking PII
Mask the PII values within Senzing JSON output with tokens which can
be substituted back later. For example, _mask_ PII values before
calling a remote service (such as an LLM-based chat) then _unmask_
returned text after the roundtrip, to maintain _data privacy_.
```python
import json
from sz_semantics import Mask
data: dict = { "ENTITY_NAME": "Robert Smith" }
sz_mask: Mask = Mask()
masked_data: dict = sz_mask.mask_data(data)
masked_text: str = json.dumps(masked_data)
print(masked_text)
unmasked: str = sz_mask.unmask_text(masked_text)
print(unmasked)
```
For an example, run the `demo1.py` script with a data file which
captures Senzing JSON output:
```bash
poetry run python3 demo1.py data/get.json
```
The two lists `Mask.KNOWN_KEYS` and `Mask.MASKED_KEYS` enumerate
respectively the:
* keys for known elements which do not require masking
* keys for PII elements which require masking
Any other keys encountered will be masked by default and reported as
warnings in the logging. Adjust these lists as needed for a given use
case.
For work with large numbers of entities, subclass `KeyValueStore` to
provide a distributed key/value store (other than the Python built-in
`dict` default) to use for scale-out.
## Usage: Semantic Represenation
Starting with a small [SKOS-based taxonomy](https://www.w3.org/2004/02/skos/)
in the `domain.ttl` file, parse the Senzing
[_entity resolution_](https://senzing.com/what-is-entity-resolution/)
(ER) results to generate an
[`RDFlib`](https://rdflib.readthedocs.io/) _semantic graph_.
In other words, generate the "backbone" for constructing an
[_Entity Resolved Knowledge Graph_](https://senzing.com/entity-resolved-knowledge-graphs/),
as a core componet of a
[_semantic layer_](https://enterprise-knowledge.com/what-is-a-semantic-layer-components-and-enterprise-applications/).
The example code below serializes the _thesaurus_ generated from
Senzing ER results as `"thesaurus.ttl"` combined with the Senzing
_taxonomy_ definitions, which can be used for constructing knowledge
graphs:
```python
import pathlib
from sz_semantics import Thesaurus
thesaurus: Thesaurus = Thesaurus()
thesaurus.load_source(Thesaurus.DOMAIN_TTL)
export_path: pathlib.Path = pathlib.Path("data/truth/export.json")
with open(export_path, "r", encoding = "utf-8") as fp_json:
for line in fp_json:
for rdf_frag in thesaurus.parse_iter(line, language = "en"):
thesaurus.load_source_text(
Thesaurus.RDF_PREAMBLE + rdf_frag,
format = "turtle",
)
thesaurus_path: pathlib.Path = pathlib.Path("thesaurus.ttl")
thesaurus.save_source(thesaurus_path, format = "turtle")
```
For an example, run the `demo2.py` script to process the JSON file
`data/export.json` which captures Senzing ER exported results:
```bash
poetry run python3 demo2.py
```
Then check the RDF definitions in the generated `thesaurus.ttl` file.
## Usage: gRPC Client/Server
For a demo of `SzClient` to simplify accessing the Senzing SDK via a
gRPC server, then running _entity resolution_ on the "truthset"
collection of sample datasets, first launch this container and have it
running in the background:
```bash
docker run -it --publish 8261:8261 --rm senzing/serve-grpc
```
Then run:
```bash
poetry run python3 demo3.py
```
Restart the container each time before re-running the `demo3.py`
script.
---

---
<details>
<summary>License and Copyright</summary>
Source code for `sz_semantics` plus any logo, documentation, and
examples have an [MIT license](https://spdx.org/licenses/MIT.html)
which is succinct and simplifies use in commercial applications.
All materials herein are Copyright © 2025 Senzing, Inc.
</details>
Kudos to
[@brianmacy](https://github.com/brianmacy),
[@jbutcher21](https://github.com/jbutcher21),
[@docktermj](https://github.com/docktermj),
[@cj2001](https://github.com/cj2001),
[@503jmt](https://github.com/503jmt),
and the kind folks at [GraphGeeks](https://graphgeeks.org/) for their support.
</details>
## Star History
[](https://star-history.com/#senzing-garage/sz-semantics&Date)
Raw data
{
"_id": null,
"home_page": "https://github.com/senzing-garage/sz-semantics",
"name": "sz_semantics",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "context-engineering, data-privacy, entity-resolution, entity-resolved-knowledge-graph, grpc, ontology, rdf, semantic-layer, semantics, skos, taxonomy, thesaurus",
"author": "Paco Nathan",
"author_email": "paco@senzing.com",
"download_url": "https://files.pythonhosted.org/packages/ff/74/8f0d2495fca4d6170b50833ae2d81133081ae7911f30d840eabcbf818fc6/sz_semantics-1.2.3.tar.gz",
"platform": null,
"description": "# sz_semantics\n\nTransform JSON output from the [Senzing SDK](https://senzing.com/docs/python/)\nfor use with graph technologies, semantics, and downstream LLM integration.\n\n\n## Install\n\nThis library uses [`poetry`](https://python-poetry.org/docs/) for\ndemos:\n\n```bash\npoetry update\n```\n\nOtherwise, to use the library:\n\n```bash\npip install sz_sematics\n```\n\nFor the [gRCP server](https://github.com/senzing-garage/serve-grpc), \nif you don't already have Senzing and its gRPC server otherwise\ninstalled pull the latest Docker container:\n\n```bash\ndocker pull senzing/serve-grpc:latest\n```\n\n\n## Usage: Masking PII\n\nMask the PII values within Senzing JSON output with tokens which can\nbe substituted back later. For example, _mask_ PII values before\ncalling a remote service (such as an LLM-based chat) then _unmask_\nreturned text after the roundtrip, to maintain _data privacy_.\n\n```python\nimport json\nfrom sz_semantics import Mask\n\ndata: dict = { \"ENTITY_NAME\": \"Robert Smith\" }\n\nsz_mask: Mask = Mask()\nmasked_data: dict = sz_mask.mask_data(data)\n\nmasked_text: str = json.dumps(masked_data)\nprint(masked_text)\n\nunmasked: str = sz_mask.unmask_text(masked_text)\nprint(unmasked)\n```\n\nFor an example, run the `demo1.py` script with a data file which\ncaptures Senzing JSON output:\n\n```bash\npoetry run python3 demo1.py data/get.json\n```\n\nThe two lists `Mask.KNOWN_KEYS` and `Mask.MASKED_KEYS` enumerate\nrespectively the:\n\n * keys for known elements which do not require masking\n * keys for PII elements which require masking\n\nAny other keys encountered will be masked by default and reported as\nwarnings in the logging. Adjust these lists as needed for a given use\ncase.\n\nFor work with large numbers of entities, subclass `KeyValueStore` to\nprovide a distributed key/value store (other than the Python built-in\n`dict` default) to use for scale-out.\n\n\n## Usage: Semantic Represenation\n\nStarting with a small [SKOS-based taxonomy](https://www.w3.org/2004/02/skos/)\nin the `domain.ttl` file, parse the Senzing\n[_entity resolution_](https://senzing.com/what-is-entity-resolution/)\n(ER) results to generate an \n[`RDFlib`](https://rdflib.readthedocs.io/) _semantic graph_.\n\nIn other words, generate the \"backbone\" for constructing an\n[_Entity Resolved Knowledge Graph_](https://senzing.com/entity-resolved-knowledge-graphs/),\nas a core componet of a\n[_semantic layer_](https://enterprise-knowledge.com/what-is-a-semantic-layer-components-and-enterprise-applications/).\n\nThe example code below serializes the _thesaurus_ generated from\nSenzing ER results as `\"thesaurus.ttl\"` combined with the Senzing\n_taxonomy_ definitions, which can be used for constructing knowledge\ngraphs:\n\n```python\nimport pathlib\nfrom sz_semantics import Thesaurus\n\nthesaurus: Thesaurus = Thesaurus()\nthesaurus.load_source(Thesaurus.DOMAIN_TTL)\n\nexport_path: pathlib.Path = pathlib.Path(\"data/truth/export.json\")\n\nwith open(export_path, \"r\", encoding = \"utf-8\") as fp_json:\n for line in fp_json:\n for rdf_frag in thesaurus.parse_iter(line, language = \"en\"):\n thesaurus.load_source_text(\n Thesaurus.RDF_PREAMBLE + rdf_frag,\n format = \"turtle\",\n )\n\nthesaurus_path: pathlib.Path = pathlib.Path(\"thesaurus.ttl\")\nthesaurus.save_source(thesaurus_path, format = \"turtle\")\n```\n\nFor an example, run the `demo2.py` script to process the JSON file\n`data/export.json` which captures Senzing ER exported results:\n\n```bash\npoetry run python3 demo2.py\n```\n\nThen check the RDF definitions in the generated `thesaurus.ttl` file.\n\n\n## Usage: gRPC Client/Server\n\nFor a demo of `SzClient` to simplify accessing the Senzing SDK via a\ngRPC server, then running _entity resolution_ on the \"truthset\"\ncollection of sample datasets, first launch this container and have it\nrunning in the background:\n\n```bash\ndocker run -it --publish 8261:8261 --rm senzing/serve-grpc\n```\n\nThen run:\n\n```bash\npoetry run python3 demo3.py\n```\n\nRestart the container each time before re-running the `demo3.py`\nscript.\n\n---\n\n\n\n---\n\n<details>\n <summary>License and Copyright</summary>\n\nSource code for `sz_semantics` plus any logo, documentation, and\nexamples have an [MIT license](https://spdx.org/licenses/MIT.html)\nwhich is succinct and simplifies use in commercial applications.\n\nAll materials herein are Copyright \u00a9 2025 Senzing, Inc.\n</details>\n\nKudos to \n[@brianmacy](https://github.com/brianmacy),\n[@jbutcher21](https://github.com/jbutcher21),\n[@docktermj](https://github.com/docktermj),\n[@cj2001](https://github.com/cj2001),\n[@503jmt](https://github.com/503jmt),\nand the kind folks at [GraphGeeks](https://graphgeeks.org/) for their support.\n</details>\n\n\n## Star History\n\n[](https://star-history.com/#senzing-garage/sz-semantics&Date)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Transform JSON output from Senzing SDK for use with graph technologies, semantics, and downstream LLM integration",
"version": "1.2.3",
"project_urls": {
"Homepage": "https://github.com/senzing-garage/sz-semantics",
"package": "https://pypi.org/project/sz_semantics/",
"semantics": "https://github.com/senzing-garage/sz-semantics/wiki/ns"
},
"split_keywords": [
"context-engineering",
" data-privacy",
" entity-resolution",
" entity-resolved-knowledge-graph",
" grpc",
" ontology",
" rdf",
" semantic-layer",
" semantics",
" skos",
" taxonomy",
" thesaurus"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8c06bb26c61f67bd32e607f2e1fd986b492b9b28fff45a8fe9da015b51d647e1",
"md5": "efdf3688353d32a230b855ed55035c3e",
"sha256": "68c8aedc278bc77d44b359838dd3edc98aa0544690ce050734c3f493123885d3"
},
"downloads": -1,
"filename": "sz_semantics-1.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "efdf3688353d32a230b855ed55035c3e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 13468,
"upload_time": "2025-11-01T20:31:29",
"upload_time_iso_8601": "2025-11-01T20:31:29.344415Z",
"url": "https://files.pythonhosted.org/packages/8c/06/bb26c61f67bd32e607f2e1fd986b492b9b28fff45a8fe9da015b51d647e1/sz_semantics-1.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ff748f0d2495fca4d6170b50833ae2d81133081ae7911f30d840eabcbf818fc6",
"md5": "ea61579f7205a34da0ece157b4cce351",
"sha256": "7e96505836833ce60e61f00a2422d9fb1286a460c50cffe489edbb572e30438b"
},
"downloads": -1,
"filename": "sz_semantics-1.2.3.tar.gz",
"has_sig": false,
"md5_digest": "ea61579f7205a34da0ece157b4cce351",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 12467,
"upload_time": "2025-11-01T20:31:30",
"upload_time_iso_8601": "2025-11-01T20:31:30.585562Z",
"url": "https://files.pythonhosted.org/packages/ff/74/8f0d2495fca4d6170b50833ae2d81133081ae7911f30d840eabcbf818fc6/sz_semantics-1.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-01 20:31:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "senzing-garage",
"github_project": "sz-semantics",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sz_semantics"
}