# FHIR Terminology Encoder
This is a [scikit-learn](https://scikit-learn.org/) compatible encoder that uses
a FHIR terminology server to encode ontological features.
It currently supports subsumption relationships and properties.
You supply a scope in the form of a FHIR ValueSet URI, and a FHIR terminology
endpoint.
The result is a multi-hot encoded vector delivered as a sparse matrix, suitable
for input into most models and estimators.
## Installation
```bash
pip install fhir-tx-encoder
```
## Usage
```python
from fhir_tx import FhirTerminologyEncoder
import numpy as np
encoder = FhirTerminologyEncoder(
# Ancestors of the SNOMED CT concept "Malignant neoplastic disease" (363346000)
scope="http://snomed.info/sct?fhir_vs=ecl/(%3E%3E%20363346000)",
# Include "Associated morphology" (116676008) as a property
properties=["116676008"]
)
# Encode two SNOMED CT concepts:
# - "Neoplasm and/or hamartoma" (399981008)
# - "Malignant neoplastic disease" (363346000)
result = encoder.fit_transform(np.array([["399981008", "363346000"]]))
# Print out the result and its shape.
print(f"result.shape: {result.shape}")
print(f"result:\n{result.toarray()}")
# Print out the feature names.
print(f"encoder.feature_names_: {encoder.feature_names_}")
```
Which would output:
```
Expanding value set: http://snomed.info/sct?fhir_vs=ecl/(%3E%3E%20363346000)
Expanding (6 items, offset 0, total 6)
Expansion complete
Generating one-hot encoding... (6, 6)
Creating index... 6 items
Applying transitive closure...
Batch 1 of 1, 6 items... 15 pairs added
Subsumption encoding complete: (6, 6)
Encoding properties... (6, 9)
result.shape: (2, 9)
result:
[[1. 1. 0. 1. 0. 1. 0. 0. 1.]
[1. 1. 1. 1. 1. 1. 0. 1. 0.]]
encoder.feature_names_: ['404684003', '64572001', '363346000', '399981008', '55342001', '138875005', '609096000.116676008=108369006', '609096000.116676008=1240414004', '609096000.116676008=400177003']
```
## Important note
This software is currently in alpha. It is not yet ready for production use.
Copyright © 2023, Commonwealth Scientific and Industrial Research Organisation
(CSIRO) ABN 41 687 119 230. Licensed under
the [Apache License, version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
Raw data
{
"_id": null,
"home_page": null,
"name": "fhir-tx-encoder",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Australian e-Health Research Centre, CSIRO",
"author_email": "ontoserver-support@csiro.au",
"download_url": "https://files.pythonhosted.org/packages/08/ad/cd90bc48085d090390d3e291918ab403a28818f5a89762d10183fe63cd90/fhir_tx_encoder-1.1.0.tar.gz",
"platform": null,
"description": "# FHIR Terminology Encoder\n\nThis is a [scikit-learn](https://scikit-learn.org/) compatible encoder that uses \na FHIR terminology server to encode ontological features.\n\nIt currently supports subsumption relationships and properties.\n\nYou supply a scope in the form of a FHIR ValueSet URI, and a FHIR terminology\nendpoint.\n\nThe result is a multi-hot encoded vector delivered as a sparse matrix, suitable\nfor input into most models and estimators.\n\n## Installation\n\n```bash\npip install fhir-tx-encoder\n```\n\n## Usage\n\n```python\nfrom fhir_tx import FhirTerminologyEncoder\nimport numpy as np\n\nencoder = FhirTerminologyEncoder(\n # Ancestors of the SNOMED CT concept \"Malignant neoplastic disease\" (363346000)\n scope=\"http://snomed.info/sct?fhir_vs=ecl/(%3E%3E%20363346000)\",\n # Include \"Associated morphology\" (116676008) as a property\n properties=[\"116676008\"]\n)\n\n# Encode two SNOMED CT concepts:\n# - \"Neoplasm and/or hamartoma\" (399981008)\n# - \"Malignant neoplastic disease\" (363346000)\nresult = encoder.fit_transform(np.array([[\"399981008\", \"363346000\"]]))\n\n# Print out the result and its shape.\nprint(f\"result.shape: {result.shape}\")\nprint(f\"result:\\n{result.toarray()}\")\n\n# Print out the feature names.\nprint(f\"encoder.feature_names_: {encoder.feature_names_}\")\n```\n\nWhich would output:\n\n```\nExpanding value set: http://snomed.info/sct?fhir_vs=ecl/(%3E%3E%20363346000)\nExpanding (6 items, offset 0, total 6)\nExpansion complete\nGenerating one-hot encoding... (6, 6)\nCreating index... 6 items\nApplying transitive closure...\nBatch 1 of 1, 6 items... 15 pairs added\nSubsumption encoding complete: (6, 6)\nEncoding properties... (6, 9)\nresult.shape: (2, 9)\nresult:\n[[1. 1. 0. 1. 0. 1. 0. 0. 1.]\n [1. 1. 1. 1. 1. 1. 0. 1. 0.]]\nencoder.feature_names_: ['404684003', '64572001', '363346000', '399981008', '55342001', '138875005', '609096000.116676008=108369006', '609096000.116676008=1240414004', '609096000.116676008=400177003']\n```\n\n## Important note\n\nThis software is currently in alpha. It is not yet ready for production use.\n\nCopyright \u00a9 2023, Commonwealth Scientific and Industrial Research Organisation \n(CSIRO) ABN 41 687 119 230. Licensed under\nthe [Apache License, version 2.0](https://www.apache.org/licenses/LICENSE-2.0).\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Tools for encoding FHIR terminology concepts for machine learning",
"version": "1.1.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "33a00eb55a7e173a002b812d29c03387c7f14ed1e2e38dbeaeb84c4fa9bfab04",
"md5": "9899edee3b4ee1b3b8149e679eb2f5ef",
"sha256": "29516429b036336074b99e632203214ad44940c223b85d9d9c38df0f884f4fbb"
},
"downloads": -1,
"filename": "fhir_tx_encoder-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9899edee3b4ee1b3b8149e679eb2f5ef",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12210,
"upload_time": "2024-06-02T04:51:45",
"upload_time_iso_8601": "2024-06-02T04:51:45.402062Z",
"url": "https://files.pythonhosted.org/packages/33/a0/0eb55a7e173a002b812d29c03387c7f14ed1e2e38dbeaeb84c4fa9bfab04/fhir_tx_encoder-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "08adcd90bc48085d090390d3e291918ab403a28818f5a89762d10183fe63cd90",
"md5": "0e6d7a14412b602ed38fbbe1daa651b4",
"sha256": "623fcad188a489d200635650d6e3a264a7d363fac5f47c36686d40e0cf7f2a85"
},
"downloads": -1,
"filename": "fhir_tx_encoder-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "0e6d7a14412b602ed38fbbe1daa651b4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 10326,
"upload_time": "2024-06-02T04:51:46",
"upload_time_iso_8601": "2024-06-02T04:51:46.461509Z",
"url": "https://files.pythonhosted.org/packages/08/ad/cd90bc48085d090390d3e291918ab403a28818f5a89762d10183fe63cd90/fhir_tx_encoder-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-02 04:51:46",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fhir-tx-encoder"
}