FHIR Validator
==============
- `FHIR Validator <#fhir-validator>`__
- `Background <#background>`__
- `Objective <#objective>`__
- `Example: CLI validation
usage <#example-cli-validation-usage>`__
- `Installation <#installation>`__
- `Using pip <#using-pip>`__
- `Using Poetry <#using-poetry>`__
- `CLI Usage <#cli-usage>`__
- `Validate a FHIR File: <#validate-a-fhir-file>`__
- `Identify the Content
Structure: <#identify-the-content-structure>`__
- `Options: <#options>`__
- `Chunk size <#chunk-size>`__
- `Integration <#integration>`__
- `Example: Validate a FHIR
File <#example-validate-a-fhir-file>`__
- `Development <#development>`__
- `Setting Up Your Development
Environment <#setting-up-your-development-environment>`__
- `Tests <#tests>`__
Background
----------
While testing Google’s FHIR Store and following the provided
documentation, we encountered an issue where the import process wasn’t
working as expected. A great tool from MITRE called
`Synthea <https://github.com/synthetichealth/synthea/>`__ generates
synthetic patient FHIR records, and it’s even recommended by Google in
their examples. However, either due to unclear documentation or our
oversight, the import of this generated data failed. After struggling
with over 60,000 “invalid JSON” error messages in Google Healthcare, we
realized we were missing a crucial content-structure flag. It took us an
entire day to figure out the issue.
This got us thinking—what happens when you have an ETL process dealing
with hundreds of thousands of files?
We explored existing FHIR validation tools, including those from HL7.
However, we found that even for a small 2MB patient file, some
validators took up to 6 minutes and produced over 1,000 warnings and
errors—most of which were related to external terminologies and content
that was valid and parsable by the FHIR store.
This led us to develop a simple validator designed to quickly check if
your FHIR files conform to the FHIR R4 schema. The goal is to quickly
reject problematic files before they clutter your logs and overwhelm
your monitoring systems.
Objective
---------
The objective of ``fhir-validator`` is to quickly and efficiently
validate FHIR (Fast Healthcare Interoperability Resources) files i
against the FHIR schema for structure.
Most validators are rules based delving deep into contents of the FHIR
messages, and are often embedded directly into FHIR stores of software
used to process FHIR messages and are heavily verbose.
This is meant to be a lightweight fast validation ensure conformity
against the FHIR structure.
This script also identifies the FHIR messages content structure used
primarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,
``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)
Allowing you to determine the appropriate switch for import
Example: CLI validation usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: sh
$ fhir-validator --path data/samples/fhir --action identify
Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY
# Performing a google import
$ gcloud healthcare fhir-stores import gcs fhir-store \
--dataset=fhir-dataset \
--gcs-uri=gs://$BUCKET_NAME/*.json \
--content-structure=bundle-pretty
Installation
------------
You can install ``fhir-validator`` using either ``pip`` or ``Poetry``.
Using pip
~~~~~~~~~
.. code:: bash
pip install fhir-validator
Using Poetry
~~~~~~~~~~~~
.. code:: bash
poetry add fhir-validator
CLI Usage
---------
Once installed, you can use the ``fhir-validator`` CLI to validate FHIR
files or identify their content structure.
.. code:: sh
$ fhir-validator --help
usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]
FHIR Bundle Validator and Content Structure Identifier
optional arguments:
-h, --help show this help message and exit
--path PATH File or directory path to validate or identify content structure
--action {validate,identify}
Action to perform: validate the FHIR bundles or identify the content structure
--chunk-size CHUNK_SIZE
Number of entries per chunk for validation (default: 100)
Validate a FHIR File:
~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
fhir-validator --path path/to/fhir_file.json --action validate
Identify the Content Structure:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
fhir-validator --path path/to/fhir_file.json --action identify
This will return
+---------+------------------------------------------------------------+
| FLAG | Description |
+=========+============================================================+
| ``B | The source file contains one or more lines of |
| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle, |
| | which contains one or more resources. If you don’t specify |
| | ContentStructure, it defaults to BUNDLE. |
+---------+------------------------------------------------------------+
| ``RES | The source file contains one or more lines of |
| OURCE`` | newline-delimited JSON (ndjson). Each line is a single |
| | resource. |
+---------+------------------------------------------------------------+
| ``RES | The entire source file is one JSON resource. The JSON can |
| OURCE_P | span multiple lines. |
| RETTY`` | |
+---------+------------------------------------------------------------+
| ``B | The entire source file is one JSON bundle. The JSON can |
| UNDLE_P | span multiple lines. |
| RETTY`` | |
+---------+------------------------------------------------------------+
Options:
~~~~~~~~
- ``--path``: Specify the file or directory path to validate or
identify.
- ``--action``: Choose ``validate`` to validate the file or
``identify`` to determine the content structure.
- ``--chunk-size``: (Optional) Number of entries per chunk for
validation, defaults to 100.
Chunk size
~~~~~~~~~~
Breaks the file into it’s entry components allowing for faster
validation against chunks of the json files.
Integration
-----------
You can also use ``fhir-validator`` directly in your Python code. Here’s
an example of how to integrate the validation or content structure
identification into a Python project:
Example: Validate a FHIR File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: python
from fhir_validator import (compile_fhir_schema,
identify_content_structure,
load_consolidated_fhir_schema,
validate_fhir_bundle_in_chunks,
BUNDLE_PRETTY)
import json
file_path = "data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json"
content_structure = identify_content_structure(file_path)
print(f"Content structure: {content_structure}")
# By default loads the r4 schema
schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')
compiled_validator = compile_fhir_schema(schema_json)
# If content structure is a bundle, validate it
if content_structure == BUNDLE_PRETTY:
with open(file_path, 'r') as f:
bundle = json.load(f)
is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)
print(f"File : {file_path} is valid ? {is_valid}")
This simple Python snippet demonstrates how to check the content
structure of a FHIR file and, if it’s a ``BUNDLE_PRETTY``, how to
validate its content.
--------------
Development
-----------
To contribute to the ``fhir-validator`` project, you’ll need to install
the necessary dependencies, including the ``dev`` and ``test`` groups
for development tools and testing. The ``pre-commit`` hooks are part of
the ``dev`` group, and ``pytest`` is part of the ``test`` group.
Setting Up Your Development Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. **Clone the repository**:
.. code:: bash
git clone https://github.com/thevgergroup/fhir-validator.git
cd fhir-validator
2. **Install dependencies using Poetry**:
Install both the ``dev`` and ``test`` groups to ensure you have all
the necessary tools for development and testing:
.. code:: bash
poetry install --with dev,test
This command installs the base dependencies along with the ``dev``
group (which includes tools like ``pre-commit``) and the ``test``
group (which includes tools like ``pytest``).
We use pandoc to generate the README.rst for pypi to ensure links are
correctly structured see [Installing
Pandoc](https://pandoc.org/installing.html] Update the any necessary
changes in ``README.md`` and the pre-commit hook will perform the
conversion.
3. **Install the Pre-commit Hooks**:
The project uses ``pre-commit`` to automate tasks such as converting
``README.md`` to ``README.rst`` before commits. To set up the
pre-commit hooks locally, run:
.. code:: bash
poetry run pre-commit install
This will configure the Git hooks to automatically run when you make
a commit.
Tests
~~~~~
We use pytest see the unit tests in ``tests``
.. code:: sh
poetry run pytest
.. _fhir-validator-1:
FHIR Validator
==============
- `FHIR Validator <#fhir-validator>`__
- `Background <#background>`__
- `Objective <#objective>`__
- `Example: CLI validation
usage <#example-cli-validation-usage>`__
- `Installation <#installation>`__
- `Using pip <#using-pip>`__
- `Using Poetry <#using-poetry>`__
- `CLI Usage <#cli-usage>`__
- `Validate a FHIR File: <#validate-a-fhir-file>`__
- `Identify the Content
Structure: <#identify-the-content-structure>`__
- `Options: <#options>`__
- `Chunk size <#chunk-size>`__
- `Integration <#integration>`__
- `Example: Validate a FHIR
File <#example-validate-a-fhir-file>`__
- `Development <#development>`__
- `Setting Up Your Development
Environment <#setting-up-your-development-environment>`__
- `Tests <#tests>`__
.. _background-1:
Background
----------
While testing Google’s FHIR Store and following the provided
documentation, we encountered an issue where the import process wasn’t
working as expected. A great tool from MITRE called
`Synthea <https://github.com/synthetichealth/synthea/>`__ generates
synthetic patient FHIR records, and it’s even recommended by Google in
their examples. However, either due to unclear documentation or our
oversight, the import of this generated data failed. After struggling
with over 60,000 “invalid JSON” error messages in Google Healthcare, we
realized we were missing a crucial content-structure flag. It took us an
entire day to figure out the issue.
This got us thinking—what happens when you have an ETL process dealing
with hundreds of thousands of files?
We explored existing FHIR validation tools, including those from HL7.
However, we found that even for a small 2MB patient file, some
validators took up to 6 minutes and produced over 1,000 warnings and
errors—most of which were related to external terminologies and content
that was valid and parsable by the FHIR store.
This led us to develop a simple validator designed to quickly check if
your FHIR files conform to the FHIR R4 schema. The goal is to quickly
reject problematic files before they clutter your logs and overwhelm
your monitoring systems.
.. _objective-1:
Objective
---------
The objective of ``fhir-validator`` is to quickly and efficiently
validate FHIR (Fast Healthcare Interoperability Resources) files i
against the FHIR schema for structure.
Most validators are rules based delving deep into contents of the FHIR
messages, and are often embedded directly into FHIR stores of software
used to process FHIR messages and are heavily verbose.
This is meant to be a lightweight fast validation ensure conformity
against the FHIR structure.
This script also identifies the FHIR messages content structure used
primarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,
``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)
Allowing you to determine the appropriate switch for import
.. _example-cli-validation-usage-1:
Example: CLI validation usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: sh
$ fhir-validator --path data/samples/fhir --action identify
Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY
Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY
# Performing a google import
$ gcloud healthcare fhir-stores import gcs fhir-store \
--dataset=fhir-dataset \
--gcs-uri=gs://$BUCKET_NAME/*.json \
--content-structure=bundle-pretty
.. _installation-1:
Installation
------------
You can install ``fhir-validator`` using either ``pip`` or ``Poetry``.
.. _using-pip-1:
Using pip
~~~~~~~~~
.. code:: bash
pip install fhir-validator
.. _using-poetry-1:
Using Poetry
~~~~~~~~~~~~
.. code:: bash
poetry add fhir-validator
.. _cli-usage-1:
CLI Usage
---------
Once installed, you can use the ``fhir-validator`` CLI to validate FHIR
files or identify their content structure.
.. code:: sh
$ fhir-validator --help
usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]
FHIR Bundle Validator and Content Structure Identifier
optional arguments:
-h, --help show this help message and exit
--path PATH File or directory path to validate or identify content structure
--action {validate,identify}
Action to perform: validate the FHIR bundles or identify the content structure
--chunk-size CHUNK_SIZE
Number of entries per chunk for validation (default: 100)
.. _validate-a-fhir-file-1:
Validate a FHIR File:
~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
fhir-validator --path path/to/fhir_file.json --action validate
.. _identify-the-content-structure-1:
Identify the Content Structure:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: bash
fhir-validator --path path/to/fhir_file.json --action identify
This will return
+---------+------------------------------------------------------------+
| FLAG | Description |
+=========+============================================================+
| ``B | The source file contains one or more lines of |
| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle, |
| | which contains one or more resources. If you don’t specify |
| | ContentStructure, it defaults to BUNDLE. |
+---------+------------------------------------------------------------+
| ``RES | The source file contains one or more lines of |
| OURCE`` | newline-delimited JSON (ndjson). Each line is a single |
| | resource. |
+---------+------------------------------------------------------------+
| ``RES | The entire source file is one JSON resource. The JSON can |
| OURCE_P | span multiple lines. |
| RETTY`` | |
+---------+------------------------------------------------------------+
| ``B | The entire source file is one JSON bundle. The JSON can |
| UNDLE_P | span multiple lines. |
| RETTY`` | |
+---------+------------------------------------------------------------+
.. _options-1:
Options:
~~~~~~~~
- ``--path``: Specify the file or directory path to validate or
identify.
- ``--action``: Choose ``validate`` to validate the file or
``identify`` to determine the content structure.
- ``--chunk-size``: (Optional) Number of entries per chunk for
validation, defaults to 100.
.. _chunk-size-1:
Chunk size
~~~~~~~~~~
Breaks the file into it’s entry components allowing for faster
validation against chunks of the json files.
.. _integration-1:
Integration
-----------
You can also use ``fhir-validator`` directly in your Python code. Here’s
an example of how to integrate the validation or content structure
identification into a Python project:
.. _example-validate-a-fhir-file-1:
Example: Validate a FHIR File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: python
from fhir_validator import (compile_fhir_schema,
identify_content_structure,
load_consolidated_fhir_schema,
validate_fhir_bundle_in_chunks,
BUNDLE_PRETTY)
import json
file_path = "data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json"
content_structure = identify_content_structure(file_path)
print(f"Content structure: {content_structure}")
# By default loads the r4 schema
schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')
compiled_validator = compile_fhir_schema(schema_json)
# If content structure is a bundle, validate it
if content_structure == BUNDLE_PRETTY:
with open(file_path, 'r') as f:
bundle = json.load(f)
is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)
print(f"File : {file_path} is valid ? {is_valid}")
This simple Python snippet demonstrates how to check the content
structure of a FHIR file and, if it’s a ``BUNDLE_PRETTY``, how to
validate its content.
--------------
.. _development-1:
Development
-----------
To contribute to the ``fhir-validator`` project, you’ll need to install
the necessary dependencies, including the ``dev`` and ``test`` groups
for development tools and testing. The ``pre-commit`` hooks are part of
the ``dev`` group, and ``pytest`` is part of the ``test`` group.
.. _setting-up-your-development-environment-1:
Setting Up Your Development Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. **Clone the repository**:
.. code:: bash
git clone https://github.com/thevgergroup/fhir-validator.git
cd fhir-validator
2. **Install dependencies using Poetry**:
Install both the ``dev`` and ``test`` groups to ensure you have all
the necessary tools for development and testing:
.. code:: bash
poetry install --with dev,test
This command installs the base dependencies along with the ``dev``
group (which includes tools like ``pre-commit``) and the ``test``
group (which includes tools like ``pytest``).
We use pandoc to generate the README.rst for pypi to ensure links are
correctly structured see [Installing
Pandoc](https://pandoc.org/installing.html] Update the any necessary
changes in ``README.md`` and the pre-commit hook will perform the
conversion.
3. **Install the Pre-commit Hooks**:
The project uses ``pre-commit`` to automate tasks such as converting
``README.md`` to ``README.rst`` before commits. To set up the
pre-commit hooks locally, run:
.. code:: bash
poetry run pre-commit install
This will configure the Git hooks to automatically run when you make
a commit.
.. _tests-1:
Tests
~~~~~
We use pytest see the unit tests in ``tests``
.. code:: sh
poetry run pytest
Raw data
{
"_id": null,
"home_page": "https://github.com/thevgergroup/fhir-validator",
"name": "fhir-validator",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.9",
"maintainer_email": null,
"keywords": "FHIR, HL7, validator, identifier",
"author": "patrick o'leary",
"author_email": "pjaol@pjaol.com",
"download_url": "https://files.pythonhosted.org/packages/3c/66/42ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015/fhir_validator-0.2.2.tar.gz",
"platform": null,
"description": "FHIR Validator\n==============\n\n- `FHIR Validator <#fhir-validator>`__\n\n - `Background <#background>`__\n - `Objective <#objective>`__\n\n - `Example: CLI validation\n usage <#example-cli-validation-usage>`__\n\n - `Installation <#installation>`__\n\n - `Using pip <#using-pip>`__\n - `Using Poetry <#using-poetry>`__\n\n - `CLI Usage <#cli-usage>`__\n\n - `Validate a FHIR File: <#validate-a-fhir-file>`__\n - `Identify the Content\n Structure: <#identify-the-content-structure>`__\n - `Options: <#options>`__\n - `Chunk size <#chunk-size>`__\n\n - `Integration <#integration>`__\n\n - `Example: Validate a FHIR\n File <#example-validate-a-fhir-file>`__\n\n - `Development <#development>`__\n\n - `Setting Up Your Development\n Environment <#setting-up-your-development-environment>`__\n - `Tests <#tests>`__\n\nBackground\n----------\n\nWhile testing Google\u2019s FHIR Store and following the provided\ndocumentation, we encountered an issue where the import process wasn\u2019t\nworking as expected. A great tool from MITRE called\n`Synthea <https://github.com/synthetichealth/synthea/>`__ generates\nsynthetic patient FHIR records, and it\u2019s even recommended by Google in\ntheir examples. However, either due to unclear documentation or our\noversight, the import of this generated data failed. After struggling\nwith over 60,000 \u201cinvalid JSON\u201d error messages in Google Healthcare, we\nrealized we were missing a crucial content-structure flag. It took us an\nentire day to figure out the issue.\n\nThis got us thinking\u2014what happens when you have an ETL process dealing\nwith hundreds of thousands of files?\n\nWe explored existing FHIR validation tools, including those from HL7.\nHowever, we found that even for a small 2MB patient file, some\nvalidators took up to 6 minutes and produced over 1,000 warnings and\nerrors\u2014most of which were related to external terminologies and content\nthat was valid and parsable by the FHIR store.\n\nThis led us to develop a simple validator designed to quickly check if\nyour FHIR files conform to the FHIR R4 schema. The goal is to quickly\nreject problematic files before they clutter your logs and overwhelm\nyour monitoring systems.\n\nObjective\n---------\n\nThe objective of ``fhir-validator`` is to quickly and efficiently\nvalidate FHIR (Fast Healthcare Interoperability Resources) files i\nagainst the FHIR schema for structure.\n\nMost validators are rules based delving deep into contents of the FHIR\nmessages, and are often embedded directly into FHIR stores of software\nused to process FHIR messages and are heavily verbose.\n\nThis is meant to be a lightweight fast validation ensure conformity\nagainst the FHIR structure.\n\nThis script also identifies the FHIR messages content structure used\nprimarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,\n``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)\n\nAllowing you to determine the appropriate switch for import\n\nExample: CLI validation usage\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: sh\n\n $ fhir-validator --path data/samples/fhir --action identify\n Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY\n\n # Performing a google import\n $ gcloud healthcare fhir-stores import gcs fhir-store \\\n --dataset=fhir-dataset \\\n --gcs-uri=gs://$BUCKET_NAME/*.json \\\n --content-structure=bundle-pretty\n\nInstallation\n------------\n\nYou can install ``fhir-validator`` using either ``pip`` or ``Poetry``.\n\nUsing pip\n~~~~~~~~~\n\n.. code:: bash\n\n pip install fhir-validator\n\nUsing Poetry\n~~~~~~~~~~~~\n\n.. code:: bash\n\n poetry add fhir-validator\n\nCLI Usage\n---------\n\nOnce installed, you can use the ``fhir-validator`` CLI to validate FHIR\nfiles or identify their content structure.\n\n.. code:: sh\n\n $ fhir-validator --help\n usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]\n\n FHIR Bundle Validator and Content Structure Identifier\n\n optional arguments:\n -h, --help show this help message and exit\n --path PATH File or directory path to validate or identify content structure\n --action {validate,identify}\n Action to perform: validate the FHIR bundles or identify the content structure\n --chunk-size CHUNK_SIZE\n Number of entries per chunk for validation (default: 100)\n\nValidate a FHIR File:\n~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n fhir-validator --path path/to/fhir_file.json --action validate\n\nIdentify the Content Structure:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n fhir-validator --path path/to/fhir_file.json --action identify\n\nThis will return\n\n+---------+------------------------------------------------------------+\n| FLAG | Description |\n+=========+============================================================+\n| ``B | The source file contains one or more lines of |\n| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle, |\n| | which contains one or more resources. If you don\u2019t specify |\n| | ContentStructure, it defaults to BUNDLE. |\n+---------+------------------------------------------------------------+\n| ``RES | The source file contains one or more lines of |\n| OURCE`` | newline-delimited JSON (ndjson). Each line is a single |\n| | resource. |\n+---------+------------------------------------------------------------+\n| ``RES | The entire source file is one JSON resource. The JSON can |\n| OURCE_P | span multiple lines. |\n| RETTY`` | |\n+---------+------------------------------------------------------------+\n| ``B | The entire source file is one JSON bundle. The JSON can |\n| UNDLE_P | span multiple lines. |\n| RETTY`` | |\n+---------+------------------------------------------------------------+\n\nOptions:\n~~~~~~~~\n\n- ``--path``: Specify the file or directory path to validate or\n identify.\n- ``--action``: Choose ``validate`` to validate the file or\n ``identify`` to determine the content structure.\n- ``--chunk-size``: (Optional) Number of entries per chunk for\n validation, defaults to 100.\n\nChunk size\n~~~~~~~~~~\n\nBreaks the file into it\u2019s entry components allowing for faster\nvalidation against chunks of the json files.\n\nIntegration\n-----------\n\nYou can also use ``fhir-validator`` directly in your Python code. Here\u2019s\nan example of how to integrate the validation or content structure\nidentification into a Python project:\n\nExample: Validate a FHIR File\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n from fhir_validator import (compile_fhir_schema, \n identify_content_structure, \n load_consolidated_fhir_schema,\n validate_fhir_bundle_in_chunks,\n BUNDLE_PRETTY) \n import json \n\n file_path = \"data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json\"\n content_structure = identify_content_structure(file_path)\n\n print(f\"Content structure: {content_structure}\")\n\n # By default loads the r4 schema\n schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')\n compiled_validator = compile_fhir_schema(schema_json)\n\n # If content structure is a bundle, validate it\n if content_structure == BUNDLE_PRETTY:\n with open(file_path, 'r') as f:\n bundle = json.load(f)\n is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)\n print(f\"File : {file_path} is valid ? {is_valid}\")\n\nThis simple Python snippet demonstrates how to check the content\nstructure of a FHIR file and, if it\u2019s a ``BUNDLE_PRETTY``, how to\nvalidate its content.\n\n--------------\n\nDevelopment\n-----------\n\nTo contribute to the ``fhir-validator`` project, you\u2019ll need to install\nthe necessary dependencies, including the ``dev`` and ``test`` groups\nfor development tools and testing. The ``pre-commit`` hooks are part of\nthe ``dev`` group, and ``pytest`` is part of the ``test`` group.\n\nSetting Up Your Development Environment\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. **Clone the repository**:\n\n .. code:: bash\n\n git clone https://github.com/thevgergroup/fhir-validator.git\n cd fhir-validator\n\n2. **Install dependencies using Poetry**:\n\n Install both the ``dev`` and ``test`` groups to ensure you have all\n the necessary tools for development and testing:\n\n .. code:: bash\n\n poetry install --with dev,test\n\n This command installs the base dependencies along with the ``dev``\n group (which includes tools like ``pre-commit``) and the ``test``\n group (which includes tools like ``pytest``).\n\n We use pandoc to generate the README.rst for pypi to ensure links are\n correctly structured see [Installing\n Pandoc](https://pandoc.org/installing.html] Update the any necessary\n changes in ``README.md`` and the pre-commit hook will perform the\n conversion.\n\n3. **Install the Pre-commit Hooks**:\n\n The project uses ``pre-commit`` to automate tasks such as converting\n ``README.md`` to ``README.rst`` before commits. To set up the\n pre-commit hooks locally, run:\n\n .. code:: bash\n\n poetry run pre-commit install\n\n This will configure the Git hooks to automatically run when you make\n a commit.\n\nTests\n~~~~~\n\nWe use pytest see the unit tests in ``tests``\n\n.. code:: sh\n\n poetry run pytest\n\n.. _fhir-validator-1:\n\nFHIR Validator\n==============\n\n- `FHIR Validator <#fhir-validator>`__\n\n - `Background <#background>`__\n - `Objective <#objective>`__\n\n - `Example: CLI validation\n usage <#example-cli-validation-usage>`__\n\n - `Installation <#installation>`__\n\n - `Using pip <#using-pip>`__\n - `Using Poetry <#using-poetry>`__\n\n - `CLI Usage <#cli-usage>`__\n\n - `Validate a FHIR File: <#validate-a-fhir-file>`__\n - `Identify the Content\n Structure: <#identify-the-content-structure>`__\n - `Options: <#options>`__\n - `Chunk size <#chunk-size>`__\n\n - `Integration <#integration>`__\n\n - `Example: Validate a FHIR\n File <#example-validate-a-fhir-file>`__\n\n - `Development <#development>`__\n\n - `Setting Up Your Development\n Environment <#setting-up-your-development-environment>`__\n - `Tests <#tests>`__\n\n.. _background-1:\n\nBackground\n----------\n\nWhile testing Google\u2019s FHIR Store and following the provided\ndocumentation, we encountered an issue where the import process wasn\u2019t\nworking as expected. A great tool from MITRE called\n`Synthea <https://github.com/synthetichealth/synthea/>`__ generates\nsynthetic patient FHIR records, and it\u2019s even recommended by Google in\ntheir examples. However, either due to unclear documentation or our\noversight, the import of this generated data failed. After struggling\nwith over 60,000 \u201cinvalid JSON\u201d error messages in Google Healthcare, we\nrealized we were missing a crucial content-structure flag. It took us an\nentire day to figure out the issue.\n\nThis got us thinking\u2014what happens when you have an ETL process dealing\nwith hundreds of thousands of files?\n\nWe explored existing FHIR validation tools, including those from HL7.\nHowever, we found that even for a small 2MB patient file, some\nvalidators took up to 6 minutes and produced over 1,000 warnings and\nerrors\u2014most of which were related to external terminologies and content\nthat was valid and parsable by the FHIR store.\n\nThis led us to develop a simple validator designed to quickly check if\nyour FHIR files conform to the FHIR R4 schema. The goal is to quickly\nreject problematic files before they clutter your logs and overwhelm\nyour monitoring systems.\n\n.. _objective-1:\n\nObjective\n---------\n\nThe objective of ``fhir-validator`` is to quickly and efficiently\nvalidate FHIR (Fast Healthcare Interoperability Resources) files i\nagainst the FHIR schema for structure.\n\nMost validators are rules based delving deep into contents of the FHIR\nmessages, and are often embedded directly into FHIR stores of software\nused to process FHIR messages and are heavily verbose.\n\nThis is meant to be a lightweight fast validation ensure conformity\nagainst the FHIR structure.\n\nThis script also identifies the FHIR messages content structure used\nprimarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,\n``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)\n\nAllowing you to determine the appropriate switch for import\n\n.. _example-cli-validation-usage-1:\n\nExample: CLI validation usage\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: sh\n\n $ fhir-validator --path data/samples/fhir --action identify\n Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY\n Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY\n\n # Performing a google import\n $ gcloud healthcare fhir-stores import gcs fhir-store \\\n --dataset=fhir-dataset \\\n --gcs-uri=gs://$BUCKET_NAME/*.json \\\n --content-structure=bundle-pretty\n\n.. _installation-1:\n\nInstallation\n------------\n\nYou can install ``fhir-validator`` using either ``pip`` or ``Poetry``.\n\n.. _using-pip-1:\n\nUsing pip\n~~~~~~~~~\n\n.. code:: bash\n\n pip install fhir-validator\n\n.. _using-poetry-1:\n\nUsing Poetry\n~~~~~~~~~~~~\n\n.. code:: bash\n\n poetry add fhir-validator\n\n.. _cli-usage-1:\n\nCLI Usage\n---------\n\nOnce installed, you can use the ``fhir-validator`` CLI to validate FHIR\nfiles or identify their content structure.\n\n.. code:: sh\n\n $ fhir-validator --help\n usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]\n\n FHIR Bundle Validator and Content Structure Identifier\n\n optional arguments:\n -h, --help show this help message and exit\n --path PATH File or directory path to validate or identify content structure\n --action {validate,identify}\n Action to perform: validate the FHIR bundles or identify the content structure\n --chunk-size CHUNK_SIZE\n Number of entries per chunk for validation (default: 100)\n\n.. _validate-a-fhir-file-1:\n\nValidate a FHIR File:\n~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n fhir-validator --path path/to/fhir_file.json --action validate\n\n.. _identify-the-content-structure-1:\n\nIdentify the Content Structure:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n fhir-validator --path path/to/fhir_file.json --action identify\n\nThis will return\n\n+---------+------------------------------------------------------------+\n| FLAG | Description |\n+=========+============================================================+\n| ``B | The source file contains one or more lines of |\n| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle, |\n| | which contains one or more resources. If you don\u2019t specify |\n| | ContentStructure, it defaults to BUNDLE. |\n+---------+------------------------------------------------------------+\n| ``RES | The source file contains one or more lines of |\n| OURCE`` | newline-delimited JSON (ndjson). Each line is a single |\n| | resource. |\n+---------+------------------------------------------------------------+\n| ``RES | The entire source file is one JSON resource. The JSON can |\n| OURCE_P | span multiple lines. |\n| RETTY`` | |\n+---------+------------------------------------------------------------+\n| ``B | The entire source file is one JSON bundle. The JSON can |\n| UNDLE_P | span multiple lines. |\n| RETTY`` | |\n+---------+------------------------------------------------------------+\n\n.. _options-1:\n\nOptions:\n~~~~~~~~\n\n- ``--path``: Specify the file or directory path to validate or\n identify.\n- ``--action``: Choose ``validate`` to validate the file or\n ``identify`` to determine the content structure.\n- ``--chunk-size``: (Optional) Number of entries per chunk for\n validation, defaults to 100.\n\n.. _chunk-size-1:\n\nChunk size\n~~~~~~~~~~\n\nBreaks the file into it\u2019s entry components allowing for faster\nvalidation against chunks of the json files.\n\n.. _integration-1:\n\nIntegration\n-----------\n\nYou can also use ``fhir-validator`` directly in your Python code. Here\u2019s\nan example of how to integrate the validation or content structure\nidentification into a Python project:\n\n.. _example-validate-a-fhir-file-1:\n\nExample: Validate a FHIR File\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n from fhir_validator import (compile_fhir_schema, \n identify_content_structure, \n load_consolidated_fhir_schema,\n validate_fhir_bundle_in_chunks,\n BUNDLE_PRETTY) \n import json \n\n file_path = \"data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json\"\n content_structure = identify_content_structure(file_path)\n\n print(f\"Content structure: {content_structure}\")\n\n # By default loads the r4 schema\n schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')\n compiled_validator = compile_fhir_schema(schema_json)\n\n # If content structure is a bundle, validate it\n if content_structure == BUNDLE_PRETTY:\n with open(file_path, 'r') as f:\n bundle = json.load(f)\n is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)\n print(f\"File : {file_path} is valid ? {is_valid}\")\n\nThis simple Python snippet demonstrates how to check the content\nstructure of a FHIR file and, if it\u2019s a ``BUNDLE_PRETTY``, how to\nvalidate its content.\n\n--------------\n\n.. _development-1:\n\nDevelopment\n-----------\n\nTo contribute to the ``fhir-validator`` project, you\u2019ll need to install\nthe necessary dependencies, including the ``dev`` and ``test`` groups\nfor development tools and testing. The ``pre-commit`` hooks are part of\nthe ``dev`` group, and ``pytest`` is part of the ``test`` group.\n\n.. _setting-up-your-development-environment-1:\n\nSetting Up Your Development Environment\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. **Clone the repository**:\n\n .. code:: bash\n\n git clone https://github.com/thevgergroup/fhir-validator.git\n cd fhir-validator\n\n2. **Install dependencies using Poetry**:\n\n Install both the ``dev`` and ``test`` groups to ensure you have all\n the necessary tools for development and testing:\n\n .. code:: bash\n\n poetry install --with dev,test\n\n This command installs the base dependencies along with the ``dev``\n group (which includes tools like ``pre-commit``) and the ``test``\n group (which includes tools like ``pytest``).\n\n We use pandoc to generate the README.rst for pypi to ensure links are\n correctly structured see [Installing\n Pandoc](https://pandoc.org/installing.html] Update the any necessary\n changes in ``README.md`` and the pre-commit hook will perform the\n conversion.\n\n3. **Install the Pre-commit Hooks**:\n\n The project uses ``pre-commit`` to automate tasks such as converting\n ``README.md`` to ``README.rst`` before commits. To set up the\n pre-commit hooks locally, run:\n\n .. code:: bash\n\n poetry run pre-commit install\n\n This will configure the Git hooks to automatically run when you make\n a commit.\n\n.. _tests-1:\n\nTests\n~~~~~\n\nWe use pytest see the unit tests in ``tests``\n\n.. code:: sh\n\n poetry run pytest\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "FHIR Validator and Identifier for resource vs bundle type",
"version": "0.2.2",
"project_urls": {
"Homepage": "https://github.com/thevgergroup/fhir-validator",
"Repository": "https://github.com/thevgergroup/fhir-validator.git"
},
"split_keywords": [
"fhir",
" hl7",
" validator",
" identifier"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8a6606c9e7e39f30b1d98bad6b96e181941781819206fd0e62a41fcc9b812a4d",
"md5": "7c4f2ddec4197b7c49f1b456047caccb",
"sha256": "805e026e4d9504bb9df34372f32a4b5f326c17f39046979f24b520b830076f52"
},
"downloads": -1,
"filename": "fhir_validator-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7c4f2ddec4197b7c49f1b456047caccb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.9",
"size": 9464,
"upload_time": "2024-10-09T19:07:51",
"upload_time_iso_8601": "2024-10-09T19:07:51.712476Z",
"url": "https://files.pythonhosted.org/packages/8a/66/06c9e7e39f30b1d98bad6b96e181941781819206fd0e62a41fcc9b812a4d/fhir_validator-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3c6642ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015",
"md5": "4e43f3f48858ff7d62e27bada090a140",
"sha256": "747171da268104cd8f2a874b03fc1c6bc84ab0c1054b758b19e0372114ce0bfa"
},
"downloads": -1,
"filename": "fhir_validator-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "4e43f3f48858ff7d62e27bada090a140",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.9",
"size": 8672,
"upload_time": "2024-10-09T19:07:52",
"upload_time_iso_8601": "2024-10-09T19:07:52.873964Z",
"url": "https://files.pythonhosted.org/packages/3c/66/42ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015/fhir_validator-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-09 19:07:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "thevgergroup",
"github_project": "fhir-validator",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "fhir-validator"
}