fhir-validator

Name	fhir-validator JSON
Version	0.2.2 JSON
	download
home_page	https://github.com/thevgergroup/fhir-validator
Summary	FHIR Validator and Identifier for resource vs bundle type
upload_time	2024-10-09 19:07:52
maintainer	None
docs_url	None
author	patrick o'leary
requires_python	<3.12,>=3.9
license	MIT
keywords	fhir hl7 validator identifier
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            FHIR Validator
==============

-  `FHIR Validator <#fhir-validator>`__

   -  `Background <#background>`__
   -  `Objective <#objective>`__

      -  `Example: CLI validation
         usage <#example-cli-validation-usage>`__

   -  `Installation <#installation>`__

      -  `Using pip <#using-pip>`__
      -  `Using Poetry <#using-poetry>`__

   -  `CLI Usage <#cli-usage>`__

      -  `Validate a FHIR File: <#validate-a-fhir-file>`__
      -  `Identify the Content
         Structure: <#identify-the-content-structure>`__
      -  `Options: <#options>`__
      -  `Chunk size <#chunk-size>`__

   -  `Integration <#integration>`__

      -  `Example: Validate a FHIR
         File <#example-validate-a-fhir-file>`__

   -  `Development <#development>`__

      -  `Setting Up Your Development
         Environment <#setting-up-your-development-environment>`__
      -  `Tests <#tests>`__

Background
----------

While testing Google’s FHIR Store and following the provided
documentation, we encountered an issue where the import process wasn’t
working as expected. A great tool from MITRE called
`Synthea <https://github.com/synthetichealth/synthea/>`__ generates
synthetic patient FHIR records, and it’s even recommended by Google in
their examples. However, either due to unclear documentation or our
oversight, the import of this generated data failed. After struggling
with over 60,000 “invalid JSON” error messages in Google Healthcare, we
realized we were missing a crucial content-structure flag. It took us an
entire day to figure out the issue.

This got us thinking—what happens when you have an ETL process dealing
with hundreds of thousands of files?

We explored existing FHIR validation tools, including those from HL7.
However, we found that even for a small 2MB patient file, some
validators took up to 6 minutes and produced over 1,000 warnings and
errors—most of which were related to external terminologies and content
that was valid and parsable by the FHIR store.

This led us to develop a simple validator designed to quickly check if
your FHIR files conform to the FHIR R4 schema. The goal is to quickly
reject problematic files before they clutter your logs and overwhelm
your monitoring systems.

Objective
---------

The objective of ``fhir-validator`` is to quickly and efficiently
validate FHIR (Fast Healthcare Interoperability Resources) files i
against the FHIR schema for structure.

Most validators are rules based delving deep into contents of the FHIR
messages, and are often embedded directly into FHIR stores of software
used to process FHIR messages and are heavily verbose.

This is meant to be a lightweight fast validation ensure conformity
against the FHIR structure.

This script also identifies the FHIR messages content structure used
primarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,
``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)

Allowing you to determine the appropriate switch for import

Example: CLI validation usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: sh

   $ fhir-validator --path data/samples/fhir --action identify
   Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY

   # Performing a google import
   $ gcloud healthcare fhir-stores import gcs fhir-store \
       --dataset=fhir-dataset \
       --gcs-uri=gs://$BUCKET_NAME/*.json \
       --content-structure=bundle-pretty

Installation
------------

You can install ``fhir-validator`` using either ``pip`` or ``Poetry``.

Using pip
~~~~~~~~~

.. code:: bash

   pip install fhir-validator

Using Poetry
~~~~~~~~~~~~

.. code:: bash

   poetry add fhir-validator

CLI Usage
---------

Once installed, you can use the ``fhir-validator`` CLI to validate FHIR
files or identify their content structure.

.. code:: sh

   $ fhir-validator --help
   usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]

   FHIR Bundle Validator and Content Structure Identifier

   optional arguments:
     -h, --help            show this help message and exit
     --path PATH           File or directory path to validate or identify content structure
     --action {validate,identify}
                           Action to perform: validate the FHIR bundles or identify the content structure
     --chunk-size CHUNK_SIZE
                           Number of entries per chunk for validation (default: 100)

Validate a FHIR File:
~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

   fhir-validator --path path/to/fhir_file.json --action validate

Identify the Content Structure:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

   fhir-validator --path path/to/fhir_file.json --action identify

This will return

+---------+------------------------------------------------------------+
| FLAG    | Description                                                |
+=========+============================================================+
| ``B     | The source file contains one or more lines of              |
| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle,    |
|         | which contains one or more resources. If you don’t specify |
|         | ContentStructure, it defaults to BUNDLE.                   |
+---------+------------------------------------------------------------+
| ``RES   | The source file contains one or more lines of              |
| OURCE`` | newline-delimited JSON (ndjson). Each line is a single     |
|         | resource.                                                  |
+---------+------------------------------------------------------------+
| ``RES   | The entire source file is one JSON resource. The JSON can  |
| OURCE_P | span multiple lines.                                       |
| RETTY`` |                                                            |
+---------+------------------------------------------------------------+
| ``B     | The entire source file is one JSON bundle. The JSON can    |
| UNDLE_P | span multiple lines.                                       |
| RETTY`` |                                                            |
+---------+------------------------------------------------------------+

Options:
~~~~~~~~

-  ``--path``: Specify the file or directory path to validate or
   identify.
-  ``--action``: Choose ``validate`` to validate the file or
   ``identify`` to determine the content structure.
-  ``--chunk-size``: (Optional) Number of entries per chunk for
   validation, defaults to 100.

Chunk size
~~~~~~~~~~

Breaks the file into it’s entry components allowing for faster
validation against chunks of the json files.

Integration
-----------

You can also use ``fhir-validator`` directly in your Python code. Here’s
an example of how to integrate the validation or content structure
identification into a Python project:

Example: Validate a FHIR File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: python

   from fhir_validator import (compile_fhir_schema, 
                               identify_content_structure, 
                               load_consolidated_fhir_schema,
                               validate_fhir_bundle_in_chunks,
                               BUNDLE_PRETTY) 
   import json                            

   file_path = "data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json"
   content_structure = identify_content_structure(file_path)

   print(f"Content structure: {content_structure}")

   # By default loads the r4 schema
   schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')
   compiled_validator = compile_fhir_schema(schema_json)

   # If content structure is a bundle, validate it
   if content_structure == BUNDLE_PRETTY:
       with open(file_path, 'r') as f:
           bundle = json.load(f)
       is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)
       print(f"File : {file_path} is valid ? {is_valid}")

This simple Python snippet demonstrates how to check the content
structure of a FHIR file and, if it’s a ``BUNDLE_PRETTY``, how to
validate its content.

--------------

Development
-----------

To contribute to the ``fhir-validator`` project, you’ll need to install
the necessary dependencies, including the ``dev`` and ``test`` groups
for development tools and testing. The ``pre-commit`` hooks are part of
the ``dev`` group, and ``pytest`` is part of the ``test`` group.

Setting Up Your Development Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. **Clone the repository**:

   .. code:: bash

      git clone https://github.com/thevgergroup/fhir-validator.git
      cd fhir-validator

2. **Install dependencies using Poetry**:

   Install both the ``dev`` and ``test`` groups to ensure you have all
   the necessary tools for development and testing:

   .. code:: bash

      poetry install --with dev,test

   This command installs the base dependencies along with the ``dev``
   group (which includes tools like ``pre-commit``) and the ``test``
   group (which includes tools like ``pytest``).

   We use pandoc to generate the README.rst for pypi to ensure links are
   correctly structured see [Installing
   Pandoc](https://pandoc.org/installing.html] Update the any necessary
   changes in ``README.md`` and the pre-commit hook will perform the
   conversion.

3. **Install the Pre-commit Hooks**:

   The project uses ``pre-commit`` to automate tasks such as converting
   ``README.md`` to ``README.rst`` before commits. To set up the
   pre-commit hooks locally, run:

   .. code:: bash

      poetry run pre-commit install

   This will configure the Git hooks to automatically run when you make
   a commit.

Tests
~~~~~

We use pytest see the unit tests in ``tests``

.. code:: sh

   poetry run pytest

.. _fhir-validator-1:

FHIR Validator
==============

-  `FHIR Validator <#fhir-validator>`__

   -  `Background <#background>`__
   -  `Objective <#objective>`__

      -  `Example: CLI validation
         usage <#example-cli-validation-usage>`__

   -  `Installation <#installation>`__

      -  `Using pip <#using-pip>`__
      -  `Using Poetry <#using-poetry>`__

   -  `CLI Usage <#cli-usage>`__

      -  `Validate a FHIR File: <#validate-a-fhir-file>`__
      -  `Identify the Content
         Structure: <#identify-the-content-structure>`__
      -  `Options: <#options>`__
      -  `Chunk size <#chunk-size>`__

   -  `Integration <#integration>`__

      -  `Example: Validate a FHIR
         File <#example-validate-a-fhir-file>`__

   -  `Development <#development>`__

      -  `Setting Up Your Development
         Environment <#setting-up-your-development-environment>`__
      -  `Tests <#tests>`__

.. _background-1:

Background
----------

While testing Google’s FHIR Store and following the provided
documentation, we encountered an issue where the import process wasn’t
working as expected. A great tool from MITRE called
`Synthea <https://github.com/synthetichealth/synthea/>`__ generates
synthetic patient FHIR records, and it’s even recommended by Google in
their examples. However, either due to unclear documentation or our
oversight, the import of this generated data failed. After struggling
with over 60,000 “invalid JSON” error messages in Google Healthcare, we
realized we were missing a crucial content-structure flag. It took us an
entire day to figure out the issue.

This got us thinking—what happens when you have an ETL process dealing
with hundreds of thousands of files?

We explored existing FHIR validation tools, including those from HL7.
However, we found that even for a small 2MB patient file, some
validators took up to 6 minutes and produced over 1,000 warnings and
errors—most of which were related to external terminologies and content
that was valid and parsable by the FHIR store.

This led us to develop a simple validator designed to quickly check if
your FHIR files conform to the FHIR R4 schema. The goal is to quickly
reject problematic files before they clutter your logs and overwhelm
your monitoring systems.

.. _objective-1:

Objective
---------

The objective of ``fhir-validator`` is to quickly and efficiently
validate FHIR (Fast Healthcare Interoperability Resources) files i
against the FHIR schema for structure.

Most validators are rules based delving deep into contents of the FHIR
messages, and are often embedded directly into FHIR stores of software
used to process FHIR messages and are heavily verbose.

This is meant to be a lightweight fast validation ensure conformity
against the FHIR structure.

This script also identifies the FHIR messages content structure used
primarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,
``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)

Allowing you to determine the appropriate switch for import

.. _example-cli-validation-usage-1:

Example: CLI validation usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: sh

   $ fhir-validator --path data/samples/fhir --action identify
   Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY
   Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY

   # Performing a google import
   $ gcloud healthcare fhir-stores import gcs fhir-store \
       --dataset=fhir-dataset \
       --gcs-uri=gs://$BUCKET_NAME/*.json \
       --content-structure=bundle-pretty

.. _installation-1:

Installation
------------

You can install ``fhir-validator`` using either ``pip`` or ``Poetry``.

.. _using-pip-1:

Using pip
~~~~~~~~~

.. code:: bash

   pip install fhir-validator

.. _using-poetry-1:

Using Poetry
~~~~~~~~~~~~

.. code:: bash

   poetry add fhir-validator

.. _cli-usage-1:

CLI Usage
---------

Once installed, you can use the ``fhir-validator`` CLI to validate FHIR
files or identify their content structure.

.. code:: sh

   $ fhir-validator --help
   usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]

   FHIR Bundle Validator and Content Structure Identifier

   optional arguments:
     -h, --help            show this help message and exit
     --path PATH           File or directory path to validate or identify content structure
     --action {validate,identify}
                           Action to perform: validate the FHIR bundles or identify the content structure
     --chunk-size CHUNK_SIZE
                           Number of entries per chunk for validation (default: 100)

.. _validate-a-fhir-file-1:

Validate a FHIR File:
~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

   fhir-validator --path path/to/fhir_file.json --action validate

.. _identify-the-content-structure-1:

Identify the Content Structure:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: bash

   fhir-validator --path path/to/fhir_file.json --action identify

This will return

+---------+------------------------------------------------------------+
| FLAG    | Description                                                |
+=========+============================================================+
| ``B     | The source file contains one or more lines of              |
| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle,    |
|         | which contains one or more resources. If you don’t specify |
|         | ContentStructure, it defaults to BUNDLE.                   |
+---------+------------------------------------------------------------+
| ``RES   | The source file contains one or more lines of              |
| OURCE`` | newline-delimited JSON (ndjson). Each line is a single     |
|         | resource.                                                  |
+---------+------------------------------------------------------------+
| ``RES   | The entire source file is one JSON resource. The JSON can  |
| OURCE_P | span multiple lines.                                       |
| RETTY`` |                                                            |
+---------+------------------------------------------------------------+
| ``B     | The entire source file is one JSON bundle. The JSON can    |
| UNDLE_P | span multiple lines.                                       |
| RETTY`` |                                                            |
+---------+------------------------------------------------------------+

.. _options-1:

Options:
~~~~~~~~

-  ``--path``: Specify the file or directory path to validate or
   identify.
-  ``--action``: Choose ``validate`` to validate the file or
   ``identify`` to determine the content structure.
-  ``--chunk-size``: (Optional) Number of entries per chunk for
   validation, defaults to 100.

.. _chunk-size-1:

Chunk size
~~~~~~~~~~

Breaks the file into it’s entry components allowing for faster
validation against chunks of the json files.

.. _integration-1:

Integration
-----------

You can also use ``fhir-validator`` directly in your Python code. Here’s
an example of how to integrate the validation or content structure
identification into a Python project:

.. _example-validate-a-fhir-file-1:

Example: Validate a FHIR File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code:: python

   from fhir_validator import (compile_fhir_schema, 
                               identify_content_structure, 
                               load_consolidated_fhir_schema,
                               validate_fhir_bundle_in_chunks,
                               BUNDLE_PRETTY) 
   import json                            

   file_path = "data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json"
   content_structure = identify_content_structure(file_path)

   print(f"Content structure: {content_structure}")

   # By default loads the r4 schema
   schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')
   compiled_validator = compile_fhir_schema(schema_json)

   # If content structure is a bundle, validate it
   if content_structure == BUNDLE_PRETTY:
       with open(file_path, 'r') as f:
           bundle = json.load(f)
       is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)
       print(f"File : {file_path} is valid ? {is_valid}")

This simple Python snippet demonstrates how to check the content
structure of a FHIR file and, if it’s a ``BUNDLE_PRETTY``, how to
validate its content.

--------------

.. _development-1:

Development
-----------

To contribute to the ``fhir-validator`` project, you’ll need to install
the necessary dependencies, including the ``dev`` and ``test`` groups
for development tools and testing. The ``pre-commit`` hooks are part of
the ``dev`` group, and ``pytest`` is part of the ``test`` group.

.. _setting-up-your-development-environment-1:

Setting Up Your Development Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. **Clone the repository**:

   .. code:: bash

      git clone https://github.com/thevgergroup/fhir-validator.git
      cd fhir-validator

2. **Install dependencies using Poetry**:

   Install both the ``dev`` and ``test`` groups to ensure you have all
   the necessary tools for development and testing:

   .. code:: bash

      poetry install --with dev,test

   This command installs the base dependencies along with the ``dev``
   group (which includes tools like ``pre-commit``) and the ``test``
   group (which includes tools like ``pytest``).

   We use pandoc to generate the README.rst for pypi to ensure links are
   correctly structured see [Installing
   Pandoc](https://pandoc.org/installing.html] Update the any necessary
   changes in ``README.md`` and the pre-commit hook will perform the
   conversion.

3. **Install the Pre-commit Hooks**:

   The project uses ``pre-commit`` to automate tasks such as converting
   ``README.md`` to ``README.rst`` before commits. To set up the
   pre-commit hooks locally, run:

   .. code:: bash

      poetry run pre-commit install

   This will configure the Git hooks to automatically run when you make
   a commit.

.. _tests-1:

Tests
~~~~~

We use pytest see the unit tests in ``tests``

.. code:: sh

   poetry run pytest

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/thevgergroup/fhir-validator",
    "name": "fhir-validator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.9",
    "maintainer_email": null,
    "keywords": "FHIR, HL7, validator, identifier",
    "author": "patrick o'leary",
    "author_email": "pjaol@pjaol.com",
    "download_url": "https://files.pythonhosted.org/packages/3c/66/42ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015/fhir_validator-0.2.2.tar.gz",
    "platform": null,
    "description": "FHIR Validator\n==============\n\n-  `FHIR Validator <#fhir-validator>`__\n\n   -  `Background <#background>`__\n   -  `Objective <#objective>`__\n\n      -  `Example: CLI validation\n         usage <#example-cli-validation-usage>`__\n\n   -  `Installation <#installation>`__\n\n      -  `Using pip <#using-pip>`__\n      -  `Using Poetry <#using-poetry>`__\n\n   -  `CLI Usage <#cli-usage>`__\n\n      -  `Validate a FHIR File: <#validate-a-fhir-file>`__\n      -  `Identify the Content\n         Structure: <#identify-the-content-structure>`__\n      -  `Options: <#options>`__\n      -  `Chunk size <#chunk-size>`__\n\n   -  `Integration <#integration>`__\n\n      -  `Example: Validate a FHIR\n         File <#example-validate-a-fhir-file>`__\n\n   -  `Development <#development>`__\n\n      -  `Setting Up Your Development\n         Environment <#setting-up-your-development-environment>`__\n      -  `Tests <#tests>`__\n\nBackground\n----------\n\nWhile testing Google\u2019s FHIR Store and following the provided\ndocumentation, we encountered an issue where the import process wasn\u2019t\nworking as expected. A great tool from MITRE called\n`Synthea <https://github.com/synthetichealth/synthea/>`__ generates\nsynthetic patient FHIR records, and it\u2019s even recommended by Google in\ntheir examples. However, either due to unclear documentation or our\noversight, the import of this generated data failed. After struggling\nwith over 60,000 \u201cinvalid JSON\u201d error messages in Google Healthcare, we\nrealized we were missing a crucial content-structure flag. It took us an\nentire day to figure out the issue.\n\nThis got us thinking\u2014what happens when you have an ETL process dealing\nwith hundreds of thousands of files?\n\nWe explored existing FHIR validation tools, including those from HL7.\nHowever, we found that even for a small 2MB patient file, some\nvalidators took up to 6 minutes and produced over 1,000 warnings and\nerrors\u2014most of which were related to external terminologies and content\nthat was valid and parsable by the FHIR store.\n\nThis led us to develop a simple validator designed to quickly check if\nyour FHIR files conform to the FHIR R4 schema. The goal is to quickly\nreject problematic files before they clutter your logs and overwhelm\nyour monitoring systems.\n\nObjective\n---------\n\nThe objective of ``fhir-validator`` is to quickly and efficiently\nvalidate FHIR (Fast Healthcare Interoperability Resources) files i\nagainst the FHIR schema for structure.\n\nMost validators are rules based delving deep into contents of the FHIR\nmessages, and are often embedded directly into FHIR stores of software\nused to process FHIR messages and are heavily verbose.\n\nThis is meant to be a lightweight fast validation ensure conformity\nagainst the FHIR structure.\n\nThis script also identifies the FHIR messages content structure used\nprimarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,\n``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)\n\nAllowing you to determine the appropriate switch for import\n\nExample: CLI validation usage\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: sh\n\n   $ fhir-validator --path data/samples/fhir --action identify\n   Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY\n\n   # Performing a google import\n   $ gcloud healthcare fhir-stores import gcs fhir-store \\\n       --dataset=fhir-dataset \\\n       --gcs-uri=gs://$BUCKET_NAME/*.json \\\n       --content-structure=bundle-pretty\n\nInstallation\n------------\n\nYou can install ``fhir-validator`` using either ``pip`` or ``Poetry``.\n\nUsing pip\n~~~~~~~~~\n\n.. code:: bash\n\n   pip install fhir-validator\n\nUsing Poetry\n~~~~~~~~~~~~\n\n.. code:: bash\n\n   poetry add fhir-validator\n\nCLI Usage\n---------\n\nOnce installed, you can use the ``fhir-validator`` CLI to validate FHIR\nfiles or identify their content structure.\n\n.. code:: sh\n\n   $ fhir-validator --help\n   usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]\n\n   FHIR Bundle Validator and Content Structure Identifier\n\n   optional arguments:\n     -h, --help            show this help message and exit\n     --path PATH           File or directory path to validate or identify content structure\n     --action {validate,identify}\n                           Action to perform: validate the FHIR bundles or identify the content structure\n     --chunk-size CHUNK_SIZE\n                           Number of entries per chunk for validation (default: 100)\n\nValidate a FHIR File:\n~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n   fhir-validator --path path/to/fhir_file.json --action validate\n\nIdentify the Content Structure:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n   fhir-validator --path path/to/fhir_file.json --action identify\n\nThis will return\n\n+---------+------------------------------------------------------------+\n| FLAG    | Description                                                |\n+=========+============================================================+\n| ``B     | The source file contains one or more lines of              |\n| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle,    |\n|         | which contains one or more resources. If you don\u2019t specify |\n|         | ContentStructure, it defaults to BUNDLE.                   |\n+---------+------------------------------------------------------------+\n| ``RES   | The source file contains one or more lines of              |\n| OURCE`` | newline-delimited JSON (ndjson). Each line is a single     |\n|         | resource.                                                  |\n+---------+------------------------------------------------------------+\n| ``RES   | The entire source file is one JSON resource. The JSON can  |\n| OURCE_P | span multiple lines.                                       |\n| RETTY`` |                                                            |\n+---------+------------------------------------------------------------+\n| ``B     | The entire source file is one JSON bundle. The JSON can    |\n| UNDLE_P | span multiple lines.                                       |\n| RETTY`` |                                                            |\n+---------+------------------------------------------------------------+\n\nOptions:\n~~~~~~~~\n\n-  ``--path``: Specify the file or directory path to validate or\n   identify.\n-  ``--action``: Choose ``validate`` to validate the file or\n   ``identify`` to determine the content structure.\n-  ``--chunk-size``: (Optional) Number of entries per chunk for\n   validation, defaults to 100.\n\nChunk size\n~~~~~~~~~~\n\nBreaks the file into it\u2019s entry components allowing for faster\nvalidation against chunks of the json files.\n\nIntegration\n-----------\n\nYou can also use ``fhir-validator`` directly in your Python code. Here\u2019s\nan example of how to integrate the validation or content structure\nidentification into a Python project:\n\nExample: Validate a FHIR File\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n   from fhir_validator import (compile_fhir_schema, \n                               identify_content_structure, \n                               load_consolidated_fhir_schema,\n                               validate_fhir_bundle_in_chunks,\n                               BUNDLE_PRETTY) \n   import json                            \n\n   file_path = \"data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json\"\n   content_structure = identify_content_structure(file_path)\n\n   print(f\"Content structure: {content_structure}\")\n\n   # By default loads the r4 schema\n   schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')\n   compiled_validator = compile_fhir_schema(schema_json)\n\n   # If content structure is a bundle, validate it\n   if content_structure == BUNDLE_PRETTY:\n       with open(file_path, 'r') as f:\n           bundle = json.load(f)\n       is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)\n       print(f\"File : {file_path} is valid ? {is_valid}\")\n\nThis simple Python snippet demonstrates how to check the content\nstructure of a FHIR file and, if it\u2019s a ``BUNDLE_PRETTY``, how to\nvalidate its content.\n\n--------------\n\nDevelopment\n-----------\n\nTo contribute to the ``fhir-validator`` project, you\u2019ll need to install\nthe necessary dependencies, including the ``dev`` and ``test`` groups\nfor development tools and testing. The ``pre-commit`` hooks are part of\nthe ``dev`` group, and ``pytest`` is part of the ``test`` group.\n\nSetting Up Your Development Environment\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. **Clone the repository**:\n\n   .. code:: bash\n\n      git clone https://github.com/thevgergroup/fhir-validator.git\n      cd fhir-validator\n\n2. **Install dependencies using Poetry**:\n\n   Install both the ``dev`` and ``test`` groups to ensure you have all\n   the necessary tools for development and testing:\n\n   .. code:: bash\n\n      poetry install --with dev,test\n\n   This command installs the base dependencies along with the ``dev``\n   group (which includes tools like ``pre-commit``) and the ``test``\n   group (which includes tools like ``pytest``).\n\n   We use pandoc to generate the README.rst for pypi to ensure links are\n   correctly structured see [Installing\n   Pandoc](https://pandoc.org/installing.html] Update the any necessary\n   changes in ``README.md`` and the pre-commit hook will perform the\n   conversion.\n\n3. **Install the Pre-commit Hooks**:\n\n   The project uses ``pre-commit`` to automate tasks such as converting\n   ``README.md`` to ``README.rst`` before commits. To set up the\n   pre-commit hooks locally, run:\n\n   .. code:: bash\n\n      poetry run pre-commit install\n\n   This will configure the Git hooks to automatically run when you make\n   a commit.\n\nTests\n~~~~~\n\nWe use pytest see the unit tests in ``tests``\n\n.. code:: sh\n\n   poetry run pytest\n\n.. _fhir-validator-1:\n\nFHIR Validator\n==============\n\n-  `FHIR Validator <#fhir-validator>`__\n\n   -  `Background <#background>`__\n   -  `Objective <#objective>`__\n\n      -  `Example: CLI validation\n         usage <#example-cli-validation-usage>`__\n\n   -  `Installation <#installation>`__\n\n      -  `Using pip <#using-pip>`__\n      -  `Using Poetry <#using-poetry>`__\n\n   -  `CLI Usage <#cli-usage>`__\n\n      -  `Validate a FHIR File: <#validate-a-fhir-file>`__\n      -  `Identify the Content\n         Structure: <#identify-the-content-structure>`__\n      -  `Options: <#options>`__\n      -  `Chunk size <#chunk-size>`__\n\n   -  `Integration <#integration>`__\n\n      -  `Example: Validate a FHIR\n         File <#example-validate-a-fhir-file>`__\n\n   -  `Development <#development>`__\n\n      -  `Setting Up Your Development\n         Environment <#setting-up-your-development-environment>`__\n      -  `Tests <#tests>`__\n\n.. _background-1:\n\nBackground\n----------\n\nWhile testing Google\u2019s FHIR Store and following the provided\ndocumentation, we encountered an issue where the import process wasn\u2019t\nworking as expected. A great tool from MITRE called\n`Synthea <https://github.com/synthetichealth/synthea/>`__ generates\nsynthetic patient FHIR records, and it\u2019s even recommended by Google in\ntheir examples. However, either due to unclear documentation or our\noversight, the import of this generated data failed. After struggling\nwith over 60,000 \u201cinvalid JSON\u201d error messages in Google Healthcare, we\nrealized we were missing a crucial content-structure flag. It took us an\nentire day to figure out the issue.\n\nThis got us thinking\u2014what happens when you have an ETL process dealing\nwith hundreds of thousands of files?\n\nWe explored existing FHIR validation tools, including those from HL7.\nHowever, we found that even for a small 2MB patient file, some\nvalidators took up to 6 minutes and produced over 1,000 warnings and\nerrors\u2014most of which were related to external terminologies and content\nthat was valid and parsable by the FHIR store.\n\nThis led us to develop a simple validator designed to quickly check if\nyour FHIR files conform to the FHIR R4 schema. The goal is to quickly\nreject problematic files before they clutter your logs and overwhelm\nyour monitoring systems.\n\n.. _objective-1:\n\nObjective\n---------\n\nThe objective of ``fhir-validator`` is to quickly and efficiently\nvalidate FHIR (Fast Healthcare Interoperability Resources) files i\nagainst the FHIR schema for structure.\n\nMost validators are rules based delving deep into contents of the FHIR\nmessages, and are often embedded directly into FHIR stores of software\nused to process FHIR messages and are heavily verbose.\n\nThis is meant to be a lightweight fast validation ensure conformity\nagainst the FHIR structure.\n\nThis script also identifies the FHIR messages content structure used\nprimarily in Google FHIR Store. (e.g., ``BUNDLE``, ``RESOURCE``,\n``BUNDLE_PRETTY``, ``RESOURCE_PRETTY``)\n\nAllowing you to determine the appropriate switch for import\n\n.. _example-cli-validation-usage-1:\n\nExample: CLI validation usage\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: sh\n\n   $ fhir-validator --path data/samples/fhir --action identify\n   Content structure of data/samples/fhir/practitionerInformation1728333795898.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/hospitalInformation1728333795898.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/Maricela194_Heidenreich818_9a998c27-9e98-29c2-8878-e214c9cef5ed.json: BUNDLE_PRETTY\n   Content structure of data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json: BUNDLE_PRETTY\n\n   # Performing a google import\n   $ gcloud healthcare fhir-stores import gcs fhir-store \\\n       --dataset=fhir-dataset \\\n       --gcs-uri=gs://$BUCKET_NAME/*.json \\\n       --content-structure=bundle-pretty\n\n.. _installation-1:\n\nInstallation\n------------\n\nYou can install ``fhir-validator`` using either ``pip`` or ``Poetry``.\n\n.. _using-pip-1:\n\nUsing pip\n~~~~~~~~~\n\n.. code:: bash\n\n   pip install fhir-validator\n\n.. _using-poetry-1:\n\nUsing Poetry\n~~~~~~~~~~~~\n\n.. code:: bash\n\n   poetry add fhir-validator\n\n.. _cli-usage-1:\n\nCLI Usage\n---------\n\nOnce installed, you can use the ``fhir-validator`` CLI to validate FHIR\nfiles or identify their content structure.\n\n.. code:: sh\n\n   $ fhir-validator --help\n   usage: fhir-validator [-h] [--path PATH] [--action {validate,identify}] [--chunk-size CHUNK_SIZE]\n\n   FHIR Bundle Validator and Content Structure Identifier\n\n   optional arguments:\n     -h, --help            show this help message and exit\n     --path PATH           File or directory path to validate or identify content structure\n     --action {validate,identify}\n                           Action to perform: validate the FHIR bundles or identify the content structure\n     --chunk-size CHUNK_SIZE\n                           Number of entries per chunk for validation (default: 100)\n\n.. _validate-a-fhir-file-1:\n\nValidate a FHIR File:\n~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n   fhir-validator --path path/to/fhir_file.json --action validate\n\n.. _identify-the-content-structure-1:\n\nIdentify the Content Structure:\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: bash\n\n   fhir-validator --path path/to/fhir_file.json --action identify\n\nThis will return\n\n+---------+------------------------------------------------------------+\n| FLAG    | Description                                                |\n+=========+============================================================+\n| ``B     | The source file contains one or more lines of              |\n| UNDLE`` | newline-delimited JSON (ndjson). Each line is a bundle,    |\n|         | which contains one or more resources. If you don\u2019t specify |\n|         | ContentStructure, it defaults to BUNDLE.                   |\n+---------+------------------------------------------------------------+\n| ``RES   | The source file contains one or more lines of              |\n| OURCE`` | newline-delimited JSON (ndjson). Each line is a single     |\n|         | resource.                                                  |\n+---------+------------------------------------------------------------+\n| ``RES   | The entire source file is one JSON resource. The JSON can  |\n| OURCE_P | span multiple lines.                                       |\n| RETTY`` |                                                            |\n+---------+------------------------------------------------------------+\n| ``B     | The entire source file is one JSON bundle. The JSON can    |\n| UNDLE_P | span multiple lines.                                       |\n| RETTY`` |                                                            |\n+---------+------------------------------------------------------------+\n\n.. _options-1:\n\nOptions:\n~~~~~~~~\n\n-  ``--path``: Specify the file or directory path to validate or\n   identify.\n-  ``--action``: Choose ``validate`` to validate the file or\n   ``identify`` to determine the content structure.\n-  ``--chunk-size``: (Optional) Number of entries per chunk for\n   validation, defaults to 100.\n\n.. _chunk-size-1:\n\nChunk size\n~~~~~~~~~~\n\nBreaks the file into it\u2019s entry components allowing for faster\nvalidation against chunks of the json files.\n\n.. _integration-1:\n\nIntegration\n-----------\n\nYou can also use ``fhir-validator`` directly in your Python code. Here\u2019s\nan example of how to integrate the validation or content structure\nidentification into a Python project:\n\n.. _example-validate-a-fhir-file-1:\n\nExample: Validate a FHIR File\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n.. code:: python\n\n   from fhir_validator import (compile_fhir_schema, \n                               identify_content_structure, \n                               load_consolidated_fhir_schema,\n                               validate_fhir_bundle_in_chunks,\n                               BUNDLE_PRETTY) \n   import json                            \n\n   file_path = \"data/samples/fhir/Laquanda221_Haag279_84a90023-0c6b-0eb9-95d6-50861e13f9b3.json\"\n   content_structure = identify_content_structure(file_path)\n\n   print(f\"Content structure: {content_structure}\")\n\n   # By default loads the r4 schema\n   schema_json = load_consolidated_fhir_schema('schemas/r4/fhir.schema.json')\n   compiled_validator = compile_fhir_schema(schema_json)\n\n   # If content structure is a bundle, validate it\n   if content_structure == BUNDLE_PRETTY:\n       with open(file_path, 'r') as f:\n           bundle = json.load(f)\n       is_valid = validate_fhir_bundle_in_chunks(bundle, compiled_validator)\n       print(f\"File : {file_path} is valid ? {is_valid}\")\n\nThis simple Python snippet demonstrates how to check the content\nstructure of a FHIR file and, if it\u2019s a ``BUNDLE_PRETTY``, how to\nvalidate its content.\n\n--------------\n\n.. _development-1:\n\nDevelopment\n-----------\n\nTo contribute to the ``fhir-validator`` project, you\u2019ll need to install\nthe necessary dependencies, including the ``dev`` and ``test`` groups\nfor development tools and testing. The ``pre-commit`` hooks are part of\nthe ``dev`` group, and ``pytest`` is part of the ``test`` group.\n\n.. _setting-up-your-development-environment-1:\n\nSetting Up Your Development Environment\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\n1. **Clone the repository**:\n\n   .. code:: bash\n\n      git clone https://github.com/thevgergroup/fhir-validator.git\n      cd fhir-validator\n\n2. **Install dependencies using Poetry**:\n\n   Install both the ``dev`` and ``test`` groups to ensure you have all\n   the necessary tools for development and testing:\n\n   .. code:: bash\n\n      poetry install --with dev,test\n\n   This command installs the base dependencies along with the ``dev``\n   group (which includes tools like ``pre-commit``) and the ``test``\n   group (which includes tools like ``pytest``).\n\n   We use pandoc to generate the README.rst for pypi to ensure links are\n   correctly structured see [Installing\n   Pandoc](https://pandoc.org/installing.html] Update the any necessary\n   changes in ``README.md`` and the pre-commit hook will perform the\n   conversion.\n\n3. **Install the Pre-commit Hooks**:\n\n   The project uses ``pre-commit`` to automate tasks such as converting\n   ``README.md`` to ``README.rst`` before commits. To set up the\n   pre-commit hooks locally, run:\n\n   .. code:: bash\n\n      poetry run pre-commit install\n\n   This will configure the Git hooks to automatically run when you make\n   a commit.\n\n.. _tests-1:\n\nTests\n~~~~~\n\nWe use pytest see the unit tests in ``tests``\n\n.. code:: sh\n\n   poetry run pytest\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "FHIR Validator and Identifier for resource vs bundle type",
    "version": "0.2.2",
    "project_urls": {
        "Homepage": "https://github.com/thevgergroup/fhir-validator",
        "Repository": "https://github.com/thevgergroup/fhir-validator.git"
    },
    "split_keywords": [
        "fhir",
        " hl7",
        " validator",
        " identifier"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8a6606c9e7e39f30b1d98bad6b96e181941781819206fd0e62a41fcc9b812a4d",
                "md5": "7c4f2ddec4197b7c49f1b456047caccb",
                "sha256": "805e026e4d9504bb9df34372f32a4b5f326c17f39046979f24b520b830076f52"
            },
            "downloads": -1,
            "filename": "fhir_validator-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7c4f2ddec4197b7c49f1b456047caccb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.9",
            "size": 9464,
            "upload_time": "2024-10-09T19:07:51",
            "upload_time_iso_8601": "2024-10-09T19:07:51.712476Z",
            "url": "https://files.pythonhosted.org/packages/8a/66/06c9e7e39f30b1d98bad6b96e181941781819206fd0e62a41fcc9b812a4d/fhir_validator-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3c6642ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015",
                "md5": "4e43f3f48858ff7d62e27bada090a140",
                "sha256": "747171da268104cd8f2a874b03fc1c6bc84ab0c1054b758b19e0372114ce0bfa"
            },
            "downloads": -1,
            "filename": "fhir_validator-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "4e43f3f48858ff7d62e27bada090a140",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.9",
            "size": 8672,
            "upload_time": "2024-10-09T19:07:52",
            "upload_time_iso_8601": "2024-10-09T19:07:52.873964Z",
            "url": "https://files.pythonhosted.org/packages/3c/66/42ac644f81553ff229e0aee31eca22c74cfbfd511a69f46d26e4577da015/fhir_validator-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-09 19:07:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "thevgergroup",
    "github_project": "fhir-validator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fhir-validator"
}

patrick o'leary