healthchain


Namehealthchain JSON
Version 0.6.1 PyPI version JSON
download
home_pageNone
SummaryRemarkably simple testing and validation of AI/NLP applications in healthcare context.
upload_time2024-11-27 15:12:14
maintainerNone
docs_urlNone
authorJennifer Jiang-Kells
requires_python<3.12,>=3.8
licenseApache-2.0
keywords nlp ai llm healthcare ehr mlops
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center" style="margin-bottom: 1em;">

# HealthChain πŸ’« πŸ₯

<img src="https://raw.githubusercontent.com/dotimplement/HealthChain/main/docs/assets/images/healthchain_logo.png" alt="HealthChain Logo" width=300></img>

![GitHub License](https://img.shields.io/github/license/dotimplement/HealthChain)
![PyPI Version](https://img.shields.io/pypi/v/healthchain) ![Python Versions](https://img.shields.io/pypi/pyversions/healthchain)
![Downloads](https://img.shields.io/pypi/dm/healthchain)

</div>

Simplify developing, testing and validating AI and NLP applications in a healthcare context πŸ’« πŸ₯.

Building applications that integrate with electronic health record systems (EHRs) is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that.

```bash
pip install healthchain
```
First time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page!

Came here from NHS RPySOC 2024 ✨? [CDS sandbox walkthrough](https://dotimplement.github.io/HealthChain/cookbook/cds_sandbox/)

## Features
- [x] πŸ› οΈ Build custom pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks
- [x] πŸ—οΈ Add built-in [CDA and FHIR parsers](https://dotimplement.github.io/HealthChain/reference/utilities/cda_parser/) to connect your pipeline to interoperability standards
- [x] πŸ§ͺ Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments
- [x] πŸ—ƒοΈ Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development
- [x] πŸš€ Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/)

## Why use HealthChain?
-  **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations.
-  **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models.
-  [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data.
- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.

## Pipeline
Pipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily integrate with complex healthcare systems.

### Building a pipeline

```python
from healthchain.io.containers import Document
from healthchain.pipeline import Pipeline
from healthchain.pipeline.components import TextPreProcessor, SpacyNLP, TextPostProcessor

# Initialize the pipeline
nlp_pipeline = Pipeline[Document]()

# Add TextPreProcessor component
preprocessor = TextPreProcessor(tokenizer="spacy")
nlp_pipeline.add_node(preprocessor)

# Add Model component (assuming we have a pre-trained model)
spacy_nlp = SpacyNLP.from_model_id("en_core_sci_md", source="spacy")
nlp_pipeline.add_node(spacy_nlp)

# Add TextPostProcessor component
postprocessor = TextPostProcessor(
    postcoordination_lookup={
        "heart attack": "myocardial infarction",
        "high blood pressure": "hypertension"
    }
)
nlp_pipeline.add_node(postprocessor)

# Build the pipeline
nlp = nlp_pipeline.build()

# Use the pipeline
result = nlp(Document("Patient has a history of heart attack and high blood pressure."))

print(f"Entities: {result.nlp.spacy_doc.ents}")
```

#### Adding connectors
Connectors give your pipelines the ability to interface with EHRs.

```python
from healthchain.io import CdaConnector
from healthchain.models import CdaRequest

cda_connector = CdaConnector()

pipeline.add_input(cda_connector)
pipeline.add_output(cda_connector)

pipe = pipeline.build()

cda_data = CdaRequest(document="<CDA XML content>")
output = pipe(cda_data)
```

### Using pre-built pipelines
Pre-built pipelines are use case specific end-to-end workflows that already have connectors and models built-in.

```python
from healthchain.pipeline import MedicalCodingPipeline
from healthchain.models import CdaRequest

# Load from model ID
pipeline = MedicalCodingPipeline.from_model_id(
    model="blaze999/Medical-NER", task="token-classification", source="huggingface"
)

# Or load from local model
pipeline = MedicalCodingPipeline.from_local_model("./path/to/model", source="spacy")

cda_data = CdaRequest(document="<CDA XML content>")
output = pipeline(cda_data)
```


## Sandbox

Sandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context.

### Clinical Decision Support (CDS)
[CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.

**When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.

**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for.

**What information is returned**: β€œcards” displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.


```python
import healthchain as hc

from healthchain.pipeline import SummarizationPipeline
from healthchain.use_cases import ClinicalDecisionSupport
from healthchain.models import Card, CdsFhirData, CDSRequest
from healthchain.data_generator import CdsDataGenerator
from typing import List

@hc.sandbox
class MyCDS(ClinicalDecisionSupport):
    def __init__(self) -> None:
        self.pipeline = SummarizationPipeline.from_model_id(
            "facebook/bart-large-cnn", source="huggingface"
        )
        self.data_generator = CdsDataGenerator()

    # Sets up an instance of a mock EHR client of the specified workflow
    @hc.ehr(workflow="encounter-discharge")
    def ehr_database_client(self) -> CdsFhirData:
        return self.data_generator.generate()

    # Define your application logic here
    @hc.api
    def my_service(self, data: CDSRequest) -> CDSRequest:
        result = self.pipeline(data)
        return result
```

### Clinical Documentation

The `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support.

**When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.

**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR.

```python
import healthchain as hc

from healthchain.pipeline import MedicalCodingPipeline
from healthchain.use_cases import ClinicalDocumentation
from healthchain.models import CcdData, CdaRequest, CdaResponse

@hc.sandbox
class NotereaderSandbox(ClinicalDocumentation):
    def __init__(self):
        self.pipeline = MedicalCodingPipeline.from_model_id(
            "en_core_sci_md", source="spacy"
        )

    # Load an existing CDA file
    @hc.ehr(workflow="sign-note-inpatient")
    def load_data_in_client(self) -> CcdData:
        with open("/path/to/cda/data.xml", "r") as file:
            xml_string = file.read()

        return CcdData(cda_xml=xml_string)

    @hc.api
    def my_service(self, data: CdaRequest) -> CdaResponse:
        annotated_ccd = self.pipeline(data)
        return annotated_ccd
```
### Running a sandbox

Ensure you run the following commands in your `mycds.py` file:

```python
cds = MyCDS()
cds.run_sandbox()
```
This will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory.

Then run:
```bash
healthchain run mycds.py
```
By default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`.

## Road Map
- [ ] πŸŽ›οΈ Versioning and artifact management for pipelines sandbox EHR configurations
- [ ] ❓ Testing and evaluation framework for pipelines and use cases
- [ ] 🧠 Multi-modal pipelines that that have built-in NLP to utilize unstructured data
- [ ] ✨ Improvements to synthetic data generator methods
- [ ] πŸ‘Ύ Frontend UI for EHR client and visualization features
- [ ] πŸš€ Production deployment options

## Contribute
We are always eager to hear feedback and suggestions, especially if you are a developer or researcher working with healthcare systems!
- πŸ’‘ Let's chat! [Discord](https://discord.gg/UQC6uAepUz)
- πŸ› οΈ [Contribution Guidelines](CONTRIBUTING.md)

## Acknowledgement
This repository makes use of CDS Hooks developed by Boston Children’s Hospital.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "healthchain",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.8",
    "maintainer_email": null,
    "keywords": "nlp, ai, llm, healthcare, ehr, mlops",
    "author": "Jennifer Jiang-Kells",
    "author_email": "jenniferjiangkells@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/cb/a1/23c5fdff971beda78b6a69f76a893ea46ad38dd8c1bacf86773a7150b9cc/healthchain-0.6.1.tar.gz",
    "platform": null,
    "description": "<div align=\"center\" style=\"margin-bottom: 1em;\">\n\n# HealthChain \ud83d\udcab \ud83c\udfe5\n\n<img src=\"https://raw.githubusercontent.com/dotimplement/HealthChain/main/docs/assets/images/healthchain_logo.png\" alt=\"HealthChain Logo\" width=300></img>\n\n![GitHub License](https://img.shields.io/github/license/dotimplement/HealthChain)\n![PyPI Version](https://img.shields.io/pypi/v/healthchain) ![Python Versions](https://img.shields.io/pypi/pyversions/healthchain)\n![Downloads](https://img.shields.io/pypi/dm/healthchain)\n\n</div>\n\nSimplify developing, testing and validating AI and NLP applications in a healthcare context \ud83d\udcab \ud83c\udfe5.\n\nBuilding applications that integrate with electronic health record systems (EHRs) is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that.\n\n```bash\npip install healthchain\n```\nFirst time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page!\n\nCame here from NHS RPySOC 2024 \u2728? [CDS sandbox walkthrough](https://dotimplement.github.io/HealthChain/cookbook/cds_sandbox/)\n\n## Features\n- [x] \ud83d\udee0\ufe0f Build custom pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks\n- [x] \ud83c\udfd7\ufe0f Add built-in [CDA and FHIR parsers](https://dotimplement.github.io/HealthChain/reference/utilities/cda_parser/) to connect your pipeline to interoperability standards\n- [x] \ud83e\uddea Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments\n- [x] \ud83d\uddc3\ufe0f Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development\n- [x] \ud83d\ude80 Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/)\n\n## Why use HealthChain?\n-  **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations.\n-  **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models.\n-  [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data.\n- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.\n\n## Pipeline\nPipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily integrate with complex healthcare systems.\n\n### Building a pipeline\n\n```python\nfrom healthchain.io.containers import Document\nfrom healthchain.pipeline import Pipeline\nfrom healthchain.pipeline.components import TextPreProcessor, SpacyNLP, TextPostProcessor\n\n# Initialize the pipeline\nnlp_pipeline = Pipeline[Document]()\n\n# Add TextPreProcessor component\npreprocessor = TextPreProcessor(tokenizer=\"spacy\")\nnlp_pipeline.add_node(preprocessor)\n\n# Add Model component (assuming we have a pre-trained model)\nspacy_nlp = SpacyNLP.from_model_id(\"en_core_sci_md\", source=\"spacy\")\nnlp_pipeline.add_node(spacy_nlp)\n\n# Add TextPostProcessor component\npostprocessor = TextPostProcessor(\n    postcoordination_lookup={\n        \"heart attack\": \"myocardial infarction\",\n        \"high blood pressure\": \"hypertension\"\n    }\n)\nnlp_pipeline.add_node(postprocessor)\n\n# Build the pipeline\nnlp = nlp_pipeline.build()\n\n# Use the pipeline\nresult = nlp(Document(\"Patient has a history of heart attack and high blood pressure.\"))\n\nprint(f\"Entities: {result.nlp.spacy_doc.ents}\")\n```\n\n#### Adding connectors\nConnectors give your pipelines the ability to interface with EHRs.\n\n```python\nfrom healthchain.io import CdaConnector\nfrom healthchain.models import CdaRequest\n\ncda_connector = CdaConnector()\n\npipeline.add_input(cda_connector)\npipeline.add_output(cda_connector)\n\npipe = pipeline.build()\n\ncda_data = CdaRequest(document=\"<CDA XML content>\")\noutput = pipe(cda_data)\n```\n\n### Using pre-built pipelines\nPre-built pipelines are use case specific end-to-end workflows that already have connectors and models built-in.\n\n```python\nfrom healthchain.pipeline import MedicalCodingPipeline\nfrom healthchain.models import CdaRequest\n\n# Load from model ID\npipeline = MedicalCodingPipeline.from_model_id(\n    model=\"blaze999/Medical-NER\", task=\"token-classification\", source=\"huggingface\"\n)\n\n# Or load from local model\npipeline = MedicalCodingPipeline.from_local_model(\"./path/to/model\", source=\"spacy\")\n\ncda_data = CdaRequest(document=\"<CDA XML content>\")\noutput = pipeline(cda_data)\n```\n\n\n## Sandbox\n\nSandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context.\n\n### Clinical Decision Support (CDS)\n[CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.\n\n**When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.\n\n**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for.\n\n**What information is returned**: \u201ccards\u201d displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.\n\n\n```python\nimport healthchain as hc\n\nfrom healthchain.pipeline import SummarizationPipeline\nfrom healthchain.use_cases import ClinicalDecisionSupport\nfrom healthchain.models import Card, CdsFhirData, CDSRequest\nfrom healthchain.data_generator import CdsDataGenerator\nfrom typing import List\n\n@hc.sandbox\nclass MyCDS(ClinicalDecisionSupport):\n    def __init__(self) -> None:\n        self.pipeline = SummarizationPipeline.from_model_id(\n            \"facebook/bart-large-cnn\", source=\"huggingface\"\n        )\n        self.data_generator = CdsDataGenerator()\n\n    # Sets up an instance of a mock EHR client of the specified workflow\n    @hc.ehr(workflow=\"encounter-discharge\")\n    def ehr_database_client(self) -> CdsFhirData:\n        return self.data_generator.generate()\n\n    # Define your application logic here\n    @hc.api\n    def my_service(self, data: CDSRequest) -> CDSRequest:\n        result = self.pipeline(data)\n        return result\n```\n\n### Clinical Documentation\n\nThe `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support.\n\n**When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.\n\n**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR.\n\n```python\nimport healthchain as hc\n\nfrom healthchain.pipeline import MedicalCodingPipeline\nfrom healthchain.use_cases import ClinicalDocumentation\nfrom healthchain.models import CcdData, CdaRequest, CdaResponse\n\n@hc.sandbox\nclass NotereaderSandbox(ClinicalDocumentation):\n    def __init__(self):\n        self.pipeline = MedicalCodingPipeline.from_model_id(\n            \"en_core_sci_md\", source=\"spacy\"\n        )\n\n    # Load an existing CDA file\n    @hc.ehr(workflow=\"sign-note-inpatient\")\n    def load_data_in_client(self) -> CcdData:\n        with open(\"/path/to/cda/data.xml\", \"r\") as file:\n            xml_string = file.read()\n\n        return CcdData(cda_xml=xml_string)\n\n    @hc.api\n    def my_service(self, data: CdaRequest) -> CdaResponse:\n        annotated_ccd = self.pipeline(data)\n        return annotated_ccd\n```\n### Running a sandbox\n\nEnsure you run the following commands in your `mycds.py` file:\n\n```python\ncds = MyCDS()\ncds.run_sandbox()\n```\nThis will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory.\n\nThen run:\n```bash\nhealthchain run mycds.py\n```\nBy default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`.\n\n## Road Map\n- [ ] \ud83c\udf9b\ufe0f Versioning and artifact management for pipelines sandbox EHR configurations\n- [ ] \u2753 Testing and evaluation framework for pipelines and use cases\n- [ ] \ud83e\udde0 Multi-modal pipelines that that have built-in NLP to utilize unstructured data\n- [ ] \u2728 Improvements to synthetic data generator methods\n- [ ] \ud83d\udc7e Frontend UI for EHR client and visualization features\n- [ ] \ud83d\ude80 Production deployment options\n\n## Contribute\nWe are always eager to hear feedback and suggestions, especially if you are a developer or researcher working with healthcare systems!\n- \ud83d\udca1 Let's chat! [Discord](https://discord.gg/UQC6uAepUz)\n- \ud83d\udee0\ufe0f [Contribution Guidelines](CONTRIBUTING.md)\n\n## Acknowledgement\nThis repository makes use of CDS Hooks developed by Boston Children\u2019s Hospital.\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Remarkably simple testing and validation of AI/NLP applications in healthcare context.",
    "version": "0.6.1",
    "project_urls": {
        "Documentation": "https://dotimplement.github.io/HealthChain/"
    },
    "split_keywords": [
        "nlp",
        " ai",
        " llm",
        " healthcare",
        " ehr",
        " mlops"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f81149f317c7814df847d8d33405ef5fc2478b8a447661dab40afbf0fcbcaeb",
                "md5": "6592dc343640e1623db5594c2ee192a1",
                "sha256": "2e6f8a66b8328a17324c4fd4de12e80edc405e861ffe5461306b81b939224b32"
            },
            "downloads": -1,
            "filename": "healthchain-0.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6592dc343640e1623db5594c2ee192a1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.8",
            "size": 151544,
            "upload_time": "2024-11-27T15:12:13",
            "upload_time_iso_8601": "2024-11-27T15:12:13.124587Z",
            "url": "https://files.pythonhosted.org/packages/3f/81/149f317c7814df847d8d33405ef5fc2478b8a447661dab40afbf0fcbcaeb/healthchain-0.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cba123c5fdff971beda78b6a69f76a893ea46ad38dd8c1bacf86773a7150b9cc",
                "md5": "31f96ba0f7a50af7de0bbdad074aba21",
                "sha256": "2d664066e4205e12e98175e83d4eecce0c685c93e013ee608967121a67b62d4e"
            },
            "downloads": -1,
            "filename": "healthchain-0.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "31f96ba0f7a50af7de0bbdad074aba21",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.8",
            "size": 107523,
            "upload_time": "2024-11-27T15:12:14",
            "upload_time_iso_8601": "2024-11-27T15:12:14.681113Z",
            "url": "https://files.pythonhosted.org/packages/cb/a1/23c5fdff971beda78b6a69f76a893ea46ad38dd8c1bacf86773a7150b9cc/healthchain-0.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-27 15:12:14",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "healthchain"
}
        
Elapsed time: 1.16448s