<div align="center" style="margin-bottom: 1em;">
# HealthChain π« π₯
<img src="https://raw.githubusercontent.com/dotimplement/HealthChain/main/docs/assets/images/healthchain_logo.png" alt="HealthChain Logo" width=300></img>
![GitHub License](https://img.shields.io/github/license/dotimplement/HealthChain)
![PyPI Version](https://img.shields.io/pypi/v/healthchain) ![Python Versions](https://img.shields.io/pypi/pyversions/healthchain)
![Downloads](https://img.shields.io/pypi/dm/healthchain)
</div>
Simplify developing, testing and validating AI and NLP applications in a healthcare context π« π₯.
Building applications that integrate with electronic health record systems (EHRs) is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that.
```bash
pip install healthchain
```
First time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page!
## Features
- [x] π οΈ Build custom pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks
- [x] ποΈ Add built-in CDA and FHIR parsers to connect your pipeline to interoperability standards
- [x] π§ͺ Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments
- [x] ποΈ Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development
- [x] π Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/)
## Why use HealthChain?
- **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations.
- **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models.
- [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data.
- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.
## Pipeline
Pipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily interface with parsers and connectors to integrate with EHRs.
### Building a pipeline
```python
from healthchain.io.containers import Document
from healthchain.pipeline import Pipeline
from healthchain.pipeline.components import TextPreProcessor, Model, TextPostProcessor
# Initialize the pipeline
nlp_pipeline = Pipeline[Document]()
# Add TextPreProcessor component
preprocessor = TextPreProcessor(tokenizer="spacy")
nlp_pipeline.add(preprocessor)
# Add Model component (assuming we have a pre-trained model)
model = Model(model_path="path/to/pretrained/model")
nlp_pipeline.add(model)
# Add TextPostProcessor component
postprocessor = TextPostProcessor(
postcoordination_lookup={
"heart attack": "myocardial infarction",
"high blood pressure": "hypertension"
}
)
nlp_pipeline.add(postprocessor)
# Build the pipeline
nlp = nlp_pipeline.build()
# Use the pipeline
result = nlp(Document("Patient has a history of heart attack and high blood pressure."))
print(f"Entities: {result.entities}")
```
### Using pre-built pipelines
```python
from healthchain.io.containers import Document
from healthchain.pipeline import MedicalCodingPipeline
# Load the pre-built MedicalCodingPipeline
pipeline = MedicalCodingPipeline.load("./path/to/model")
# Create a document to process
result = pipeline(Document("Patient has a history of myocardial infarction and hypertension."))
print(f"Entities: {result.entities}")
```
## Sandbox
Sandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context.
### Clinical Decision Support (CDS)
[CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.
**When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.
**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for.
**What information is returned**: βcardsβ displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.
```python
import healthchain as hc
from healthchain.pipeline import Pipeline
from healthchain.use_cases import ClinicalDecisionSupport
from healthchain.models import Card, CdsFhirData, CDSRequest
from healthchain.data_generator import CdsDataGenerator
from typing import List
@hc.sandbox
class MyCDS(ClinicalDecisionSupport):
def __init__(self) -> None:
self.pipeline = Pipeline.load("./path/to/model")
self.data_generator = CdsDataGenerator()
# Sets up an instance of a mock EHR client of the specified workflow
@hc.ehr(workflow="patient-view")
def ehr_database_client(self) -> CdsFhirData:
return self.data_generator.generate()
# Define your application logic here
@hc.api
def my_service(self, data: CDSRequest) -> List[Card]:
result = self.pipeline(data)
return [
Card(
summary="Welcome to our Clinical Decision Support service.",
detail=result.summary,
indicator="info"
)
]
```
### Clinical Documentation
The `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support.
**When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.
**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR.
```python
import healthchain as hc
from healthchain.pipeline import MedicalCodingPipeline
from healthchain.use_cases import ClinicalDocumentation
from healthchain.models import CcdData, ProblemConcept, Quantity,
@hc.sandbox
class NotereaderSandbox(ClinicalDocumentation):
def __init__(self):
self.pipeline = MedicalCodingPipeline.load("./path/to/model")
# Load an existing CDA file
@hc.ehr(workflow="sign-note-inpatient")
def load_data_in_client(self) -> CcdData:
with open("/path/to/cda/data.xml", "r") as file:
xml_string = file.read()
return CcdData(cda_xml=xml_string)
@hc.api
def my_service(self, ccd_data: CcdData) -> CcdData:
annotated_ccd = self.pipeline(ccd_data)
return annotated_ccd
```
### Running a sandbox
Ensure you run the following commands in your `mycds.py` file:
```python
cds = MyCDS()
cds.run_sandbox()
```
This will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory.
Then run:
```bash
healthchain run mycds.py
```
By default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`.
## Road Map
- [ ] ποΈ Versioning and artifact management for pipelines sandbox EHR configurations
- [ ] π€ Integrations with other pipeline libraries such as spaCy, HuggingFace, LangChain etc.
- [ ] β Testing and evaluation framework for pipelines and use cases
- [ ] π§ Multi-modal pipelines that that have built-in NLP to utilize unstructured data
- [ ] β¨ Improvements to synthetic data generator methods
- [ ] πΎ Frontend UI for EHR client and visualization features
- [ ] π Production deployment options
## Contribute
We are always eager to hear feedback and suggestions, especially if you are a developer or researcher working with healthcare systems!
- π‘ Let's chat! [Discord](https://discord.gg/UQC6uAepUz)
- π οΈ [Contribution Guidelines](CONTRIBUTING.md)
## Acknowledgement
This repository makes use of CDS Hooks developed by Boston Childrenβs Hospital.
Raw data
{
"_id": null,
"home_page": null,
"name": "healthchain",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.8",
"maintainer_email": null,
"keywords": "nlp, ai, llm, healthcare, ehr, mlops",
"author": "Jennifer Jiang-Kells",
"author_email": "jenniferjiangkells@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/0c/7d/e3643fc8972b1caf796ee3628c2bfcd01cfda59943e10bf77829b49b34a5/healthchain-0.4.0.tar.gz",
"platform": null,
"description": "<div align=\"center\" style=\"margin-bottom: 1em;\">\n\n# HealthChain \ud83d\udcab \ud83c\udfe5\n\n<img src=\"https://raw.githubusercontent.com/dotimplement/HealthChain/main/docs/assets/images/healthchain_logo.png\" alt=\"HealthChain Logo\" width=300></img>\n\n![GitHub License](https://img.shields.io/github/license/dotimplement/HealthChain)\n![PyPI Version](https://img.shields.io/pypi/v/healthchain) ![Python Versions](https://img.shields.io/pypi/pyversions/healthchain)\n![Downloads](https://img.shields.io/pypi/dm/healthchain)\n\n</div>\n\nSimplify developing, testing and validating AI and NLP applications in a healthcare context \ud83d\udcab \ud83c\udfe5.\n\nBuilding applications that integrate with electronic health record systems (EHRs) is complex, and so is designing reliable, reactive algorithms involving unstructured data. Let's try to change that.\n\n```bash\npip install healthchain\n```\nFirst time here? Check out our [Docs](https://dotimplement.github.io/HealthChain/) page!\n\n## Features\n- [x] \ud83d\udee0\ufe0f Build custom pipelines or use [pre-built ones](https://dotimplement.github.io/HealthChain/reference/pipeline/pipeline/#prebuilt) for your healthcare NLP and ML tasks\n- [x] \ud83c\udfd7\ufe0f Add built-in CDA and FHIR parsers to connect your pipeline to interoperability standards\n- [x] \ud83e\uddea Test your pipelines in full healthcare-context aware [sandbox](https://dotimplement.github.io/HealthChain/reference/sandbox/sandbox/) environments\n- [x] \ud83d\uddc3\ufe0f Generate [synthetic healthcare data](https://dotimplement.github.io/HealthChain/reference/utilities/data_generator/) for testing and development\n- [x] \ud83d\ude80 Deploy sandbox servers locally with [FastAPI](https://fastapi.tiangolo.com/)\n\n## Why use HealthChain?\n- **EHR integrations are manual and time-consuming** - HealthChain abstracts away complexities so you can focus on AI development, not EHR configurations.\n- **It's difficult to track and evaluate multiple integration instances** - HealthChain provides a framework to test the real-world resilience of your whole system, not just your models.\n- [**Most healthcare data is unstructured**](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6372467/) - HealthChain is optimized for real-time AI and NLP applications that deal with realistic healthcare data.\n- **Built by health tech developers, for health tech developers** - HealthChain is tech stack agnostic, modular, and easily extensible.\n\n## Pipeline\nPipelines provide a flexible way to build and manage processing pipelines for NLP and ML tasks that can easily interface with parsers and connectors to integrate with EHRs.\n\n### Building a pipeline\n\n```python\nfrom healthchain.io.containers import Document\nfrom healthchain.pipeline import Pipeline\nfrom healthchain.pipeline.components import TextPreProcessor, Model, TextPostProcessor\n\n# Initialize the pipeline\nnlp_pipeline = Pipeline[Document]()\n\n# Add TextPreProcessor component\npreprocessor = TextPreProcessor(tokenizer=\"spacy\")\nnlp_pipeline.add(preprocessor)\n\n# Add Model component (assuming we have a pre-trained model)\nmodel = Model(model_path=\"path/to/pretrained/model\")\nnlp_pipeline.add(model)\n\n# Add TextPostProcessor component\npostprocessor = TextPostProcessor(\n postcoordination_lookup={\n \"heart attack\": \"myocardial infarction\",\n \"high blood pressure\": \"hypertension\"\n }\n)\nnlp_pipeline.add(postprocessor)\n\n# Build the pipeline\nnlp = nlp_pipeline.build()\n\n# Use the pipeline\nresult = nlp(Document(\"Patient has a history of heart attack and high blood pressure.\"))\n\nprint(f\"Entities: {result.entities}\")\n```\n### Using pre-built pipelines\n\n```python\nfrom healthchain.io.containers import Document\nfrom healthchain.pipeline import MedicalCodingPipeline\n\n# Load the pre-built MedicalCodingPipeline\npipeline = MedicalCodingPipeline.load(\"./path/to/model\")\n\n# Create a document to process\nresult = pipeline(Document(\"Patient has a history of myocardial infarction and hypertension.\"))\n\nprint(f\"Entities: {result.entities}\")\n```\n\n## Sandbox\n\nSandboxes provide a staging environment for testing and validating your pipeline in a realistic healthcare context.\n\n### Clinical Decision Support (CDS)\n[CDS Hooks](https://cds-hooks.org/) is an [HL7](https://cds-hooks.hl7.org) published specification for clinical decision support.\n\n**When is this used?** CDS hooks are triggered at certain events during a clinician's workflow in an electronic health record (EHR), e.g. when a patient record is opened, when an order is elected.\n\n**What information is sent**: the context of the event and [FHIR](https://hl7.org/fhir/) resources that are requested by your service, for example, the patient ID and information on the encounter and conditions they are being seen for.\n\n**What information is returned**: \u201ccards\u201d displaying text, actionable suggestions, or links to launch a [SMART](https://smarthealthit.org/) app from within the workflow.\n\n\n```python\nimport healthchain as hc\n\nfrom healthchain.pipeline import Pipeline\nfrom healthchain.use_cases import ClinicalDecisionSupport\nfrom healthchain.models import Card, CdsFhirData, CDSRequest\nfrom healthchain.data_generator import CdsDataGenerator\nfrom typing import List\n\n@hc.sandbox\nclass MyCDS(ClinicalDecisionSupport):\n def __init__(self) -> None:\n self.pipeline = Pipeline.load(\"./path/to/model\")\n self.data_generator = CdsDataGenerator()\n\n # Sets up an instance of a mock EHR client of the specified workflow\n @hc.ehr(workflow=\"patient-view\")\n def ehr_database_client(self) -> CdsFhirData:\n return self.data_generator.generate()\n\n # Define your application logic here\n @hc.api\n def my_service(self, data: CDSRequest) -> List[Card]:\n result = self.pipeline(data)\n return [\n Card(\n summary=\"Welcome to our Clinical Decision Support service.\",\n detail=result.summary,\n indicator=\"info\"\n )\n ]\n```\n\n### Clinical Documentation\n\nThe `ClinicalDocumentation` use case implements a real-time Clinical Documentation Improvement (CDI) service. It helps convert free-text medical documentation into coded information that can be used for billing, quality reporting, and clinical decision support.\n\n**When is this used?** Triggered when a clinician opts in to a CDI functionality (e.g. Epic NoteReader) and signs or pends a note after writing it.\n\n**What information is sent**: A [CDA (Clinical Document Architecture)](https://www.hl7.org.uk/standards/hl7-standards/cda-clinical-document-architecture/) document which contains continuity of care data and free-text data, e.g. a patient's problem list and the progress note that the clinician has entered in the EHR.\n\n```python\nimport healthchain as hc\n\nfrom healthchain.pipeline import MedicalCodingPipeline\nfrom healthchain.use_cases import ClinicalDocumentation\nfrom healthchain.models import CcdData, ProblemConcept, Quantity,\n\n@hc.sandbox\nclass NotereaderSandbox(ClinicalDocumentation):\n def __init__(self):\n self.pipeline = MedicalCodingPipeline.load(\"./path/to/model\")\n\n # Load an existing CDA file\n @hc.ehr(workflow=\"sign-note-inpatient\")\n def load_data_in_client(self) -> CcdData:\n with open(\"/path/to/cda/data.xml\", \"r\") as file:\n xml_string = file.read()\n\n return CcdData(cda_xml=xml_string)\n\n @hc.api\n def my_service(self, ccd_data: CcdData) -> CcdData:\n annotated_ccd = self.pipeline(ccd_data)\n return annotated_ccd\n```\n### Running a sandbox\n\nEnsure you run the following commands in your `mycds.py` file:\n\n```python\ncds = MyCDS()\ncds.run_sandbox()\n```\nThis will populate your EHR client with the data generation method you have defined, send requests to your server for processing, and save the data in the `./output` directory.\n\nThen run:\n```bash\nhealthchain run mycds.py\n```\nBy default, the server runs at `http://127.0.0.1:8000`, and you can interact with the exposed endpoints at `/docs`.\n## Road Map\n- [ ] \ud83c\udf9b\ufe0f Versioning and artifact management for pipelines sandbox EHR configurations\n- [ ] \ud83e\udd16 Integrations with other pipeline libraries such as spaCy, HuggingFace, LangChain etc.\n- [ ] \u2753 Testing and evaluation framework for pipelines and use cases\n- [ ] \ud83e\udde0 Multi-modal pipelines that that have built-in NLP to utilize unstructured data\n- [ ] \u2728 Improvements to synthetic data generator methods\n- [ ] \ud83d\udc7e Frontend UI for EHR client and visualization features\n- [ ] \ud83d\ude80 Production deployment options\n\n## Contribute\nWe are always eager to hear feedback and suggestions, especially if you are a developer or researcher working with healthcare systems!\n- \ud83d\udca1 Let's chat! [Discord](https://discord.gg/UQC6uAepUz)\n- \ud83d\udee0\ufe0f [Contribution Guidelines](CONTRIBUTING.md)\n\n## Acknowledgement\nThis repository makes use of CDS Hooks developed by Boston Children\u2019s Hospital.\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Remarkably simple testing and validation of AI/NLP applications in healthcare context.",
"version": "0.4.0",
"project_urls": {
"Documentation": "https://dotimplement.github.io/HealthChain/"
},
"split_keywords": [
"nlp",
" ai",
" llm",
" healthcare",
" ehr",
" mlops"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f0e542afb6af8a22d75e2a77eed693a1eca77250395f825d075efd87a7cbb893",
"md5": "3a43dff8b7a6928d62f4690e2f2587f7",
"sha256": "3845849aea0f915b09a7c25611ed556794a5073e83303521c748b8e6ea770ebf"
},
"downloads": -1,
"filename": "healthchain-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3a43dff8b7a6928d62f4690e2f2587f7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.8",
"size": 128311,
"upload_time": "2024-10-04T15:40:20",
"upload_time_iso_8601": "2024-10-04T15:40:20.770063Z",
"url": "https://files.pythonhosted.org/packages/f0/e5/42afb6af8a22d75e2a77eed693a1eca77250395f825d075efd87a7cbb893/healthchain-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0c7de3643fc8972b1caf796ee3628c2bfcd01cfda59943e10bf77829b49b34a5",
"md5": "e44d448b1381258ac85e2ed1b35d1fd3",
"sha256": "257a5e2f42341fddf02e7271b84e3c10eb1fb3a2175b7e2d9a91e92c6f7b6d2d"
},
"downloads": -1,
"filename": "healthchain-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "e44d448b1381258ac85e2ed1b35d1fd3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.8",
"size": 89892,
"upload_time": "2024-10-04T15:40:22",
"upload_time_iso_8601": "2024-10-04T15:40:22.826251Z",
"url": "https://files.pythonhosted.org/packages/0c/7d/e3643fc8972b1caf796ee3628c2bfcd01cfda59943e10bf77829b49b34a5/healthchain-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-04 15:40:22",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "healthchain"
}