# MIMIC-IV Analysis Toolkit
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
<img src="https://img.shields.io/github/last-commit/artinmajdi/mimic_iv_analysis?style=flat&logo=git&logoColor=white&color=0080ff" alt="last-commit">
<img src="https://img.shields.io/github/languages/top/artinmajdi/mimic_iv_analysis?style=flat&color=0080ff" alt="repo-top-language">
*Unlock Insights from Healthcare Data Effortlessly*
A comprehensive analytical toolkit for exploring and modeling data from the MIMIC-IV clinical database. This project provides tools for data loading, preprocessing, feature engineering, clustering, and visualization, primarily focusing on provider order pattern analysis.
## Table of Contents
- [About MIMIC-IV Data](#about-mimic-iv-data)
- [Features](#features)
- [Project Structure](#project-structure)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage](#usage)
- [Core Modules Overview](#core-modules-overview)
- [Development](#development)
- [Documentation](#documentation)
- [Streamlit Cloud Deployment](#streamlit-cloud-deployment)
- [Contributing](#contributing)
- [License](#license)
- [Author](#author)
## About MIMIC-IV Data
This toolkit is designed to analyze data from the [MIMIC-IV (Medical Information Mart for Intensive Care IV)](https://mimic.mit.edu/docs/iv/) clinical database. MIMIC-IV is a large, freely-available database comprising de-identified health-related data associated with patients who stayed in critical care units at the Beth Israel Deaconess Medical Center.
For detailed information on the MIMIC-IV data structure used by this project, please refer to the documentation:
* [MIMIC-IV Data Structure Overview](documentations/mimic_iv_data_structure.md)
* [Detailed Table Structures](documentations/DATA_STRUCTURE.md)
## Features
* **Comprehensive Data Loader:** Utilities for loading and preparing MIMIC-IV data, simplifying the process of loading and preprocessing MIMIC-IV datasets, addressing common data management challenges. Supports both CSV and Parquet formats, with options for Dask integration for large datasets.
* **Interactive Visualization:** A Streamlit application for visualizing data, cluster results, and analysis. Utilizes Streamlit for real-time data exploration, enhancing user engagement and understanding of complex datasets.
* **Feature Engineering Tools:** Tools for creating meaningful features from clinical temporal data, including order frequency matrices, temporal order sequences, and order timing features. Provides utilities for identifying and extracting relevant features, streamlining the data preparation process.
* **Clustering Analysis Capabilities:** Implementations for K-Means, Hierarchical, DBSCAN clustering, and LDA Topic Modeling to identify patterns in clinical data.
* **Predictive Modeling Support:** Designed to prepare data for various predictive tasks.
* **Configuration Management:** Easy-to-use YAML configuration for managing data paths and application settings.
* **MIMIC-IV Data Focus:** Specifically designed to work with the MIMIC-IV clinical database structure.
* **Modular Architecture:** Facilitates easy updates and maintenance, promoting a seamless development experience.
* **Exploratory Data Analysis**
* **Patient Trajectory Visualization**
* **Order Pattern Analysis**
## Project Structure
The repository is organized as follows:
```
mimic_iv_analysis/
├── mimic_iv_analysis/ # Main package source code
│ ├── __init__.py # Package initialization
│ ├── configurations/ # Configuration files (e.g., config.yaml)
│ ├── core/ # Core functionalities (data loading, clustering, feature engineering)
│ │ ├── __init__.py
│ │ ├── clustering.py
│ │ ├── data_loader.py
│ │ ├── feature_engineering.py
│ │ └── filtering.py
│ ├── examples/ # Example scripts and notebooks
│ └── visualization/ # Streamlit dashboard application and utilities
│ ├── __init__.py
│ ├── app.py
│ └── app_components/
├── documentations/ # Project documentation
├── scripts/ # Utility and helper scripts (install, run dashboard)
├── setup_config/ # Configuration for setup and testing (e.g., pytest.ini)
├── tests/ # Test suite for the project
├── .streamlit/ # Configuration for Streamlit Cloud deployment
├── README.md # This file
├── requirements.txt # Python package dependencies
└── setup.py # Package setup script
```
(Note: The `src/` directory mentioned in one of the older READMEs is now represented by the top-level `mimic_iv_analysis/` package directory for source code.)
## Installation
### Prerequisites
* Python 3.12 or higher
* pip or conda package manager
### Installation Steps
1. **Clone the repository:**
```bash
git clone https://github.com/artinmajdi/mimic_iv_analysis.git
cd mimic_iv_analysis
```
2. **Create a virtual environment (recommended):**
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```
3. **Install dependencies:**
The `requirements.txt` file lists all necessary Python packages.
```bash
pip install -r requirements.txt
```
To install the package in editable mode along with development dependencies:
```bash
pip install -e ".[dev]"
```
Alternatively, you can use the provided installation script which offers environment choices (venv, conda, docker):
```bash
bash scripts/install.sh
```
## Configuration
The main configuration for the application is located in `mimic_iv_analysis/configurations/config.yaml`.
You **must** update the `mimic_data_path` in this file to point to the root directory of your local MIMIC-IV dataset (version 3.1 or compatible).
Example `config.yaml` structure:
```yaml
data:
mimic_data_path: "/path/to/your/mimic-iv-data" # <-- IMPORTANT: Update this path
app:
port: 8501
theme: "light"
debug: false
# ... other configurations
```
## Usage
### Running the Streamlit Dashboard
1. Ensure your virtual environment is activated (if you created one).
2. Make sure you have configured the `mimic_data_path` in `config.yaml`.
3. Run the application using:
```bash
streamlit run mimic_iv_analysis/visualization/app.py
```
Alternatively, if the package was installed using pip (e.g., via `pip install -e .` or from PyPI), you might be able to use a command like:
```bash
mimic-iv
```
The dashboard should open in your web browser, typically at `http://localhost:8501` (or the port specified in `config.yaml`).
### Install the package from TestPyPI (Example for version 0.5.8)
If a version is available on TestPyPI, you can install it using:
```bash
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ mimic_iv_analysis==0.5.8
```
(Replace `0.5.8` with the desired version if applicable.)
## Core Modules Overview
* **`mimic_iv_analysis.core`**: Contains the fundamental logic for data handling and analysis.
* `data_loader.py`: Utilities for loading MIMIC-IV tables efficiently, supporting both CSV and Parquet formats, with options for Dask integration for large datasets.
* `feature_engineering.py`: Tools to create meaningful features from raw clinical data, such as order frequencies and temporal sequences.
* `clustering.py`: Implements various clustering algorithms (K-Means, Hierarchical, DBSCAN) and LDA topic modeling.
* `filtering.py`: Enables applying inclusion and exclusion criteria to the dataset.
* **`mimic_iv_analysis.visualization`**: Houses the Streamlit application.
* `app.py`: The main entry point for the interactive dashboard.
* `app_components/`: Contains different tabs and UI elements of the dashboard.
* **`mimic_iv_analysis.configurations`**: Manages application settings.
## Development
### Code Style
This project uses the following tools to maintain code quality:
* **Black:** For code formatting.
* **isort:** For import sorting.
* **Flake8:** For style guide enforcement (PEP 8).
* **MyPy:** For static type checking.
To format your code:
```bash
black .
isort .
```
To check your code:
```bash
flake8 .
mypy .
```
### Running Tests
Tests are located in the `tests/` directory. To run the test suite:
```bash
pytest tests/
```
To run tests with coverage:
```bash
pytest --cov=mimic_iv_analysis tests/
```
Test configuration can be found in `setup_config/pytest.ini` (or `pytest.ini` / `pyproject.toml` depending on project setup).
## Documentation
Further documentation can be found in the `documentations/` directory:
* [`DATA_STRUCTURE.md`](documentations/DATA_STRUCTURE.md): Describes the expected structure of the MIMIC-IV data.
* [`mimic_iv_data_structure.md`](documentations/mimic_iv_data_structure.md): Provides an overview of MIMIC-IV tables and identifiers.
* [`.streamlit/README.md`](.streamlit/README.md): Guide for deploying the Streamlit application to Streamlit Cloud.
* The `documentations/pyhealth/` directory contains documentation for the PyHealth library, which might be a dependency or a related project.
## Streamlit Cloud Deployment
For deploying the dashboard to Streamlit Cloud, refer to the guide in [`.streamlit/README.md`](.streamlit/README.md). This includes steps for repository preparation, secret management, and dependency configuration.
## Contributing
Contributions are welcome! Please follow these general steps:
1. Fork the repository.
2. Create a new feature branch (`git checkout -b feature/your-feature-name`).
3. Make your changes.
4. Ensure all tests pass (`pytest tests/`).
5. Format your code (`black .` and `isort .`).
6. Submit a pull request with a clear description of your changes.
## License
This project is licensed under the MIT License. See the `LICENSE.md` file for details.
## Author
* Artin Majdi ([msm2024@gmail.com](mailto:msm2024@gmail.com))
Raw data
{
"_id": null,
"home_page": null,
"name": "mimic-iv-analysis",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "nursing research, healthcare, AI, medical analysis",
"author": null,
"author_email": "Artin Majdi <msm2024@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/8e/3d/3c3edaedb6a42893472ce56241b8f1735f2068cedc67def2213ad3af56c7/mimic_iv_analysis-1.13.0.tar.gz",
"platform": null,
"description": "# MIMIC-IV Analysis Toolkit\n\n[](https://opensource.org/licenses/MIT)\n[](https://www.python.org/downloads/)\n<img src=\"https://img.shields.io/github/last-commit/artinmajdi/mimic_iv_analysis?style=flat&logo=git&logoColor=white&color=0080ff\" alt=\"last-commit\">\n<img src=\"https://img.shields.io/github/languages/top/artinmajdi/mimic_iv_analysis?style=flat&color=0080ff\" alt=\"repo-top-language\">\n\n*Unlock Insights from Healthcare Data Effortlessly*\n\nA comprehensive analytical toolkit for exploring and modeling data from the MIMIC-IV clinical database. This project provides tools for data loading, preprocessing, feature engineering, clustering, and visualization, primarily focusing on provider order pattern analysis.\n\n## Table of Contents\n\n- [About MIMIC-IV Data](#about-mimic-iv-data)\n- [Features](#features)\n- [Project Structure](#project-structure)\n- [Installation](#installation)\n- [Configuration](#configuration)\n- [Usage](#usage)\n- [Core Modules Overview](#core-modules-overview)\n- [Development](#development)\n- [Documentation](#documentation)\n- [Streamlit Cloud Deployment](#streamlit-cloud-deployment)\n- [Contributing](#contributing)\n- [License](#license)\n- [Author](#author)\n\n## About MIMIC-IV Data\n\nThis toolkit is designed to analyze data from the [MIMIC-IV (Medical Information Mart for Intensive Care IV)](https://mimic.mit.edu/docs/iv/) clinical database. MIMIC-IV is a large, freely-available database comprising de-identified health-related data associated with patients who stayed in critical care units at the Beth Israel Deaconess Medical Center.\n\nFor detailed information on the MIMIC-IV data structure used by this project, please refer to the documentation:\n* [MIMIC-IV Data Structure Overview](documentations/mimic_iv_data_structure.md)\n* [Detailed Table Structures](documentations/DATA_STRUCTURE.md)\n\n## Features\n\n* **Comprehensive Data Loader:** Utilities for loading and preparing MIMIC-IV data, simplifying the process of loading and preprocessing MIMIC-IV datasets, addressing common data management challenges. Supports both CSV and Parquet formats, with options for Dask integration for large datasets.\n* **Interactive Visualization:** A Streamlit application for visualizing data, cluster results, and analysis. Utilizes Streamlit for real-time data exploration, enhancing user engagement and understanding of complex datasets.\n* **Feature Engineering Tools:** Tools for creating meaningful features from clinical temporal data, including order frequency matrices, temporal order sequences, and order timing features. Provides utilities for identifying and extracting relevant features, streamlining the data preparation process.\n* **Clustering Analysis Capabilities:** Implementations for K-Means, Hierarchical, DBSCAN clustering, and LDA Topic Modeling to identify patterns in clinical data.\n* **Predictive Modeling Support:** Designed to prepare data for various predictive tasks.\n* **Configuration Management:** Easy-to-use YAML configuration for managing data paths and application settings.\n* **MIMIC-IV Data Focus:** Specifically designed to work with the MIMIC-IV clinical database structure.\n* **Modular Architecture:** Facilitates easy updates and maintenance, promoting a seamless development experience.\n* **Exploratory Data Analysis**\n* **Patient Trajectory Visualization**\n* **Order Pattern Analysis**\n\n## Project Structure\n\nThe repository is organized as follows:\n\n```\nmimic_iv_analysis/\n\u251c\u2500\u2500 mimic_iv_analysis/ # Main package source code\n\u2502 \u251c\u2500\u2500 __init__.py # Package initialization\n\u2502 \u251c\u2500\u2500 configurations/ # Configuration files (e.g., config.yaml)\n\u2502 \u251c\u2500\u2500 core/ # Core functionalities (data loading, clustering, feature engineering)\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u2502 \u251c\u2500\u2500 clustering.py\n\u2502 \u2502 \u251c\u2500\u2500 data_loader.py\n\u2502 \u2502 \u251c\u2500\u2500 feature_engineering.py\n\u2502 \u2502 \u2514\u2500\u2500 filtering.py\n\u2502 \u251c\u2500\u2500 examples/ # Example scripts and notebooks\n\u2502 \u2514\u2500\u2500 visualization/ # Streamlit dashboard application and utilities\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 app.py\n\u2502 \u2514\u2500\u2500 app_components/\n\u251c\u2500\u2500 documentations/ # Project documentation\n\u251c\u2500\u2500 scripts/ # Utility and helper scripts (install, run dashboard)\n\u251c\u2500\u2500 setup_config/ # Configuration for setup and testing (e.g., pytest.ini)\n\u251c\u2500\u2500 tests/ # Test suite for the project\n\u251c\u2500\u2500 .streamlit/ # Configuration for Streamlit Cloud deployment\n\u251c\u2500\u2500 README.md # This file\n\u251c\u2500\u2500 requirements.txt # Python package dependencies\n\u2514\u2500\u2500 setup.py # Package setup script\n```\n\n(Note: The `src/` directory mentioned in one of the older READMEs is now represented by the top-level `mimic_iv_analysis/` package directory for source code.)\n\n\n## Installation\n\n### Prerequisites\n\n* Python 3.12 or higher\n* pip or conda package manager\n\n### Installation Steps\n\n1. **Clone the repository:**\n ```bash\n git clone https://github.com/artinmajdi/mimic_iv_analysis.git\n cd mimic_iv_analysis\n ```\n\n2. **Create a virtual environment (recommended):**\n ```bash\n python -m venv .venv\n source .venv/bin/activate # On Windows: .venv\\Scripts\\activate\n ```\n\n3. **Install dependencies:**\n The `requirements.txt` file lists all necessary Python packages.\n ```bash\n pip install -r requirements.txt\n ```\n To install the package in editable mode along with development dependencies:\n ```bash\n pip install -e \".[dev]\"\n ```\n Alternatively, you can use the provided installation script which offers environment choices (venv, conda, docker):\n ```bash\n bash scripts/install.sh\n ```\n\n## Configuration\n\nThe main configuration for the application is located in `mimic_iv_analysis/configurations/config.yaml`.\n\nYou **must** update the `mimic_data_path` in this file to point to the root directory of your local MIMIC-IV dataset (version 3.1 or compatible).\n\nExample `config.yaml` structure:\n```yaml\ndata:\n mimic_data_path: \"/path/to/your/mimic-iv-data\" # <-- IMPORTANT: Update this path\n\napp:\n port: 8501\n theme: \"light\"\n debug: false\n\n# ... other configurations\n```\n\n## Usage\n\n### Running the Streamlit Dashboard\n\n1. Ensure your virtual environment is activated (if you created one).\n2. Make sure you have configured the `mimic_data_path` in `config.yaml`.\n3. Run the application using:\n ```bash\n streamlit run mimic_iv_analysis/visualization/app.py\n ```\n Alternatively, if the package was installed using pip (e.g., via `pip install -e .` or from PyPI), you might be able to use a command like:\n ```bash\n mimic-iv\n ```\nThe dashboard should open in your web browser, typically at `http://localhost:8501` (or the port specified in `config.yaml`).\n\n### Install the package from TestPyPI (Example for version 0.5.8)\n\nIf a version is available on TestPyPI, you can install it using:\n```bash\npip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ mimic_iv_analysis==0.5.8\n```\n(Replace `0.5.8` with the desired version if applicable.)\n\n\n## Core Modules Overview\n\n* **`mimic_iv_analysis.core`**: Contains the fundamental logic for data handling and analysis.\n * `data_loader.py`: Utilities for loading MIMIC-IV tables efficiently, supporting both CSV and Parquet formats, with options for Dask integration for large datasets.\n * `feature_engineering.py`: Tools to create meaningful features from raw clinical data, such as order frequencies and temporal sequences.\n * `clustering.py`: Implements various clustering algorithms (K-Means, Hierarchical, DBSCAN) and LDA topic modeling.\n * `filtering.py`: Enables applying inclusion and exclusion criteria to the dataset.\n* **`mimic_iv_analysis.visualization`**: Houses the Streamlit application.\n * `app.py`: The main entry point for the interactive dashboard.\n * `app_components/`: Contains different tabs and UI elements of the dashboard.\n* **`mimic_iv_analysis.configurations`**: Manages application settings.\n\n## Development\n\n### Code Style\n\nThis project uses the following tools to maintain code quality:\n\n* **Black:** For code formatting.\n* **isort:** For import sorting.\n* **Flake8:** For style guide enforcement (PEP 8).\n* **MyPy:** For static type checking.\n\nTo format your code:\n```bash\nblack .\nisort .\n```\n\nTo check your code:\n```bash\nflake8 .\nmypy .\n```\n\n### Running Tests\n\nTests are located in the `tests/` directory. To run the test suite:\n```bash\npytest tests/\n```\n\nTo run tests with coverage:\n```bash\npytest --cov=mimic_iv_analysis tests/\n```\nTest configuration can be found in `setup_config/pytest.ini` (or `pytest.ini` / `pyproject.toml` depending on project setup).\n\n## Documentation\n\nFurther documentation can be found in the `documentations/` directory:\n\n* [`DATA_STRUCTURE.md`](documentations/DATA_STRUCTURE.md): Describes the expected structure of the MIMIC-IV data.\n* [`mimic_iv_data_structure.md`](documentations/mimic_iv_data_structure.md): Provides an overview of MIMIC-IV tables and identifiers.\n* [`.streamlit/README.md`](.streamlit/README.md): Guide for deploying the Streamlit application to Streamlit Cloud.\n* The `documentations/pyhealth/` directory contains documentation for the PyHealth library, which might be a dependency or a related project.\n\n## Streamlit Cloud Deployment\n\nFor deploying the dashboard to Streamlit Cloud, refer to the guide in [`.streamlit/README.md`](.streamlit/README.md). This includes steps for repository preparation, secret management, and dependency configuration.\n\n## Contributing\n\nContributions are welcome! Please follow these general steps:\n\n1. Fork the repository.\n2. Create a new feature branch (`git checkout -b feature/your-feature-name`).\n3. Make your changes.\n4. Ensure all tests pass (`pytest tests/`).\n5. Format your code (`black .` and `isort .`).\n6. Submit a pull request with a clear description of your changes.\n\n## License\n\nThis project is licensed under the MIT License. See the `LICENSE.md` file for details.\n\n## Author\n\n* Artin Majdi ([msm2024@gmail.com](mailto:msm2024@gmail.com))\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A data science and machine learning framework for nursing research",
"version": "1.13.0",
"project_urls": {
"Documentation": "https://github.com/artinmajdi/mimic_iv_analysis/docs",
"Homepage": "https://github.com/artinmajdi/mimic_iv_analysis",
"Issues": "https://github.com/artinmajdi/mimic_iv_analysis/issues",
"Repository": "https://github.com/artinmajdi/mimic_iv_analysis.git"
},
"split_keywords": [
"nursing research",
" healthcare",
" ai",
" medical analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "6bc35c32527ed5237123d3d6f70a0362378dfa9dbb8d7e734f580fcda2ef3164",
"md5": "b066c0fe6b4eecf976b3809f685ed697",
"sha256": "eaa1dde06159fdc471cdba14cb59a2533590eae9815b5572a3b9addc522319ae"
},
"downloads": -1,
"filename": "mimic_iv_analysis-1.13.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b066c0fe6b4eecf976b3809f685ed697",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 238495,
"upload_time": "2025-10-07T16:54:01",
"upload_time_iso_8601": "2025-10-07T16:54:01.548210Z",
"url": "https://files.pythonhosted.org/packages/6b/c3/5c32527ed5237123d3d6f70a0362378dfa9dbb8d7e734f580fcda2ef3164/mimic_iv_analysis-1.13.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8e3d3c3edaedb6a42893472ce56241b8f1735f2068cedc67def2213ad3af56c7",
"md5": "2af9f262f3b993d432d389897f613886",
"sha256": "c39c4e2fe31c0322d96d525bfca37641f08d2196136d7d3fcb2be553b184b11e"
},
"downloads": -1,
"filename": "mimic_iv_analysis-1.13.0.tar.gz",
"has_sig": false,
"md5_digest": "2af9f262f3b993d432d389897f613886",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 225520,
"upload_time": "2025-10-07T16:54:02",
"upload_time_iso_8601": "2025-10-07T16:54:02.938533Z",
"url": "https://files.pythonhosted.org/packages/8e/3d/3c3edaedb6a42893472ce56241b8f1735f2068cedc67def2213ad3af56c7/mimic_iv_analysis-1.13.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-07 16:54:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "artinmajdi",
"github_project": "mimic_iv_analysis",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "mimic-iv-analysis"
}