yaib

Name	yaib JSON
Version	0.3.1 JSON
	download
home_page	https://github.com/rvandewater/YAIB
Summary	Yet Another ICU Benchmark is a holistic framework for the automation of the development of clinical prediction models on ICU data. Users can create custom datasets, cohorts, prediction tasks, endpoints, and models.
upload_time	2023-06-13 11:38:03
maintainer
docs_url	None
author	Robin van de Water
requires_python
license	MIT license
keywords	benchmark mimic-iii eicu hirid clinical-ml machine-learning benchmark time-series mimic-iv patient-monitoring amsterdamumcdb clinical-data ehr icu ricu pyicu
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            ![YAIB logo](https://github.com/rvandewater/YAIB/blob/development/docs/figures/yaib_logo.png?raw=true)

# 🧪 Yet Another ICU Benchmark

[![CI](https://github.com/rvandewater/YAIB/actions/workflows/ci.yml/badge.svg?branch=development)](https://github.com/rvandewater/YAIB/actions/workflows/ci.yml)
[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![Platform](https://img.shields.io/badge/platform-linux--64%20|%20win--64%20|%20osx--64-lightgrey)
[![arXiv](https://img.shields.io/badge/arXiv-2306.05109-b31b1b.svg)](http://arxiv.org/abs/2306.05109)
[![PyPI version shields.io](https://img.shields.io/pypi/v/yaib.svg)](https://pypi.python.org/pypi/yaib/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

[//]: # (TODO: add coverage once we have some tests )

Yet another ICU benchmark (YAIB) provides a framework for doing clinical machine learning experiments on Intensive Care Unit (
ICU) EHR data.

We support the following datasets out of the box:

| **Dataset**                 | [MIMIC-III](https://physionet.org/content/mimiciii/) / [IV](https://physionet.org/content/mimiciv/) | [eICU-CRD](https://physionet.org/content/eicu-crd/) | [HiRID](https://physionet.org/content/hirid/1.1.1/) | [AUMCdb](https://doi.org/10.17026/dans-22u-f8vd) |
|-----------------------------|-----------------------------------------------------------------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|--------------------------------------------------|
| **Admissions**              | 40k / 73k                                                                                           | 200k                                                | 33k                                                 | 23k                                              |
| **Version**                 | v1.4 / v2.2                                                                                         | v2.0                                                | v1.1.1                                              | v1.0.2                                           |  
| **Frequency** (time-series) | 1 hour                                                                                              | 5 minutes                                           | 2 / 5 minutes                                       | up to 1 minute                                   |
| **Originally published**    | 2015  / 2020                                                                                        | 2017                                                | 2020                                                | 2019                                             | 
| **Origin**                  | USA                                                                                                 | USA                                                 | Switzerland                                         | Netherlands                                      |

New datasets can also be added. We are currently working on a package to make this process as smooth as possible.
The benchmark is designed for operating on preprocessed parquet files.
<!-- We refer to  PyICU (in development)
or [ricu package](https://github.com/eth-mds/ricu) for generating these parquet files for particular cohorts and endpoints. -->

We provide five common tasks for clinical prediction by default:

| No  | Task                      | Frequency                 | Type                  | 
|-----|---------------------------|---------------------------|-----------------------|
| 1   | ICU Mortality             | Once per Stay (after 24H) | Binary Classification |
| 2   | Acute Kidney Injury (AKI) | Hourly (within 6H)        | Binary Classification |
| 3   | Sepsis                    | Hourly (within 6H)        | Binary Classification |
| 4   | Kidney Function(KF)       | Once per stay             | Regression            |
| 5   | Length of Stay (LoS)      | Hourly (within 7D)        | Regression            |

New tasks can be easily added.
For the purposes of getting started right away, we include the eICU and MIMIC-III demo datasets in our repository.

The following repositories may be relevant as well:

- [YAIB-cohorts](https://github.com/rvandewater/YAIB-cohorts): Cohort generation for YAIB.
- [YAIB-models](https://github.com/rvandewater/YAIB-models): Pretrained models for YAIB.
- [ReciPys](https://github.com/rvandewater/ReciPys): Preprocessing package for YAIB pipelines.

 For all YAIB related repositories, please see: https://github.com/stars/rvandewater/lists/yaib. 
# 📄Paper

To reproduce the benchmarks in our paper, we refer to: the [ML reproducibility document](PAPER.md).
If you use this code in your research, please cite the following publication:

```
@article{vandewaterYetAnotherICUBenchmark2023,
	title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
	shorttitle = {Yet Another ICU Benchmark},
	url = {http://arxiv.org/abs/2306.05109},
	language = {en},
	urldate = {2023-06-09},
	publisher = {arXiv},
	author = {Robin van de Water and Hendrik Schmidt and Paul Elbers and Patrick Thoral and Bert Arnrich and Patrick Rockenschaub},
	month = jun,
	year = {2023},
	note = {arXiv:2306.05109 [cs]},
	keywords = {Computer Science - Machine Learning},
}
```

This paper can also be found on arxiv [2306.05109](https://arxiv.org/abs/2306.05109)

# 💿Installation
YAIB is currently ideally installed from source, however we also offer it an early PyPi release.

## Installation from source
First, we clone this repository using git:
````
git clone https://github.com/rvandewater/YAIB.git
````
Please note the branch. The newest features and fixes are available at the development branch:
````
git checkout development
````
YAIB can be installed using a conda environment (preferred) or pip. Below are the three CLI commands to install YAIB 
using **conda**.

The first command will install an environment based on Python 3.10.

```
conda env update -f <environment.yml|environment_mps.yml>
```

> Use `environment.yml` on x86 hardware and `environment_mps.yml` on Macs with Metal Performance Shaders.

We then activate the environment and install a package called `icu-benchmarks`, after which YAIB should be operational.

```
conda activate yaib
pip install -e .
```

If you want to install the icu-benchmarks package with **pip**, execute the command below:

```
pip install torch numpy && pip install -e .
```
After installation, please check if your Pytorch version works with CUDA (in case available) to ensure the best performance. 
YAIB will automatically list available processors at initialization in its log files.

# 👩‍💻Usage

Please refer to [our wiki](https://github.com/rvandewater/YAIB/wiki) for detailed information on how to use YAIB.

## Quickstart 🚀 (demo data)

In the folder `demo_data` we provide processed publicly available demo datasets from eICU and MIMIC with the necessary labels
for `Mortality at 24h`,`Sepsis`, `Akute Kidney Injury`, `Kidney Function`, and `Length of Stay`.

If you do not yet have access to the ICU datasets, you can run the following command to train models for the included demo
cohorts:

```
wandb sweep --verbose experiments/demo_benchmark_classification.yml
wandb sweep --verbose experiments/demo_benchmark_regression.yml
```

```train
wandb agent <sweep_id>
```

> Tip: You can choose to run each of the configurations on a SLURM cluster instance by `wandb agent --count 1 <sweep_id>`

> Note: You will need to have a wandb account and be logged in to run the above commands.

## Getting the datasets

HiRID, eICU, and MIMIC IV can be accessed through [PhysioNet](https://physionet.org/). A guide to this process can be
found [here](https://eicu-crd.mit.edu/gettingstarted/access/).
AUMCdb can be accessed through a separate access [procedure](https://github.com/AmsterdamUMC/AmsterdamUMCdb). We do not have
involvement in the access procedure and can not answer to any requests for data access.

## Cohort creation

Since the datasets were created independently of each other, they do not share the same data structure or data identifiers. In
order to make them interoperable, use the preprocessing utilities
provided by the [ricu package](https://github.com/eth-mds/ricu).
Ricu pre-defines a large number of clinical concepts and how to load them from a given dataset, providing a common interface to
the data, that is used in this
benchmark. Please refer to our [cohort definition](https://github.com/rvandewater/YAIB-cohorts) code for generating the cohorts
using our python interface for ricu.
After this, you can run the benchmark once you have gained access to the datasets.

# 👟 Running YAIB

## Preprocessing and Training

The following command will run training and evaluation on the MIMIC demo dataset for (Binary) mortality prediction at 24h with
the
LGBMClassifier. Child samples are reduced due to the small amount of training data. We load available cache and, if available,
load
existing cache files.

```
icu-benchmarks train \
    -d demo_data/mortality24/mimic_demo \
    -n mimic_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    -hp LGBMClassifier.min_child_samples=10 \
    --generate_cache
    --load_cache \
    --seed 2222 \
    -s 2222 \
    -l ../yaib_logs/ \
    --tune
```

> For a list of available flags, run `icu-benchmarks train -h`.

> Run with `PYTORCH_ENABLE_MPS_FALLBACK=1` on Macs with Metal Performance Shaders.

[//]: # (> Please note that, for Windows based systems, paths need to be formatted differently, e.g: ` r"\..\data\mortality_seq\hirid"`.)
> For Windows based systems, the next line character (\\)  needs to be replaced by (^) (Command Prompt) or (`) (Powershell)
> respectively.


Alternatively, the easiest method to train all the models in the paper is to run these commands from the directory root:

```train
wandb sweep --verbose experiments/benchmark_classification.yml
wandb sweep --verbose experiments/benchmark_regression.yml
```

This will create two hyperparameter sweeps for WandB for the classification and regression tasks.
This configuration will train all the models in the paper. You can then run the following command to train the models:

```train
wandb agent <sweep_id>
```

> Tip: You can choose to run each of the configurations on a SLURM cluster instance by `wandb agent --count 1 <sweep_id>`

> Note: You will need to have a wandb account and be logged in to run the above commands.

## Evaluate

It is possible to evaluate a model trained on another dataset. In this case, the source dataset is the demo data from MIMIC and
the target is the eICU demo:

```
icu-benchmarks evaluate \
    -d demo_data/mortality24/eicu_demo \
    -n eicu_demo \
    -t BinaryClassification \
    -tn Mortality24 \
    -m LGBMClassifier \
    --generate_cache \
    --load_cache \
    -s 2222 \
    -l ../yaib_logs \
    -sn mimic \
    --source-dir ../yaib_logs/mimic_demo/Mortality24/LGBMClassifier/2022-12-12T15-24-46/fold_0
```

## Models

We provide several existing machine learning models that are commonly used for multivariate time-series data.
`pytorch` is used for the deep learning models, `lightgbm` for the boosted tree approaches, and `sklearn` for other classical
machine learning models.
The benchmark provides (among others) the following built-in models:

- [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic+regression):
  Standard regression approach.
- [Elastic Net](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html): Linear regression with
  combined L1 and L2 priors as regularizer.
- [LightGBM](https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf): Efficient gradient
  boosting trees.
- [Long Short-term Memory (LSTM)](https://ieeexplore.ieee.org/document/818041): The most commonly used type of Recurrent Neural
  Networks for long sequences.
- [Gated Recurrent Unit (GRU)](https://arxiv.org/abs/1406.1078) : A extension to LSTM which showed
  improvements ([paper](https://arxiv.org/abs/1412.3555)).
- [Temporal Convolutional Networks (TCN)](https://arxiv.org/pdf/1803.01271 ): 1D convolution approach to sequence data. By
  using dilated convolution to extend the receptive field of the network it has shown great performance on long-term
  dependencies.
- [Transformers](https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf): The most common Attention
  based approach.

# 🛠️ Development

To adapt YAIB to your own use case, you can use
the [development information](https://github.com/rvandewater/YAIB/wiki/Contribution-and-development) page as a reference.
We appreciate contributions to the project. Please read the [contribution guidelines](CONTRIBUTING.MD) before submitting a pull
request.

# Acknowledgements

We do not own any of the datasets used in this benchmark. This project uses heavily adapted components of
the [HiRID benchmark](https://github.com/ratschlab/HIRID-ICU-Benchmark/). We thank the authors for providing this codebase and
encourage further development to benefit the scientific community. The demo datasets have been released under
an [Open Data Commons Open Database License (ODbL)](https://opendatacommons.org/licenses/odbl/1-0/).

# License

This source code is released under the MIT license, included [here](LICENSE). We do not own any of the datasets used or
included in this repository.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rvandewater/YAIB",
    "name": "yaib",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "benchmark mimic-iii eicu hirid clinical-ml machine-learning benchmark time-series mimic-iv patient-monitoring amsterdamumcdb clinical-data ehr icu ricu pyicu",
    "author": "Robin van de Water",
    "author_email": "robin.vandewater@hpi.de",
    "download_url": "https://files.pythonhosted.org/packages/a6/1a/8e0894a3d58857e526cb5f92051a511617036234809f61136896dffc7257/yaib-0.3.1.tar.gz",
    "platform": null,
    "description": "![YAIB logo](https://github.com/rvandewater/YAIB/blob/development/docs/figures/yaib_logo.png?raw=true)\r\n\r\n# \ud83e\uddea Yet Another ICU Benchmark\r\n\r\n[![CI](https://github.com/rvandewater/YAIB/actions/workflows/ci.yml/badge.svg?branch=development)](https://github.com/rvandewater/YAIB/actions/workflows/ci.yml)\r\n[![Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\r\n![Platform](https://img.shields.io/badge/platform-linux--64%20|%20win--64%20|%20osx--64-lightgrey)\r\n[![arXiv](https://img.shields.io/badge/arXiv-2306.05109-b31b1b.svg)](http://arxiv.org/abs/2306.05109)\r\n[![PyPI version shields.io](https://img.shields.io/pypi/v/yaib.svg)](https://pypi.python.org/pypi/yaib/)\r\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\r\n\r\n[//]: # (TODO: add coverage once we have some tests )\r\n\r\nYet another ICU benchmark (YAIB) provides a framework for doing clinical machine learning experiments on Intensive Care Unit (\r\nICU) EHR data.\r\n\r\nWe support the following datasets out of the box:\r\n\r\n| **Dataset**                 | [MIMIC-III](https://physionet.org/content/mimiciii/) / [IV](https://physionet.org/content/mimiciv/) | [eICU-CRD](https://physionet.org/content/eicu-crd/) | [HiRID](https://physionet.org/content/hirid/1.1.1/) | [AUMCdb](https://doi.org/10.17026/dans-22u-f8vd) |\r\n|-----------------------------|-----------------------------------------------------------------------------------------------------|-----------------------------------------------------|-----------------------------------------------------|--------------------------------------------------|\r\n| **Admissions**              | 40k / 73k                                                                                           | 200k                                                | 33k                                                 | 23k                                              |\r\n| **Version**                 | v1.4 / v2.2                                                                                         | v2.0                                                | v1.1.1                                              | v1.0.2                                           |  \r\n| **Frequency** (time-series) | 1 hour                                                                                              | 5 minutes                                           | 2 / 5 minutes                                       | up to 1 minute                                   |\r\n| **Originally published**    | 2015  / 2020                                                                                        | 2017                                                | 2020                                                | 2019                                             | \r\n| **Origin**                  | USA                                                                                                 | USA                                                 | Switzerland                                         | Netherlands                                      |\r\n\r\nNew datasets can also be added. We are currently working on a package to make this process as smooth as possible.\r\nThe benchmark is designed for operating on preprocessed parquet files.\r\n<!-- We refer to  PyICU (in development)\r\nor [ricu package](https://github.com/eth-mds/ricu) for generating these parquet files for particular cohorts and endpoints. -->\r\n\r\nWe provide five common tasks for clinical prediction by default:\r\n\r\n| No  | Task                      | Frequency                 | Type                  | \r\n|-----|---------------------------|---------------------------|-----------------------|\r\n| 1   | ICU Mortality             | Once per Stay (after 24H) | Binary Classification |\r\n| 2   | Acute Kidney Injury (AKI) | Hourly (within 6H)        | Binary Classification |\r\n| 3   | Sepsis                    | Hourly (within 6H)        | Binary Classification |\r\n| 4   | Kidney Function(KF)       | Once per stay             | Regression            |\r\n| 5   | Length of Stay (LoS)      | Hourly (within 7D)        | Regression            |\r\n\r\nNew tasks can be easily added.\r\nFor the purposes of getting started right away, we include the eICU and MIMIC-III demo datasets in our repository.\r\n\r\nThe following repositories may be relevant as well:\r\n\r\n- [YAIB-cohorts](https://github.com/rvandewater/YAIB-cohorts): Cohort generation for YAIB.\r\n- [YAIB-models](https://github.com/rvandewater/YAIB-models): Pretrained models for YAIB.\r\n- [ReciPys](https://github.com/rvandewater/ReciPys): Preprocessing package for YAIB pipelines.\r\n\r\n For all YAIB related repositories, please see: https://github.com/stars/rvandewater/lists/yaib. \r\n# \ud83d\udcc4Paper\r\n\r\nTo reproduce the benchmarks in our paper, we refer to: the [ML reproducibility document](PAPER.md).\r\nIf you use this code in your research, please cite the following publication:\r\n\r\n```\r\n@article{vandewaterYetAnotherICUBenchmark2023,\r\n\ttitle = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},\r\n\tshorttitle = {Yet Another ICU Benchmark},\r\n\turl = {http://arxiv.org/abs/2306.05109},\r\n\tlanguage = {en},\r\n\turldate = {2023-06-09},\r\n\tpublisher = {arXiv},\r\n\tauthor = {Robin van de Water and Hendrik Schmidt and Paul Elbers and Patrick Thoral and Bert Arnrich and Patrick Rockenschaub},\r\n\tmonth = jun,\r\n\tyear = {2023},\r\n\tnote = {arXiv:2306.05109 [cs]},\r\n\tkeywords = {Computer Science - Machine Learning},\r\n}\r\n```\r\n\r\nThis paper can also be found on arxiv [2306.05109](https://arxiv.org/abs/2306.05109)\r\n\r\n# \ud83d\udcbfInstallation\r\nYAIB is currently ideally installed from source, however we also offer it an early PyPi release.\r\n\r\n## Installation from source\r\nFirst, we clone this repository using git:\r\n````\r\ngit clone https://github.com/rvandewater/YAIB.git\r\n````\r\nPlease note the branch. The newest features and fixes are available at the development branch:\r\n````\r\ngit checkout development\r\n````\r\nYAIB can be installed using a conda environment (preferred) or pip. Below are the three CLI commands to install YAIB \r\nusing **conda**.\r\n\r\nThe first command will install an environment based on Python 3.10.\r\n\r\n```\r\nconda env update -f <environment.yml|environment_mps.yml>\r\n```\r\n\r\n> Use `environment.yml` on x86 hardware and `environment_mps.yml` on Macs with Metal Performance Shaders.\r\n\r\nWe then activate the environment and install a package called `icu-benchmarks`, after which YAIB should be operational.\r\n\r\n```\r\nconda activate yaib\r\npip install -e .\r\n```\r\n\r\nIf you want to install the icu-benchmarks package with **pip**, execute the command below:\r\n\r\n```\r\npip install torch numpy && pip install -e .\r\n```\r\nAfter installation, please check if your Pytorch version works with CUDA (in case available) to ensure the best performance. \r\nYAIB will automatically list available processors at initialization in its log files.\r\n\r\n# \ud83d\udc69\u200d\ud83d\udcbbUsage\r\n\r\nPlease refer to [our wiki](https://github.com/rvandewater/YAIB/wiki) for detailed information on how to use YAIB.\r\n\r\n## Quickstart \ud83d\ude80 (demo data)\r\n\r\nIn the folder `demo_data` we provide processed publicly available demo datasets from eICU and MIMIC with the necessary labels\r\nfor `Mortality at 24h`,`Sepsis`, `Akute Kidney Injury`, `Kidney Function`, and `Length of Stay`.\r\n\r\nIf you do not yet have access to the ICU datasets, you can run the following command to train models for the included demo\r\ncohorts:\r\n\r\n```\r\nwandb sweep --verbose experiments/demo_benchmark_classification.yml\r\nwandb sweep --verbose experiments/demo_benchmark_regression.yml\r\n```\r\n\r\n```train\r\nwandb agent <sweep_id>\r\n```\r\n\r\n> Tip: You can choose to run each of the configurations on a SLURM cluster instance by `wandb agent --count 1 <sweep_id>`\r\n\r\n> Note: You will need to have a wandb account and be logged in to run the above commands.\r\n\r\n## Getting the datasets\r\n\r\nHiRID, eICU, and MIMIC IV can be accessed through [PhysioNet](https://physionet.org/). A guide to this process can be\r\nfound [here](https://eicu-crd.mit.edu/gettingstarted/access/).\r\nAUMCdb can be accessed through a separate access [procedure](https://github.com/AmsterdamUMC/AmsterdamUMCdb). We do not have\r\ninvolvement in the access procedure and can not answer to any requests for data access.\r\n\r\n## Cohort creation\r\n\r\nSince the datasets were created independently of each other, they do not share the same data structure or data identifiers. In\r\norder to make them interoperable, use the preprocessing utilities\r\nprovided by the [ricu package](https://github.com/eth-mds/ricu).\r\nRicu pre-defines a large number of clinical concepts and how to load them from a given dataset, providing a common interface to\r\nthe data, that is used in this\r\nbenchmark. Please refer to our [cohort definition](https://github.com/rvandewater/YAIB-cohorts) code for generating the cohorts\r\nusing our python interface for ricu.\r\nAfter this, you can run the benchmark once you have gained access to the datasets.\r\n\r\n# \ud83d\udc5f Running YAIB\r\n\r\n## Preprocessing and Training\r\n\r\nThe following command will run training and evaluation on the MIMIC demo dataset for (Binary) mortality prediction at 24h with\r\nthe\r\nLGBMClassifier. Child samples are reduced due to the small amount of training data. We load available cache and, if available,\r\nload\r\nexisting cache files.\r\n\r\n```\r\nicu-benchmarks train \\\r\n    -d demo_data/mortality24/mimic_demo \\\r\n    -n mimic_demo \\\r\n    -t BinaryClassification \\\r\n    -tn Mortality24 \\\r\n    -m LGBMClassifier \\\r\n    -hp LGBMClassifier.min_child_samples=10 \\\r\n    --generate_cache\r\n    --load_cache \\\r\n    --seed 2222 \\\r\n    -s 2222 \\\r\n    -l ../yaib_logs/ \\\r\n    --tune\r\n```\r\n\r\n> For a list of available flags, run `icu-benchmarks train -h`.\r\n\r\n> Run with `PYTORCH_ENABLE_MPS_FALLBACK=1` on Macs with Metal Performance Shaders.\r\n\r\n[//]: # (> Please note that, for Windows based systems, paths need to be formatted differently, e.g: ` r\"\\..\\data\\mortality_seq\\hirid\"`.)\r\n> For Windows based systems, the next line character (\\\\)  needs to be replaced by (^) (Command Prompt) or (`) (Powershell)\r\n> respectively.\r\n\r\n\r\nAlternatively, the easiest method to train all the models in the paper is to run these commands from the directory root:\r\n\r\n```train\r\nwandb sweep --verbose experiments/benchmark_classification.yml\r\nwandb sweep --verbose experiments/benchmark_regression.yml\r\n```\r\n\r\nThis will create two hyperparameter sweeps for WandB for the classification and regression tasks.\r\nThis configuration will train all the models in the paper. You can then run the following command to train the models:\r\n\r\n```train\r\nwandb agent <sweep_id>\r\n```\r\n\r\n> Tip: You can choose to run each of the configurations on a SLURM cluster instance by `wandb agent --count 1 <sweep_id>`\r\n\r\n> Note: You will need to have a wandb account and be logged in to run the above commands.\r\n\r\n## Evaluate\r\n\r\nIt is possible to evaluate a model trained on another dataset. In this case, the source dataset is the demo data from MIMIC and\r\nthe target is the eICU demo:\r\n\r\n```\r\nicu-benchmarks evaluate \\\r\n    -d demo_data/mortality24/eicu_demo \\\r\n    -n eicu_demo \\\r\n    -t BinaryClassification \\\r\n    -tn Mortality24 \\\r\n    -m LGBMClassifier \\\r\n    --generate_cache \\\r\n    --load_cache \\\r\n    -s 2222 \\\r\n    -l ../yaib_logs \\\r\n    -sn mimic \\\r\n    --source-dir ../yaib_logs/mimic_demo/Mortality24/LGBMClassifier/2022-12-12T15-24-46/fold_0\r\n```\r\n\r\n## Models\r\n\r\nWe provide several existing machine learning models that are commonly used for multivariate time-series data.\r\n`pytorch` is used for the deep learning models, `lightgbm` for the boosted tree approaches, and `sklearn` for other classical\r\nmachine learning models.\r\nThe benchmark provides (among others) the following built-in models:\r\n\r\n- [Logistic Regression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html?highlight=logistic+regression):\r\n  Standard regression approach.\r\n- [Elastic Net](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html): Linear regression with\r\n  combined L1 and L2 priors as regularizer.\r\n- [LightGBM](https://proceedings.neurips.cc/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf): Efficient gradient\r\n  boosting trees.\r\n- [Long Short-term Memory (LSTM)](https://ieeexplore.ieee.org/document/818041): The most commonly used type of Recurrent Neural\r\n  Networks for long sequences.\r\n- [Gated Recurrent Unit (GRU)](https://arxiv.org/abs/1406.1078) : A extension to LSTM which showed\r\n  improvements ([paper](https://arxiv.org/abs/1412.3555)).\r\n- [Temporal Convolutional Networks (TCN)](https://arxiv.org/pdf/1803.01271 ): 1D convolution approach to sequence data. By\r\n  using dilated convolution to extend the receptive field of the network it has shown great performance on long-term\r\n  dependencies.\r\n- [Transformers](https://papers.nips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf): The most common Attention\r\n  based approach.\r\n\r\n# \ud83d\udee0\ufe0f Development\r\n\r\nTo adapt YAIB to your own use case, you can use\r\nthe [development information](https://github.com/rvandewater/YAIB/wiki/Contribution-and-development) page as a reference.\r\nWe appreciate contributions to the project. Please read the [contribution guidelines](CONTRIBUTING.MD) before submitting a pull\r\nrequest.\r\n\r\n# Acknowledgements\r\n\r\nWe do not own any of the datasets used in this benchmark. This project uses heavily adapted components of\r\nthe [HiRID benchmark](https://github.com/ratschlab/HIRID-ICU-Benchmark/). We thank the authors for providing this codebase and\r\nencourage further development to benefit the scientific community. The demo datasets have been released under\r\nan [Open Data Commons Open Database License (ODbL)](https://opendatacommons.org/licenses/odbl/1-0/).\r\n\r\n# License\r\n\r\nThis source code is released under the MIT license, included [here](LICENSE). We do not own any of the datasets used or\r\nincluded in this repository. \r\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Yet Another ICU Benchmark is a holistic framework for the automation of the development of clinical prediction models on ICU data. Users can create custom datasets, cohorts, prediction tasks, endpoints, and models.",
    "version": "0.3.1",
    "project_urls": {
        "Homepage": "https://github.com/rvandewater/YAIB"
    },
    "split_keywords": [
        "benchmark",
        "mimic-iii",
        "eicu",
        "hirid",
        "clinical-ml",
        "machine-learning",
        "benchmark",
        "time-series",
        "mimic-iv",
        "patient-monitoring",
        "amsterdamumcdb",
        "clinical-data",
        "ehr",
        "icu",
        "ricu",
        "pyicu"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dcd3d8991c19beba26f80cb688466361f3b051a3fe6b50cb431c9e6577882825",
                "md5": "95a49d12d3c982ed63fda5d25afa961b",
                "sha256": "79a81c6359ebac87b588e61dd4d54006a51ca63969e7784b92d8415e1ce6e53b"
            },
            "downloads": -1,
            "filename": "yaib-0.3.1-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "95a49d12d3c982ed63fda5d25afa961b",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 16301,
            "upload_time": "2023-06-13T11:38:01",
            "upload_time_iso_8601": "2023-06-13T11:38:01.171118Z",
            "url": "https://files.pythonhosted.org/packages/dc/d3/d8991c19beba26f80cb688466361f3b051a3fe6b50cb431c9e6577882825/yaib-0.3.1-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a61a8e0894a3d58857e526cb5f92051a511617036234809f61136896dffc7257",
                "md5": "d120880599ea7ffe67e14fffdc9e877e",
                "sha256": "ede46c1c338092549367518b53e95c6b186f1daecc70b0960e701fc0413adb83"
            },
            "downloads": -1,
            "filename": "yaib-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "d120880599ea7ffe67e14fffdc9e877e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 475617,
            "upload_time": "2023-06-13T11:38:03",
            "upload_time_iso_8601": "2023-06-13T11:38:03.864305Z",
            "url": "https://files.pythonhosted.org/packages/a6/1a/8e0894a3d58857e526cb5f92051a511617036234809f61136896dffc7257/yaib-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-13 11:38:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rvandewater",
    "github_project": "YAIB",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "yaib"
}

Robin van de Water