# ontology-based rare disease common data model
Welcome to the repo of the ontology-based rare disease common data model (RD-CDM) harmonising international registry use, HL7® FHIR®, and the GA4GH Phenopacket Schema.
<!-- Badges -->
[](https://github.com/BIH-CEI/rd-cdm/actions/workflows/ci.yml)
[](https://rd-cdm.readthedocs.io/en/latest/?badge=latest)
[](https://doi.org/10.5281/zenodo.13891625)

[](https://pypi.org/project/rd-cdm/)
[](https://pypi.org/project/rd-cdm/)
[](https://linkml.io/)
**Latest docs:** https://rd-cdm.readthedocs.io/en/latest/
### Manuscript
The corresponding paper for RD-CDM v2.0.0 has been published in *Nature Scientific Data*:
https://www.nature.com/articles/s41597-025-04558-z
---
## Table of Contents
- [Project Description](#project-description)
- [What you get from PyPI](#what-you-get-from-pypi)
- [Features](#features)
- [Installation](#installation)
- [Quick start (pip)](#quick-start-pip)
- [Development install](#development-install)
- [CLI tools](#cli-tools)
- [Versioning & File Layout](#versioning--file-layout)
- [Validating with BioPortal](#validating-with-bioportal)
- [Contributing & Contact](#contributing--contact)
- [Resources](#resources)
- [License](#license)
- [Citing](#citing)
- [Acknowledgements](#acknowledgements)
---
## Project Description
The ontology-based RD-CDM harmonizes rare disease data capture across registries. It integrates ERDRI-CDS, HL7 FHIR, and GA4GH Phenopacket Schema to support interoperable data for research and care. RD-CDM v2.0.x comprises 78 data elements covering formal criteria, personal information, patient status, disease, genetic findings, phenotypic findings, and family history.
---
## What you get from PyPI
Installing `rd-cdm` from PyPI provides:
- **Schema**
- `src/rd_cdm/schema/rd_cdm.yaml`
- **Versioned instances (data packs)**
- `src/rd_cdm/instances/v2_0_1/*.yaml` (e.g., `code_systems.yaml`, `data_elements.yaml`, `value_sets.yaml`)
- merged file: `src/rd_cdm/instances/v2_0_1/rd_cdm_v2_0_1.yaml`
- exports (if present or generated locally):
- `src/rd_cdm/instances/v2_0_1/jsons/*.json`
- `src/rd_cdm/instances/v2_0_1/csvs/*.csv`
- **Generated Python & Pydantic classes (LinkML)**
- `src/rd_cdm/python_classes/rd_cdm.py` (LinkML runtime dataclasses)
- `src/rd_cdm/python_classes/rd_cdm_pydantic.py` (generated from the schema via LinkML’s Pydantic generator)
- **Utilities / CLI entry points**
- `rdcdm-merge` – merge instance parts into `rd_cdm_vX_Y_Z.yaml`
- `rdcdm-json` – per-file JSON export + combined `rd_cdm_vX_Y_Z.json`
- `rdcdm-csv` – per-file CSV export + combined `rd_cdm_vX_Y_Z.csv`
- `rdcdm-validate` – validate ontology codes via BioPortal
---
## Features
- **Interoperability**: Aligns with HL7 FHIR v4.0.1 and GA4GH Phenopacket v2.0
- **Ontology-driven**: Uses SNOMED CT, LOINC, NCIT, MONDO, OMIM, HPO, and more
- **Modular**: Clear separation of schema, instances, and exports
- **Versioned data**: Instances shipped and resolved per version (e.g., `v2_0_1`)
- **Tooling**: Merge, export, and validation utilities with simple CLIs
- **(Optional) Pydantic models**: Strict runtime validation generated from LinkML
---
### Installation
From PyPI:
```bash
pip install rd-cdm
```
Optional extras for testing/docs:
```bash
pip install rd-cdm[test] # pytest, etc.
# or
pip install rd-cdm[docs]
```
### Development install
```bash
git clone https://github.com/BIH-CEI/rd-cdm.git
cd rd-cdm
# (Recommended) create a venv
python -m venv .venv && source .venv/bin/activate
pip install -U pip
pip install -e .[test]
pytest -q
```
> We use a **src/** layout. If you run tools directly, ensure `PYTHONPATH=src` or use the installed CLI entry points shown below.
---
## CLI tools
After installation you should have these commands:
```bash
# Merge the versioned parts into rd_cdm_vX_Y_Z.yaml (auto-resolves latest if not given)
rdcdm-merge # or: rdcdm-merge --version 2.0.1
# Export JSON (per-file .json + combined rd_cdm_vX_Y_Z.json)
rdcdm-json # or: rdcdm-json -v 2.0.1
# Export CSV (per-file .csv + combined rd_cdm_vX_Y_Z.csv)
rdcdm-csv # or: rdcdm-csv -v 2.0.1
# Validate merged instance file against ontologies via BioPortal
rdcdm-validate # or: rdcdm-validate -v 2.0.1 (Note: set up BioPortal API key for this)
```
### BioPoratal API Key Setup for Validation
The ``rdcdm-validate`` command uses the BioPortal API
to check ontology term validity. This requires an API key to be set as an environment variable.
#### Get an API key:
Sign up (or log in) at https://bioportal.bioontology.org/accounts/new
- Go to your account settings and copy your API Key.
- Set the API key in your environment
#### macOS / Linux (bash/zsh):
```bash
export BIOPORTAL_API_KEY="your-key-here"
```
#### Windoes (PowerShell):
```bash
setx BIOPORTAL_API_KEY "your-key-here"
```
---
## Contributing and Contact
The RD-CDM is a community-driven effort and we invite open and international
collaboration. Please feel free to create issues, discuss features,
or submit pull requests to help enhance this project. For larger contributions,
consider reaching out to discuss collaboration opportunities.
Please find more information on how to contact us and contribute
in the [`Contribution` section of our documentation](https://rd-cdm.readthedocs.io/en/latest/contributing.html).
## RareLink
RareLink is a novel rare disease framework in REDCap linking international
registries, FHIR, and Phenopackets based on the RD-CDM. It is designed to
support the collection of harmonized data for rare disease research
across any REDCap project worldwide and allows for the preconfigured export of
the RD-CDM data in FHIR and Phenopackets formats.
For more information on RareLink, please see the:
- [RareLink Documentation](https://rarelink.readthedocs.io/en/latest/index.html)
- [RareLink GitHub](https://github.com/BIH-CEI/rarelink)
## Resources
### Ontologies
- Human Phenotype Ontology [🔗](http://www.human-phenotype-ontology.org)
- Monarch Initiative Disease Ontology [🔗](https://mondo.monarchinitiative.org/)
- Online Mendelian Inheritance in Man [🔗](https://www.omim.org/)
- Orphanet Rare Disease Ontology [🔗](https://www.orpha.net/)
- SNOMED CT [🔗](https://www.snomed.org/snomed-ct)
- ICD 11 [🔗](https://icd.who.int/en)
- ICD10CM [🔗](https://www.cdc.gov/nchs/icd/icd10cm.htm)
- National Center for Biotechnology Information Taxonomy [🔗](https://www.ncbi.nlm.nih.gov/taxonomy)
- Logical Observation Identifiers Names and Codes [🔗](https://loinc.org/)
- HUGO Gene Nomenclature Committee [🔗](https://www.genenames.org/)
- Gene Ontology [🔗](https://geneontology.org/)
- NCI Thesaurus OBO Edition [🔗](https://obofoundry.org/ontology/ncit.html)
For the versions used in a specific RD-CDM version, please see the
[resources in our documentation](https://rd-cdm.readthedocs.io/en/latest/resources/resources_file.html).
### Submodules
- [RareLink](https://github.com/BIH-CEI/RareLink)
## License
This project is licensed under the terms of the [MIT License](https://github.com/BIH-CEI/rd-cdm/blob/develop/LICENSE)
## Citing
If you use the model for your research, do not hesitate to reach out and
please cite our article:
> Graefe, A.S.L., Hübner, M.R., Rehburg, F. et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data 12, 234 (2025). https://doi.org/10.1038/s41597-025-04558-z
## Acknowledgements
We would like to extend our thanks to all the authors involved in the
development of this RD-CDM model.
---
- Authors:
- [Adam SL Graefe](https://github.com/aslgraefe)
- [Filip Rehburg](https://github.com/frehburg)
- [Samer Alkarkoukly](https://github.com/alkarkoukly)
- [Daniel Danis](https://github.com/ielis)
- [Peter N. Robinson](https://github.com/pnrobinson)
- Oya Beyan
- Sylvia Thun
Raw data
{
"_id": null,
"home_page": null,
"name": "rd-cdm",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.10",
"maintainer_email": null,
"keywords": "Rare Diseases, Interoperability, Ontology, GA4GH Phenopackets, HL7 FHIR, International Patient Summary, Genomics Reporting",
"author": "Adam SL Graefe",
"author_email": "adam.graefe@charite.de",
"download_url": "https://files.pythonhosted.org/packages/32/03/254646fec4b5d4d0ee09277e8f24262da0f54800d5ecc0c5509301761318/rd_cdm-2.0.2.tar.gz",
"platform": null,
"description": "# ontology-based rare disease common data model\n\nWelcome to the repo of the ontology-based rare disease common data model (RD-CDM) harmonising international registry use, HL7\u00ae FHIR\u00ae, and the GA4GH Phenopacket Schema.\n\n<!-- Badges -->\n[](https://github.com/BIH-CEI/rd-cdm/actions/workflows/ci.yml)\n[](https://rd-cdm.readthedocs.io/en/latest/?badge=latest)\n[](https://doi.org/10.5281/zenodo.13891625)\n\n[](https://pypi.org/project/rd-cdm/)\n[](https://pypi.org/project/rd-cdm/)\n[](https://linkml.io/)\n\n**Latest docs:** https://rd-cdm.readthedocs.io/en/latest/\n\n### Manuscript\n\nThe corresponding paper for RD-CDM v2.0.0 has been published in *Nature Scientific Data*: \nhttps://www.nature.com/articles/s41597-025-04558-z\n\n---\n\n## Table of Contents\n\n- [Project Description](#project-description)\n- [What you get from PyPI](#what-you-get-from-pypi)\n- [Features](#features)\n- [Installation](#installation)\n - [Quick start (pip)](#quick-start-pip)\n - [Development install](#development-install)\n- [CLI tools](#cli-tools)\n- [Versioning & File Layout](#versioning--file-layout)\n- [Validating with BioPortal](#validating-with-bioportal)\n- [Contributing & Contact](#contributing--contact)\n- [Resources](#resources)\n- [License](#license)\n- [Citing](#citing)\n- [Acknowledgements](#acknowledgements)\n\n---\n\n## Project Description\n\nThe ontology-based RD-CDM harmonizes rare disease data capture across registries. It integrates ERDRI-CDS, HL7 FHIR, and GA4GH Phenopacket Schema to support interoperable data for research and care. RD-CDM v2.0.x comprises 78 data elements covering formal criteria, personal information, patient status, disease, genetic findings, phenotypic findings, and family history.\n\n---\n\n## What you get from PyPI\n\nInstalling `rd-cdm` from PyPI provides:\n\n- **Schema**\n - `src/rd_cdm/schema/rd_cdm.yaml`\n\n- **Versioned instances (data packs)**\n - `src/rd_cdm/instances/v2_0_1/*.yaml` (e.g., `code_systems.yaml`, `data_elements.yaml`, `value_sets.yaml`)\n - merged file: `src/rd_cdm/instances/v2_0_1/rd_cdm_v2_0_1.yaml`\n - exports (if present or generated locally):\n - `src/rd_cdm/instances/v2_0_1/jsons/*.json`\n - `src/rd_cdm/instances/v2_0_1/csvs/*.csv`\n\n- **Generated Python & Pydantic classes (LinkML)**\n - `src/rd_cdm/python_classes/rd_cdm.py` (LinkML runtime dataclasses)\n - `src/rd_cdm/python_classes/rd_cdm_pydantic.py` (generated from the schema via LinkML\u2019s Pydantic generator)\n\n- **Utilities / CLI entry points**\n - `rdcdm-merge` \u2013 merge instance parts into `rd_cdm_vX_Y_Z.yaml`\n - `rdcdm-json` \u2013 per-file JSON export + combined `rd_cdm_vX_Y_Z.json`\n - `rdcdm-csv` \u2013 per-file CSV export + combined `rd_cdm_vX_Y_Z.csv`\n - `rdcdm-validate` \u2013 validate ontology codes via BioPortal\n\n---\n\n## Features\n\n- **Interoperability**: Aligns with HL7 FHIR v4.0.1 and GA4GH Phenopacket v2.0\n- **Ontology-driven**: Uses SNOMED CT, LOINC, NCIT, MONDO, OMIM, HPO, and more\n- **Modular**: Clear separation of schema, instances, and exports\n- **Versioned data**: Instances shipped and resolved per version (e.g., `v2_0_1`)\n- **Tooling**: Merge, export, and validation utilities with simple CLIs\n- **(Optional) Pydantic models**: Strict runtime validation generated from LinkML\n\n---\n\n### Installation\n\nFrom PyPI:\n\n```bash\npip install rd-cdm\n```\n\nOptional extras for testing/docs:\n\n```bash\npip install rd-cdm[test] # pytest, etc.\n# or\npip install rd-cdm[docs]\n```\n\n### Development install\n\n```bash\ngit clone https://github.com/BIH-CEI/rd-cdm.git\ncd rd-cdm\n# (Recommended) create a venv\npython -m venv .venv && source .venv/bin/activate\npip install -U pip\npip install -e .[test]\npytest -q\n```\n\n> We use a **src/** layout. If you run tools directly, ensure `PYTHONPATH=src` or use the installed CLI entry points shown below.\n\n---\n\n## CLI tools\n\nAfter installation you should have these commands:\n\n```bash\n# Merge the versioned parts into rd_cdm_vX_Y_Z.yaml (auto-resolves latest if not given)\nrdcdm-merge # or: rdcdm-merge --version 2.0.1\n\n# Export JSON (per-file .json + combined rd_cdm_vX_Y_Z.json)\nrdcdm-json # or: rdcdm-json -v 2.0.1\n\n# Export CSV (per-file .csv + combined rd_cdm_vX_Y_Z.csv)\nrdcdm-csv # or: rdcdm-csv -v 2.0.1\n\n# Validate merged instance file against ontologies via BioPortal\nrdcdm-validate # or: rdcdm-validate -v 2.0.1 (Note: set up BioPortal API key for this)\n```\n\n### BioPoratal API Key Setup for Validation\nThe ``rdcdm-validate`` command uses the BioPortal API\nto check ontology term validity. This requires an API key to be set as an environment variable.\n\n#### Get an API key:\n\nSign up (or log in) at https://bioportal.bioontology.org/accounts/new\n\n- Go to your account settings and copy your API Key.\n- Set the API key in your environment\n\n#### macOS / Linux (bash/zsh): \n\n```bash\nexport BIOPORTAL_API_KEY=\"your-key-here\"\n```\n\n#### Windoes (PowerShell):\n```bash\nsetx BIOPORTAL_API_KEY \"your-key-here\"\n```\n\n---\n\n## Contributing and Contact\n\nThe RD-CDM is a community-driven effort and we invite open and international\ncollaboration. Please feel free to create issues, discuss features, \nor submit pull requests to help enhance this project. For larger contributions, \nconsider reaching out to discuss collaboration opportunities. \nPlease find more information on how to contact us and contribute \nin the [`Contribution` section of our documentation](https://rd-cdm.readthedocs.io/en/latest/contributing.html).\n\n## RareLink \n\nRareLink is a novel rare disease framework in REDCap linking international \nregistries, FHIR, and Phenopackets based on the RD-CDM. It is designed to \nsupport the collection of harmonized data for rare disease research \nacross any REDCap project worldwide and allows for the preconfigured export of \nthe RD-CDM data in FHIR and Phenopackets formats.\n\nFor more information on RareLink, please see the: \n\n- [RareLink Documentation](https://rarelink.readthedocs.io/en/latest/index.html)\n- [RareLink GitHub](https://github.com/BIH-CEI/rarelink)\n\n## Resources \n\n### Ontologies\n- Human Phenotype Ontology [\ud83d\udd17](http://www.human-phenotype-ontology.org)\n- Monarch Initiative Disease Ontology [\ud83d\udd17](https://mondo.monarchinitiative.org/)\n- Online Mendelian Inheritance in Man [\ud83d\udd17](https://www.omim.org/)\n- Orphanet Rare Disease Ontology [\ud83d\udd17](https://www.orpha.net/)\n- SNOMED CT [\ud83d\udd17](https://www.snomed.org/snomed-ct)\n- ICD 11 [\ud83d\udd17](https://icd.who.int/en)\n- ICD10CM [\ud83d\udd17](https://www.cdc.gov/nchs/icd/icd10cm.htm)\n- National Center for Biotechnology Information Taxonomy [\ud83d\udd17](https://www.ncbi.nlm.nih.gov/taxonomy)\n- Logical Observation Identifiers Names and Codes [\ud83d\udd17](https://loinc.org/)\n- HUGO Gene Nomenclature Committee [\ud83d\udd17](https://www.genenames.org/)\n- Gene Ontology [\ud83d\udd17](https://geneontology.org/)\n- NCI Thesaurus OBO Edition [\ud83d\udd17](https://obofoundry.org/ontology/ncit.html)\n\nFor the versions used in a specific RD-CDM version, please see the \n[resources in our documentation](https://rd-cdm.readthedocs.io/en/latest/resources/resources_file.html).\n\n### Submodules\n- [RareLink](https://github.com/BIH-CEI/RareLink)\n\n## License\n\nThis project is licensed under the terms of the [MIT License](https://github.com/BIH-CEI/rd-cdm/blob/develop/LICENSE)\n\n## Citing\n\nIf you use the model for your research, do not hesitate to reach out and \nplease cite our article: \n\n> Graefe, A.S.L., H\u00fcbner, M.R., Rehburg, F. et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data 12, 234 (2025). https://doi.org/10.1038/s41597-025-04558-z\n\n## Acknowledgements\n\nWe would like to extend our thanks to all the authors involved in the \ndevelopment of this RD-CDM model.\n\n---\n\n- Authors:\n - [Adam SL Graefe](https://github.com/aslgraefe)\n - [Filip Rehburg](https://github.com/frehburg) \n - [Samer Alkarkoukly](https://github.com/alkarkoukly)\n - [Daniel Danis](https://github.com/ielis)\n - [Peter N. Robinson](https://github.com/pnrobinson)\n - Oya Beyan\n - Sylvia Thun\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "An ontology-based rare disease common data model (RD-CDM) harmonising international registries, HL7 FHIR, and GA4GH Phenopackets.",
"version": "2.0.2",
"project_urls": {
"Documentation": "https://rd-cdm.readthedocs.io/en/latest/",
"Homepage": "https://github.com/BIH-CEI/rd-cdm",
"Repository": "https://github.com/BIH-CEI/rd-cdm"
},
"split_keywords": [
"rare diseases",
" interoperability",
" ontology",
" ga4gh phenopackets",
" hl7 fhir",
" international patient summary",
" genomics reporting"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b4420901eb38e0214b5254991b46019f85c001d56f7eeafb787346bcb50dfbd1",
"md5": "3eb618c47b1705ba695066a133b029a1",
"sha256": "ec941b09a2c65d9533a631c695ba58002893cfb22240a68f5c752772c768d1bb"
},
"downloads": -1,
"filename": "rd_cdm-2.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3eb618c47b1705ba695066a133b029a1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.10",
"size": 89786,
"upload_time": "2025-08-08T07:32:28",
"upload_time_iso_8601": "2025-08-08T07:32:28.468214Z",
"url": "https://files.pythonhosted.org/packages/b4/42/0901eb38e0214b5254991b46019f85c001d56f7eeafb787346bcb50dfbd1/rd_cdm-2.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3203254646fec4b5d4d0ee09277e8f24262da0f54800d5ecc0c5509301761318",
"md5": "a7a53acc865ed15e378d4775f16ea9dd",
"sha256": "b7aa17682daaec88645a27c257f973da6f4f2db0e4acac5e3f8b8a6b1d7ac383"
},
"downloads": -1,
"filename": "rd_cdm-2.0.2.tar.gz",
"has_sig": false,
"md5_digest": "a7a53acc865ed15e378d4775f16ea9dd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.10",
"size": 68726,
"upload_time": "2025-08-08T07:32:29",
"upload_time_iso_8601": "2025-08-08T07:32:29.959832Z",
"url": "https://files.pythonhosted.org/packages/32/03/254646fec4b5d4d0ee09277e8f24262da0f54800d5ecc0c5509301761318/rd_cdm-2.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-08 07:32:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "BIH-CEI",
"github_project": "rd-cdm",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "rd-cdm"
}