[![CI](https://github.com/bihealth/cada-prio/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/bihealth/cada-prio/actions/workflows/main.yml)
[![codecov](https://codecov.io/gh/bihealth/cada-prio/graph/badge.svg?token=HIBwaG4eYM)](https://codecov.io/gh/bihealth/cada-prio)
[![Documentation Status](https://readthedocs.org/projects/cada-prio/badge/?version=latest)](https://cada-prio.readthedocs.io/en/latest/?badge=latest)
[![Pypi](https://img.shields.io/pypi/pyversions/cada-prio.svg)](https://pypi.org/project/cada-prio)
# CADA: The Next Generation
This is a re-implementation of the [CADA](https://github.com/Chengyao-Peng/CADA) method for phenotype-similarity prioritization.
- Free software: MIT license
- Documentation: https://cada-prio.readthedocs.io/en/latest/
- Discussion Forum: https://github.com/bihealth/cada-prio/discussions
- Bug Reports: https://github.com/bihealth/cada-prio/issues
## Running Hyperparameter Tuning
Install with `tune` feature enabled:
```
pip install cada-prio[tune]
```
Run tuning, e.g., on the "classic" model.
Thanks to [optuna](https://optuna.org/), you can run this in parallel as long as the database is shared.
Each run will use 4 CPUs in the example below and perform 1 trial.
```
cada-prio tune run-optuna \
sqlite:///local_data/cada-tune.sqlite \
--path-hgnc-json data/classic/hgnc_complete_set.json \
--path-hpo-genes-to-phenotype data/classic/genes_to_phenotype.all_source_all_freqs_etc.txt \
--path-hpo-obo data/classic/hp.obo \
--path-clinvar-phenotype-links data/classic/cases_train.jsonl \
--path-validation-links data/classic/cases_validate.jsonl \
--n-trials 1 \
--cpus=4
```
## Managing GitHub Project with Terraform
```
# export GITHUB_OWNER=bihealth
# export GITHUB_TOKEN=ghp_<thetoken>
# cd utils/terraform
# terraform init
# terraform import github_repository.cada-prio cada-prio
# terraform validate
# terraform fmt
# terraform plan
# terraform apply
```
# Changelog
### [0.6.1](https://www.github.com/bihealth/cada-prio/compare/v0.6.0...v0.6.1) (2023-11-16)
### Bug Fixes
* pinning python to 3.11 for build so we have setuptools ([#36](https://www.github.com/bihealth/cada-prio/issues/36)) ([54d4e8c](https://www.github.com/bihealth/cada-prio/commit/54d4e8c4be8bdd7ad6b9f839b22a798b5f527d27))
## [0.6.0](https://www.github.com/bihealth/cada-prio/compare/v0.5.0...v0.6.0) (2023-11-16)
### Features
* adding API prefix, OpenAPI and docs to REST server ([#35](https://www.github.com/bihealth/cada-prio/issues/35)) ([1a2f605](https://www.github.com/bihealth/cada-prio/commit/1a2f605bbaa28ed511b117efa04de256dcff149d))
* adding classic and current model ([#25](https://www.github.com/bihealth/cada-prio/issues/25)) ([44ddf24](https://www.github.com/bihealth/cada-prio/commit/44ddf24abf939eed8ad56b80cb1e90f60846a390))
## [0.5.0](https://www.github.com/bihealth/cada-prio/compare/v0.4.0...v0.5.0) (2023-09-18)
### Features
* adding "tune run-optuna" command ([#23](https://www.github.com/bihealth/cada-prio/issues/23)) ([6cc753b](https://www.github.com/bihealth/cada-prio/commit/6cc753b3b4f92aa75d961c3cf314e097d174ede0))
* re-useable implementation of "tune train-eval" ([#21](https://www.github.com/bihealth/cada-prio/issues/21)) ([c80c4bf](https://www.github.com/bihealth/cada-prio/commit/c80c4bf1d69ff83bcb84b949cf3383746580a12d))
## [0.4.0](https://www.github.com/bihealth/cada-prio/compare/v0.3.1...v0.4.0) (2023-09-14)
### Features
* adding dump-graph to cli ([#18](https://www.github.com/bihealth/cada-prio/issues/18)) ([3aace31](https://www.github.com/bihealth/cada-prio/commit/3aace31166ddbd4357ae32283b6514a21404e0ef))
* adding param-opt command with single parameter evaluation ([#20](https://www.github.com/bihealth/cada-prio/issues/20)) ([83141c6](https://www.github.com/bihealth/cada-prio/commit/83141c6c4afe6efffc51fcde1ebdc92b5b3d0fbf))
* allow running with legacy model/graph data ([#16](https://www.github.com/bihealth/cada-prio/issues/16)) ([9d3cc7c](https://www.github.com/bihealth/cada-prio/commit/9d3cc7cea6efeac82b41fe11dfc9527ab4fe2913))
* embedding parameters can be provided via CLI and contains seeds ([#19](https://www.github.com/bihealth/cada-prio/issues/19)) ([bbd5d86](https://www.github.com/bihealth/cada-prio/commit/bbd5d86e879db94240093c20145b1c4c45edc69e))
### [0.3.1](https://www.github.com/bihealth/cada-prio/compare/v0.3.0...v0.3.1) (2023-09-13)
### Bug Fixes
* add missing line endings to hgnc_info.jsonl ([#13](https://www.github.com/bihealth/cada-prio/issues/13)) ([aa14b9b](https://www.github.com/bihealth/cada-prio/commit/aa14b9b948a0e9512c57567de2acaa65e9b132bc))
* properly parsing comma-separated list on REST API ([#14](https://www.github.com/bihealth/cada-prio/issues/14)) ([97fdfee](https://www.github.com/bihealth/cada-prio/commit/97fdfeee118d2e4985ca71433617fd9c470d0b49))
## [0.3.0](https://www.github.com/bihealth/cada-prio/compare/v0.2.1...v0.3.0) (2023-09-11)
### Features
* also adding gene-to-phen edges from HPO ([#9](https://www.github.com/bihealth/cada-prio/issues/9)) ([d5a8337](https://www.github.com/bihealth/cada-prio/commit/d5a833774b1488fb7e1f0650692aab2c3f753144))
### [0.2.1](https://www.github.com/bihealth/cada-prio/compare/v0.2.0...v0.2.1) (2023-09-08)
### Bug Fixes
* removing spurious debug print statement ([#7](https://www.github.com/bihealth/cada-prio/issues/7)) ([98e7443](https://www.github.com/bihealth/cada-prio/commit/98e74433001872517a4904bbe85fd021cc4ad613))
## [0.2.0](https://www.github.com/bihealth/cada-prio/compare/v0.1.0...v0.2.0) (2023-09-08)
### Features
* gene to phenotype links file can be gziped ([#5](https://www.github.com/bihealth/cada-prio/issues/5)) ([66c48bf](https://www.github.com/bihealth/cada-prio/commit/66c48bf98c8bd73f8227c7cbd5687b4e74577ef8))
## 0.1.0 (2023-09-07)
### Features
* adding REST API server for prediction ([#4](https://www.github.com/bihealth/cada-prio/issues/4)) ([8bb7516](https://www.github.com/bihealth/cada-prio/commit/8bb75161097529932f371925fe860290098f0885))
* initial training implementation ([#1](https://www.github.com/bihealth/cada-prio/issues/1)) ([10d3a7c](https://www.github.com/bihealth/cada-prio/commit/10d3a7cb356b50a89fd8b1226ad66932dd5542f3))
* prioritization prediction with model ([#3](https://www.github.com/bihealth/cada-prio/issues/3)) ([48d504c](https://www.github.com/bihealth/cada-prio/commit/48d504c0bc373e1ae312773fa70a5a2e04d8dbed))
Raw data
{
"_id": null,
"home_page": "https://github.com/bihealth/cada-prio",
"name": "cada-prio",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "cada",
"author": "Manuel Holtgrewe",
"author_email": "manuel.holtgrewe@bih-charite.de",
"download_url": "https://files.pythonhosted.org/packages/e5/b3/ffa12d2d5b0b72d8ad174b7da68e012b1ee257234391fb50b99eafcf9685/cada-prio-0.6.1.tar.gz",
"platform": null,
"description": "[![CI](https://github.com/bihealth/cada-prio/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/bihealth/cada-prio/actions/workflows/main.yml)\n[![codecov](https://codecov.io/gh/bihealth/cada-prio/graph/badge.svg?token=HIBwaG4eYM)](https://codecov.io/gh/bihealth/cada-prio)\n[![Documentation Status](https://readthedocs.org/projects/cada-prio/badge/?version=latest)](https://cada-prio.readthedocs.io/en/latest/?badge=latest)\n[![Pypi](https://img.shields.io/pypi/pyversions/cada-prio.svg)](https://pypi.org/project/cada-prio)\n\n# CADA: The Next Generation\n\nThis is a re-implementation of the [CADA](https://github.com/Chengyao-Peng/CADA) method for phenotype-similarity prioritization.\n\n- Free software: MIT license\n- Documentation: https://cada-prio.readthedocs.io/en/latest/\n- Discussion Forum: https://github.com/bihealth/cada-prio/discussions\n- Bug Reports: https://github.com/bihealth/cada-prio/issues\n\n## Running Hyperparameter Tuning\n\nInstall with `tune` feature enabled:\n\n```\npip install cada-prio[tune]\n```\n\nRun tuning, e.g., on the \"classic\" model.\nThanks to [optuna](https://optuna.org/), you can run this in parallel as long as the database is shared.\nEach run will use 4 CPUs in the example below and perform 1 trial.\n\n```\ncada-prio tune run-optuna \\\n sqlite:///local_data/cada-tune.sqlite \\\n --path-hgnc-json data/classic/hgnc_complete_set.json \\\n --path-hpo-genes-to-phenotype data/classic/genes_to_phenotype.all_source_all_freqs_etc.txt \\\n --path-hpo-obo data/classic/hp.obo \\\n --path-clinvar-phenotype-links data/classic/cases_train.jsonl \\\n --path-validation-links data/classic/cases_validate.jsonl \\\n --n-trials 1 \\\n --cpus=4\n```\n\n## Managing GitHub Project with Terraform\n\n```\n# export GITHUB_OWNER=bihealth\n# export GITHUB_TOKEN=ghp_<thetoken>\n\n# cd utils/terraform\n\n# terraform init\n# terraform import github_repository.cada-prio cada-prio\n# terraform validate\n# terraform fmt\n# terraform plan\n# terraform apply\n```\n\n\n# Changelog\n\n### [0.6.1](https://www.github.com/bihealth/cada-prio/compare/v0.6.0...v0.6.1) (2023-11-16)\n\n\n### Bug Fixes\n\n* pinning python to 3.11 for build so we have setuptools ([#36](https://www.github.com/bihealth/cada-prio/issues/36)) ([54d4e8c](https://www.github.com/bihealth/cada-prio/commit/54d4e8c4be8bdd7ad6b9f839b22a798b5f527d27))\n\n## [0.6.0](https://www.github.com/bihealth/cada-prio/compare/v0.5.0...v0.6.0) (2023-11-16)\n\n\n### Features\n\n* adding API prefix, OpenAPI and docs to REST server ([#35](https://www.github.com/bihealth/cada-prio/issues/35)) ([1a2f605](https://www.github.com/bihealth/cada-prio/commit/1a2f605bbaa28ed511b117efa04de256dcff149d))\n* adding classic and current model ([#25](https://www.github.com/bihealth/cada-prio/issues/25)) ([44ddf24](https://www.github.com/bihealth/cada-prio/commit/44ddf24abf939eed8ad56b80cb1e90f60846a390))\n\n## [0.5.0](https://www.github.com/bihealth/cada-prio/compare/v0.4.0...v0.5.0) (2023-09-18)\n\n\n### Features\n\n* adding \"tune run-optuna\" command ([#23](https://www.github.com/bihealth/cada-prio/issues/23)) ([6cc753b](https://www.github.com/bihealth/cada-prio/commit/6cc753b3b4f92aa75d961c3cf314e097d174ede0))\n* re-useable implementation of \"tune train-eval\" ([#21](https://www.github.com/bihealth/cada-prio/issues/21)) ([c80c4bf](https://www.github.com/bihealth/cada-prio/commit/c80c4bf1d69ff83bcb84b949cf3383746580a12d))\n\n## [0.4.0](https://www.github.com/bihealth/cada-prio/compare/v0.3.1...v0.4.0) (2023-09-14)\n\n\n### Features\n\n* adding dump-graph to cli ([#18](https://www.github.com/bihealth/cada-prio/issues/18)) ([3aace31](https://www.github.com/bihealth/cada-prio/commit/3aace31166ddbd4357ae32283b6514a21404e0ef))\n* adding param-opt command with single parameter evaluation ([#20](https://www.github.com/bihealth/cada-prio/issues/20)) ([83141c6](https://www.github.com/bihealth/cada-prio/commit/83141c6c4afe6efffc51fcde1ebdc92b5b3d0fbf))\n* allow running with legacy model/graph data ([#16](https://www.github.com/bihealth/cada-prio/issues/16)) ([9d3cc7c](https://www.github.com/bihealth/cada-prio/commit/9d3cc7cea6efeac82b41fe11dfc9527ab4fe2913))\n* embedding parameters can be provided via CLI and contains seeds ([#19](https://www.github.com/bihealth/cada-prio/issues/19)) ([bbd5d86](https://www.github.com/bihealth/cada-prio/commit/bbd5d86e879db94240093c20145b1c4c45edc69e))\n\n### [0.3.1](https://www.github.com/bihealth/cada-prio/compare/v0.3.0...v0.3.1) (2023-09-13)\n\n\n### Bug Fixes\n\n* add missing line endings to hgnc_info.jsonl ([#13](https://www.github.com/bihealth/cada-prio/issues/13)) ([aa14b9b](https://www.github.com/bihealth/cada-prio/commit/aa14b9b948a0e9512c57567de2acaa65e9b132bc))\n* properly parsing comma-separated list on REST API ([#14](https://www.github.com/bihealth/cada-prio/issues/14)) ([97fdfee](https://www.github.com/bihealth/cada-prio/commit/97fdfeee118d2e4985ca71433617fd9c470d0b49))\n\n## [0.3.0](https://www.github.com/bihealth/cada-prio/compare/v0.2.1...v0.3.0) (2023-09-11)\n\n\n### Features\n\n* also adding gene-to-phen edges from HPO ([#9](https://www.github.com/bihealth/cada-prio/issues/9)) ([d5a8337](https://www.github.com/bihealth/cada-prio/commit/d5a833774b1488fb7e1f0650692aab2c3f753144))\n\n### [0.2.1](https://www.github.com/bihealth/cada-prio/compare/v0.2.0...v0.2.1) (2023-09-08)\n\n\n### Bug Fixes\n\n* removing spurious debug print statement ([#7](https://www.github.com/bihealth/cada-prio/issues/7)) ([98e7443](https://www.github.com/bihealth/cada-prio/commit/98e74433001872517a4904bbe85fd021cc4ad613))\n\n## [0.2.0](https://www.github.com/bihealth/cada-prio/compare/v0.1.0...v0.2.0) (2023-09-08)\n\n\n### Features\n\n* gene to phenotype links file can be gziped ([#5](https://www.github.com/bihealth/cada-prio/issues/5)) ([66c48bf](https://www.github.com/bihealth/cada-prio/commit/66c48bf98c8bd73f8227c7cbd5687b4e74577ef8))\n\n## 0.1.0 (2023-09-07)\n\n\n### Features\n\n* adding REST API server for prediction ([#4](https://www.github.com/bihealth/cada-prio/issues/4)) ([8bb7516](https://www.github.com/bihealth/cada-prio/commit/8bb75161097529932f371925fe860290098f0885))\n* initial training implementation ([#1](https://www.github.com/bihealth/cada-prio/issues/1)) ([10d3a7c](https://www.github.com/bihealth/cada-prio/commit/10d3a7cb356b50a89fd8b1226ad66932dd5542f3))\n* prioritization prediction with model ([#3](https://www.github.com/bihealth/cada-prio/issues/3)) ([48d504c](https://www.github.com/bihealth/cada-prio/commit/48d504c0bc373e1ae312773fa70a5a2e04d8dbed))\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "Phenotype-based prioritization of variants with CADA",
"version": "0.6.1",
"project_urls": {
"Homepage": "https://github.com/bihealth/cada-prio"
},
"split_keywords": [
"cada"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e5b3ffa12d2d5b0b72d8ad174b7da68e012b1ee257234391fb50b99eafcf9685",
"md5": "03c33cc4fbb1a24ab06b5ba32fd6ea2d",
"sha256": "4b928f0257559378a1be6734654b9b03067ddbc9e7db9458b71fe1931c8d58d0"
},
"downloads": -1,
"filename": "cada-prio-0.6.1.tar.gz",
"has_sig": false,
"md5_digest": "03c33cc4fbb1a24ab06b5ba32fd6ea2d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 24002,
"upload_time": "2023-11-16T07:38:28",
"upload_time_iso_8601": "2023-11-16T07:38:28.051551Z",
"url": "https://files.pythonhosted.org/packages/e5/b3/ffa12d2d5b0b72d8ad174b7da68e012b1ee257234391fb50b99eafcf9685/cada-prio-0.6.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-16 07:38:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "bihealth",
"github_project": "cada-prio",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "cada-prio"
}