# Cyrius: WGS-based CYP2D6 genotyper
Cyrius is a tool to genotype CYP2D6 from a whole-genome sequencing (WGS) BAM file. Cyrius uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog CYP2D7 and thus is able to detect all star alleles, particularly those that contain structural variants, accurately. Please refer to our [paper](https://www.nature.com/articles/s41397-020-00205-5) for details about the method.
Cyrius has been integrated into [Illumina DRAGEN Bio-IT Platform since v3.7](https://support.illumina.com/content/dam/illumina-support/help/Illumina_DRAGEN_Bio_IT_Platform_v3_7_1000000141465/Content/SW/Informatics/Dragen/CYP2D6_Caller_fDG.htm).
## Running the program
This Python3 program can be run as follows:
```bash
python3 star_caller.py --manifest MANIFEST_FILE \
--genome [19/37/38] \
--prefix OUTPUT_FILE_PREFIX \
--outDir OUTPUT_DIRECTORY \
--threads NUMBER_THREADS
```
The manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file.
For CRAM input, it’s suggested to provide the path to the reference fasta file with `--reference` in the command.
## Interpreting the output
The program produces a .tsv file in the directory specified by --outDir.
The fields are explained below:
| Fields in tsv | Explanation |
|:------------------|:---------------------------------------------------------------|
| Sample | Sample name |
| Genotype | Genotype call |
| Filter | Filters on the genotype call |
A genotype of "None" indicates a no-call.
There are currently four possible values for the Filter column:
-`PASS`: a passing, confident call.
-`More_than_one_possible_genotype`: In rare cases, Cyrius reports two possible genotypes for which it cannot distinguish one from the other. These are different sets of star alleles that result in the same set of variants that cannot be phased with short reads, e.g. \*1/\*46 and \*43/\*45. The two possible genotypes are reported together, separated by a semicolon.
-`Not_assigned_to_haplotypes`: In a very small portion of samples with more than two copies of CYP2D6, Cyrius calls a set of star alleles but they can be assigned to haplotypes in more than one way. Cyrius reports the star alleles joined by underscores. For example, \*1_\*2_\*68 is reported and the actual genotype could be \*1+\*68/\*2, \*2+\*68/\*1 or \*1+\*2/\*68.
-`LowQ_high_CN`: In rare cases, at high copy number (>=6 copies of CYP2D6), Cyrius uses less strict approximation in calling copy numbers to account for higher noise in depth and thus the genotype call could be lower confidence than usual.
A .json file is also produced that contains more information about each sample.
| Fields in json | Explanation |
|:------------------|:---------------------------------------------------------------|
| Coverage_MAD | Median absolute deviation of depth, measure of sample quality |
| Median_depth | Sample median depth |
| Total_CN | Total copy number of CYP2D6+CYP2D7 |
| Total_CN_raw | Raw normalized depth of CYP2D6+CYP2D7 |
| Spacer_CN | Copy number of CYP2D7 spacer region |
| Spacer_CN_raw | Raw normalized depth of CYP2D7 spacer region |
| Variants_called | Targeted variants called in CYP2D6 |
| CNV_group | An identifier for the sample's CNV/fusion status |
| Variant_raw_count | Supporting reads for each variant |
| Raw_star_allele | Raw star allele call |
| d67_snp_call | CYP2D6 copy number call at CYP2D6/7 differentiating sites |
| d67_snp_raw | Raw CYP2D6 copy number at CYP2D6/7 differentiating sites |
## Troubleshooting
Common causes for Cyrius to produce no-calls are:
-Low sequencing depth. We suggest a sequencing depth of 30x, which is the standard practice recommended by clinical genome sequencing.
-The depth of the CYP2D6/CYP2D7 region is much lower than the rest of the genome, most likely because reads are aligned to alternative contigs. If your reference genome includes alternative contigs, we suggest alt-aware alignment so that alignments to the primary assembly take precedence over alternative contigs.
-The majority of reads in CYP2D6/CYP2D7 region have a mapping quality of zero. This is probably due to some post-processing tools like bwa-postalt that modifies the mapQ in the BAM. We recommend using the BAM file before such post-processing steps as input to Cyrius.
Raw data
{
"_id": null,
"home_page": "https://github.com/illumina/Cyrius",
"name": "cyrius",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "CYP2D6",
"author": "Xiao Chen",
"author_email": "xchen2@illumina.com",
"download_url": "https://files.pythonhosted.org/packages/00/ce/2ad108c212e4676926f2420f84111cc4f7f286158148a044bcd6c2f264aa/cyrius-1.1.1.tar.gz",
"platform": null,
"description": "# Cyrius: WGS-based CYP2D6 genotyper\nCyrius is a tool to genotype CYP2D6 from a whole-genome sequencing (WGS) BAM file. Cyrius uses a novel method to solve the problems caused by the high sequence similarity with the pseudogene paralog CYP2D7 and thus is able to detect all star alleles, particularly those that contain structural variants, accurately. Please refer to our [paper](https://www.nature.com/articles/s41397-020-00205-5) for details about the method. \n\nCyrius has been integrated into [Illumina DRAGEN Bio-IT Platform since v3.7](https://support.illumina.com/content/dam/illumina-support/help/Illumina_DRAGEN_Bio_IT_Platform_v3_7_1000000141465/Content/SW/Informatics/Dragen/CYP2D6_Caller_fDG.htm).\n\n## Running the program\n\nThis Python3 program can be run as follows:\n```bash\npython3 star_caller.py --manifest MANIFEST_FILE \\\n --genome [19/37/38] \\\n --prefix OUTPUT_FILE_PREFIX \\\n --outDir OUTPUT_DIRECTORY \\\n --threads NUMBER_THREADS\n```\nThe manifest is a text file in which each line should list the absolute path to an input BAM/CRAM file.\nFor CRAM input, it\u2019s suggested to provide the path to the reference fasta file with `--reference` in the command. \n\n## Interpreting the output \n\nThe program produces a .tsv file in the directory specified by --outDir. \nThe fields are explained below: \n\n| Fields in tsv | Explanation |\n|:------------------|:---------------------------------------------------------------|\n| Sample | Sample name |\n| Genotype | Genotype call | \n| Filter | Filters on the genotype call | \n\nA genotype of \"None\" indicates a no-call. \nThere are currently four possible values for the Filter column: \n-`PASS`: a passing, confident call. \n-`More_than_one_possible_genotype`: In rare cases, Cyrius reports two possible genotypes for which it cannot distinguish one from the other. These are different sets of star alleles that result in the same set of variants that cannot be phased with short reads, e.g. \\*1/\\*46 and \\*43/\\*45. The two possible genotypes are reported together, separated by a semicolon. \n-`Not_assigned_to_haplotypes`: In a very small portion of samples with more than two copies of CYP2D6, Cyrius calls a set of star alleles but they can be assigned to haplotypes in more than one way. Cyrius reports the star alleles joined by underscores. For example, \\*1_\\*2_\\*68 is reported and the actual genotype could be \\*1+\\*68/\\*2, \\*2+\\*68/\\*1 or \\*1+\\*2/\\*68. \n-`LowQ_high_CN`: In rare cases, at high copy number (>=6 copies of CYP2D6), Cyrius uses less strict approximation in calling copy numbers to account for higher noise in depth and thus the genotype call could be lower confidence than usual. \n\nA .json file is also produced that contains more information about each sample. \n\n| Fields in json | Explanation |\n|:------------------|:---------------------------------------------------------------|\n| Coverage_MAD | Median absolute deviation of depth, measure of sample quality |\n| Median_depth | Sample median depth |\n| Total_CN | Total copy number of CYP2D6+CYP2D7 |\n| Total_CN_raw | Raw normalized depth of CYP2D6+CYP2D7 |\n| Spacer_CN | Copy number of CYP2D7 spacer region |\n| Spacer_CN_raw | Raw normalized depth of CYP2D7 spacer region |\n| Variants_called | Targeted variants called in CYP2D6 |\n| CNV_group | An identifier for the sample's CNV/fusion status |\n| Variant_raw_count | Supporting reads for each variant |\n| Raw_star_allele | Raw star allele call |\n| d67_snp_call | CYP2D6 copy number call at CYP2D6/7 differentiating sites |\n| d67_snp_raw | Raw CYP2D6 copy number at CYP2D6/7 differentiating sites |\n\n## Troubleshooting \n\nCommon causes for Cyrius to produce no-calls are: \n-Low sequencing depth. We suggest a sequencing depth of 30x, which is the standard practice recommended by clinical genome sequencing. \n-The depth of the CYP2D6/CYP2D7 region is much lower than the rest of the genome, most likely because reads are aligned to alternative contigs. If your reference genome includes alternative contigs, we suggest alt-aware alignment so that alignments to the primary assembly take precedence over alternative contigs. \n-The majority of reads in CYP2D6/CYP2D7 region have a mapping quality of zero. This is probably due to some post-processing tools like bwa-postalt that modifies the mapQ in the BAM. We recommend using the BAM file before such post-processing steps as input to Cyrius. \n\n\n",
"bugtrack_url": null,
"license": "PolyFormStrict",
"summary": "WGS-based CYP2D6 caller",
"version": "1.1.1",
"project_urls": {
"Homepage": "https://github.com/illumina/Cyrius"
},
"split_keywords": [
"cyp2d6"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "55b883b8fc9ad78718b417905a34380269886e1cb4a5ce7b725a08b45c9990c7",
"md5": "b334cae407cb7c534b475393595ea5c9",
"sha256": "8ed069fc21df511ef7daa55fc9ddbb7a739ddaaf6f03aa0d2fd9ca2a2eae472a"
},
"downloads": -1,
"filename": "cyrius-1.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b334cae407cb7c534b475393595ea5c9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 221153,
"upload_time": "2023-10-14T17:29:10",
"upload_time_iso_8601": "2023-10-14T17:29:10.084309Z",
"url": "https://files.pythonhosted.org/packages/55/b8/83b8fc9ad78718b417905a34380269886e1cb4a5ce7b725a08b45c9990c7/cyrius-1.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "00ce2ad108c212e4676926f2420f84111cc4f7f286158148a044bcd6c2f264aa",
"md5": "603c4aa162b0a36792a197acd3ce035a",
"sha256": "18fe5ac94f0cbf0641ca76f40439d7186542bef1e9e10f0530d43c7e7fd9bb94"
},
"downloads": -1,
"filename": "cyrius-1.1.1.tar.gz",
"has_sig": false,
"md5_digest": "603c4aa162b0a36792a197acd3ce035a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 27290,
"upload_time": "2023-10-14T17:29:11",
"upload_time_iso_8601": "2023-10-14T17:29:11.675364Z",
"url": "https://files.pythonhosted.org/packages/00/ce/2ad108c212e4676926f2420f84111cc4f7f286158148a044bcd6c2f264aa/cyrius-1.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-14 17:29:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "illumina",
"github_project": "Cyrius",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "cyrius"
}