ramifi


Nameramifi JSON
Version 0.3.0 PyPI version JSON
download
home_pagehttps://github.com/chienchi/ramifi
SummaryScript to do recombinant read analysis
upload_time2023-01-25 17:24:48
maintainer
docs_urlNone
authorChienchi Lo
requires_python
licenseLICENSE
keywords recombinant mix-infection
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Python](https://img.shields.io/badge/python-3.8+-green.svg)](https://www.python.org/)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

# RAMIFI

<ins>R</ins>ecombinant <ins>A</ins>nd <ins>M</ins>ixed-<ins>I</ins>nfection <ins>Fi</ins>nder for SARS-CoV-2 sample. It takes input from aligned bam file  (aligned to [NC_045512](https://github.com/chienchi/ramifi/blob/main/ramifi/data/NC_045512.fasta)) based on [defined mutation list json file](https://github.com/chienchi/ramifi/blob/main/ramifi/data/variant_mutation.json)  provided in the repo and output recombinant and parents reads in .bam and .tsv file with associated stats file. 

## Design Diagram
<img width="2339" alt="Ramifi_design_diagram" src="https://user-images.githubusercontent.com/737589/214627513-7848eae0-3ebd-4864-97cf-dd8e8b3ed416.png">

## Dependencies

### Programming/Scripting languages
- [Python >=v3.8](https://www.python.org/)
    - The pipeline has been tested in v3.8.10

### Python packages
- [pandas >=1.2.4](https://pandas.pydata.org/) 
- [pysam >= 0.16.0.1](https://github.com/pysam-developers/pysam)
- [importlib-resources>=5.7.1](https://pypi.org/project/importlib-resources/)

#### Optional packages
- [plotly >=4.7.1](https://plotly.com/python/)
- [kaleido >= 0.2.1](https://github.com/plotly/Kaleido)
- [biopython >= 1.78](https://biopython.org/)


## Installation

### Install from source
Clone the `ramifi` repository.

```
git clone https://github.com/LANL-Bioinformatics/ramifi
```

Then change directory to `ramifi` and install.

```
cd ramifi
pip install .
```

If the installation was succesful, you should be able to type `ramifi -h` and get a help message on how to use the tool.

```
ramifi -h
```


## Usage
```
usage: ramifi.py [-h] [--refacc [STR]] [--minMixAF [FLOAT]] [--maxMixAF [FLOAT]] [--minMixed_n [INT]] [--minReadCount [INT]]
                 [--lineageMutation [FILE]] [--variantMutation [FILE]] [--mutations_af_plot] [--verbose] [--version] --bam [FILE]
                 [--vcf [File]] [--tsv [FILE]] [--outbam [File]] [-eo [PATH]] [--igv [PATH]] [--igv_variants]

Script to do recombinant read analysis

optional arguments:
  -h, --help            show this help message and exit
  --refacc [STR]        reference accession used in bam [default: NC_045512.2]
  --minMixAF [FLOAT]    minimum alleic frequency for checking mixed mutations on vcf [default:0.2]
  --maxMixAF [FLOAT]    maximum alleic frequency for checking mixed mutations on vcf [default:0.8]
  --minMixed_n [INT]    threshold of mixed mutations count for vcf.
  --minReadCount [INT]  threshold of read with variant count when no vcf provided.
  --lineageMutation [FILE]
                        lineage mutation json file [default: variant_mutation.json]
  --variantMutation [FILE]
                        variant mutation json file [default: lineage_mutation.json]
  --mutations_af_plot   generate mutations_af_plot (when --vcf provided)
  --verbose             Show more infomration in log
  --version             show program's version number and exit

Input:
  --bam [FILE]          <Required> bam file
  --vcf [File]          <Optional> vcf file which will infer the two parents of recombinant_variants

Output:
  --tsv [FILE]          output file name [default: recombinant_reads.tsv]
  --outbam [File]       output recombinant reads in bam file [default: recombinant_reads.bam]

EDGE COVID-19 Options:
  options specific used for EDGE COVID-19

  -eo [PATH], --ec19_projdir [PATH]
                        ec-19 project directory
  --igv [PATH]          igv.html relative path
  --igv_variants        add variants igv track
```

## Test

```
cd tests
./runTest.sh
```

## Outputs 

-- recombinant_reads.stats:  counts

| total  | mapped | unmapped | mutation_reads | parents     | recomb1_reads | recomb2_reads | recombx_reads | parent1_reads | parent2_reads | recomb1_perc| recomb2_perc | recombx_perc |
|--------|--------|----------|----------------|-------------|---------------|---------------|---------------|---------------|---------------|-------------|--------------|--------------|
| 64355  | 64355  |   0      |  5203          |Omicron,Delta|   162         | 175           |     18        |  489          |     730       | 10.29       | 11.11        | 1.14         |


-- recombinant_reads.tsv
|    read_name                | start | end | mutaions_json                                                                                                                                                                                                                                 |  note            |
|-----------------------------|-------|-----|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|
|HMVN7DRXY:2:2153:21802:16078 |  21566|21859| {21618: ['Delta'], 21846: ['Iota', 'Mu', 'Omicron']}                                                                                                                                                                                          |  recombinant 2   |
|HMVN7DRXY:2:2166:28574:36229 |  21732|21883| {21762: ['Eta', 'Omicron'], 21846: ['Iota', 'Mu', 'Omicron']}                                                                                                                                                                                 |  parent Omicron  |
|HMVN7DRXY:2:2215:29749:15217 |  22867|22994| {22917: ['Delta', 'Epsilon', 'Kappa'], 22992: ['rev of Omicron']}                                                                                                                                                                             |  parent Delta    |
|HMVN7DRXY:2:2105:30572:25160 |  22865|23023| {22917: ['rev of Delta Epsilon Kappa'], 22992: ['rev of Omicron'], 22995: ['Delta', 'Omicron'], 23013: ['rev of Omicron']}                                                                                                                    |  recombinant 1   | 
|HMVN7DRXY:2:2127:18304:18850 |  24058|24518| {24130: ['Omicron'], 24469: ['rev of Omicron'], 24503: ['Omicron']}                                                                                                                                                                           |  recombinant x   |
|etc ...                      |       |     |

-- recombinant_reads_by_cross_region.tsv

| Cross_region  | Reads                                                                                                                                                    |
|---------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|
|11201-11283    |{"recomb1": ["HMVN7DRXY:2:2150:13015:23750", "HMVN7DRXY:2:2124:23746:28776", "HMVN7DRXY:2:2232:6216:33395"], "recomb2": ["HMVN7DRXY:2:2122:27624:23062"]} |
|11283-11537    |{"recomb2": ["HMVN7DRXY:2:2126:12825:30154", "HMVN7DRXY:2:2126:15302:29121"]}                                                                             |
|21618-21846    |{"recomb2": ["HMVN7DRXY:2:2153:21802:16078", "HMVN7DRXY:2:2105:22996:5682"]}                                                                              |
|etc ...        |

-- recombinant_reads.parent1.bam

-- recombinant_reads.parent1.bam.bai

-- recombinant_reads.parent2.bam

-- recombinant_reads.parent2.bam.bai

-- recombinant_reads.recomb1.bam

-- recombinant_reads.recomb1.bam.bai

-- recombinant_reads.recomb2.bam

-- recombinant_reads.recomb2.bam.bai

-- recombinant_reads.recombx.bam

-- recombinant_reads.recombx.bam.bai

-- [recombinant_reads.mutations_af_plot.html](https://chienchi.github.io/ramifi/recombinant_reads.mutations_af_plot.html)

-- [recombinant_reads.mutations_af_plot_genomeview.html](https://chienchi.github.io/ramifi/recombinant_reads.mutations_af_plot_genomeview.html)

## Data visualization

The `recombinant_reads.bam`, `ramifi/data/variants_mutation.gff` and `ramifi/data/NC_045512.fasta` can be loaded into [IGV](https://software.broadinstitute.org/software/igv/).

Example:
IGV Link: [https://chienchi.github.io/ramifi/igv-webapp](https://chienchi.github.io/ramifi/igv-webapp)

![Screen Shot 2022-06-13 at 9 51 08 PM](https://user-images.githubusercontent.com/737589/173489713-18150a0d-176b-4526-a751-5a03d2047096.png)

## Custom mutation list

User can custom mustaion list formated as same [defined mutation list json file](https://github.com/chienchi/ramifi/blob/main/ramifi/data/variant_mutation.json) provided in the repo to check other variant/lineage co-infection/recombinant. When run ramifi, the custom mutation list will be taken in by the option flag `--variantMutation`.

For example:
```
{
    "Alpha": {
        "A:23063:T": "S:N501Y",
        "A:23403:G": "S:D614G",
        ...
        "del:21991:3": "S:Y144*"
        ...
    },
    "Beta": {
        "A:10323:G": "ORF1a:K3353R",
        "A:21801:C": "S:D80A",
        "A:22206:G": "S:D215G",
        "A:23063:T": "S:N501Y"
        ...
    },
    "BA.2": {
        ...
    }
}
```

NCBI TRACE Lineage Definitions Weekly Update Site: [https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/](https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/)

## Remove package:

```
pip uninstall ramifi
```

## Citing RAMIFI

This work is currently unpublished. If you are making use of this package, we would appreciate if you gave credit to our repository.


## License

RAMIFI is distributed as open-source software under [GPLv3 LICENSE](https://github.com/chienchi/ramifi/blob/main/LICENSE) and the license file included in the RAMIFI distribution.

LANL open source approval reference C22090.

© 2023. Triad National Security, LLC. All rights reserved.
This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos
National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.
Department of Energy/National Nuclear Security Administration. All rights in the program are
reserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear
Security Administration. The Government is granted for itself and others acting on its behalf a
nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare
derivative works, distribute copies to the public, perform publicly and display publicly, and to permit
others to do so.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/chienchi/ramifi",
    "name": "ramifi",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "recombinant,mix-infection",
    "author": "Chienchi Lo",
    "author_email": "chienchi@lanl.gov",
    "download_url": "https://files.pythonhosted.org/packages/15/5b/480d01fab6827d8ac59e0e92375e893209c7b853db3fcdab0a223d067e7b/ramifi-0.3.0.tar.gz",
    "platform": null,
    "description": "[![Python](https://img.shields.io/badge/python-3.8+-green.svg)](https://www.python.org/)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n# RAMIFI\n\n<ins>R</ins>ecombinant <ins>A</ins>nd <ins>M</ins>ixed-<ins>I</ins>nfection <ins>Fi</ins>nder for SARS-CoV-2 sample. It takes input from aligned bam file  (aligned to [NC_045512](https://github.com/chienchi/ramifi/blob/main/ramifi/data/NC_045512.fasta)) based on [defined mutation list json file](https://github.com/chienchi/ramifi/blob/main/ramifi/data/variant_mutation.json)  provided in the repo and output recombinant and parents reads in .bam and .tsv file with associated stats file. \n\n## Design Diagram\n<img width=\"2339\" alt=\"Ramifi_design_diagram\" src=\"https://user-images.githubusercontent.com/737589/214627513-7848eae0-3ebd-4864-97cf-dd8e8b3ed416.png\">\n\n## Dependencies\n\n### Programming/Scripting languages\n- [Python >=v3.8](https://www.python.org/)\n    - The pipeline has been tested in v3.8.10\n\n### Python packages\n- [pandas >=1.2.4](https://pandas.pydata.org/) \n- [pysam >= 0.16.0.1](https://github.com/pysam-developers/pysam)\n- [importlib-resources>=5.7.1](https://pypi.org/project/importlib-resources/)\n\n#### Optional packages\n- [plotly >=4.7.1](https://plotly.com/python/)\n- [kaleido >= 0.2.1](https://github.com/plotly/Kaleido)\n- [biopython >= 1.78](https://biopython.org/)\n\n\n## Installation\n\n### Install from source\nClone the `ramifi` repository.\n\n```\ngit clone https://github.com/LANL-Bioinformatics/ramifi\n```\n\nThen change directory to `ramifi` and install.\n\n```\ncd ramifi\npip install .\n```\n\nIf the installation was succesful, you should be able to type `ramifi -h` and get a help message on how to use the tool.\n\n```\nramifi -h\n```\n\n\n## Usage\n```\nusage: ramifi.py [-h] [--refacc [STR]] [--minMixAF [FLOAT]] [--maxMixAF [FLOAT]] [--minMixed_n [INT]] [--minReadCount [INT]]\n                 [--lineageMutation [FILE]] [--variantMutation [FILE]] [--mutations_af_plot] [--verbose] [--version] --bam [FILE]\n                 [--vcf [File]] [--tsv [FILE]] [--outbam [File]] [-eo [PATH]] [--igv [PATH]] [--igv_variants]\n\nScript to do recombinant read analysis\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --refacc [STR]        reference accession used in bam [default: NC_045512.2]\n  --minMixAF [FLOAT]    minimum alleic frequency for checking mixed mutations on vcf [default:0.2]\n  --maxMixAF [FLOAT]    maximum alleic frequency for checking mixed mutations on vcf [default:0.8]\n  --minMixed_n [INT]    threshold of mixed mutations count for vcf.\n  --minReadCount [INT]  threshold of read with variant count when no vcf provided.\n  --lineageMutation [FILE]\n                        lineage mutation json file [default: variant_mutation.json]\n  --variantMutation [FILE]\n                        variant mutation json file [default: lineage_mutation.json]\n  --mutations_af_plot   generate mutations_af_plot (when --vcf provided)\n  --verbose             Show more infomration in log\n  --version             show program's version number and exit\n\nInput:\n  --bam [FILE]          <Required> bam file\n  --vcf [File]          <Optional> vcf file which will infer the two parents of recombinant_variants\n\nOutput:\n  --tsv [FILE]          output file name [default: recombinant_reads.tsv]\n  --outbam [File]       output recombinant reads in bam file [default: recombinant_reads.bam]\n\nEDGE COVID-19 Options:\n  options specific used for EDGE COVID-19\n\n  -eo [PATH], --ec19_projdir [PATH]\n                        ec-19 project directory\n  --igv [PATH]          igv.html relative path\n  --igv_variants        add variants igv track\n```\n\n## Test\n\n```\ncd tests\n./runTest.sh\n```\n\n## Outputs \n\n-- recombinant_reads.stats:  counts\n\n| total  | mapped | unmapped | mutation_reads | parents     | recomb1_reads | recomb2_reads | recombx_reads | parent1_reads | parent2_reads | recomb1_perc| recomb2_perc | recombx_perc |\n|--------|--------|----------|----------------|-------------|---------------|---------------|---------------|---------------|---------------|-------------|--------------|--------------|\n| 64355  | 64355  |   0      |  5203          |Omicron,Delta|   162         | 175           |     18        |  489          |     730       | 10.29       | 11.11        | 1.14         |\n\n\n-- recombinant_reads.tsv\n|    read_name                | start | end | mutaions_json                                                                                                                                                                                                                                 |  note            |\n|-----------------------------|-------|-----|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|\n|HMVN7DRXY:2:2153:21802:16078 |  21566|21859| {21618: ['Delta'], 21846: ['Iota', 'Mu', 'Omicron']}                                                                                                                                                                                          |  recombinant 2   |\n|HMVN7DRXY:2:2166:28574:36229 |  21732|21883| {21762: ['Eta', 'Omicron'], 21846: ['Iota', 'Mu', 'Omicron']}                                                                                                                                                                                 |  parent Omicron  |\n|HMVN7DRXY:2:2215:29749:15217 |  22867|22994| {22917: ['Delta', 'Epsilon', 'Kappa'], 22992: ['rev of Omicron']}                                                                                                                                                                             |  parent Delta    |\n|HMVN7DRXY:2:2105:30572:25160 |  22865|23023| {22917: ['rev of Delta Epsilon Kappa'], 22992: ['rev of Omicron'], 22995: ['Delta', 'Omicron'], 23013: ['rev of Omicron']}                                                                                                                    |  recombinant 1   | \n|HMVN7DRXY:2:2127:18304:18850 |  24058|24518| {24130: ['Omicron'], 24469: ['rev of Omicron'], 24503: ['Omicron']}                                                                                                                                                                           |  recombinant x   |\n|etc ...                      |       |     |\n\n-- recombinant_reads_by_cross_region.tsv\n\n| Cross_region  | Reads                                                                                                                                                    |\n|---------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|\n|11201-11283    |{\"recomb1\": [\"HMVN7DRXY:2:2150:13015:23750\", \"HMVN7DRXY:2:2124:23746:28776\", \"HMVN7DRXY:2:2232:6216:33395\"], \"recomb2\": [\"HMVN7DRXY:2:2122:27624:23062\"]} |\n|11283-11537    |{\"recomb2\": [\"HMVN7DRXY:2:2126:12825:30154\", \"HMVN7DRXY:2:2126:15302:29121\"]}                                                                             |\n|21618-21846    |{\"recomb2\": [\"HMVN7DRXY:2:2153:21802:16078\", \"HMVN7DRXY:2:2105:22996:5682\"]}                                                                              |\n|etc ...        |\n\n-- recombinant_reads.parent1.bam\n\n-- recombinant_reads.parent1.bam.bai\n\n-- recombinant_reads.parent2.bam\n\n-- recombinant_reads.parent2.bam.bai\n\n-- recombinant_reads.recomb1.bam\n\n-- recombinant_reads.recomb1.bam.bai\n\n-- recombinant_reads.recomb2.bam\n\n-- recombinant_reads.recomb2.bam.bai\n\n-- recombinant_reads.recombx.bam\n\n-- recombinant_reads.recombx.bam.bai\n\n-- [recombinant_reads.mutations_af_plot.html](https://chienchi.github.io/ramifi/recombinant_reads.mutations_af_plot.html)\n\n-- [recombinant_reads.mutations_af_plot_genomeview.html](https://chienchi.github.io/ramifi/recombinant_reads.mutations_af_plot_genomeview.html)\n\n## Data visualization\n\nThe `recombinant_reads.bam`, `ramifi/data/variants_mutation.gff` and `ramifi/data/NC_045512.fasta` can be loaded into [IGV](https://software.broadinstitute.org/software/igv/).\n\nExample:\nIGV Link: [https://chienchi.github.io/ramifi/igv-webapp](https://chienchi.github.io/ramifi/igv-webapp)\n\n![Screen Shot 2022-06-13 at 9 51 08 PM](https://user-images.githubusercontent.com/737589/173489713-18150a0d-176b-4526-a751-5a03d2047096.png)\n\n## Custom mutation list\n\nUser can custom mustaion list formated as same [defined mutation list json file](https://github.com/chienchi/ramifi/blob/main/ramifi/data/variant_mutation.json) provided in the repo to check other variant/lineage co-infection/recombinant. When run ramifi, the custom mutation list will be taken in by the option flag `--variantMutation`.\n\nFor example:\n```\n{\n    \"Alpha\": {\n        \"A:23063:T\": \"S:N501Y\",\n        \"A:23403:G\": \"S:D614G\",\n        ...\n        \"del:21991:3\": \"S:Y144*\"\n        ...\n    },\n    \"Beta\": {\n        \"A:10323:G\": \"ORF1a:K3353R\",\n        \"A:21801:C\": \"S:D80A\",\n        \"A:22206:G\": \"S:D215G\",\n        \"A:23063:T\": \"S:N501Y\"\n        ...\n    },\n    \"BA.2\": {\n        ...\n    }\n}\n```\n\nNCBI TRACE Lineage Definitions Weekly Update Site: [https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/](https://ftp.ncbi.nlm.nih.gov/pub/ACTIV-TRACE/)\n\n## Remove package:\n\n```\npip uninstall ramifi\n```\n\n## Citing RAMIFI\n\nThis work is currently unpublished. If you are making use of this package, we would appreciate if you gave credit to our repository.\n\n\n## License\n\nRAMIFI is distributed as open-source software under [GPLv3 LICENSE](https://github.com/chienchi/ramifi/blob/main/LICENSE) and the license file included in the RAMIFI distribution.\n\nLANL open source approval reference C22090.\n\n\u00a9 2023. Triad National Security, LLC. All rights reserved.\nThis program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos\nNational Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.\nDepartment of Energy/National Nuclear Security Administration. All rights in the program are\nreserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear\nSecurity Administration. The Government is granted for itself and others acting on its behalf a\nnonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare\nderivative works, distribute copies to the public, perform publicly and display publicly, and to permit\nothers to do so.\n\n\n",
    "bugtrack_url": null,
    "license": "LICENSE",
    "summary": "Script to do recombinant read analysis",
    "version": "0.3.0",
    "split_keywords": [
        "recombinant",
        "mix-infection"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "714fc6282a63bcbb9cad9e20669c0f236a70ea3c46cdfd15848fe8bce9ae2c41",
                "md5": "0a466cefab808051cabaf13363631c4f",
                "sha256": "ede7e7361dc1428a4c229b6ebd7182852b8c9124e5e33a85363d51037343222a"
            },
            "downloads": -1,
            "filename": "ramifi-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0a466cefab808051cabaf13363631c4f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 117692,
            "upload_time": "2023-01-25T17:24:46",
            "upload_time_iso_8601": "2023-01-25T17:24:46.673920Z",
            "url": "https://files.pythonhosted.org/packages/71/4f/c6282a63bcbb9cad9e20669c0f236a70ea3c46cdfd15848fe8bce9ae2c41/ramifi-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "155b480d01fab6827d8ac59e0e92375e893209c7b853db3fcdab0a223d067e7b",
                "md5": "a933434f34cc5824adb1644e5fd464a4",
                "sha256": "4b3ce43bb68d4bd552dd79d70fd7142c69e0d38e17206093e78d9d4b10e9fdcd"
            },
            "downloads": -1,
            "filename": "ramifi-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a933434f34cc5824adb1644e5fd464a4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 37073,
            "upload_time": "2023-01-25T17:24:48",
            "upload_time_iso_8601": "2023-01-25T17:24:48.172091Z",
            "url": "https://files.pythonhosted.org/packages/15/5b/480d01fab6827d8ac59e0e92375e893209c7b853db3fcdab0a223d067e7b/ramifi-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-25 17:24:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "chienchi",
    "github_project": "ramifi",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "ramifi"
}
        
Elapsed time: 0.04000s