nanomix


Namenanomix JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryMethods for cell type deconvolution from Oxford Nanopore methylation calling
upload_time2023-04-21 17:06:45
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseNone
keywords nanopore methylation deconvolution
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Nanomix: Methylation Deconvolution of Nanopore Sequencing Data
Methylation deconvolution is the process of determining the proportion of distinct cell types in a complex (hetergeneic) mixture of cell or cell free DNA.
This tool provides suitable models for performing deconvolution on Nanopore sequencing data. In particular our new models account for the non-uniform coverage distribution and high error rate in modified base calling. We also include more typical deconvolution models for deconvolution of bisulfite sequencing data or bead chip arrays.


## Installation
This package is available on PyPI
```
pip install nanomix
```
Alternatively you can install from source. Installing from source requires [maturin](https://github.com/PyO3/maturin)
```
pip install maturin
git clone https://github.com/Jonbroad15/nanomix.git
cd nanomix
maturin develop
```

## Usage
### Reference Atlas
Deconvolution determines the mixture proportion based on methylation propensities of previously resolved sequencing runs of purified reference cells across the genome. This information is collated into an *atlas*. We suggest using the atlas from [Loyfer et. al](https://www.biorxiv.org/content/10.1101/2022.01.24.477547v1.full) which we have curated and labelled as `39Bisulfite.tsv` and is set for default. Their tool [wgbstools](https://github.com/nloyfer/wgbs_tools) also provides the means to create an atlas suited to the cell types you are interested in.

### Deconvolution
To deconvolute a sequencing run, one must simply provide `nanomix` with a methylome. We define a methylome as a `tsv` file with columns `{chr, start, end, total_calls, modified_calls}`. Such a file can be created from a `.bam` file using our associated program, [mbtools](https://github.com/jts/mbtools)
```
mbtools region-frequency -r ATLAS.tsv SAMPLE.bam > METHYLOME.tsv
```
Then the mixture proportion can be found by calling:
```
nanomix deconvolute -a ATLAS.tsv METHYLOME.tsv
```

### Model
We provide four deconvolution models

- **llse (default):**   log-likelihood with sequencing errors. Maximize the likelihood of sigma
                    by assuming modification calls follow a binomial distribution. Good for sequencing data with high error
                    rate. (Oxford Nanopore)
- **nnls:**             non-negative least squares. Minimize the squared error between the methylome and what we expect for
                    the methylome (given sigma and the atlas). Recommended for fast deconvolution of methylomes with high
                    coverage. (Methylation Arrays)
- **llsp:**             log-likelihood with sequencing perfect. Same as llse, without error modelling. Useful for differentiating the
                    effect of sequencing errors on deconvolution loss and accuracy.
- **mmse:**             mixture model with sequencing errors. Also follows a binomial distribution, but softly assigns fragments
                    to cell-types. Optimization uses expectation maximization (slower than above). Recommended for high resolution
                    deconvolution (many cell types) and an atlas with large regions of grouped CpGs.
Select a model by:
```
nanomix deconvolute -m MODEL METHYLOME.tsv 
```
The **mmse model is distinct** in that it works by assigning reads to cell types. To this effect, one would need a methylome where every row represents a read and columns contain `{read_id, chr, start, end, total_calls, modified_calls}`, this also be constructed from a `.bam` file with [mbtools](https://github.com/jts/mbtools)
```
mbtools read-frequency SAMPLE.bam > METHYLOME.tsv
nanomix deconvolute -m mmse METHYLOME.tsv
```
For more info on other option hparams, run
```
nanomix deconvolute -h
```

### Assign fragments
Our tools also allows you to assign fragments in the methylome to cell types in the atlas based off of the deconvoluted sigma vector.
```
nanomix assign -s SIGMA.tsv METHYLOME.tsv 
```
### Simulate 
We provide functionality to simulate methylomes of complex cell mixtures given a `sigma.tsv` file that indicates the cell\_type in the first column and the corresponding proportion in the second column. All the proportions must add up to 1 and the cell-types must be the same as those in the supplied reference atlas. To simulate a methylome:
```
nanomix simulate -a ATLAS.tsv SIGMA.tsv
```

### Evaluate
Simulating data provides true cell-type assignments in the last column of the methylome. We can evaluate the performance of a models deconvolution on this methylome. This will output the deconvolution loss (euclidean distance between true and predicted sigma vector) and the read assignment accuracy at confidence levels from 0.5 to 0.9.
```
nanomix evaluate -a ATLAS.tsv METHYLOME.tsv
```

### Plot
You can plot a list of deconvolution mixtures by providing them to the plot function. This will produce a stacked bar plot.
```
nanomix plot -o NAME.png *sigma.tsv
```
![exampledeconvplot](Images/example_deconvolution_plot.png)






            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "nanomix",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "Jonathan Broadbent <jonbroad15@gmail.com>",
    "keywords": "nanopore,methylation,deconvolution",
    "author": null,
    "author_email": "Jonathan Broadbent <jonbroad15@gmail.com>, Jared Simpson <jsimpson@oicr.on.ca>",
    "download_url": "https://files.pythonhosted.org/packages/d2/75/97b7c5506321c11521fedfd9ccb7a9d55197155b8f30f51e5642958ee5ed/nanomix-0.2.0.tar.gz",
    "platform": null,
    "description": "# Nanomix: Methylation Deconvolution of Nanopore Sequencing Data\nMethylation deconvolution is the process of determining the proportion of distinct cell types in a complex (hetergeneic) mixture of cell or cell free DNA.\nThis tool provides suitable models for performing deconvolution on Nanopore sequencing data. In particular our new models account for the non-uniform coverage distribution and high error rate in modified base calling. We also include more typical deconvolution models for deconvolution of bisulfite sequencing data or bead chip arrays.\n\n\n## Installation\nThis package is available on PyPI\n```\npip install nanomix\n```\nAlternatively you can install from source. Installing from source requires [maturin](https://github.com/PyO3/maturin)\n```\npip install maturin\ngit clone https://github.com/Jonbroad15/nanomix.git\ncd nanomix\nmaturin develop\n```\n\n## Usage\n### Reference Atlas\nDeconvolution determines the mixture proportion based on methylation propensities of previously resolved sequencing runs of purified reference cells across the genome. This information is collated into an *atlas*. We suggest using the atlas from [Loyfer et. al](https://www.biorxiv.org/content/10.1101/2022.01.24.477547v1.full) which we have curated and labelled as `39Bisulfite.tsv` and is set for default. Their tool [wgbstools](https://github.com/nloyfer/wgbs_tools) also provides the means to create an atlas suited to the cell types you are interested in.\n\n### Deconvolution\nTo deconvolute a sequencing run, one must simply provide `nanomix` with a methylome. We define a methylome as a `tsv` file with columns `{chr, start, end, total_calls, modified_calls}`. Such a file can be created from a `.bam` file using our associated program, [mbtools](https://github.com/jts/mbtools)\n```\nmbtools region-frequency -r ATLAS.tsv SAMPLE.bam > METHYLOME.tsv\n```\nThen the mixture proportion can be found by calling:\n```\nnanomix deconvolute -a ATLAS.tsv METHYLOME.tsv\n```\n\n### Model\nWe provide four deconvolution models\n\n- **llse (default):**   log-likelihood with sequencing errors. Maximize the likelihood of sigma\n                    by assuming modification calls follow a binomial distribution. Good for sequencing data with high error\n                    rate. (Oxford Nanopore)\n- **nnls:**             non-negative least squares. Minimize the squared error between the methylome and what we expect for\n                    the methylome (given sigma and the atlas). Recommended for fast deconvolution of methylomes with high\n                    coverage. (Methylation Arrays)\n- **llsp:**             log-likelihood with sequencing perfect. Same as llse, without error modelling. Useful for differentiating the\n                    effect of sequencing errors on deconvolution loss and accuracy.\n- **mmse:**             mixture model with sequencing errors. Also follows a binomial distribution, but softly assigns fragments\n                    to cell-types. Optimization uses expectation maximization (slower than above). Recommended for high resolution\n                    deconvolution (many cell types) and an atlas with large regions of grouped CpGs.\nSelect a model by:\n```\nnanomix deconvolute -m MODEL METHYLOME.tsv \n```\nThe **mmse model is distinct** in that it works by assigning reads to cell types. To this effect, one would need a methylome where every row represents a read and columns contain `{read_id, chr, start, end, total_calls, modified_calls}`, this also be constructed from a `.bam` file with [mbtools](https://github.com/jts/mbtools)\n```\nmbtools read-frequency SAMPLE.bam > METHYLOME.tsv\nnanomix deconvolute -m mmse METHYLOME.tsv\n```\nFor more info on other option hparams, run\n```\nnanomix deconvolute -h\n```\n\n### Assign fragments\nOur tools also allows you to assign fragments in the methylome to cell types in the atlas based off of the deconvoluted sigma vector.\n```\nnanomix assign -s SIGMA.tsv METHYLOME.tsv \n```\n### Simulate \nWe provide functionality to simulate methylomes of complex cell mixtures given a `sigma.tsv` file that indicates the cell\\_type in the first column and the corresponding proportion in the second column. All the proportions must add up to 1 and the cell-types must be the same as those in the supplied reference atlas. To simulate a methylome:\n```\nnanomix simulate -a ATLAS.tsv SIGMA.tsv\n```\n\n### Evaluate\nSimulating data provides true cell-type assignments in the last column of the methylome. We can evaluate the performance of a models deconvolution on this methylome. This will output the deconvolution loss (euclidean distance between true and predicted sigma vector) and the read assignment accuracy at confidence levels from 0.5 to 0.9.\n```\nnanomix evaluate -a ATLAS.tsv METHYLOME.tsv\n```\n\n### Plot\nYou can plot a list of deconvolution mixtures by providing them to the plot function. This will produce a stacked bar plot.\n```\nnanomix plot -o NAME.png *sigma.tsv\n```\n![exampledeconvplot](Images/example_deconvolution_plot.png)\n\n\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Methods for cell type deconvolution from Oxford Nanopore methylation calling",
    "version": "0.2.0",
    "split_keywords": [
        "nanopore",
        "methylation",
        "deconvolution"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "85a5f6f9a6ef093f2fa11c36e88aca398fdfe20ef77579751e1fbae94d42756b",
                "md5": "51e93fad35a82d78d396a63225204078",
                "sha256": "572f4193449b9833318f681d6235234136d0708fb61714a8a037efedf5d1c633"
            },
            "downloads": -1,
            "filename": "nanomix-0.2.0-cp39-cp39-manylinux_2_31_x86_64.whl",
            "has_sig": false,
            "md5_digest": "51e93fad35a82d78d396a63225204078",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.7",
            "size": 1100267,
            "upload_time": "2023-04-21T17:06:42",
            "upload_time_iso_8601": "2023-04-21T17:06:42.907900Z",
            "url": "https://files.pythonhosted.org/packages/85/a5/f6f9a6ef093f2fa11c36e88aca398fdfe20ef77579751e1fbae94d42756b/nanomix-0.2.0-cp39-cp39-manylinux_2_31_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d27597b7c5506321c11521fedfd9ccb7a9d55197155b8f30f51e5642958ee5ed",
                "md5": "e0c246dd87b58f12497847b2ef659e64",
                "sha256": "adfd7a40a286cc93eba7f0bbffb433153b404ea488c703dd6bc375735bff9d53"
            },
            "downloads": -1,
            "filename": "nanomix-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e0c246dd87b58f12497847b2ef659e64",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 3323736,
            "upload_time": "2023-04-21T17:06:45",
            "upload_time_iso_8601": "2023-04-21T17:06:45.172346Z",
            "url": "https://files.pythonhosted.org/packages/d2/75/97b7c5506321c11521fedfd9ccb7a9d55197155b8f30f51e5642958ee5ed/nanomix-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-21 17:06:45",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "nanomix"
}
        
Elapsed time: 0.58339s