inmoose


Nameinmoose JSON
Version 0.7.3 PyPI version JSON
download
home_pageNone
SummaryInMoose: the Integrated Multi Omic Open Source Environment
upload_time2024-11-07 10:12:56
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            <img src="docs/source/inmoose.png" width="600">

[![pypi version](https://img.shields.io/pypi/v/inmoose)](https://pypi.org/project/inmoose)
[![pypiDownloads](https://static.pepy.tech/badge/inmoose)](https://pepy.tech/project/inmoose)
[![coverage](https://img.shields.io/coverallsCoverage/github/epigenelabs/inmoose.svg)](https://coveralls.io/github/epigenelabs/inmoose)
[![Documentation Status](https://readthedocs.org/projects/inmoose/badge/?version=latest)](https://inmoose.readthedocs.io/en/latest/?badge=latest)
[![license](https://img.shields.io/pypi/l/inmoose)](LICENSE)

# InMoose

InMoose is the **In**tegrated **M**ulti **O**mic **O**pen **S**ource **E**nvironment.
It is a collection of tools for the analysis of omic data.

# Installation

You can install InMoose directly with:

```
pip install inmoose
```

# Documentation

Documentation is hosted on [readthedocs.org](https://inmoose.readthedocs.io/en/latest/).

# Batch Effect Correction

InMoose provides features to correct technical biases, also called batch
effects, in transcriptomic data:
- for microarray data, InMoose supersedes pyCombat [1], a Python3
  implementation of ComBat [2], one of the most widely used tool for batch effect
  correction on microarray data.
- for RNASeq data, InMoose features a port to Python3 of ComBat-Seq [3], one of the
  most widely used tool for batch effect correction on RNASeq data.

To use these functions, simply import them and call them with default
parameters:
```python
from inmoose.pycombat import pycombat_norm, pycombat_seq

microarray_corrected = pycombat_norm(microarray_data, microarray_batches)
rnaseq_corrected = pycombat_seq(rnaseq_data, rnaseq_batches)
```

* `microarray_data`, `rnaseq_data`: the expression matrices, containing the
  information about the gene expression (rows) for each sample (columns).
* `microarray_batches`, `rnaseq_batches`: list of batch indices, describing the
  batch for each sample. The list of batches should contain as many elements as
  the number of samples in the expression matrix.


# Cohort QC
InMoose provides classes `CohortMetric` and `QCReport` to help to perform quality control (QC) on cohort datasets after batch effect correction.

`CohortMetric`: This class handles the analysis and provides methods for performing quality control on cohort datasets.

**Description**
The `CohortMetric` class performs a range of quality control analyses, including:
- Principal Component Analysis (PCA) to assess data variation.
- Comparison of sample distributions across different datasets or batches.
- Quantification of the effect of batch correction.
- Silhouette Score calculation to assess how well batches are separated.
- Entropy calculation to evaluate the mixing of samples from different batches.

**Usage Example**
```python
from inmoose.cohort_qc.cohort_metric import CohortMetric

cohort_quality_control = CohortMetric(
    clinical_df=clinical_data,
    batch_column="batch",
    data_expression_df=gene_expression_after_correction,
    data_expression_df_before=gene_expression_before_correction,
    covariates=["biopsy_site", "sample_type"]
)
```

`QCReport`: This class takes a CohortMetric argument, and generates an HTML report summarizing the QC results.

**Description**
The `QCReport` class extends `CohortMetric` and generates a comprehensive HTML report based on the quality control analysis. It includes visualizations and summaries of PCA, batch correction, Silhouette Scores, entropy, and more.

**Usage Example**
```python
from inmoose.cohort_qc.qc_report import QCReport

# Generate and save the QC report
qc_report = QCReport(cohort_quality_control)
qc_report.save_html_report_local(output_path='reports')
```

# Differential Expression Analysis

InMoose provides features to analyse diffentially expressed genes in bulk
transcriptomic data:
- for microarray data, InMoose features a port of limma [4], the *de
  facto* standard tool for differential expression analysis on microarray data.
- for RNASeq data, InMoose features a ports to Python3 of edgeR [5] and DESeq2
  [6], two of the most widely used tools for differential expression analysis on
  RNASeq data.

See the dedicated sections of the
[documentation](https://inmoose.readthedocs.io/en/latest/).

# Consensus clustering
InMoose provides features to compute consensus clustering, a resampling based algorithm compatible with any clustering algorithms which class implementation is instantiated with parameter `n_clusters`, and possess a `fit_predict` method, which is invoked on data.
Consensus clustering helps determining the best number of clusters to use and output confidence metrics and plots.


To use these functions, import the consensusClustering class and a clustering algorithm class:
```python
from inmoose.consensus_clustering.consensus_clustering import consensusClustering
from sklearn.cluster import AgglomerativeClustering

CC = consensusClustering(AgglomerativeClustering)
CC.compute_consensus_clustering(numpy_ndarray)
```

# How to contribute

Please refer to [CONTRIBUTING.md](https://github.com/epigenelabs/inmoose/blob/master/CONTRIBUTING.md) to learn more about the contribution guidelines.

# References

[1] Behdenna A, Colange M, Haziza J, Gema A, Appé G, Azencot CA and Nordor A. (2023) pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods. BMC Bioinformatics 7;24(1):459. https://doi.org/10.1186/s12859-023-05578-5.

[2] Johnson W E, et al. (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8, 118–12. https://doi.org/10.1093/biostatistics/kxj037

[3] Zhang Y, et al. (2020) ComBat-Seq: batch effect adjustment for RNASeq count
data. NAR Genomics and Bioinformatics, 2(3). https://doi.org/10.1093/nargab/lqaa078


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "inmoose",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "Guillaume App\u00e9 <guillaume@epigenelabs.com>, Maximilien Colange <maximilien@epigenelabs.com>, L\u00e9a Meunier <lea@epigenelabs.com>, Sol\u00e8ne Weill <solene@epigenelabs.com>",
    "download_url": "https://files.pythonhosted.org/packages/2b/81/7ecfae2ad7a95045466eb971c7d64f2b9313153c57195c49bdf1aa2a0ada/inmoose-0.7.3.tar.gz",
    "platform": null,
    "description": "<img src=\"docs/source/inmoose.png\" width=\"600\">\n\n[![pypi version](https://img.shields.io/pypi/v/inmoose)](https://pypi.org/project/inmoose)\n[![pypiDownloads](https://static.pepy.tech/badge/inmoose)](https://pepy.tech/project/inmoose)\n[![coverage](https://img.shields.io/coverallsCoverage/github/epigenelabs/inmoose.svg)](https://coveralls.io/github/epigenelabs/inmoose)\n[![Documentation Status](https://readthedocs.org/projects/inmoose/badge/?version=latest)](https://inmoose.readthedocs.io/en/latest/?badge=latest)\n[![license](https://img.shields.io/pypi/l/inmoose)](LICENSE)\n\n# InMoose\n\nInMoose is the **In**tegrated **M**ulti **O**mic **O**pen **S**ource **E**nvironment.\nIt is a collection of tools for the analysis of omic data.\n\n# Installation\n\nYou can install InMoose directly with:\n\n```\npip install inmoose\n```\n\n# Documentation\n\nDocumentation is hosted on [readthedocs.org](https://inmoose.readthedocs.io/en/latest/).\n\n# Batch Effect Correction\n\nInMoose provides features to correct technical biases, also called batch\neffects, in transcriptomic data:\n- for microarray data, InMoose supersedes pyCombat [1], a Python3\n  implementation of ComBat [2], one of the most widely used tool for batch effect\n  correction on microarray data.\n- for RNASeq data, InMoose features a port to Python3 of ComBat-Seq [3], one of the\n  most widely used tool for batch effect correction on RNASeq data.\n\nTo use these functions, simply import them and call them with default\nparameters:\n```python\nfrom inmoose.pycombat import pycombat_norm, pycombat_seq\n\nmicroarray_corrected = pycombat_norm(microarray_data, microarray_batches)\nrnaseq_corrected = pycombat_seq(rnaseq_data, rnaseq_batches)\n```\n\n* `microarray_data`, `rnaseq_data`: the expression matrices, containing the\n  information about the gene expression (rows) for each sample (columns).\n* `microarray_batches`, `rnaseq_batches`: list of batch indices, describing the\n  batch for each sample. The list of batches should contain as many elements as\n  the number of samples in the expression matrix.\n\n\n# Cohort QC\nInMoose provides classes `CohortMetric` and `QCReport` to help to perform quality control (QC) on cohort datasets after batch effect correction.\n\n`CohortMetric`: This class handles the analysis and provides methods for performing quality control on cohort datasets.\n\n**Description**\nThe `CohortMetric` class performs a range of quality control analyses, including:\n- Principal Component Analysis (PCA) to assess data variation.\n- Comparison of sample distributions across different datasets or batches.\n- Quantification of the effect of batch correction.\n- Silhouette Score calculation to assess how well batches are separated.\n- Entropy calculation to evaluate the mixing of samples from different batches.\n\n**Usage Example**\n```python\nfrom inmoose.cohort_qc.cohort_metric import CohortMetric\n\ncohort_quality_control = CohortMetric(\n    clinical_df=clinical_data,\n    batch_column=\"batch\",\n    data_expression_df=gene_expression_after_correction,\n    data_expression_df_before=gene_expression_before_correction,\n    covariates=[\"biopsy_site\", \"sample_type\"]\n)\n```\n\n`QCReport`: This class takes a CohortMetric argument, and generates an HTML report summarizing the QC results.\n\n**Description**\nThe `QCReport` class extends `CohortMetric` and generates a comprehensive HTML report based on the quality control analysis. It includes visualizations and summaries of PCA, batch correction, Silhouette Scores, entropy, and more.\n\n**Usage Example**\n```python\nfrom inmoose.cohort_qc.qc_report import QCReport\n\n# Generate and save the QC report\nqc_report = QCReport(cohort_quality_control)\nqc_report.save_html_report_local(output_path='reports')\n```\n\n# Differential Expression Analysis\n\nInMoose provides features to analyse diffentially expressed genes in bulk\ntranscriptomic data:\n- for microarray data, InMoose features a port of limma [4], the *de\n  facto* standard tool for differential expression analysis on microarray data.\n- for RNASeq data, InMoose features a ports to Python3 of edgeR [5] and DESeq2\n  [6], two of the most widely used tools for differential expression analysis on\n  RNASeq data.\n\nSee the dedicated sections of the\n[documentation](https://inmoose.readthedocs.io/en/latest/).\n\n# Consensus clustering\nInMoose provides features to compute consensus clustering, a resampling based algorithm compatible with any clustering algorithms which class implementation is instantiated with parameter `n_clusters`, and possess a `fit_predict` method, which is invoked on data.\nConsensus clustering helps determining the best number of clusters to use and output confidence metrics and plots.\n\n\nTo use these functions, import the consensusClustering class and a clustering algorithm class:\n```python\nfrom inmoose.consensus_clustering.consensus_clustering import consensusClustering\nfrom sklearn.cluster import AgglomerativeClustering\n\nCC = consensusClustering(AgglomerativeClustering)\nCC.compute_consensus_clustering(numpy_ndarray)\n```\n\n# How to contribute\n\nPlease refer to [CONTRIBUTING.md](https://github.com/epigenelabs/inmoose/blob/master/CONTRIBUTING.md) to learn more about the contribution guidelines.\n\n# References\n\n[1] Behdenna A, Colange M, Haziza J, Gema A, App\u00e9 G, Azencot CA and Nordor A. (2023) pyComBat, a Python tool for batch effects correction in high-throughput molecular data using empirical Bayes methods. BMC Bioinformatics 7;24(1):459. https://doi.org/10.1186/s12859-023-05578-5.\n\n[2] Johnson W E, et al. (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8, 118\u201312. https://doi.org/10.1093/biostatistics/kxj037\n\n[3] Zhang Y, et al. (2020) ComBat-Seq: batch effect adjustment for RNASeq count\ndata. NAR Genomics and Bioinformatics, 2(3). https://doi.org/10.1093/nargab/lqaa078\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "InMoose: the Integrated Multi Omic Open Source Environment",
    "version": "0.7.3",
    "project_urls": {
        "Documentation": "https://inmoose.readthedocs.io/en/latest/",
        "Source": "https://github.com/epigenelabs/inmoose"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2b817ecfae2ad7a95045466eb971c7d64f2b9313153c57195c49bdf1aa2a0ada",
                "md5": "be0260a5cecd297ac06e7e5f8a004edf",
                "sha256": "c8924c94245d49f8990ed05d962409cf9949a6124180fdc2f6784778a29a6c0e"
            },
            "downloads": -1,
            "filename": "inmoose-0.7.3.tar.gz",
            "has_sig": false,
            "md5_digest": "be0260a5cecd297ac06e7e5f8a004edf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 7286042,
            "upload_time": "2024-11-07T10:12:56",
            "upload_time_iso_8601": "2024-11-07T10:12:56.075055Z",
            "url": "https://files.pythonhosted.org/packages/2b/81/7ecfae2ad7a95045466eb971c7d64f2b9313153c57195c49bdf1aa2a0ada/inmoose-0.7.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-07 10:12:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "epigenelabs",
    "github_project": "inmoose",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "tox": true,
    "lcname": "inmoose"
}
        
Elapsed time: 0.32778s