mimic-da

Name	mimic-da JSON
Version	1.0.1 JSON
	download
home_page	None
Summary	None
upload_time	2024-06-03 17:24:45
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	diffrential analysis cladogram mann-whitney microbiome
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![DOI](https://zenodo.org/badge/711493727.svg)](https://zenodo.org/doi/10.5281/zenodo.10958274)
# miMic (Mann-Whitney image microbiome) 

This repository is attached to the paper "mi-Mic: a novel multi-layer statistical test for microbiota-disease associations".    
miMic is a straightforward yet remarkably versatile and scalable approach for differential abundance analysis.

miMic consists of three main steps:

- Data preprocessing and translation to a cladogram of means.

-  An apriori nested ANOVA (or nested GLM for continuous labels) to detect overall microbiome-label relations.

-  A post hoc test along the cladogram trajectories.


##  miMic

miMic is available through the following platforms:
- [GitHub](https://github.com/oshritshtossel/miMic) 
- [PyPi](https://pypi.org/project/mimic-da/)
- [Website](https://micros.math.biu.ac.il/)

### Install the package
```python
pip install mimic-da
```
##  How to apply miMic 
See `example_use.py` for an example of how to use miMic.  
The example contains the following steps:

1.  Import miMic and additional packages.
    ```python
    from mimic_da import apply_mimic
    import pandas as pd
    ```

2. Load the raw ASVs table in the following format:    
   - The first column is named "ID"
   - Each row represents a sample and each column represents an ASV.  
   - The last row contains the taxonomy information, named "taxonomy".

    ```python
    df = pd.read_csv("example_data/for_process.csv")
    ```
   - <u>Note:</u> `for_process.csv` is a file that contains the raw ASVs table in the required format, you can find an exmaple file in `example_data` folder.   


3. Load a tag table as csv, such that the tag column is named "Tag".

      ```python
    tag = pd.read_csv("example_data/tag.csv",index_col=0)
      ```
   - <u>Note:</u>  `tag.csv` is a file that contains the tag table in the required format, you can find an example tag in `example_data` folder.


4. Specify a folder to save the output of the miMic test.   
   ```python
   folder = "example_data/2D_images"
   ```
   - <u>Note:</u>  `2D_images` is a folder that will be created in your current working directory, and the output of the miMic test will be saved there.


5. Apply MIPMLP.
   - MIPMLP using defaulting parameters, you can find more in 'Note' section below.
   - taxonomy_group: ["sub PCA", "mean", "sum"], "sub PCA" method is preferred.

   ```python
   processed = apply_mimic(folder=folder, tag=tag, mode="preprocess", preprocess=True, rawData=df,
                            taxnomy_group='sub PCA')
   ```
   - <u>Note:</u>  MIPMLP is a package that is used to preprocess the raw ASVs table, see [MIPMLP PyPi](https://pypi.org/project/MIPMLP/) or [MIPMLP website](https://mip-mlp.math.biu.ac.il/Home) for more explanations.   
<u>If you have your own processed data</u>, set `preprocess` to False, and use your processed data as input for `proceesed` parameter in the next step.


6. Apply miMic test.   
   miMic using the following hyperparameters:   
    - **eval**: evaluation method, ["man", "corr", "cat"]. Default is <u>"man"</u>.
      - "man" for binary labels.
      - "corr" for continuous labels.
      - "cat" for categorical labels.
    - **sis**: apply sister correction,["fdr_bh", "bonferroni", "no"]. Default is <u> "fdr_bh"</u>.
    - **correct_first**: apply FDR correction to the starting taxonomy level according to `sis` parameter,[True, False] Default is <u>True</u>.
    - **mode**: 2 different formats of running,["test", "plot"]. Default is <u>"test"</u>.
    - **save**: whether to save the corrs_df of the miMic test to computer,[True, False] Default is <u>True</u>.
    - **tax**: starting taxonomy of the post hoc test,["None", 1, 2, 3, "noAnova", "nosignifacnt"]   
      - In <u>"test"</u> mode the defaulting value is <u>"None"</u>. 
      - In the <u>"plot"</u> mode the tax is <u>set automatically</u> to the selected taxonomy of the "test" mode [1, 2, 3, "noAnova"].
      - "noAnova", where apriori nested ANOVA test is not significant.
      - "nosignificant", where apriori nested ANOVA test is not significant and miMic did not find any significant taxa in the leafs. In this case, the post hoc test will **not** be applied.
    - **colorful**: Determines whether to apply colorful mode on the plots [True, False]. <u>Default</u> is True.
    - **threshold_p**: the threshold for significant values. Default is <u>0.05</u>.
    - **THRESHOLD_edge**: the threshold for having an edge in "interaction" plot. Default is <u>0.5</u>.
    - **processed**: the processed data from the previous step. Default is <u>None</u>.
    - **apply_samba**: whether to apply samba or no. Default is True (Boolean).
    - **samba_output**: if you already have samba outputs- miMic will read it from the folder you specified,
    else miMic will apply samba and set `samba_output` to None.
   
     ```python
       if processed is not None:
            taxonomy_selected,samba_output = apply_mimic(folder, tag, eval="man", threshold_p=0.05, processed=processed, apply_samba=True, save=False)
            if taxonomy_selected is not None:
                apply_mimic(folder, tag, mode="plot", tax=taxonomy_selected, eval="man", sis='fdr_bh', samba_output=samba_output,save=False,
                            threshold_p=0.05, THRESHOLD_edge=0.5)
   ```
   - <u>Note:</u>  if `apply_samba` is set to True, miMic will apply samba-metric.   
   If `save` is set to True, the output will be saved to the folder you specified.   
   See [SAMBA PyPi](https://pypi.org/project/samba-metric/) for more explanations. 
##  miMic output
miMic will output the following:

- If `save` is set to True, samba outputs and the following csv will be saved to your specified folder:
  - **corrs_df**: a dataframe containing the results of the miMic test (including Utest results).
  - **just_mimic**: a dataframe containing the results of the miMic test without the Utest results.
  - **u_test_without_mimic**: a dataframe containing the results of the Utest without the miMic results.
  - **miMic&Utest**: a dataframe containing the joint results of miMic and Utest tests.


- If `mode` is set to "plot", plots will be saved in the folder named <u>'plots'</u> in your current working directory.    
The following plots will be saved:
   1.  **tax_vs_rp_sp_anova_p**: plot RP vs SP over the different taxonomy levels and color the background of the plot till the selected taxonomy, based on miMic test.  
  ![tax_vs_rp_sp_anova_p](https://github.com/oshritshtossel/miMic/raw/master/plots/tax_vs_rp_sp_anova_p.png)

   2. **rsp_vs_beta**: calculate RSP score for different betas and create the appropriate plot.   
  ![rsp_vs_beta](https://github.com/oshritshtossel/miMic/raw/master/plots/rsp_vs_beta.png)

   3. **hist**: a histogram of the ASVs in each taxonomy level.   
  ![hist](https://github.com/oshritshtossel/miMic/raw/master/plots/hist.png)

   4. **corrs_within_family**: a plot of the correlation between the significant ASVs within the family level, if `colorful` is set to True, each family will be colored.    
  ![corrs_within_family](https://github.com/oshritshtossel/miMic/raw/master/plots/corrs_within_family.png)

   5. **interaction**: a plot of the interaction between the significant ASVs.  
  ![interaction](https://github.com/oshritshtossel/miMic/raw/master/plots/interaction.png)

   6. **correlations_tree**: create correlation cladogram, such that tha size of each node is according to the -log(p-value), the color of 
       each node represents the sign of the post hoc test, the shape of the node (circle, square,sphere) is based on 
       miMic, Utest, or both results accordingly, and if `colorful` is set to True, the background color of the node will be colored based on the family color.  
   ![correlations_tree](https://github.com/oshritshtossel/miMic/raw/master/plots/correlations_tree.svg)


# Cite us
If you are using our package, miMic for **any** purpose, please cite us; Shtossel, Oshrit, Shani Finkelstein, and Yoram Louzoun. "mi-Mic: a novel multi-layer statistical test for microbiota-disease associations." Genome Biology 25, no. 1 (2024): 113. https://link.springer.com/article/10.1186/s13059-024-03256-0

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mimic-da",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "Diffrential analysis, cladogram, Mann-Whitney, microbiome",
    "author": null,
    "author_email": "\"Oshrit Shtossel, Shani Finkelstein\" <oshritvig@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/07/01/6086703a6d1b1cc04b714183ad1663cdf04742aff7b6f94b2df9fdbcb0fb/mimic_da-1.0.1.tar.gz",
    "platform": null,
    "description": "[![DOI](https://zenodo.org/badge/711493727.svg)](https://zenodo.org/doi/10.5281/zenodo.10958274)\n# miMic (Mann-Whitney image microbiome) \n\nThis repository is attached to the paper \"mi-Mic: a novel multi-layer statistical test for microbiota-disease associations\".    \nmiMic is a straightforward yet remarkably versatile and scalable approach for differential abundance analysis.\n\nmiMic consists of three main steps:\n\n- Data preprocessing and translation to a cladogram of means.\n\n-  An apriori nested ANOVA (or nested GLM for continuous labels) to detect overall microbiome-label relations.\n\n-  A post hoc test along the cladogram trajectories.\n\n\n##  miMic\n\nmiMic is available through the following platforms:\n- [GitHub](https://github.com/oshritshtossel/miMic) \n- [PyPi](https://pypi.org/project/mimic-da/)\n- [Website](https://micros.math.biu.ac.il/)\n\n### Install the package\n```python\npip install mimic-da\n```\n##  How to apply miMic \nSee `example_use.py` for an example of how to use miMic.  \nThe example contains the following steps:\n\n1.  Import miMic and additional packages.\n    ```python\n    from mimic_da import apply_mimic\n    import pandas as pd\n    ```\n\n2. Load the raw ASVs table in the following format:    \n   - The first column is named \"ID\"\n   - Each row represents a sample and each column represents an ASV.  \n   - The last row contains the taxonomy information, named \"taxonomy\".\n\n    ```python\n    df = pd.read_csv(\"example_data/for_process.csv\")\n    ```\n   - <u>Note:</u> `for_process.csv` is a file that contains the raw ASVs table in the required format, you can find an exmaple file in `example_data` folder.   \n\n\n3. Load a tag table as csv, such that the tag column is named \"Tag\".\n\n      ```python\n    tag = pd.read_csv(\"example_data/tag.csv\",index_col=0)\n      ```\n   - <u>Note:</u>  `tag.csv` is a file that contains the tag table in the required format, you can find an example tag in `example_data` folder.\n\n\n4. Specify a folder to save the output of the miMic test.   \n   ```python\n   folder = \"example_data/2D_images\"\n   ```\n   - <u>Note:</u>  `2D_images` is a folder that will be created in your current working directory, and the output of the miMic test will be saved there.\n\n\n5. Apply MIPMLP.\n   - MIPMLP using defaulting parameters, you can find more in 'Note' section below.\n   - taxonomy_group: [\"sub PCA\", \"mean\", \"sum\"], \"sub PCA\" method is preferred.\n\n   ```python\n   processed = apply_mimic(folder=folder, tag=tag, mode=\"preprocess\", preprocess=True, rawData=df,\n                            taxnomy_group='sub PCA')\n   ```\n   - <u>Note:</u>  MIPMLP is a package that is used to preprocess the raw ASVs table, see [MIPMLP PyPi](https://pypi.org/project/MIPMLP/) or [MIPMLP website](https://mip-mlp.math.biu.ac.il/Home) for more explanations.   \n<u>If you have your own processed data</u>, set `preprocess` to False, and use your processed data as input for `proceesed` parameter in the next step.\n\n\n6. Apply miMic test.   \n   miMic using the following hyperparameters:   \n    - **eval**: evaluation method, [\"man\", \"corr\", \"cat\"]. Default is <u>\"man\"</u>.\n      - \"man\" for binary labels.\n      - \"corr\" for continuous labels.\n      - \"cat\" for categorical labels.\n    - **sis**: apply sister correction,[\"fdr_bh\", \"bonferroni\", \"no\"]. Default is <u> \"fdr_bh\"</u>.\n    - **correct_first**: apply FDR correction to the starting taxonomy level according to `sis` parameter,[True, False] Default is <u>True</u>.\n    - **mode**: 2 different formats of running,[\"test\", \"plot\"]. Default is <u>\"test\"</u>.\n    - **save**: whether to save the corrs_df of the miMic test to computer,[True, False] Default is <u>True</u>.\n    - **tax**: starting taxonomy of the post hoc test,[\"None\", 1, 2, 3, \"noAnova\", \"nosignifacnt\"]   \n      - In <u>\"test\"</u> mode the defaulting value is <u>\"None\"</u>. \n      - In the <u>\"plot\"</u> mode the tax is <u>set automatically</u> to the selected taxonomy of the \"test\" mode [1, 2, 3, \"noAnova\"].\n      - \"noAnova\", where apriori nested ANOVA test is not significant.\n      - \"nosignificant\", where apriori nested ANOVA test is not significant and miMic did not find any significant taxa in the leafs. In this case, the post hoc test will **not** be applied.\n    - **colorful**: Determines whether to apply colorful mode on the plots [True, False]. <u>Default</u> is True.\n    - **threshold_p**: the threshold for significant values. Default is <u>0.05</u>.\n    - **THRESHOLD_edge**: the threshold for having an edge in \"interaction\" plot. Default is <u>0.5</u>.\n    - **processed**: the processed data from the previous step. Default is <u>None</u>.\n    - **apply_samba**: whether to apply samba or no. Default is True (Boolean).\n    - **samba_output**: if you already have samba outputs- miMic will read it from the folder you specified,\n    else miMic will apply samba and set `samba_output` to None.\n   \n     ```python\n       if processed is not None:\n            taxonomy_selected,samba_output = apply_mimic(folder, tag, eval=\"man\", threshold_p=0.05, processed=processed, apply_samba=True, save=False)\n            if taxonomy_selected is not None:\n                apply_mimic(folder, tag, mode=\"plot\", tax=taxonomy_selected, eval=\"man\", sis='fdr_bh', samba_output=samba_output,save=False,\n                            threshold_p=0.05, THRESHOLD_edge=0.5)\n   ```\n   - <u>Note:</u>  if `apply_samba` is set to True, miMic will apply samba-metric.   \n   If `save` is set to True, the output will be saved to the folder you specified.   \n   See [SAMBA PyPi](https://pypi.org/project/samba-metric/) for more explanations. \n##  miMic output\nmiMic will output the following:\n\n- If `save` is set to True, samba outputs and the following csv will be saved to your specified folder:\n  - **corrs_df**: a dataframe containing the results of the miMic test (including Utest results).\n  - **just_mimic**: a dataframe containing the results of the miMic test without the Utest results.\n  - **u_test_without_mimic**: a dataframe containing the results of the Utest without the miMic results.\n  - **miMic&Utest**: a dataframe containing the joint results of miMic and Utest tests.\n\n\n- If `mode` is set to \"plot\", plots will be saved in the folder named <u>'plots'</u> in your current working directory.    \nThe following plots will be saved:\n   1.  **tax_vs_rp_sp_anova_p**: plot RP vs SP over the different taxonomy levels and color the background of the plot till the selected taxonomy, based on miMic test.  \n  ![tax_vs_rp_sp_anova_p](https://github.com/oshritshtossel/miMic/raw/master/plots/tax_vs_rp_sp_anova_p.png)\n\n   2. **rsp_vs_beta**: calculate RSP score for different betas and create the appropriate plot.   \n  ![rsp_vs_beta](https://github.com/oshritshtossel/miMic/raw/master/plots/rsp_vs_beta.png)\n\n   3. **hist**: a histogram of the ASVs in each taxonomy level.   \n  ![hist](https://github.com/oshritshtossel/miMic/raw/master/plots/hist.png)\n\n   4. **corrs_within_family**: a plot of the correlation between the significant ASVs within the family level, if `colorful` is set to True, each family will be colored.    \n  ![corrs_within_family](https://github.com/oshritshtossel/miMic/raw/master/plots/corrs_within_family.png)\n\n   5. **interaction**: a plot of the interaction between the significant ASVs.  \n  ![interaction](https://github.com/oshritshtossel/miMic/raw/master/plots/interaction.png)\n\n   6. **correlations_tree**: create correlation cladogram, such that tha size of each node is according to the -log(p-value), the color of \n       each node represents the sign of the post hoc test, the shape of the node (circle, square,sphere) is based on \n       miMic, Utest, or both results accordingly, and if `colorful` is set to True, the background color of the node will be colored based on the family color.  \n   ![correlations_tree](https://github.com/oshritshtossel/miMic/raw/master/plots/correlations_tree.svg)\n\n\n# Cite us\nIf you are using our package, miMic for **any** purpose, please cite us; Shtossel, Oshrit, Shani Finkelstein, and Yoram Louzoun. \"mi-Mic: a novel multi-layer statistical test for microbiota-disease associations.\" Genome Biology 25, no. 1 (2024): 113. https://link.springer.com/article/10.1186/s13059-024-03256-0\n\n\n \n   \n   \n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": null,
    "version": "1.0.1",
    "project_urls": null,
    "split_keywords": [
        "diffrential analysis",
        " cladogram",
        " mann-whitney",
        " microbiome"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6351c7edc60fd9971d5563eaa922d6229c31244ec084614abb1f6994d8f1512d",
                "md5": "e2cfcc1be04b938b0caf3e260f1d0d2c",
                "sha256": "072ce37c3d5304dc3aa7d03cf64138e834ccfea23c3983311f98e7c2dbb8ce7d"
            },
            "downloads": -1,
            "filename": "mimic_da-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e2cfcc1be04b938b0caf3e260f1d0d2c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 23664,
            "upload_time": "2024-06-03T17:24:42",
            "upload_time_iso_8601": "2024-06-03T17:24:42.493873Z",
            "url": "https://files.pythonhosted.org/packages/63/51/c7edc60fd9971d5563eaa922d6229c31244ec084614abb1f6994d8f1512d/mimic_da-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "07016086703a6d1b1cc04b714183ad1663cdf04742aff7b6f94b2df9fdbcb0fb",
                "md5": "3c24255855c7a3274f3fd27210157126",
                "sha256": "30bac741124d24d2e02d6799c36f0c4c0270ecf84bc11d40d42b22650c796a22"
            },
            "downloads": -1,
            "filename": "mimic_da-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3c24255855c7a3274f3fd27210157126",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1146663,
            "upload_time": "2024-06-03T17:24:45",
            "upload_time_iso_8601": "2024-06-03T17:24:45.981135Z",
            "url": "https://files.pythonhosted.org/packages/07/01/6086703a6d1b1cc04b714183ad1663cdf04742aff7b6f94b2df9fdbcb0fb/mimic_da-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-03 17:24:45",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "mimic-da"
}

None