# SiSaNA
Single Sample Network Analysis
SiSaNA is used both before and after creating both PANDA and LIONESS networks from the package netZooPy. SiSaNA first needs to pre-process the data to be ran in PANDA/LIONESS. SiSaNA takes the LIONESS output, processes it to be analyzed downstream, and then calculates in- and out-degree for each of the reconstructed networks. Additionally, it can compare the expression/degree between groups of interest, including performing statistical tests, visualizing the results (volcano plots, boxplots, violin plots, and heatmaps), and compare the survival between groups.
**Note: The steps below are for the basic use of SiSaNA. Additional functionalities are still under development.**
## Requirements
- python v3.9.19 (see installation steps for creating a conda environment with this specific Python version). SiSaNA should work with versions of Python 3.9.0 or greater, but as it has been written and tested on this version, we will use 3.9.19.
## Installation can be performed by running the following steps
1. Create a conda virtual environment with python version 3.9.19.
```
conda create --prefix </path/to/env-name> python=3.9.19
```
2. Enter the conda environment
```
conda activate </path/to/env-name>
```
3. Install SiSaNA via the pip package installer
```
pip3 install sisana
```
4. Create a directory for the analysis and move into the analysis directory
```
mkdir sisana
cd sisana
```
## Pipeline overview
![Pipeline overview](docs/sisana_pipeline_overview_v2.png)
## Example input files
Example input files can be obtained using the command
```
sisana -e
```
These files will be copied to a new directory in the current working directory, called "example_inputs". One of these example files is the params.yml file, which can be used as a template and edited for your own data (see next section). Each user-defined parameter in the params.yml file is documented with a comment to explain the function of the parameter. The comments do not need to be removed prior to running SiSaNA. The files in this example_inputs directory can be used in the commands listed down below.
## Viewing help documentation on SiSaNA
To view help documentation on which subcommands are available, the following can be used:
```
sisana -h
```
For further information on these subcommands, simply put the name of the subcommand before the `-h`
```
sisana <subcommand> -h
```
## Setting up your params.yml file
The most important thing to get right in order to correctly run SiSaNA is the structure of your params.yml file. SiSaNA comes with a params.yml file that is annotated to explain the function of each argument. The params.yml file is separated into 'chunks' that reflect the same subcommands available in SiSaNA on the command line. For each step of SiSaNA, you will need to use the correct subcommand, as well as have the parameters set up in the params.yml file.
In the below example, the user is running the "preprocess" step of SiSaNA. They have specified the paths to the input files as well as the value for the number of samples a gene must be expressed in (in their case, 5), along with the path to the output directory in which to store their results.
![Pipeline overview](docs/params_example.png)
## Pre-processing of data
The "preprocess" subcommand is the first stage of SiSaNA, where it preprocess the input data to get it in a format that the PANDA and LIONESS algorithms can handle. This will likely involve the removal of genes or transcription factors that are not consistent across files. Information regarding the removal of these factors is given at the end of the preprocessing step.
#### Example command
```
sisana preprocess ./example_inputs/params.yml
```
#### Outputs
Three files, one for each of the three filtered input files.
<br />
<br />
## Reconstruct and analyze the network
This second SiSaNA stage, "generate", uses the PANDA and LIONESS algorithms of netZooPy to reconstruct gene regulatory networks. Documentation for netZooPy can be found at https://github.com/netZoo/netZooPy/tree/master. It then performs basic analyses of these networks by calculating in-degree of genes (also called gene targeting scores) and out-degree of transcription factors (TFs).
#### Example command
```
sisana generate ./example_inputs/params.yml
```
#### Outputs
1. lioness.npy, which contains all calculated edges for each sample
2. lioness.pickle, which is the same thing, just serialized to make reading into python quicker
3. A file containing the calculated indegree and another file with the outdegree of each gene and transcription factor, respectively.
<br />
## Comparing two experimental groups
The next stage in SiSaNA, "compare", is used to find out how groups differ between each other. SiSaNA offers multiple ways to do this comparison, including t-tests (and Mann-Whitney tests), paired t-tests (and Wilcoxon paired t-tests), survival analysis (typically used for cancer data), and gene set enrichment analysis (GSEA).
To compare the in- and out-degrees between two treatment groups, one can use either a Student's t-test (parametric) or a Mann-Whitney (non-parametric) test. Or for paired samples, one can use either a paired t-test or Wilcoxon signed-rank test, respectively.
#### Example commands
To compare the values between two groups in order to identify differentially expressed genes are differential degrees, you can use the following command:
```
sisana compare means ./example_inputs/params.yml
```
For performing survival analyses, you can use a command like this:
```
sisana compare survival ./example_inputs/params.yml
```
...and for gene set enrichment:
```
sisana compare gsea ./example_inputs/params.yml
```
<br />
## Visualization of results
The final stage of SiSaNA, "visualize" allows you to visualize the results of your analysis on publication-ready figures. There are multiple types of visualization you can perform, including generating volcano plots...
```
sisana visualize volcano ./example_inputs/params.yml
```
...making boxplots or violin plots of expression/degrees...
```
sisana visualize quantity ./example_inputs/params.yml
```
...and creating heatmaps
```
sisana visualize heatmap ./example_inputs/params.yml
```
Raw data
{
"_id": null,
"home_page": null,
"name": "sisana",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9.0",
"maintainer_email": null,
"keywords": "transcription-factors, gene-regulatory-network, panda, lioness",
"author": null,
"author_email": "Nolan Newman <nolan.newman@ncmm.uio.no>",
"download_url": "https://files.pythonhosted.org/packages/18/3a/4f1d79bc2392a07a885f7a3d996542242c15ccccd1e2953687b09d8c6772/sisana-0.0.6.tar.gz",
"platform": null,
"description": "# SiSaNA\nSingle Sample Network Analysis\n\nSiSaNA is used both before and after creating both PANDA and LIONESS networks from the package netZooPy. SiSaNA first needs to pre-process the data to be ran in PANDA/LIONESS. SiSaNA takes the LIONESS output, processes it to be analyzed downstream, and then calculates in- and out-degree for each of the reconstructed networks. Additionally, it can compare the expression/degree between groups of interest, including performing statistical tests, visualizing the results (volcano plots, boxplots, violin plots, and heatmaps), and compare the survival between groups.\n\n**Note: The steps below are for the basic use of SiSaNA. Additional functionalities are still under development.**\n\n## Requirements\n - python v3.9.19 (see installation steps for creating a conda environment with this specific Python version). SiSaNA should work with versions of Python 3.9.0 or greater, but as it has been written and tested on this version, we will use 3.9.19.\n \n## Installation can be performed by running the following steps\n\n1. Create a conda virtual environment with python version 3.9.19. \n```\nconda create --prefix </path/to/env-name> python=3.9.19\n```\n\n2. Enter the conda environment\n```\nconda activate </path/to/env-name>\n```\n\n3. Install SiSaNA via the pip package installer\n```\npip3 install sisana\n```\n\n4. Create a directory for the analysis and move into the analysis directory\n```\nmkdir sisana\ncd sisana\n```\n\n## Pipeline overview\n![Pipeline overview](docs/sisana_pipeline_overview_v2.png)\n\n## Example input files\nExample input files can be obtained using the command\n```\nsisana -e\n```\nThese files will be copied to a new directory in the current working directory, called \"example_inputs\". One of these example files is the params.yml file, which can be used as a template and edited for your own data (see next section). Each user-defined parameter in the params.yml file is documented with a comment to explain the function of the parameter. The comments do not need to be removed prior to running SiSaNA. The files in this example_inputs directory can be used in the commands listed down below.\n\n## Viewing help documentation on SiSaNA\nTo view help documentation on which subcommands are available, the following can be used:\n```\nsisana -h\n```\n\nFor further information on these subcommands, simply put the name of the subcommand before the `-h`\n```\nsisana <subcommand> -h\n```\n\n## Setting up your params.yml file\nThe most important thing to get right in order to correctly run SiSaNA is the structure of your params.yml file. SiSaNA comes with a params.yml file that is annotated to explain the function of each argument. The params.yml file is separated into 'chunks' that reflect the same subcommands available in SiSaNA on the command line. For each step of SiSaNA, you will need to use the correct subcommand, as well as have the parameters set up in the params.yml file.\n\nIn the below example, the user is running the \"preprocess\" step of SiSaNA. They have specified the paths to the input files as well as the value for the number of samples a gene must be expressed in (in their case, 5), along with the path to the output directory in which to store their results.\n![Pipeline overview](docs/params_example.png)\n\n## Pre-processing of data\nThe \"preprocess\" subcommand is the first stage of SiSaNA, where it preprocess the input data to get it in a format that the PANDA and LIONESS algorithms can handle. This will likely involve the removal of genes or transcription factors that are not consistent across files. Information regarding the removal of these factors is given at the end of the preprocessing step.\n\n#### Example command\n```\nsisana preprocess ./example_inputs/params.yml\n```\n\n#### Outputs\nThree files, one for each of the three filtered input files. \n<br />\n<br />\n\n\n\n## Reconstruct and analyze the network\nThis second SiSaNA stage, \"generate\", uses the PANDA and LIONESS algorithms of netZooPy to reconstruct gene regulatory networks. Documentation for netZooPy can be found at https://github.com/netZoo/netZooPy/tree/master. It then performs basic analyses of these networks by calculating in-degree of genes (also called gene targeting scores) and out-degree of transcription factors (TFs).\n\n#### Example command\n```\nsisana generate ./example_inputs/params.yml\n```\n\n#### Outputs\n1. lioness.npy, which contains all calculated edges for each sample\n2. lioness.pickle, which is the same thing, just serialized to make reading into python quicker\n3. A file containing the calculated indegree and another file with the outdegree of each gene and transcription factor, respectively.\n<br />\n\n\n## Comparing two experimental groups\nThe next stage in SiSaNA, \"compare\", is used to find out how groups differ between each other. SiSaNA offers multiple ways to do this comparison, including t-tests (and Mann-Whitney tests), paired t-tests (and Wilcoxon paired t-tests), survival analysis (typically used for cancer data), and gene set enrichment analysis (GSEA).\n\nTo compare the in- and out-degrees between two treatment groups, one can use either a Student's t-test (parametric) or a Mann-Whitney (non-parametric) test. Or for paired samples, one can use either a paired t-test or Wilcoxon signed-rank test, respectively.\n\n#### Example commands\nTo compare the values between two groups in order to identify differentially expressed genes are differential degrees, you can use the following command:\n```\nsisana compare means ./example_inputs/params.yml\n```\n\nFor performing survival analyses, you can use a command like this:\n```\nsisana compare survival ./example_inputs/params.yml\n```\n\n...and for gene set enrichment:\n```\nsisana compare gsea ./example_inputs/params.yml\n```\n<br />\n\n\n## Visualization of results\nThe final stage of SiSaNA, \"visualize\" allows you to visualize the results of your analysis on publication-ready figures. There are multiple types of visualization you can perform, including generating volcano plots...\n```\nsisana visualize volcano ./example_inputs/params.yml\n```\n\n...making boxplots or violin plots of expression/degrees...\n```\nsisana visualize quantity ./example_inputs/params.yml\n```\n\n...and creating heatmaps\n```\nsisana visualize heatmap ./example_inputs/params.yml\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "A command line interface tool to reconstruct and analyze single sample networks.",
"version": "0.0.6",
"project_urls": {
"Repository": "https://github.com/newmanno/sisana"
},
"split_keywords": [
"transcription-factors",
" gene-regulatory-network",
" panda",
" lioness"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9c035ae448571cf3183ce45c3f162627d2db91b0b2df59430de3e055be0b340b",
"md5": "7416e73bcfc3c13173b449e7537bf07c",
"sha256": "4cf3995b65fc9b47d74ea95449abec8db7b33ba48755818f118a4f5cbf3b8346"
},
"downloads": -1,
"filename": "sisana-0.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7416e73bcfc3c13173b449e7537bf07c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9.0",
"size": 11220741,
"upload_time": "2025-01-07T13:38:03",
"upload_time_iso_8601": "2025-01-07T13:38:03.791154Z",
"url": "https://files.pythonhosted.org/packages/9c/03/5ae448571cf3183ce45c3f162627d2db91b0b2df59430de3e055be0b340b/sisana-0.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "183a4f1d79bc2392a07a885f7a3d996542242c15ccccd1e2953687b09d8c6772",
"md5": "4402e18d95c431058fa7316def5a4f46",
"sha256": "b163463170690c994880324ddd72c32a657c8b0fde3c0741f55de0a34e15ed7d"
},
"downloads": -1,
"filename": "sisana-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "4402e18d95c431058fa7316def5a4f46",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9.0",
"size": 11210446,
"upload_time": "2025-01-07T13:38:09",
"upload_time_iso_8601": "2025-01-07T13:38:09.369270Z",
"url": "https://files.pythonhosted.org/packages/18/3a/4f1d79bc2392a07a885f7a3d996542242c15ccccd1e2953687b09d8c6772/sisana-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-07 13:38:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "newmanno",
"github_project": "sisana",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "scikit-learn",
"specs": []
},
{
"name": "scikit-survival",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "matplotlib",
"specs": []
},
{
"name": "pathlib",
"specs": []
},
{
"name": "gseapy",
"specs": []
},
{
"name": "seaborn",
"specs": []
},
{
"name": "adjustText",
"specs": []
},
{
"name": "pyyaml",
"specs": []
},
{
"name": "netZooPy",
"specs": []
}
],
"lcname": "sisana"
}