iCypress


NameiCypress JSON
Version 0.13 PyPI version JSON
download
home_pagehttps://github.com/luciancahil/iCYPRESS
SummaryiCYPRESS: identifying CYtokine PREdictors of diSeaSe. A library that analyzes cytokines using Graph Neural Networks
upload_time2023-10-08 04:27:54
maintainer
docs_urlNone
authorroyhe
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements yacs tensorboardx torch torch-geometric deepsnap ogb numpy pandas scipy scikit-learn matplotlib seaborn notebook
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # iCYPRESSS

## What is iCYPRESS?
iCYPRESS stands for identifying CYtokine PREdictors of diSeaSE.

It is a graph neural network library that analyzes gene expression data in the context of cytokine cellular networks.

## Install
To Install iCYPRESS, make sure that you are in a conda environement with python 3.9. Then, install the following libraries using these exact commands.


````
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cpuonly -c pytorch
conda install pyg -c pyg
conda install pytorch-scatter -c pyg
conda install pytorch-sparse -c pyg
pip install pytorch_lightning
pip install iCypress
````
## Testing

To make sure that all packages are installed properly try the following program:

````
from iCYPRESS import Cypress
from iCYPRESS.CytokinesDataSet import CytokinesDataSet

cyp = Cypress.Cypress()
cyp.train()
````


## Custom Data

To use this library on your own custom data, you will need two csv files: a patients file, and an eset file.

The eset file should be structured like so:

|"             "|  "gene_sym" |"GSM989153"  |"GSM989154" |"GSM989155" |
| ------------- |-------------| -----       | ---        | ---        |
| "1"           | "A1CF"      | 3.967147246 |3.967147248 |3.96714725  |
| "2"           | "A2M"       | 4.669213864 |4.669213567 |4.669213628 |
| "3"           | "A2ML1"     | 4.140074251 |4.140074246 |4.140074286 |


The top should include every patient who's data you wish to analyze, the 2nd collumn should contain 5h3 name of every gene you have data on, and the numbers represent the gene readings.

Meanwhile, the patients file should be structured like so:
|         |   |
|---------|---|
|GSM989161|0  |
|GSM989162|0  |
|GSM989163|0  |
|GSM989164|0  |
|GSM989165|0  |
|GSM989166|0  |
|GSM989167|1  |
|GSM989168|1  |
|GSM989169|1  |
|GSM989170|1  |
|GSM989171|1  |
|GSM989172|1  |
|GSM989173|1  |
|GSM989174|1  |
|GSM989175|1  |

Where the left collumn contains all the names of your patients, and the right collumn contains their classification.

It is very important that every patient that appears in your patients file also appears in the top row of your eset file and vice versa. Otherwise, the library will raise an error.

Once you've prepared both files and placed them into the directory as your main python file, run the following program, switching "eset.csv" and "patients.csv" with the actual names of your eset and patients file respecively.


````
from iCYPRESS import Cypress
from iCYPRESS.CytokinesDataSet import CytokinesDataSet

cyp = Cypress.Cypress(patients = "patients.csv", eset="GSE40240_eset.csv")
cyp.train('CCL1')
````


## Customization

There are two main ways you can change the way the libary analyzes your data: cytokine choice and hyper paramameters:

### Cytokines

There are 70 different cytokines that this network can use to build the Graph in its Graph Neural Network.

The example above uses CCL1, but you can also use CCL2, CD70, or many others.

You can get a list of supported cytokines by calling the method Cypress.Cypress(get_cyto_list) in your main file. To try running the model with every avilable cytokine, use the code below. Be warned, the program will take a while to execute.

```
from iCYPRESS import Cypress
from iCYPRESS.CytokinesDataSet import CytokinesDataSet

cyto_list = Cypress.Cypress.get_cyto_list()
cyp = Cypress.Cypress(patients = "patients.csv", eset="GSE40240_eset.csv", active_cyto_list = cyto_list)

for cyto in cyto_list:
  cyp.train(cyto)
```

If you want to run it with just a subset of cytokines, make cyto_list a string array containing only the cytokines you want to run on.

### Hyperparameterization

Hyperparameters are the parameters that control the structure of the code and training regiment. A brief breakdown of what each of them below.

|Hyperparameter       | Description  |
|---------------------|--------------|
| batch_size          | How many elements are in each training batch. If batch_size is set to 80, each batch will contain at most 80 elements
| eval_period         | How often the neural network evaluates itself on the training data. If set to 20, it will evaluate itself every 20 epochs.
| layers_pre_mp       | How many layers the network will run before the message passing stage.
| layers_mp           | How many rounds of message passing the network will do.
| layers_post_mp      | How many layers after the message passing stage will exist in the neural network.
| dim_inner           | The number of neurons in each hidden layer.
| max_epoch           | How many epochs the training stage will go through.


If you want to try having hyperparameters different from the default, pass the hyperparameter you want to change along with a value into the constructor. For example, to set max_epoch to 500, use the following code:

```
from iCYPRESS import Cypress
from iCYPRESS.CytokinesDataSet import CytokinesDataSet

cyto_list = Cypress.Cypress.get_cyto_list()
cyp = Cypress.Cypress(patients = "patients.csv", eset="GSE40240_eset.csv", max_epoch = 500)
```

## Quick start options

### repo setup on HPC (UBC ARC Sockeye)
* clone repo to project and scratch folders
```
module load git
export ALLOC=st-allocation-code
mkdir /arc/project/$ALLOC/$USER/
cd /arc/project/$ALLOC/$USER/

mkdir /scratch/$ALLOC/$USER
cd /scratch/$ALLOC/$USER
mkdir cyp
cd cyp/
```
* $ALLOC: Sockeye allocation code
* $USER: UBC Campus wide login (should be already set)
  
Then, follow all instructions above to create the necessary conda environment. Run the code by typing "python $FILENAME.py" into the sockey console.
* $FILENAME: Name of the file you copied the above code into. Ex: main.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/luciancahil/iCYPRESS",
    "name": "iCypress",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "royhe",
    "author_email": "royhe62@yahoo.ca",
    "download_url": "https://files.pythonhosted.org/packages/03/3d/d2b3b6278323846bb7c4e77f6bb77a50b81fc81ffb5a6656427465bf86a0/iCypress-0.13.tar.gz",
    "platform": null,
    "description": "# iCYPRESSS\r\n\r\n## What is iCYPRESS?\r\niCYPRESS stands for identifying CYtokine PREdictors of diSeaSE.\r\n\r\nIt is a graph neural network library that analyzes gene expression data in the context of cytokine cellular networks.\r\n\r\n## Install\r\nTo Install iCYPRESS, make sure that you are in a conda environement with python 3.9. Then, install the following libraries using these exact commands.\r\n\r\n\r\n````\r\nconda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cpuonly -c pytorch\r\nconda install pyg -c pyg\r\nconda install pytorch-scatter -c pyg\r\nconda install pytorch-sparse -c pyg\r\npip install pytorch_lightning\r\npip install iCypress\r\n````\r\n## Testing\r\n\r\nTo make sure that all packages are installed properly try the following program:\r\n\r\n````\r\nfrom iCYPRESS import Cypress\r\nfrom iCYPRESS.CytokinesDataSet import CytokinesDataSet\r\n\r\ncyp = Cypress.Cypress()\r\ncyp.train()\r\n````\r\n\r\n\r\n## Custom Data\r\n\r\nTo use this library on your own custom data, you will need two csv files: a patients file, and an eset file.\r\n\r\nThe eset file should be structured like so:\r\n\r\n|\"             \"|  \"gene_sym\" |\"GSM989153\"  |\"GSM989154\" |\"GSM989155\" |\r\n| ------------- |-------------| -----       | ---        | ---        |\r\n| \"1\"           | \"A1CF\"      | 3.967147246 |3.967147248 |3.96714725  |\r\n| \"2\"           | \"A2M\"       | 4.669213864 |4.669213567 |4.669213628 |\r\n| \"3\"           | \"A2ML1\"     | 4.140074251 |4.140074246 |4.140074286 |\r\n\r\n\r\nThe top should include every patient who's data you wish to analyze, the 2nd collumn should contain 5h3 name of every gene you have data on, and the numbers represent the gene readings.\r\n\r\nMeanwhile, the patients file should be structured like so:\r\n|         |   |\r\n|---------|---|\r\n|GSM989161|0  |\r\n|GSM989162|0  |\r\n|GSM989163|0  |\r\n|GSM989164|0  |\r\n|GSM989165|0  |\r\n|GSM989166|0  |\r\n|GSM989167|1  |\r\n|GSM989168|1  |\r\n|GSM989169|1  |\r\n|GSM989170|1  |\r\n|GSM989171|1  |\r\n|GSM989172|1  |\r\n|GSM989173|1  |\r\n|GSM989174|1  |\r\n|GSM989175|1  |\r\n\r\nWhere the left collumn contains all the names of your patients, and the right collumn contains their classification.\r\n\r\nIt is very important that every patient that appears in your patients file also appears in the top row of your eset file and vice versa. Otherwise, the library will raise an error.\r\n\r\nOnce you've prepared both files and placed them into the directory as your main python file, run the following program, switching \"eset.csv\" and \"patients.csv\" with the actual names of your eset and patients file respecively.\r\n\r\n\r\n````\r\nfrom iCYPRESS import Cypress\r\nfrom iCYPRESS.CytokinesDataSet import CytokinesDataSet\r\n\r\ncyp = Cypress.Cypress(patients = \"patients.csv\", eset=\"GSE40240_eset.csv\")\r\ncyp.train('CCL1')\r\n````\r\n\r\n\r\n## Customization\r\n\r\nThere are two main ways you can change the way the libary analyzes your data: cytokine choice and hyper paramameters:\r\n\r\n### Cytokines\r\n\r\nThere are 70 different cytokines that this network can use to build the Graph in its Graph Neural Network.\r\n\r\nThe example above uses CCL1, but you can also use CCL2, CD70, or many others.\r\n\r\nYou can get a list of supported cytokines by calling the method Cypress.Cypress(get_cyto_list) in your main file. To try running the model with every avilable cytokine, use the code below. Be warned, the program will take a while to execute.\r\n\r\n```\r\nfrom iCYPRESS import Cypress\r\nfrom iCYPRESS.CytokinesDataSet import CytokinesDataSet\r\n\r\ncyto_list = Cypress.Cypress.get_cyto_list()\r\ncyp = Cypress.Cypress(patients = \"patients.csv\", eset=\"GSE40240_eset.csv\", active_cyto_list = cyto_list)\r\n\r\nfor cyto in cyto_list:\r\n  cyp.train(cyto)\r\n```\r\n\r\nIf you want to run it with just a subset of cytokines, make cyto_list a string array containing only the cytokines you want to run on.\r\n\r\n### Hyperparameterization\r\n\r\nHyperparameters are the parameters that control the structure of the code and training regiment. A brief breakdown of what each of them below.\r\n\r\n|Hyperparameter       | Description  |\r\n|---------------------|--------------|\r\n| batch_size          | How many elements are in each training batch. If batch_size is set to 80, each batch will contain at most 80 elements\r\n| eval_period         | How often the neural network evaluates itself on the training data. If set to 20, it will evaluate itself every 20 epochs.\r\n| layers_pre_mp       | How many layers the network will run before the message passing stage.\r\n| layers_mp           | How many rounds of message passing the network will do.\r\n| layers_post_mp      | How many layers after the message passing stage will exist in the neural network.\r\n| dim_inner           | The number of neurons in each hidden layer.\r\n| max_epoch           | How many epochs the training stage will go through.\r\n\r\n\r\nIf you want to try having hyperparameters different from the default, pass the hyperparameter you want to change along with a value into the constructor. For example, to set max_epoch to 500, use the following code:\r\n\r\n```\r\nfrom iCYPRESS import Cypress\r\nfrom iCYPRESS.CytokinesDataSet import CytokinesDataSet\r\n\r\ncyto_list = Cypress.Cypress.get_cyto_list()\r\ncyp = Cypress.Cypress(patients = \"patients.csv\", eset=\"GSE40240_eset.csv\", max_epoch = 500)\r\n```\r\n\r\n## Quick start options\r\n\r\n### repo setup on HPC (UBC ARC Sockeye)\r\n* clone repo to project and scratch folders\r\n```\r\nmodule load git\r\nexport ALLOC=st-allocation-code\r\nmkdir /arc/project/$ALLOC/$USER/\r\ncd /arc/project/$ALLOC/$USER/\r\n\r\nmkdir /scratch/$ALLOC/$USER\r\ncd /scratch/$ALLOC/$USER\r\nmkdir cyp\r\ncd cyp/\r\n```\r\n* $ALLOC: Sockeye allocation code\r\n* $USER: UBC Campus wide login (should be already set)\r\n  \r\nThen, follow all instructions above to create the necessary conda environment. Run the code by typing \"python $FILENAME.py\" into the sockey console.\r\n* $FILENAME: Name of the file you copied the above code into. Ex: main.\r\n\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "iCYPRESS: identifying CYtokine PREdictors of diSeaSe. A library that analyzes cytokines using Graph Neural Networks",
    "version": "0.13",
    "project_urls": {
        "Download": "https://github.com/luciancahil/iCYPRESS/archive/refs/tags/v_01.tar.gz",
        "Homepage": "https://github.com/luciancahil/iCYPRESS"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "033dd2b3b6278323846bb7c4e77f6bb77a50b81fc81ffb5a6656427465bf86a0",
                "md5": "e4f866be5f53222b62f73f0513e78f7e",
                "sha256": "16d17fb8fa3892fe2a7cdfab463be233e4738fb1ba77d23466ff9f18d2582b11"
            },
            "downloads": -1,
            "filename": "iCypress-0.13.tar.gz",
            "has_sig": false,
            "md5_digest": "e4f866be5f53222b62f73f0513e78f7e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 132800,
            "upload_time": "2023-10-08T04:27:54",
            "upload_time_iso_8601": "2023-10-08T04:27:54.828876Z",
            "url": "https://files.pythonhosted.org/packages/03/3d/d2b3b6278323846bb7c4e77f6bb77a50b81fc81ffb5a6656427465bf86a0/iCypress-0.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-08 04:27:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "luciancahil",
    "github_project": "iCYPRESS",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "yacs",
            "specs": []
        },
        {
            "name": "tensorboardx",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "torch-geometric",
            "specs": []
        },
        {
            "name": "deepsnap",
            "specs": []
        },
        {
            "name": "ogb",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "scikit-learn",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "seaborn",
            "specs": []
        },
        {
            "name": "notebook",
            "specs": []
        }
    ],
    "lcname": "icypress"
}
        
Elapsed time: 0.18098s