oncoboxlib


Nameoncoboxlib JSON
Version 1.3.0 PyPI version JSON
download
home_page
SummaryOncobox collections of libraries
upload_time2023-12-04 08:44:54
maintainer
docs_urlNone
author
requires_python>=3.10
license# Released under MIT License Copyright (c) 2018 Oncobox. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords bioinformatics transcriptomics pathways
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # oncoboxlib

Oncobox library calculates Pathways Activation Levels (PAL) according to
Sorokin et al.(doi: 10.3389/fgene.2021.617059).
It takes a file that contains gene symbols in HGNC format (see genenames.org),
their expression levels for one or more samples (cases and/or controls)
and calculates PAL values for each pathway in each sample.

Online service is available at https://open.oncobox.com


## Installation

```sh
pip install oncoboxlib
```

## How to run the example

1. Create any directory that will be used as a sandbox. Let's assume it is named `sandbox`.


2. Extract `resources/databases.zip` into `sandbox/databases/`.
  <br> (You may download the archive from 
  `https://gitlab.com/oncobox/oncoboxlib/-/blob/master/resources/databases.zip`)
  

3. Extract example data `resources/cyramza_normalized_counts.txt.zip` into `sandbox`.
  <br> (You may download the archive from 
  `https://gitlab.com/oncobox/oncoboxlib/-/blob/master/resources/cyramza_normalized_counts.txt.zip`)
  

What it looks like now:
```
   - sandbox
       - databases
           - Balanced 1.123
           - KEGG Adjusted 1.123
           ...
       - cyramza_normalized_counts.txt  
```

4. Change directory to `sandbox` and execute the command:
```sh
oncoboxlib_calculate_scores --databases-dir=databases/ --samples-file=cyramza_normalized_counts.txt
```
It will create a result file `sandbox\pal.csv`.


Alternatively, you can use it as a library in your source code.
For details please see `examples` directory.


## Input file format

Table that contains gene expression.
Allowed separators: comma, semicolon, tab, space.
Compressed (zipped) files are supported as well.

- First column - gene symbol in HGNC format, see genenames.org.
- Others columns - gene expression data for cases or controls.
- Names of case columns should contain "Case", "Tumour", or "Tumor", case insensitive.
- Names of control columns should contain "Control" or "Norm", case insensitive.

It is supposed that data is already normalized by DESeq2, quantile normalization or other methods.


## Command line tool help

To read the complete help, run the tool with the `-help` argument:
```sh
oncoboxlib_calculate_scores --help
```

Here is the output (for convenience):
```
usage: calculate_scores.py [-h] --samples-file SAMPLES_FILE
                           [--controls-file CONTROLS_FILE] [--ttest]
                           [--fdr-bh] --databases-dir DATABASES_DIR
                           [--databases-names DATABASES_NAMES]
                           [--results-file RESULTS_FILE]

Command line tool for calculation of pathway activation level according to
doi: 10.3389/fgene.2021.617059

optional arguments:
  -h, --help            show this help message and exit
                        
  --samples-file SAMPLES_FILE
                        Table that contains gene expression for cases (or
                        cases and controls). Allowed separators: comma,
                        semicolon, tab, space. Compressed (zipped) files are
                        supported as well. First column - gene symbol in HGNC
                        format, see genenames.org. Others columns - gene
                        expression data for cases or controls. Names of case
                        columns should contain "Case", "Tumour", or "Tumor",
                        case insensitive. Names of control columns should
                        contain "Control" or "Norm", case insensitive. It is
                        supposed that data is already normalized by DESeq2,
                        quantile normalization or other methods.
                        
  --controls-file CONTROLS_FILE
                        Optional file that contains controls. If provided,
                        cases and controls will be increased by one and
                        normalized by quantile normalization.
                        
  --ttest               Include to result a column for unequal variance t-test
                        two-tailed p-values (aka Welch's t-test). It is
                        assumed that cases and norms are independent. t-test
                        will be performed between all cases and all controls.
                        
  --fdr-bh              Include to result a column for p-values corrected for
                        FDR using Benjamini/Hochberg method
                        
  --databases-dir DATABASES_DIR
                        Directory that contains pathway databases. Databases
                        can be downloaded from https://gitlab.com/oncobox/onco
                        boxlib/-/blob/master/resources/databases.zip (Biocarta
                        1.123, KEGG Adjusted 1.123, Metabolism 1.123, NCI
                        1.123, Qiagen 1.123, Reactome 1.123)
                        
  --databases-names DATABASES_NAMES
                        Names of databases that are used to calculate PALs.
                        "all" means that all database from --databases-dir
                        will be used.
                        
  --results-file RESULTS_FILE
                        Output file that will contain results, "pal.csv" by
                        default            
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "oncoboxlib",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "",
    "keywords": "bioinformatics,transcriptomics,pathways",
    "author": "",
    "author_email": "Victor Tkachev <victor.tkachev@yandex.com>, Alexander Simonov <registsys@mail.ru>",
    "download_url": "https://files.pythonhosted.org/packages/dc/1d/83ba6d12c5cbcaa24dc3fb1d36ade6b4e13f3ca0c898b667ba3bda6a3592/oncoboxlib-1.3.0.tar.gz",
    "platform": null,
    "description": "# oncoboxlib\n\nOncobox library calculates Pathways Activation Levels (PAL) according to\nSorokin et al.(doi: 10.3389/fgene.2021.617059).\nIt takes a file that contains gene symbols in HGNC format (see genenames.org),\ntheir expression levels for one or more samples (cases and/or controls)\nand calculates PAL values for each pathway in each sample.\n\nOnline service is available at https://open.oncobox.com\n\n\n## Installation\n\n```sh\npip install oncoboxlib\n```\n\n## How to run the example\n\n1. Create any directory that will be used as a sandbox. Let's assume it is named `sandbox`.\n\n\n2. Extract `resources/databases.zip` into `sandbox/databases/`.\n  <br> (You may download the archive from \n  `https://gitlab.com/oncobox/oncoboxlib/-/blob/master/resources/databases.zip`)\n  \n\n3. Extract example data `resources/cyramza_normalized_counts.txt.zip` into `sandbox`.\n  <br> (You may download the archive from \n  `https://gitlab.com/oncobox/oncoboxlib/-/blob/master/resources/cyramza_normalized_counts.txt.zip`)\n  \n\nWhat it looks like now:\n```\n   - sandbox\n       - databases\n           - Balanced 1.123\n           - KEGG Adjusted 1.123\n           ...\n       - cyramza_normalized_counts.txt  \n```\n\n4. Change directory to `sandbox` and execute the command:\n```sh\noncoboxlib_calculate_scores --databases-dir=databases/ --samples-file=cyramza_normalized_counts.txt\n```\nIt will create a result file `sandbox\\pal.csv`.\n\n\nAlternatively, you can use it as a library in your source code.\nFor details please see `examples` directory.\n\n\n## Input file format\n\nTable that contains gene expression.\nAllowed separators: comma, semicolon, tab, space.\nCompressed (zipped) files are supported as well.\n\n- First column - gene symbol in HGNC format, see genenames.org.\n- Others columns - gene expression data for cases or controls.\n- Names of case columns should contain \"Case\", \"Tumour\", or \"Tumor\", case insensitive.\n- Names of control columns should contain \"Control\" or \"Norm\", case insensitive.\n\nIt is supposed that data is already normalized by DESeq2, quantile normalization or other methods.\n\n\n## Command line tool help\n\nTo read the complete help, run the tool with the `-help` argument:\n```sh\noncoboxlib_calculate_scores --help\n```\n\nHere is the output (for convenience):\n```\nusage: calculate_scores.py [-h] --samples-file SAMPLES_FILE\n                           [--controls-file CONTROLS_FILE] [--ttest]\n                           [--fdr-bh] --databases-dir DATABASES_DIR\n                           [--databases-names DATABASES_NAMES]\n                           [--results-file RESULTS_FILE]\n\nCommand line tool for calculation of pathway activation level according to\ndoi: 10.3389/fgene.2021.617059\n\noptional arguments:\n  -h, --help            show this help message and exit\n                        \n  --samples-file SAMPLES_FILE\n                        Table that contains gene expression for cases (or\n                        cases and controls). Allowed separators: comma,\n                        semicolon, tab, space. Compressed (zipped) files are\n                        supported as well. First column - gene symbol in HGNC\n                        format, see genenames.org. Others columns - gene\n                        expression data for cases or controls. Names of case\n                        columns should contain \"Case\", \"Tumour\", or \"Tumor\",\n                        case insensitive. Names of control columns should\n                        contain \"Control\" or \"Norm\", case insensitive. It is\n                        supposed that data is already normalized by DESeq2,\n                        quantile normalization or other methods.\n                        \n  --controls-file CONTROLS_FILE\n                        Optional file that contains controls. If provided,\n                        cases and controls will be increased by one and\n                        normalized by quantile normalization.\n                        \n  --ttest               Include to result a column for unequal variance t-test\n                        two-tailed p-values (aka Welch's t-test). It is\n                        assumed that cases and norms are independent. t-test\n                        will be performed between all cases and all controls.\n                        \n  --fdr-bh              Include to result a column for p-values corrected for\n                        FDR using Benjamini/Hochberg method\n                        \n  --databases-dir DATABASES_DIR\n                        Directory that contains pathway databases. Databases\n                        can be downloaded from https://gitlab.com/oncobox/onco\n                        boxlib/-/blob/master/resources/databases.zip (Biocarta\n                        1.123, KEGG Adjusted 1.123, Metabolism 1.123, NCI\n                        1.123, Qiagen 1.123, Reactome 1.123)\n                        \n  --databases-names DATABASES_NAMES\n                        Names of databases that are used to calculate PALs.\n                        \"all\" means that all database from --databases-dir\n                        will be used.\n                        \n  --results-file RESULTS_FILE\n                        Output file that will contain results, \"pal.csv\" by\n                        default            \n```\n",
    "bugtrack_url": null,
    "license": "# Released under MIT License  Copyright (c) 2018 Oncobox.  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Oncobox collections of libraries",
    "version": "1.3.0",
    "project_urls": {
        "Bug Tracker": "https://gitlab.com/oncobox/oncoboxlib/issues",
        "Homepage": "https://gitlab.com/oncobox/oncoboxlib"
    },
    "split_keywords": [
        "bioinformatics",
        "transcriptomics",
        "pathways"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b4ca545f22dd3989a91acd206f4bca7681cf3070807fc6850f20ca0bfe1f972d",
                "md5": "d544a4ed78662ccf385cb14dc1a04523",
                "sha256": "3052e0722a411603351553f1b66e4c3b01cac1937579e9cdcdb2719cc4290983"
            },
            "downloads": -1,
            "filename": "oncoboxlib-1.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d544a4ed78662ccf385cb14dc1a04523",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 13641,
            "upload_time": "2023-12-04T08:44:52",
            "upload_time_iso_8601": "2023-12-04T08:44:52.542597Z",
            "url": "https://files.pythonhosted.org/packages/b4/ca/545f22dd3989a91acd206f4bca7681cf3070807fc6850f20ca0bfe1f972d/oncoboxlib-1.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc1d83ba6d12c5cbcaa24dc3fb1d36ade6b4e13f3ca0c898b667ba3bda6a3592",
                "md5": "d0c9a67051751c8e1b642d285a9cbf53",
                "sha256": "98a4df4f96c4164bb1aa79be949584b95fa7fbe00e4551afe88eb38d92454076"
            },
            "downloads": -1,
            "filename": "oncoboxlib-1.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d0c9a67051751c8e1b642d285a9cbf53",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 13571,
            "upload_time": "2023-12-04T08:44:54",
            "upload_time_iso_8601": "2023-12-04T08:44:54.532226Z",
            "url": "https://files.pythonhosted.org/packages/dc/1d/83ba6d12c5cbcaa24dc3fb1d36ade6b4e13f3ca0c898b667ba3bda6a3592/oncoboxlib-1.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-04 08:44:54",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "oncobox",
    "gitlab_project": "oncoboxlib",
    "lcname": "oncoboxlib"
}
        
Elapsed time: 0.18591s