GOcats


NameGOcats JSON
Version 1.2.1 PyPI version JSON
download
home_pagehttps://github.com/MoseleyBioinformaticsLab/GOcats
SummaryA tool for categorizing Gene Ontology into subgraphs of user-defined emergent concepts
upload_time2023-06-16 21:56:00
maintainer
docs_urlNone
authorEugene W. Hinderer III
requires_python
licenseThe Clear BSD License
keywords biomedical ontology analysis
VCS
bugtrack_url
requirements docopt jsonpickle
Travis-CI No Travis.
coveralls test coverage No coveralls.
            GOcats
======

`GOcats` is an Open Biomedical Ontology (OBO) parser and categorizing utility--currently specialized for the Gene
Ontology (GO)--which can help scientists interpret large-scale experimental results by organizing redundant and highly-
specific annotations into customizable, biologically-relevant concept categories. Concept subgraphs are defined by lists
of keywords created by the user.

Currently, the `GOcats` package can be used to:
   * Create subgraphs of GO which each represent a user-specified concept.
   * Map specific, or fine-grained, GO terms in a Gene Annotation File (GAF) to an arbitrary number of concept
     categories.
   * Remap ancestor Gene Ontology term relationships and the gene annotations with a set of user defined relationships.
   * Explore the Gene Ontology graph within a Python interpreter.

Citation
~~~~~~~~
Please cite the following papers when using GOcats:

Hinderer EW, Moseley NHB. GOcats: A tool for categorizing Gene Ontology into subgraphs of user-defined concepts. PLoS One. 2020;15(6):1-29.

Hinderer EW, Flight RM, Dubey R, Macleod JN, Moseley HNB. Advances in Gene Ontology utilization improve statistical power of annotation enrichment. PLoS One. 2019;14(8):1-20.

Installation
~~~~~~~~~~~~

`GOcats` runs under Python 3.4+ and is available through python3-pip. Install via pip or clone the git repo and install
the following dependencies and you are ready to go!

Install on Linux
----------------

Pip installation
................

Dependencies should be automatically installed using this method. It is strongly recommended that you install with this
method.
.. code:: bash

   pip3 install gocats

GitHub Package installation
...........................

Make sure you have git_ installed:

.. code:: bash

   cd ~/
   git clone https://github.com/MoseleyBioinformaticsLab/GOcats.git

Dependencies
............

`GOcats` requires the following Python libraries:

   * docopt_ for creating the gocats command-line interface.
   * JSONPickle_ for saving Python objects in a JSON serializable form and outputting to a file.

To install dependencies manually:

.. code:: bash

   pip3 install docopt
   pip3 install jsonpickle

Install on Windows
------------------
GOcats can also be installed on windows through pip.

Quickstart
~~~~~~~~~~

For instructions on how to format your keyword list and advanced argument usage, consult the tutorial, guide, and API documentation at readthedocs_.

Subgraphs can be created from the command line.

.. code:: bash

   python3 -m gocats create_subgraphs /path_to_ontology_file ~/GOcats/gocats/exampledata/examplecategories.csv ~/Output --supergraph_namespace=cellular_component --subgraph_namespace=cellular_component --output_termlist

Mapping files can be found in the output directory:

   - GC_content_mapping.json_pickle  # A python dictionary with category-defining GO terms as keys and a list of all subgraph contents as values.
   - GC_id_mapping.json_pickle  # A python dictionary with every GO term of the specified namespace as keys and a list of category root terms as values.

GAF mappings can also be made from the command line:

.. code:: bash

   python3 -m gocats categorize_dataset YOUR_GAF.goa YOUR_OUTPUT_DIRECTORY/GC_id_mapping.json_pickle YOUR_OUTPUT_DIRECTORY MAPPED_DATASET_NAME.goa

Gene to GO Term remappings with consideration of ``has_part`` relationships can created from the command line:

.. code:: bash

   python3 -m gocats remap_goterms /path_to_ontology_file.obo /path_to_gaf.goa ancestors_output.json namespace_output.json --allowed_relationships=is_a,part_of,has_part --identifier_column=1

Gene to GO terms will be in JSON format in ``ancestor_output.json``, and new GO term to namespace in ``namespace_output.json``.

License
~~~~~~~

Made available under the terms of The Clear BSD License. See full license in LICENSE.

The Clear BSD License

Copyright (c) 2017, Eugene W. Hinderer III, Hunter N.B. Moseley
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted (subject to the limitations in the disclaimer
below) provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
  list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its contributors may be used
  to endorse or promote products derived from this software without specific
  prior written permission.

NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS
LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.

Authors
~~~~~~~

* **Eugene W. Hinderer III** - ehinderer_
* **Hunter N.B. Moseley** - hunter-moseley_

.. _readthedocs: http://gocats.readthedocs.io/
.. _jsonpickle: https://github.com/jsonpickle/jsonpickle
.. _docopt: https://github.com/docopt/docopt
.. _git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git/
.. _ehinderer: https://github.com/ehinderer
.. _hunter-moseley: https://github.com/hunter-moseley

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MoseleyBioinformaticsLab/GOcats",
    "name": "GOcats",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "biomedical ontology analysis",
    "author": "Eugene W. Hinderer III",
    "author_email": "ehinderer01@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/7a/54/0321691174cb2c00c8005419b45b57e8e5f5bf8085051ce298c073cc6561/GOcats-1.2.1.tar.gz",
    "platform": null,
    "description": "GOcats\n======\n\n`GOcats` is an Open Biomedical Ontology (OBO) parser and categorizing utility--currently specialized for the Gene\nOntology (GO)--which can help scientists interpret large-scale experimental results by organizing redundant and highly-\nspecific annotations into customizable, biologically-relevant concept categories. Concept subgraphs are defined by lists\nof keywords created by the user.\n\nCurrently, the `GOcats` package can be used to:\n   * Create subgraphs of GO which each represent a user-specified concept.\n   * Map specific, or fine-grained, GO terms in a Gene Annotation File (GAF) to an arbitrary number of concept\n     categories.\n   * Remap ancestor Gene Ontology term relationships and the gene annotations with a set of user defined relationships.\n   * Explore the Gene Ontology graph within a Python interpreter.\n\nCitation\n~~~~~~~~\nPlease cite the following papers when using GOcats:\n\nHinderer EW, Moseley NHB. GOcats: A tool for categorizing Gene Ontology into subgraphs of user-defined concepts. PLoS One. 2020;15(6):1-29.\n\nHinderer EW, Flight RM, Dubey R, Macleod JN, Moseley HNB. Advances in Gene Ontology utilization improve statistical power of annotation enrichment. PLoS One. 2019;14(8):1-20.\n\nInstallation\n~~~~~~~~~~~~\n\n`GOcats` runs under Python 3.4+ and is available through python3-pip. Install via pip or clone the git repo and install\nthe following dependencies and you are ready to go!\n\nInstall on Linux\n----------------\n\nPip installation\n................\n\nDependencies should be automatically installed using this method. It is strongly recommended that you install with this\nmethod.\n.. code:: bash\n\n   pip3 install gocats\n\nGitHub Package installation\n...........................\n\nMake sure you have git_ installed:\n\n.. code:: bash\n\n   cd ~/\n   git clone https://github.com/MoseleyBioinformaticsLab/GOcats.git\n\nDependencies\n............\n\n`GOcats` requires the following Python libraries:\n\n   * docopt_ for creating the gocats command-line interface.\n   * JSONPickle_ for saving Python objects in a JSON serializable form and outputting to a file.\n\nTo install dependencies manually:\n\n.. code:: bash\n\n   pip3 install docopt\n   pip3 install jsonpickle\n\nInstall on Windows\n------------------\nGOcats can also be installed on windows through pip.\n\nQuickstart\n~~~~~~~~~~\n\nFor instructions on how to format your keyword list and advanced argument usage, consult the tutorial, guide, and API documentation at readthedocs_.\n\nSubgraphs can be created from the command line.\n\n.. code:: bash\n\n   python3 -m gocats create_subgraphs /path_to_ontology_file ~/GOcats/gocats/exampledata/examplecategories.csv ~/Output --supergraph_namespace=cellular_component --subgraph_namespace=cellular_component --output_termlist\n\nMapping files can be found in the output directory:\n\n   - GC_content_mapping.json_pickle  # A python dictionary with category-defining GO terms as keys and a list of all subgraph contents as values.\n   - GC_id_mapping.json_pickle  # A python dictionary with every GO term of the specified namespace as keys and a list of category root terms as values.\n\nGAF mappings can also be made from the command line:\n\n.. code:: bash\n\n   python3 -m gocats categorize_dataset YOUR_GAF.goa YOUR_OUTPUT_DIRECTORY/GC_id_mapping.json_pickle YOUR_OUTPUT_DIRECTORY MAPPED_DATASET_NAME.goa\n\nGene to GO Term remappings with consideration of ``has_part`` relationships can created from the command line:\n\n.. code:: bash\n\n   python3 -m gocats remap_goterms /path_to_ontology_file.obo /path_to_gaf.goa ancestors_output.json namespace_output.json --allowed_relationships=is_a,part_of,has_part --identifier_column=1\n\nGene to GO terms will be in JSON format in ``ancestor_output.json``, and new GO term to namespace in ``namespace_output.json``.\n\nLicense\n~~~~~~~\n\nMade available under the terms of The Clear BSD License. See full license in LICENSE.\n\nThe Clear BSD License\n\nCopyright (c) 2017, Eugene W. Hinderer III, Hunter N.B. Moseley\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted (subject to the limitations in the disclaimer\nbelow) provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its contributors may be used\n  to endorse or promote products derived from this software without specific\n  prior written permission.\n\nNO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE GRANTED BY THIS\nLICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n\"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,\nTHE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE\nARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE\nLIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR\nCONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE\nGOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)\nHOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT\nLIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT\nOF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH\nDAMAGE.\n\nAuthors\n~~~~~~~\n\n* **Eugene W. Hinderer III** - ehinderer_\n* **Hunter N.B. Moseley** - hunter-moseley_\n\n.. _readthedocs: http://gocats.readthedocs.io/\n.. _jsonpickle: https://github.com/jsonpickle/jsonpickle\n.. _docopt: https://github.com/docopt/docopt\n.. _git: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git/\n.. _ehinderer: https://github.com/ehinderer\n.. _hunter-moseley: https://github.com/hunter-moseley\n",
    "bugtrack_url": null,
    "license": "The Clear BSD License",
    "summary": "A tool for categorizing Gene Ontology into subgraphs of user-defined emergent concepts",
    "version": "1.2.1",
    "project_urls": {
        "Homepage": "https://github.com/MoseleyBioinformaticsLab/GOcats"
    },
    "split_keywords": [
        "biomedical",
        "ontology",
        "analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "639e80f538c03266a55d271e0a9c284ab580cc80ff68f1a74e46167a655f7b0a",
                "md5": "2d454fe7203f019b34840b9fa2390df5",
                "sha256": "0c3d83485ca8ce104f630bfdb9b37f5d89708f750b80b74f174cd02e504316bb"
            },
            "downloads": -1,
            "filename": "GOcats-1.2.1-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2d454fe7203f019b34840b9fa2390df5",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 28261,
            "upload_time": "2023-06-16T21:55:59",
            "upload_time_iso_8601": "2023-06-16T21:55:59.466435Z",
            "url": "https://files.pythonhosted.org/packages/63/9e/80f538c03266a55d271e0a9c284ab580cc80ff68f1a74e46167a655f7b0a/GOcats-1.2.1-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7a540321691174cb2c00c8005419b45b57e8e5f5bf8085051ce298c073cc6561",
                "md5": "debf43b53e7eeb33cfa7c970c88fe3c3",
                "sha256": "a4f3982c137c1ac9bb863acccce66426a09cbf4d1a489308368377954e3cd8a8"
            },
            "downloads": -1,
            "filename": "GOcats-1.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "debf43b53e7eeb33cfa7c970c88fe3c3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 26529,
            "upload_time": "2023-06-16T21:56:00",
            "upload_time_iso_8601": "2023-06-16T21:56:00.899533Z",
            "url": "https://files.pythonhosted.org/packages/7a/54/0321691174cb2c00c8005419b45b57e8e5f5bf8085051ce298c073cc6561/GOcats-1.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-16 21:56:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MoseleyBioinformaticsLab",
    "github_project": "GOcats",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "docopt",
            "specs": [
                [
                    "==",
                    "0.6.2"
                ]
            ]
        },
        {
            "name": "jsonpickle",
            "specs": [
                [
                    "==",
                    "0.9.4"
                ]
            ]
        }
    ],
    "lcname": "gocats"
}
        
Elapsed time: 0.26187s