scie2g


Namescie2g JSON
Version 1.0.3 PyPI version JSON
download
home_pagehttps://github.com/ArianeMora/sciepi2gene
Summary
upload_time2022-12-24 02:45:32
maintainer
docs_urlNone
authorAriane Mora
requires_python>=3.6
licenseGPL3
keywords epigenetics bioinformatics
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            # sci-Epi2Gene
[![codecov.io](https://codecov.io/github/ArianeMora/sciepi2gene/coverage.svg?branch=master)](https://codecov.io/github/ArianeMora/sciepi2gene?branch=master)
[![PyPI](https://img.shields.io/pypi/v/scie2g)](https://pypi.org/project/scie2g/)
[![DOI](https://zenodo.org/badge/316410924.svg)](https://zenodo.org/badge/latestdoi/316410924)

[Link to docs](https://arianemora.github.io/sciepi2gene/)

## Warning!!
If you have non normal chr's please remove them it will make the program extremely slow.

Another warning: If you have duplicates (i.e. multiple things with the same start and end it will be extremely slow!


Sci-epi2gene maps events annotated to a genome location to nearby genes - i.e. peaks from histone modification data
ChIP-seq experiemnts stored as bed data, or DNA methylation data in csv format (e.g. output from DMRseq, methylKit or methylSig).

The user provides a SORTED gene annotation file with start, end, and direction for each gene (we recommend using
[sci-biomart](https://github.com/ArianeMora/scibiomart), see examples for detail.

The user then selects how to annotate, i.e. whether it is in the promoter region, or overlaps the gene body. Finally,
the parameters for overlap on each side are chosen.

It is available under the [GNU General Public License (Version 3) ](https://www.gnu.org/licenses/gpl-3.0.en.html).

This package is a wrapper that allows various epigenetic data types to be annotated to genes. [Examples are in the docs](https://arianemora.github.io/sciepi2gene/)

I also wanted to have different upper flanking and lower flanking distances that took into account the directionality of the strand
and also an easy output csv file that can be filtered and used in downstream analyses. This is why I keep all features
that fall within the annotation region of a gene (example below):

The overlapping methods are as follows:
    1) overlaps: this means does ANY part of the peak/feature overlap the gene body + some buffer before the TSS and some buffer on the non-TSS side
    2) promoter: does ANY part of the peak/feature overlap with the TSS of the gene taking into account buffers on either side of the TSS.

.. image:: _static/example_overlaps.png
   :width: 600

As you can see from the above screenshot using IGV, the input peaks are in purple, and the green are the output
peaks as annotated to genes. The function *convert_to_bed* converts the output csv to bed files for viewing. This example
shows that a peak/feature can be annotated to multiple genes. Peaks/features outside of the regions of genes (e.g.
the first peak) are dropped from the output.

We show this example in the notebook (see examples folder), where we use [IGV](https://github.com/igvteam/igv-jupyter#igvjs-jupyter-extension)
to view the tracks (see image below).

.. image:: _static/igv_jupyter.png
   :width: 600

Lastly, there are sometimes differences between annotations (i.e. the TSS on your annotation in IGV may differ to the
annotation you input to sciepi2gene), naturally, how your genes/features are annotated depends on the input file so if you see differences check this first!

Please post questions and issues related to sci-epi2gene on the `Issues <https://github.com/ArianeMora/sciepi2gene/issues>`_  section of the GitHub repository.




            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ArianeMora/sciepi2gene",
    "name": "scie2g",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "epigenetics,bioinformatics",
    "author": "Ariane Mora",
    "author_email": "ariane.n.mora@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/fe/13/192dd640632e9aeba46a7ed89c91a605de8b13c6ecce3973865186b0c451/scie2g-1.0.3.tar.gz",
    "platform": null,
    "description": "# sci-Epi2Gene\n[![codecov.io](https://codecov.io/github/ArianeMora/sciepi2gene/coverage.svg?branch=master)](https://codecov.io/github/ArianeMora/sciepi2gene?branch=master)\n[![PyPI](https://img.shields.io/pypi/v/scie2g)](https://pypi.org/project/scie2g/)\n[![DOI](https://zenodo.org/badge/316410924.svg)](https://zenodo.org/badge/latestdoi/316410924)\n\n[Link to docs](https://arianemora.github.io/sciepi2gene/)\n\n## Warning!!\nIf you have non normal chr's please remove them it will make the program extremely slow.\n\nAnother warning: If you have duplicates (i.e. multiple things with the same start and end it will be extremely slow!\n\n\nSci-epi2gene maps events annotated to a genome location to nearby genes - i.e. peaks from histone modification data\nChIP-seq experiemnts stored as bed data, or DNA methylation data in csv format (e.g. output from DMRseq, methylKit or methylSig).\n\nThe user provides a SORTED gene annotation file with start, end, and direction for each gene (we recommend using\n[sci-biomart](https://github.com/ArianeMora/scibiomart), see examples for detail.\n\nThe user then selects how to annotate, i.e. whether it is in the promoter region, or overlaps the gene body. Finally,\nthe parameters for overlap on each side are chosen.\n\nIt is available under the [GNU General Public License (Version 3) ](https://www.gnu.org/licenses/gpl-3.0.en.html).\n\nThis package is a wrapper that allows various epigenetic data types to be annotated to genes. [Examples are in the docs](https://arianemora.github.io/sciepi2gene/)\n\nI also wanted to have different upper flanking and lower flanking distances that took into account the directionality of the strand\nand also an easy output csv file that can be filtered and used in downstream analyses. This is why I keep all features\nthat fall within the annotation region of a gene (example below):\n\nThe overlapping methods are as follows:\n    1) overlaps: this means does ANY part of the peak/feature overlap the gene body + some buffer before the TSS and some buffer on the non-TSS side\n    2) promoter: does ANY part of the peak/feature overlap with the TSS of the gene taking into account buffers on either side of the TSS.\n\n.. image:: _static/example_overlaps.png\n   :width: 600\n\nAs you can see from the above screenshot using IGV, the input peaks are in purple, and the green are the output\npeaks as annotated to genes. The function *convert_to_bed* converts the output csv to bed files for viewing. This example\nshows that a peak/feature can be annotated to multiple genes. Peaks/features outside of the regions of genes (e.g.\nthe first peak) are dropped from the output.\n\nWe show this example in the notebook (see examples folder), where we use [IGV](https://github.com/igvteam/igv-jupyter#igvjs-jupyter-extension)\nto view the tracks (see image below).\n\n.. image:: _static/igv_jupyter.png\n   :width: 600\n\nLastly, there are sometimes differences between annotations (i.e. the TSS on your annotation in IGV may differ to the\nannotation you input to sciepi2gene), naturally, how your genes/features are annotated depends on the input file so if you see differences check this first!\n\nPlease post questions and issues related to sci-epi2gene on the `Issues <https://github.com/ArianeMora/sciepi2gene/issues>`_  section of the GitHub repository.\n\n\n\n",
    "bugtrack_url": null,
    "license": "GPL3",
    "summary": "",
    "version": "1.0.3",
    "split_keywords": [
        "epigenetics",
        "bioinformatics"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "39abfbbf9a9a90f6409fe00cc55c158e",
                "sha256": "33fa93e6cefac6813ad7b857b05c17e3ff1282e1dc49cc2800178daa61853e66"
            },
            "downloads": -1,
            "filename": "scie2g-1.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "39abfbbf9a9a90f6409fe00cc55c158e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 44194,
            "upload_time": "2022-12-24T02:45:31",
            "upload_time_iso_8601": "2022-12-24T02:45:31.169327Z",
            "url": "https://files.pythonhosted.org/packages/40/5b/c3d3b7af7db3715eb1526c45d1512456c7bca20cd7685748bd12fd631618/scie2g-1.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "80bc94530079784c9788663cdcec4d6f",
                "sha256": "97be591dafc186198967af77b5a7afc985b75392516c77fb7498e4ea3dc74921"
            },
            "downloads": -1,
            "filename": "scie2g-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "80bc94530079784c9788663cdcec4d6f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 29128,
            "upload_time": "2022-12-24T02:45:32",
            "upload_time_iso_8601": "2022-12-24T02:45:32.986567Z",
            "url": "https://files.pythonhosted.org/packages/fe/13/192dd640632e9aeba46a7ed89c91a605de8b13c6ecce3973865186b0c451/scie2g-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-24 02:45:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "ArianeMora",
    "github_project": "sciepi2gene",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "lcname": "scie2g"
}
        
Elapsed time: 0.04900s