genotations


Namegenotations JSON
Version 0.1.7 PyPI version JSON
download
home_page
SummaryGenotations - python library to work with genomes and primers
upload_time2023-09-06 00:03:39
maintainer
docs_urlNone
authorantonkulaga (Anton Kulaga)
requires_python
license
keywords python utils files genetics ensembl genomes annotations
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
Genotations
===========

Python library to work with genomes and annotations, mostly Ensembl genomes. Also supports visualization of transcripts/gene features and primer selection.
As pandas and polars are libraries of everyday use for many python developers this library focus on annotations representation in a dataframe way.


The library allows:
* downloading Ensembl annotations and genomes (uses genomepy under the hood)
* working with genomic annotations like with polars dataframes
* getting sequences for selected genes
* visualizing the genes features
* designing primers for selected transcripts with Primer3 python wrapper
 
Usage
=====

Install with pip:
```bash
pip install genotations
```
In some cases you may also need to install ucsc annotation tools, you can add them to your micromamba/conda environment as they are installed from bioconda channel.
Here how it may look in your environment file:
```yaml
name: genotations
channels:
  - conda-forge
  - BjornFJohansson
  - bioconda
  - defaults
dependencies:
  - python=3.10
  - ucsc-bedtogenepred
  - ucsc-genepredtobed
  - ucsc-genepredtogtf
  - ucsc-gff3togenepred
  - ucsc-gtftogenepred
  - pip
  - pip:
      - genotations
```

Now you can start using it, for example:
```python
from genotations import ensembl
human = ensembl.human # getting human genome
mouse = ensembl.mouse # getting mosue genome
mouse.annotations.exons().annotations_df # getting exons as DataFrame
mouse.annotations.protein_coding().exons().annotations_df # getting exons of protein coding genes
mouse.annotations.transcript_gene_names_df # getting transcript gene names
mouse.annotations.with_gene_name_contains("Foxo1").protein_coding().transcripts() #getting only coding Foxo1 transcripts
mouse.annotations.with_gene_name_contains("Foxo1").genes_visual(mouse.genome)[0].plot() # plotting features of the Foxo1 gene
cow_assemblies = ensembl.search_assemblies("Bos taurus") # you can also search genomes by species name if it exists in Ensembl
cow1 = ensembl.SpeciesInfo("Cow", cow_assemblies[-1][0]) # selecting one of several cow assemblies
cow1.annotations.annotations_df # getting annotations as dataframe
```

You can also use the library to annotate existing gene expression data with gene and transcript symbols and features.
For example
```python
from genotations.quantification import *
from genotations import ensembl
base = "."
examples = base / "examples"
data = examples / "data"
expressions = pl.read_parquet(str(data / "PRJNA543661_transcripts.parquet"))
with_expressions_summaries(expressions, min_avg_value = 1)
expressions_ext = ensembl.mouse.annotations.extend_with_annotations_and_sequences(expressions, ensembl.mouse.genome) # extend expression data with annotations and sequences
```

For more examples, check [example notebook](https://github.com/antonkulaga/genotations/blob/main/examples/explore_mouse.ipynb) to see the usage and API


Working with the library code
=====

Use micromamba (or conda) and environment.yaml to install the dependencies
```
micromamba create -f environment.yaml
micromamba activate genotations
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "genotations",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,utils,files,genetics,ensembl,genomes,annotations",
    "author": "antonkulaga (Anton Kulaga)",
    "author_email": "<antonkulaga@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9a/53/090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd/genotations-0.1.7.tar.gz",
    "platform": null,
    "description": "\nGenotations\n===========\n\nPython library to work with genomes and annotations, mostly Ensembl genomes. Also supports visualization of transcripts/gene features and primer selection.\nAs pandas and polars are libraries of everyday use for many python developers this library focus on annotations representation in a dataframe way.\n\n\nThe library allows:\n* downloading Ensembl annotations and genomes (uses genomepy under the hood)\n* working with genomic annotations like with polars dataframes\n* getting sequences for selected genes\n* visualizing the genes features\n* designing primers for selected transcripts with Primer3 python wrapper\n \nUsage\n=====\n\nInstall with pip:\n```bash\npip install genotations\n```\nIn some cases you may also need to install ucsc annotation tools, you can add them to your micromamba/conda environment as they are installed from bioconda channel.\nHere how it may look in your environment file:\n```yaml\nname: genotations\nchannels:\n  - conda-forge\n  - BjornFJohansson\n  - bioconda\n  - defaults\ndependencies:\n  - python=3.10\n  - ucsc-bedtogenepred\n  - ucsc-genepredtobed\n  - ucsc-genepredtogtf\n  - ucsc-gff3togenepred\n  - ucsc-gtftogenepred\n  - pip\n  - pip:\n      - genotations\n```\n\nNow you can start using it, for example:\n```python\nfrom genotations import ensembl\nhuman = ensembl.human # getting human genome\nmouse = ensembl.mouse # getting mosue genome\nmouse.annotations.exons().annotations_df # getting exons as DataFrame\nmouse.annotations.protein_coding().exons().annotations_df # getting exons of protein coding genes\nmouse.annotations.transcript_gene_names_df # getting transcript gene names\nmouse.annotations.with_gene_name_contains(\"Foxo1\").protein_coding().transcripts() #getting only coding Foxo1 transcripts\nmouse.annotations.with_gene_name_contains(\"Foxo1\").genes_visual(mouse.genome)[0].plot() # plotting features of the Foxo1 gene\ncow_assemblies = ensembl.search_assemblies(\"Bos taurus\") # you can also search genomes by species name if it exists in Ensembl\ncow1 = ensembl.SpeciesInfo(\"Cow\", cow_assemblies[-1][0]) # selecting one of several cow assemblies\ncow1.annotations.annotations_df # getting annotations as dataframe\n```\n\nYou can also use the library to annotate existing gene expression data with gene and transcript symbols and features.\nFor example\n```python\nfrom genotations.quantification import *\nfrom genotations import ensembl\nbase = \".\"\nexamples = base / \"examples\"\ndata = examples / \"data\"\nexpressions = pl.read_parquet(str(data / \"PRJNA543661_transcripts.parquet\"))\nwith_expressions_summaries(expressions, min_avg_value = 1)\nexpressions_ext = ensembl.mouse.annotations.extend_with_annotations_and_sequences(expressions, ensembl.mouse.genome) # extend expression data with annotations and sequences\n```\n\nFor more examples, check [example notebook](https://github.com/antonkulaga/genotations/blob/main/examples/explore_mouse.ipynb) to see the usage and API\n\n\nWorking with the library code\n=====\n\nUse micromamba (or conda) and environment.yaml to install the dependencies\n```\nmicromamba create -f environment.yaml\nmicromamba activate genotations\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Genotations - python library to work with genomes and primers",
    "version": "0.1.7",
    "project_urls": null,
    "split_keywords": [
        "python",
        "utils",
        "files",
        "genetics",
        "ensembl",
        "genomes",
        "annotations"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "182342a15a87f99f558aa54bb3701ebb16fa606ea4e78467c84d104438dcde4b",
                "md5": "493995eac7cfdc5f38d009433a7993e1",
                "sha256": "7ff004840ae7949b3c6893e8309431d9559c3c66e6c38b1f13dbe067a0e16461"
            },
            "downloads": -1,
            "filename": "genotations-0.1.7-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "493995eac7cfdc5f38d009433a7993e1",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 24822,
            "upload_time": "2023-09-06T00:03:37",
            "upload_time_iso_8601": "2023-09-06T00:03:37.778723Z",
            "url": "https://files.pythonhosted.org/packages/18/23/42a15a87f99f558aa54bb3701ebb16fa606ea4e78467c84d104438dcde4b/genotations-0.1.7-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9a53090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd",
                "md5": "96954f49e1870685592accab175c168d",
                "sha256": "9f84455057cccd4eea8206ccd5d49dd7219612fad476043647d198ff993ece85"
            },
            "downloads": -1,
            "filename": "genotations-0.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "96954f49e1870685592accab175c168d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 24781,
            "upload_time": "2023-09-06T00:03:39",
            "upload_time_iso_8601": "2023-09-06T00:03:39.472507Z",
            "url": "https://files.pythonhosted.org/packages/9a/53/090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd/genotations-0.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-06 00:03:39",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "genotations"
}
        
Elapsed time: 0.12635s