Genotations
===========
Python library to work with genomes and annotations, mostly Ensembl genomes. Also supports visualization of transcripts/gene features and primer selection.
As pandas and polars are libraries of everyday use for many python developers this library focus on annotations representation in a dataframe way.
The library allows:
* downloading Ensembl annotations and genomes (uses genomepy under the hood)
* working with genomic annotations like with polars dataframes
* getting sequences for selected genes
* visualizing the genes features
* designing primers for selected transcripts with Primer3 python wrapper
Usage
=====
Install with pip:
```bash
pip install genotations
```
In some cases you may also need to install ucsc annotation tools, you can add them to your micromamba/conda environment as they are installed from bioconda channel.
Here how it may look in your environment file:
```yaml
name: genotations
channels:
- conda-forge
- BjornFJohansson
- bioconda
- defaults
dependencies:
- python=3.10
- ucsc-bedtogenepred
- ucsc-genepredtobed
- ucsc-genepredtogtf
- ucsc-gff3togenepred
- ucsc-gtftogenepred
- pip
- pip:
- genotations
```
Now you can start using it, for example:
```python
from genotations import ensembl
human = ensembl.human # getting human genome
mouse = ensembl.mouse # getting mosue genome
mouse.annotations.exons().annotations_df # getting exons as DataFrame
mouse.annotations.protein_coding().exons().annotations_df # getting exons of protein coding genes
mouse.annotations.transcript_gene_names_df # getting transcript gene names
mouse.annotations.with_gene_name_contains("Foxo1").protein_coding().transcripts() #getting only coding Foxo1 transcripts
mouse.annotations.with_gene_name_contains("Foxo1").genes_visual(mouse.genome)[0].plot() # plotting features of the Foxo1 gene
cow_assemblies = ensembl.search_assemblies("Bos taurus") # you can also search genomes by species name if it exists in Ensembl
cow1 = ensembl.SpeciesInfo("Cow", cow_assemblies[-1][0]) # selecting one of several cow assemblies
cow1.annotations.annotations_df # getting annotations as dataframe
```
You can also use the library to annotate existing gene expression data with gene and transcript symbols and features.
For example
```python
from genotations.quantification import *
from genotations import ensembl
base = "."
examples = base / "examples"
data = examples / "data"
expressions = pl.read_parquet(str(data / "PRJNA543661_transcripts.parquet"))
with_expressions_summaries(expressions, min_avg_value = 1)
expressions_ext = ensembl.mouse.annotations.extend_with_annotations_and_sequences(expressions, ensembl.mouse.genome) # extend expression data with annotations and sequences
```
For more examples, check [example notebook](https://github.com/antonkulaga/genotations/blob/main/examples/explore_mouse.ipynb) to see the usage and API
Working with the library code
=====
Use micromamba (or conda) and environment.yaml to install the dependencies
```
micromamba create -f environment.yaml
micromamba activate genotations
```
Raw data
{
"_id": null,
"home_page": "",
"name": "genotations",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,utils,files,genetics,ensembl,genomes,annotations",
"author": "antonkulaga (Anton Kulaga)",
"author_email": "<antonkulaga@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/9a/53/090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd/genotations-0.1.7.tar.gz",
"platform": null,
"description": "\nGenotations\n===========\n\nPython library to work with genomes and annotations, mostly Ensembl genomes. Also supports visualization of transcripts/gene features and primer selection.\nAs pandas and polars are libraries of everyday use for many python developers this library focus on annotations representation in a dataframe way.\n\n\nThe library allows:\n* downloading Ensembl annotations and genomes (uses genomepy under the hood)\n* working with genomic annotations like with polars dataframes\n* getting sequences for selected genes\n* visualizing the genes features\n* designing primers for selected transcripts with Primer3 python wrapper\n \nUsage\n=====\n\nInstall with pip:\n```bash\npip install genotations\n```\nIn some cases you may also need to install ucsc annotation tools, you can add them to your micromamba/conda environment as they are installed from bioconda channel.\nHere how it may look in your environment file:\n```yaml\nname: genotations\nchannels:\n - conda-forge\n - BjornFJohansson\n - bioconda\n - defaults\ndependencies:\n - python=3.10\n - ucsc-bedtogenepred\n - ucsc-genepredtobed\n - ucsc-genepredtogtf\n - ucsc-gff3togenepred\n - ucsc-gtftogenepred\n - pip\n - pip:\n - genotations\n```\n\nNow you can start using it, for example:\n```python\nfrom genotations import ensembl\nhuman = ensembl.human # getting human genome\nmouse = ensembl.mouse # getting mosue genome\nmouse.annotations.exons().annotations_df # getting exons as DataFrame\nmouse.annotations.protein_coding().exons().annotations_df # getting exons of protein coding genes\nmouse.annotations.transcript_gene_names_df # getting transcript gene names\nmouse.annotations.with_gene_name_contains(\"Foxo1\").protein_coding().transcripts() #getting only coding Foxo1 transcripts\nmouse.annotations.with_gene_name_contains(\"Foxo1\").genes_visual(mouse.genome)[0].plot() # plotting features of the Foxo1 gene\ncow_assemblies = ensembl.search_assemblies(\"Bos taurus\") # you can also search genomes by species name if it exists in Ensembl\ncow1 = ensembl.SpeciesInfo(\"Cow\", cow_assemblies[-1][0]) # selecting one of several cow assemblies\ncow1.annotations.annotations_df # getting annotations as dataframe\n```\n\nYou can also use the library to annotate existing gene expression data with gene and transcript symbols and features.\nFor example\n```python\nfrom genotations.quantification import *\nfrom genotations import ensembl\nbase = \".\"\nexamples = base / \"examples\"\ndata = examples / \"data\"\nexpressions = pl.read_parquet(str(data / \"PRJNA543661_transcripts.parquet\"))\nwith_expressions_summaries(expressions, min_avg_value = 1)\nexpressions_ext = ensembl.mouse.annotations.extend_with_annotations_and_sequences(expressions, ensembl.mouse.genome) # extend expression data with annotations and sequences\n```\n\nFor more examples, check [example notebook](https://github.com/antonkulaga/genotations/blob/main/examples/explore_mouse.ipynb) to see the usage and API\n\n\nWorking with the library code\n=====\n\nUse micromamba (or conda) and environment.yaml to install the dependencies\n```\nmicromamba create -f environment.yaml\nmicromamba activate genotations\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "Genotations - python library to work with genomes and primers",
"version": "0.1.7",
"project_urls": null,
"split_keywords": [
"python",
"utils",
"files",
"genetics",
"ensembl",
"genomes",
"annotations"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "182342a15a87f99f558aa54bb3701ebb16fa606ea4e78467c84d104438dcde4b",
"md5": "493995eac7cfdc5f38d009433a7993e1",
"sha256": "7ff004840ae7949b3c6893e8309431d9559c3c66e6c38b1f13dbe067a0e16461"
},
"downloads": -1,
"filename": "genotations-0.1.7-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "493995eac7cfdc5f38d009433a7993e1",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 24822,
"upload_time": "2023-09-06T00:03:37",
"upload_time_iso_8601": "2023-09-06T00:03:37.778723Z",
"url": "https://files.pythonhosted.org/packages/18/23/42a15a87f99f558aa54bb3701ebb16fa606ea4e78467c84d104438dcde4b/genotations-0.1.7-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9a53090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd",
"md5": "96954f49e1870685592accab175c168d",
"sha256": "9f84455057cccd4eea8206ccd5d49dd7219612fad476043647d198ff993ece85"
},
"downloads": -1,
"filename": "genotations-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "96954f49e1870685592accab175c168d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 24781,
"upload_time": "2023-09-06T00:03:39",
"upload_time_iso_8601": "2023-09-06T00:03:39.472507Z",
"url": "https://files.pythonhosted.org/packages/9a/53/090ae886f5795ee3be6e6ee99cea7936ba50cff0afd5b24e34c237f137bd/genotations-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-06 00:03:39",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "genotations"
}