[![Tests](https://github.com/oxfordmmm/gumpy/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/gumpy/actions/workflows/tests.yaml)
[![codecov](https://codecov.io/gh/oxfordmmm/gumpy/branch/master/graph/badge.svg)](https://codecov.io/gh/oxfordmmm/gumpy)
[![Docs](https://github.com/oxfordmmm/gumpy/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/gumpy/)
[![PyPI version](https://badge.fury.io/py/gumpy.svg)](https://badge.fury.io/py/gumpy)
# gumpy
Genetics with Numpy
## Installation
```
git clone https://github.com/oxfordmmm/gumpy
cd gumpy
pip install .
```
## Documentation
https://oxfordmmm.github.io/gumpy/
## Testing
A suite of tests can be run from a terminal:
```
python -m pytest --cov=gumpy -vv
```
## Usage
### Parse a genbank file
Genome objects can be created by passing a filename of a genbank file
```
from gumpy import Genome
g = Genome("filename.gbk")
```
### Parse a VCF file
VCFFile objects can be created by passing a filename of a vcf file
```
from gumpy import VCFFile
vcf = VCFFile("filename.vcf")
```
### Apply a VCF file to a reference genome
The mutations defined in a vcf file can be applied to a reference genome to produce a new Genome object containing the changes detailed in the vcf.
If a contig is set within the vcf, the length of the contig should match the length of the genome. Otherwise, if the vcf details changes within the genome range, they will be made.
```
from gumpy import Genome, VCFFile
reference_genome = Genome("reference.gbk")
vcf = VCFFile("filename.vcf")
resultant_genome = reference_genome + vcf
```
### Genome level comparisons
There are two different methods for comparing changes. One can quickly check for changes which are caused by a given VCF file. The other can check for changes between two genome. The latter is therefore suited best for comparisons in which either both genomes are mutated, or the VCF file(s) are not available. The former is best suited for cases where changes caused by a VCF want to be determined, but finding gene-level differences will require rebuilding the Gene objects, which can be time consuming.
#### Compare genomes
Two genomes of the same length can be easily compared, including equality and changes between the two.
Best suited to cases where two mutated genomes are to be compared.
```
from gumpy import Genome, GenomeDifference
g1 = Genome("filename1.gbk")
g2 = Genome("filename2.gbk")
diff = g2 - g1 #Genome.difference returns a GenomeDifference object
print(diff.snp_distance) #SNP distance between the two genomes
print(diff.variants) #Array of variants (SNPs/INDELs) of the differences between g2 and g1
```
### Gene level comparisons
When a Genome object is instanciated, it is populated with Gene objects for each gene detailed in the genbank file.
These genes can also be compared.
Gene differences can be found through direct comparison of Gene objects, or systematically through the `gene_differences()` method of `GenomeDifference`.
```
from gumpy import Genome, Gene
g1 = Genome("filename1.gbk")
g2 = Genome("filename2.gbk")
#Get the Gene objects for the gene "gene1_name" from both Genomes
g1_gene1 = g1.build_gene["gene1_name"]
g2_gene1 = g2.build_gene["gene1_name"]
g1_gene1 == g2_gene1 #Equality check of the two genes
diff= g1_gene1 - g2_gene1 #Returns a GeneDifference object
diff.mutations #List of mutations in GARC describing the variation between the two genes
```
### Save and load Genome objects
Due to how long it takes to create a Genome object, it may be beneficial to save the object to disk. The reccomendation is to utilise the `pickle` module to do so, but due to the security implications of this, do so at your own risk! An example is below:
```
import pickle
import gumpy
#Load genome
g = gumpy.Genome("filename.gbk")
#Save genome
pickle.dump(g, open("filename.pkl", "wb"))
#Load genome
g2 = pickle.load(open("filename.pkl", "rb"))
g == g2 #True
```
Raw data
{
"_id": null,
"home_page": "https://github.com/oxfordmmm/gumpy",
"name": "gumpy",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": "Philip W Fowler",
"author_email": "philip.fowler@ndm.ox.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/72/28/269f48fddf2c231a3b9504f8407dd6728aaebef5dea7425fbc3fc623290c/gumpy-1.3.8.tar.gz",
"platform": null,
"description": "[![Tests](https://github.com/oxfordmmm/gumpy/actions/workflows/tests.yaml/badge.svg)](https://github.com/oxfordmmm/gumpy/actions/workflows/tests.yaml)\n[![codecov](https://codecov.io/gh/oxfordmmm/gumpy/branch/master/graph/badge.svg)](https://codecov.io/gh/oxfordmmm/gumpy) \n[![Docs](https://github.com/oxfordmmm/gumpy/actions/workflows/docs.yaml/badge.svg)](https://oxfordmmm.github.io/gumpy/)\n[![PyPI version](https://badge.fury.io/py/gumpy.svg)](https://badge.fury.io/py/gumpy)\n\n# gumpy\nGenetics with Numpy\n\n## Installation\n```\ngit clone https://github.com/oxfordmmm/gumpy\ncd gumpy\npip install .\n```\n## Documentation\nhttps://oxfordmmm.github.io/gumpy/\n\n## Testing\nA suite of tests can be run from a terminal:\n```\npython -m pytest --cov=gumpy -vv\n```\n\n## Usage\n### Parse a genbank file\nGenome objects can be created by passing a filename of a genbank file\n```\nfrom gumpy import Genome\n\ng = Genome(\"filename.gbk\")\n```\n\n### Parse a VCF file\nVCFFile objects can be created by passing a filename of a vcf file\n```\nfrom gumpy import VCFFile\n\nvcf = VCFFile(\"filename.vcf\")\n```\n\n### Apply a VCF file to a reference genome\nThe mutations defined in a vcf file can be applied to a reference genome to produce a new Genome object containing the changes detailed in the vcf.\n\nIf a contig is set within the vcf, the length of the contig should match the length of the genome. Otherwise, if the vcf details changes within the genome range, they will be made.\n```\nfrom gumpy import Genome, VCFFile\n\nreference_genome = Genome(\"reference.gbk\")\nvcf = VCFFile(\"filename.vcf\")\n\nresultant_genome = reference_genome + vcf\n```\n\n### Genome level comparisons\nThere are two different methods for comparing changes. One can quickly check for changes which are caused by a given VCF file. The other can check for changes between two genome. The latter is therefore suited best for comparisons in which either both genomes are mutated, or the VCF file(s) are not available. The former is best suited for cases where changes caused by a VCF want to be determined, but finding gene-level differences will require rebuilding the Gene objects, which can be time consuming.\n\n#### Compare genomes\nTwo genomes of the same length can be easily compared, including equality and changes between the two.\nBest suited to cases where two mutated genomes are to be compared.\n```\nfrom gumpy import Genome, GenomeDifference\n\ng1 = Genome(\"filename1.gbk\")\ng2 = Genome(\"filename2.gbk\")\n\ndiff = g2 - g1 #Genome.difference returns a GenomeDifference object\nprint(diff.snp_distance) #SNP distance between the two genomes\nprint(diff.variants) #Array of variants (SNPs/INDELs) of the differences between g2 and g1\n```\n\n### Gene level comparisons\nWhen a Genome object is instanciated, it is populated with Gene objects for each gene detailed in the genbank file.\nThese genes can also be compared.\nGene differences can be found through direct comparison of Gene objects, or systematically through the `gene_differences()` method of `GenomeDifference`.\n```\nfrom gumpy import Genome, Gene\n\ng1 = Genome(\"filename1.gbk\")\ng2 = Genome(\"filename2.gbk\")\n\n#Get the Gene objects for the gene \"gene1_name\" from both Genomes\ng1_gene1 = g1.build_gene[\"gene1_name\"]\ng2_gene1 = g2.build_gene[\"gene1_name\"]\n\ng1_gene1 == g2_gene1 #Equality check of the two genes\ndiff= g1_gene1 - g2_gene1 #Returns a GeneDifference object\ndiff.mutations #List of mutations in GARC describing the variation between the two genes\n```\n\n### Save and load Genome objects\nDue to how long it takes to create a Genome object, it may be beneficial to save the object to disk. The reccomendation is to utilise the `pickle` module to do so, but due to the security implications of this, do so at your own risk! An example is below:\n```\nimport pickle\n\nimport gumpy\n\n#Load genome\ng = gumpy.Genome(\"filename.gbk\")\n\n#Save genome\npickle.dump(g, open(\"filename.pkl\", \"wb\"))\n\n#Load genome\ng2 = pickle.load(open(\"filename.pkl\", \"rb\"))\n\ng == g2 #True\n```\n",
"bugtrack_url": null,
"license": "University of Oxford, see LICENSE.md",
"summary": "Genetics with Numpy",
"version": "1.3.8",
"project_urls": {
"Homepage": "https://github.com/oxfordmmm/gumpy"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e716fdc182ceb1a107565ac8ba8e67c020c2a3dd6eb535785b3e027eb687ae5c",
"md5": "09f692dc878cf11eae91728985d1cc3f",
"sha256": "2de1cb9a2bb18c1880cd0621828d48ec76e75922d60c14a367f83724140075c3"
},
"downloads": -1,
"filename": "gumpy-1.3.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "09f692dc878cf11eae91728985d1cc3f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 49087,
"upload_time": "2024-07-24T16:41:21",
"upload_time_iso_8601": "2024-07-24T16:41:21.595926Z",
"url": "https://files.pythonhosted.org/packages/e7/16/fdc182ceb1a107565ac8ba8e67c020c2a3dd6eb535785b3e027eb687ae5c/gumpy-1.3.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7228269f48fddf2c231a3b9504f8407dd6728aaebef5dea7425fbc3fc623290c",
"md5": "1705e1ee78389e2ef57f0f164007b73c",
"sha256": "cab7f7af29d62a1d40b51ccdb2d180433e8fd998020191dff5ea81fb0ec7f189"
},
"downloads": -1,
"filename": "gumpy-1.3.8.tar.gz",
"has_sig": false,
"md5_digest": "1705e1ee78389e2ef57f0f164007b73c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 47352,
"upload_time": "2024-07-24T16:41:23",
"upload_time_iso_8601": "2024-07-24T16:41:23.485142Z",
"url": "https://files.pythonhosted.org/packages/72/28/269f48fddf2c231a3b9504f8407dd6728aaebef5dea7425fbc3fc623290c/gumpy-1.3.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-24 16:41:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "oxfordmmm",
"github_project": "gumpy",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "pysam",
"specs": []
},
{
"name": "biopython",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "pytest",
"specs": []
},
{
"name": "pytest-cov",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "scipy",
"specs": []
}
],
"lcname": "gumpy"
}