Name | vcfparser JSON |
Version |
0.2.2
JSON |
| download |
home_page | https://github.com/everestial/vcfparser |
Summary | Python (version <=3.12) package for parsing the genomics and transcriptomics VCF data. |
upload_time | 2024-09-11 11:05:15 |
maintainer | None |
docs_url | None |
author | Kiran Bishwa |
requires_python | None |
license | MIT License Copyright (c) 2019 Kiran N' Bishwa Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
vcfparser
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
|
coveralls test coverage |
No coveralls.
|
# vcfparser
![PyPI version](https://img.shields.io/pypi/v/vcfparser.svg)
[![Travis Build Status](https://img.shields.io/travis/everestial/vcfparser.svg)](https://travis-ci.org/everestial/vcfparser)
[![Read the Docs](https://readthedocs.org/projects/vcfparser/badge/?version=latest)](https://vcfparser.readthedocs.io/en/latest/?badge=latest)
Python (version >=3.6) package for parsing the genomics and transcriptomics VCF data.
- Free software: MIT license
- Documentation: https://vcfparser.readthedocs.io.
## Features
- No external dependency except python (version >=3.6).
- Minimalistic in nature.
- Provides a lot of features to API users.
- Cython compiling is provided to optimize performance.
## Installation
**Method A:**
`VCFsimplify <https://github.com/everestial/VCF-Simplify>`\_ uses vcfparser API, so the package is readily available if VCFsimplify is already installed.
This is only preferred while developing/optimizing **VcfSimplify** along with **vcfparser**.
Navigate to the VCFsimplify directory ->
activate python ->
call the 'vcfparser' package.
```console
$ C:\Users\>cd VCF-Simplify
$ C:\Users\>cd VCF-Simplify>dir
Volume in drive C is StorageDrive
Volume Serial Number is .........
Directory of C:\Users\VCF-Simplify
07/12/2020 10:14 AM <DIR> .
07/12/2020 10:14 AM <DIR> ..
07/12/2020 08:55 AM <DIR> .github
............................
............................
07/12/2020 10:42 AM <DIR> vcfparser
07/12/2020 08:55 AM 1,494 VcfSimplify.py
11 File(s) 20,873,992 bytes
13 Dir(s) 241,211,793,408 bytes free
$ C:\Users\VCF-Simplify>python
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from vcfparser import VcfParser
>>>
```
**Method B (preferred method):**
Pip is the preferred method of installing and using **vcfparser** API if custom python scripts/app are being developed.
```console
$ pip install vcfparser
```
**Method C:**
For offline install, or in order to build from the source code, follow :ref:`advance install <advanced-install>`.
## Cythonize (optional but helpful)
The installed "vcfparser" package can be cythonized to optimize performance.
Cythonizing the package can increase the speed of the parser by about x.x - y.y (?) times.
TODO: Bhuwan - add required cython method in here
## Usage
```bash
from vcfparser import VcfParser
vcf_obj = VcfParser('input_test.vcf')
```
### Get metadata information from the vcf file
```python
metainfo = vcf_obj.parse_metadata()
metainfo.fileformat
# Output: 'VCFv4.2'
metainfo.filters
# Output: [{'ID': 'LowQual', 'Description': 'Low quality'}, {'ID': 'my_indel_filter', 'Description': 'QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0'}, {'ID': 'my_snp_filter', 'Description': 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0'}]
metainfo.alt_
# Output: [{'ID': 'NON_REF', 'Description': 'Represents any possible alternative allele at this location'}]
metainfo.sample_names
# Output: ['ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']
metainfo.record_keys
# Output: ['CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']
```
### Get Records from the vcf file
```python
records = vcf_obj.parse_records()
# Note: Records are returned as a generator.
first_record = next(records)
first_record.CHROM
# Output: '2'
first_record.POS
# Output: '15881018'
first_record.REF
# Output: 'G'
first_record.ALT
# Output: 'A,C'
first_record.QUAL
# Output: '5082.45'
first_record.FILTER
# Output: ['PASS']
first_record.get_mapped_samples()
# Output: {'ms01e': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
# 'ms02g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
# 'ms03g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},
# 'ms04h': {'GT': '1/1', 'PI': '.', 'GQ': '6', 'PG': '1/1', 'PM': '.', 'PW': '1/1', 'AD': '0,2', 'PL': '49,6,0,.,.,.', 'DP': '2', 'PB': '.', 'PC': '.'},
# 'MA611': {'GT': '0/0', 'PI': '.', 'GQ': '78', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '29,0,0', 'PL': '0,78,1170,78,1170,1170', 'DP': '29', 'PB': '.', 'PC': '.'},
# 'MA605': {'GT': '0/0', 'PI': '.', 'GQ': '9', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '3,0,0', 'PL': '0,9,112,9,112,112', 'DP': '3', 'PB': '.', 'PC': '.'},
# 'MA622': {'GT': '0/0', 'PI': '.', 'GQ': '99', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '40,0,0', 'PL': '0,105,1575,105,1575,1575', 'DP': '40', 'PB': '.', 'PC': '.\n'}}
```
TODO: Bhuwan (priority - high)
The very last example "first_record.get_mapped_samples()" is returning the value of the last sample/key with "\n".
i.e: 'PC': '.\n'
Please fix that issue - strip('\n') in the line before parsing.
|
Alternately, we can loop over each record by using a for-loop:
```bash
for record in records:
chrom = record.CHROM
pos = record.POS
id = record.ID
ref = record.REF
alt = record.ALT
qual = record.QUAL
filter = record.FILTER
format_ = record.format_
infos = record.get_info_dict()
mapped_sample = record.get_mapped_samples()
```
- For more specific use cases please check the examples in the following section:
- For tutorials in metadata, please follow :ref:`Metadata Tutorial <metadata-tutorial>`.
- For tutorials in record parser, please follow :ref:`Record Parser Tutorial <record-parser-tutorial>`.
Raw data
{
"_id": null,
"home_page": "https://github.com/everestial/vcfparser",
"name": "vcfparser",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "vcfparser",
"author": "Kiran Bishwa",
"author_email": "Kiran Bishwa <kirannbishwa01@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/4b/b8/3e2746566a07cb11ceec1015edbb8b353c0d8e4132e2742c6b7580027b90/vcfparser-0.2.2.tar.gz",
"platform": null,
"description": "# vcfparser\n\n![PyPI version](https://img.shields.io/pypi/v/vcfparser.svg) \n[![Travis Build Status](https://img.shields.io/travis/everestial/vcfparser.svg)](https://travis-ci.org/everestial/vcfparser) \n[![Read the Docs](https://readthedocs.org/projects/vcfparser/badge/?version=latest)](https://vcfparser.readthedocs.io/en/latest/?badge=latest)\n\nPython (version >=3.6) package for parsing the genomics and transcriptomics VCF data.\n\n- Free software: MIT license\n- Documentation: https://vcfparser.readthedocs.io.\n\n## Features\n\n- No external dependency except python (version >=3.6).\n- Minimalistic in nature.\n- Provides a lot of features to API users.\n- Cython compiling is provided to optimize performance.\n\n## Installation\n\n**Method A:**\n\n`VCFsimplify <https://github.com/everestial/VCF-Simplify>`\\_ uses vcfparser API, so the package is readily available if VCFsimplify is already installed.\n\nThis is only preferred while developing/optimizing **VcfSimplify** along with **vcfparser**.\n\nNavigate to the VCFsimplify directory ->\nactivate python ->\ncall the 'vcfparser' package.\n\n```console\n\n $ C:\\Users\\>cd VCF-Simplify\n $ C:\\Users\\>cd VCF-Simplify>dir\n Volume in drive C is StorageDrive\n Volume Serial Number is .........\n\n Directory of C:\\Users\\VCF-Simplify\n\n 07/12/2020 10:14 AM <DIR> .\n 07/12/2020 10:14 AM <DIR> ..\n 07/12/2020 08:55 AM <DIR> .github\n ............................\n ............................\n 07/12/2020 10:42 AM <DIR> vcfparser\n 07/12/2020 08:55 AM 1,494 VcfSimplify.py\n 11 File(s) 20,873,992 bytes\n 13 Dir(s) 241,211,793,408 bytes free\n\n $ C:\\Users\\VCF-Simplify>python\n Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24) [MSC v.1916 (Intel)] on win32\n Type \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n >>> from vcfparser import VcfParser\n >>>\n```\n\n**Method B (preferred method):**\nPip is the preferred method of installing and using **vcfparser** API if custom python scripts/app are being developed.\n\n```console\n $ pip install vcfparser\n```\n\n**Method C:**\n\nFor offline install, or in order to build from the source code, follow :ref:`advance install <advanced-install>`.\n\n## Cythonize (optional but helpful)\n\nThe installed \"vcfparser\" package can be cythonized to optimize performance.\nCythonizing the package can increase the speed of the parser by about x.x - y.y (?) times.\n\nTODO: Bhuwan - add required cython method in here\n\n## Usage\n\n```bash\nfrom vcfparser import VcfParser\nvcf_obj = VcfParser('input_test.vcf')\n```\n\n### Get metadata information from the vcf file\n\n```python\nmetainfo = vcf_obj.parse_metadata()\nmetainfo.fileformat\n# Output: 'VCFv4.2'\n\nmetainfo.filters\n# Output: [{'ID': 'LowQual', 'Description': 'Low quality'}, {'ID': 'my_indel_filter', 'Description': 'QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0'}, {'ID': 'my_snp_filter', 'Description': 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0'}]\n\nmetainfo.alt_\n# Output: [{'ID': 'NON_REF', 'Description': 'Represents any possible alternative allele at this location'}]\n\nmetainfo.sample_names\n# Output: ['ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']\n\nmetainfo.record_keys\n# Output: ['CHROM', 'POS', 'ID', 'REF', 'ALT', 'QUAL', 'FILTER', 'INFO', 'FORMAT', 'ms01e', 'ms02g', 'ms03g', 'ms04h', 'MA611', 'MA605', 'MA622']\n```\n\n### Get Records from the vcf file\n\n```python\nrecords = vcf_obj.parse_records()\n# Note: Records are returned as a generator.\n\nfirst_record = next(records)\nfirst_record.CHROM\n# Output: '2'\n\nfirst_record.POS\n# Output: '15881018'\n\nfirst_record.REF\n# Output: 'G'\n\nfirst_record.ALT\n# Output: 'A,C'\n\nfirst_record.QUAL\n# Output: '5082.45'\n\nfirst_record.FILTER\n# Output: ['PASS']\n\nfirst_record.get_mapped_samples()\n# Output: {'ms01e': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},\n# 'ms02g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},\n# 'ms03g': {'GT': './.', 'PI': '.', 'GQ': '.', 'PG': './.', 'PM': '.', 'PW': './.', 'AD': '0,0', 'PL': '0,0,0,.,.,.', 'DP': '0', 'PB': '.', 'PC': '.'},\n# 'ms04h': {'GT': '1/1', 'PI': '.', 'GQ': '6', 'PG': '1/1', 'PM': '.', 'PW': '1/1', 'AD': '0,2', 'PL': '49,6,0,.,.,.', 'DP': '2', 'PB': '.', 'PC': '.'},\n# 'MA611': {'GT': '0/0', 'PI': '.', 'GQ': '78', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '29,0,0', 'PL': '0,78,1170,78,1170,1170', 'DP': '29', 'PB': '.', 'PC': '.'},\n# 'MA605': {'GT': '0/0', 'PI': '.', 'GQ': '9', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '3,0,0', 'PL': '0,9,112,9,112,112', 'DP': '3', 'PB': '.', 'PC': '.'},\n# 'MA622': {'GT': '0/0', 'PI': '.', 'GQ': '99', 'PG': '0/0', 'PM': '.', 'PW': '0/0', 'AD': '40,0,0', 'PL': '0,105,1575,105,1575,1575', 'DP': '40', 'PB': '.', 'PC': '.\\n'}}\n```\n\nTODO: Bhuwan (priority - high)\nThe very last example \"first_record.get_mapped_samples()\" is returning the value of the last sample/key with \"\\n\".\ni.e: 'PC': '.\\n'\nPlease fix that issue - strip('\\n') in the line before parsing.\n\n|\n\nAlternately, we can loop over each record by using a for-loop:\n\n```bash\n\n for record in records:\n chrom = record.CHROM\n pos = record.POS\n id = record.ID\n ref = record.REF\n alt = record.ALT\n qual = record.QUAL\n filter = record.FILTER\n format_ = record.format_\n infos = record.get_info_dict()\n mapped_sample = record.get_mapped_samples()\n```\n\n- For more specific use cases please check the examples in the following section:\n- For tutorials in metadata, please follow :ref:`Metadata Tutorial <metadata-tutorial>`.\n- For tutorials in record parser, please follow :ref:`Record Parser Tutorial <record-parser-tutorial>`.\n",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2019 Kiran N' Bishwa Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
"summary": "Python (version <=3.12) package for parsing the genomics and transcriptomics VCF data.",
"version": "0.2.2",
"project_urls": {
"Homepage": "https://github.com/everestial/vcfparser"
},
"split_keywords": [
"vcfparser"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9b259a25f4c345497f4d0029f84a128b994ace8e3ca4b6594fa4c2816a7560d7",
"md5": "286c596fbca93167e05001d1262f66aa",
"sha256": "81cfa15b41e8d7ebacc8745fea4b524a2cf721d4593c0ca87c8fd2e8a327a829"
},
"downloads": -1,
"filename": "vcfparser-0.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "286c596fbca93167e05001d1262f66aa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 17837,
"upload_time": "2024-09-11T11:05:14",
"upload_time_iso_8601": "2024-09-11T11:05:14.046144Z",
"url": "https://files.pythonhosted.org/packages/9b/25/9a25f4c345497f4d0029f84a128b994ace8e3ca4b6594fa4c2816a7560d7/vcfparser-0.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4bb83e2746566a07cb11ceec1015edbb8b353c0d8e4132e2742c6b7580027b90",
"md5": "a921c81355660ee7e3fbc99034fb6eec",
"sha256": "476db6e7601675c94f5450dadf83dabc5e9b75062712ed72abfab85dd7c727e3"
},
"downloads": -1,
"filename": "vcfparser-0.2.2.tar.gz",
"has_sig": false,
"md5_digest": "a921c81355660ee7e3fbc99034fb6eec",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 19572,
"upload_time": "2024-09-11T11:05:15",
"upload_time_iso_8601": "2024-09-11T11:05:15.237473Z",
"url": "https://files.pythonhosted.org/packages/4b/b8/3e2746566a07cb11ceec1015edbb8b353c0d8e4132e2742c6b7580027b90/vcfparser-0.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-11 11:05:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "everestial",
"github_project": "vcfparser",
"travis_ci": true,
"coveralls": false,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "vcfparser"
}