Overview
--------
.. image:: https://badge.fury.io/py/pybedtools.svg?style=flat
:target: https://badge.fury.io/py/pybedtools
.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
:target: https://bioconda.github.io
The `BEDTools suite of programs <http://bedtools.readthedocs.org/>`_ is widely
used for genomic interval manipulation or "genome algebra". `pybedtools` wraps
and extends BEDTools and offers feature-level manipulations from within
Python.
See full online documentation, including installation instructions, at
https://daler.github.io/pybedtools/.
The GitHub repo is at https://github.com/daler/pybedtools.
Why `pybedtools`?
-----------------
Here is an example to get the names of genes that are <5 kb away from
intergenic SNPs:
.. code-block:: python
from pybedtools import BedTool
snps = BedTool('snps.bed.gz') # [1]
genes = BedTool('hg19.gff') # [1]
intergenic_snps = snps.subtract(genes) # [2]
nearby = genes.closest(intergenic_snps, d=True, stream=True) # [2, 3]
for gene in nearby: # [4]
if int(gene[-1]) < 5000: # [4]
print gene.name # [4]
Useful features shown here include:
* `[1]` support for all BEDTools-supported formats (here gzipped BED and GFF)
* `[2]` wrapping of all BEDTools programs and arguments (here, `subtract` and `closest` and passing
the `-d` flag to `closest`);
* `[3]` streaming results (like Unix pipes, here specified by `stream=True`)
* `[4]` iterating over results while accessing feature data by index or by attribute
access (here `[-1]` and `.name`).
In contrast, here is the same analysis using shell scripting. Note that this
requires knowledge in Perl, bash, and awk. The run time is identical to the
`pybedtools` version above:
.. code-block:: bash
snps=snps.bed.gz
genes=hg19.gff
intergenic_snps=/tmp/intergenic_snps
snp_fields=`zcat $snps | awk '(NR == 2){print NF; exit;}'`
gene_fields=9
distance_field=$(($gene_fields + $snp_fields + 1))
intersectBed -a $snps -b $genes -v > $intergenic_snps
closestBed -a $genes -b $intergenic_snps -d \
| awk '($'$distance_field' < 5000){print $9;}' \
| perl -ne 'm/[ID|Name|gene_id]=(.*?);/; print "$1\n"'
rm $intergenic_snps
See the `Shell script comparison <http://daler.github.io/pybedtools/sh-comparison.html>`_ in the docs
for more details on this comparison, or keep reading the full documentation at
http://daler.github.io/pybedtools.
Raw data
{
"_id": null,
"home_page": "https://github.com/daler/pybedtools",
"name": "pybedtools",
"maintainer": "Ryan Dale",
"docs_url": "https://pythonhosted.org/pybedtools/",
"requires_python": null,
"maintainer_email": "ryan.dale@nih.gov",
"keywords": null,
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/93/c0/593dadfc238f1980cc7e612b9035f0f2890bea2b9a745c8dabadfe9d4da0/pybedtools-0.11.0.tar.gz",
"platform": null,
"description": "\nOverview\n--------\n\n.. image:: https://badge.fury.io/py/pybedtools.svg?style=flat\n :target: https://badge.fury.io/py/pybedtools\n\n.. image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg\n :target: https://bioconda.github.io\n\nThe `BEDTools suite of programs <http://bedtools.readthedocs.org/>`_ is widely\nused for genomic interval manipulation or \"genome algebra\". `pybedtools` wraps\nand extends BEDTools and offers feature-level manipulations from within\nPython.\n\nSee full online documentation, including installation instructions, at\nhttps://daler.github.io/pybedtools/.\n\nThe GitHub repo is at https://github.com/daler/pybedtools.\n\nWhy `pybedtools`?\n-----------------\n\nHere is an example to get the names of genes that are <5 kb away from\nintergenic SNPs:\n\n.. code-block:: python\n\n from pybedtools import BedTool\n\n snps = BedTool('snps.bed.gz') # [1]\n genes = BedTool('hg19.gff') # [1]\n\n intergenic_snps = snps.subtract(genes) # [2]\n nearby = genes.closest(intergenic_snps, d=True, stream=True) # [2, 3]\n\n for gene in nearby: # [4]\n if int(gene[-1]) < 5000: # [4]\n print gene.name # [4]\n\nUseful features shown here include:\n\n* `[1]` support for all BEDTools-supported formats (here gzipped BED and GFF)\n* `[2]` wrapping of all BEDTools programs and arguments (here, `subtract` and `closest` and passing\n the `-d` flag to `closest`);\n* `[3]` streaming results (like Unix pipes, here specified by `stream=True`)\n* `[4]` iterating over results while accessing feature data by index or by attribute\n access (here `[-1]` and `.name`).\n\nIn contrast, here is the same analysis using shell scripting. Note that this\nrequires knowledge in Perl, bash, and awk. The run time is identical to the\n`pybedtools` version above:\n\n.. code-block:: bash\n\n snps=snps.bed.gz\n genes=hg19.gff\n intergenic_snps=/tmp/intergenic_snps\n\n snp_fields=`zcat $snps | awk '(NR == 2){print NF; exit;}'`\n gene_fields=9\n distance_field=$(($gene_fields + $snp_fields + 1))\n\n intersectBed -a $snps -b $genes -v > $intergenic_snps\n\n closestBed -a $genes -b $intergenic_snps -d \\\n | awk '($'$distance_field' < 5000){print $9;}' \\\n | perl -ne 'm/[ID|Name|gene_id]=(.*?);/; print \"$1\\n\"'\n\n rm $intergenic_snps\n\nSee the `Shell script comparison <http://daler.github.io/pybedtools/sh-comparison.html>`_ in the docs\nfor more details on this comparison, or keep reading the full documentation at\nhttp://daler.github.io/pybedtools.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Wrapper around BEDTools for bioinformatics work",
"version": "0.11.0",
"project_urls": {
"Homepage": "https://github.com/daler/pybedtools"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "93c0593dadfc238f1980cc7e612b9035f0f2890bea2b9a745c8dabadfe9d4da0",
"md5": "e45ef213f0729bb8df0f197b917ef72c",
"sha256": "73b67cdfcccf84f37b3c444db8a4b22025edd6edcb45ce5725697eeb5b510d60"
},
"downloads": -1,
"filename": "pybedtools-0.11.0.tar.gz",
"has_sig": false,
"md5_digest": "e45ef213f0729bb8df0f197b917ef72c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 12543619,
"upload_time": "2025-01-02T15:56:45",
"upload_time_iso_8601": "2025-01-02T15:56:45.172064Z",
"url": "https://files.pythonhosted.org/packages/93/c0/593dadfc238f1980cc7e612b9035f0f2890bea2b9a745c8dabadfe9d4da0/pybedtools-0.11.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-02 15:56:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "daler",
"github_project": "pybedtools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "pysam",
"specs": []
}
],
"lcname": "pybedtools"
}