===============================================
Wormtable
===============================================
Wormtable is a write-once read-many table for large scale datasets.
It provides Python programmers with a simple and efficient method of
storing, processing and searching datasets of essentially unlimited
size. A wormtable consists of a set of rows, each of which contains
values belonging to a fixed number of columns. Rows are encoded
in a custom binary format, designed to be flexible, compact and
portable. Rows are stored in a data file, and the offsets and lengths
of these rows are stored in a Berkeley DB database
to support efficient random access. Wormtable also
supports efficient searching and retrieval of rows with particular
values through the use of indexes, also based on Berkeley DB.
The Variant Call Format (VCF) is supported directly by wormtable
through a command line conversion program, vcf2wt. There is also a
command line utility wtadmin to manage wormtables, including the ability to
dump values and add, remove and view indexes.
If you use wormtable in your work, please cite the BMC Bioinformatics
`article <http://www.biomedcentral.com/1471-2105/14/356>`_. See
the ``CITATION.txt`` file for details.
-------------
Documentation
-------------
Full documentation for ``wormtable`` is available at
`<http://pythonhosted.org/wormtable>`_.
------------
Installation
------------
*******************************
Quick install for Debian/Ubuntu
*******************************
If you are running Debian or Ubuntu, this should get you up and running quickly::
$ sudo apt-get install python-dev libdb-dev
$ sudo pip install wormtable
For Python 3, use ``python3-dev`` and ``pip3``.
********************
General instructions
********************
Once Berkeley DB has been installed (see below) we can build the ``wormtable`` module using the
standard Python `methods <http://docs.python.org/install/index.html>`_. For
example, using pip we have ::
$ sudo pip install wormtable
Or, we can manually download the package, unpack it and then run::
$ python setup.py build
$ sudo python setup.py install
Most of the time this will compile and install the module without difficulty.
It is also possible to download the latest development version of
``wormtable`` from `github <https://github.com/wormtable/wormtable>`_.
**************
Python 2.6/3.1
**************
Wormtable requires the ``argparse`` package, which was introduced to the
standard library for version 3.2 (it is also included in 2.7). For users
of older Python versions, the ``argparse`` module must be installed for
the command line utilities to work::
$ sudo pip install argparse
This is not necessary for recent versions of Python.
----------------------
Installing Berkeley DB
----------------------
Wormtable requires Berkeley DB (version 4.8 or later),
which is available for all major platforms.
*****
Linux
*****
Installing Berkeley DB is very easy on Linux distributions.
On Debian/Ubuntu use::
$ sudo apt-get install libdb-dev
and on Red Hat/Fedora use::
# yum install libdb-devel
Other distributions and package managers should provide a similarly easy
option to install the DB development files.
********
Mac OS X
********
Berkeley DB can be installed from source on a mac, via
`macports <https://www.macports.org/>`_ or
`homebrew <http://mxcl.github.io/homebrew/>`_.
For MacPorts, to install e.g. v5.3 ::
$ sudo port install db53
Then, to build/install wormtable, we need to set the CFLAGS and LDFLAGS environment
variables to use the headers and libraries in /opt::
$ CFLAGS=-I/opt/local/include/db53 LDFLAGS=-L/opt/local/lib/db53/ python setup.py build
$ sudo python setup.py install
For Homebrew, get the current Berkeley DB version and again build wormtable
after setting CFLAGS and LDFLAGS appropriately::
$ brew install berkeley-db
$ CFLAGS=-I/usr/local/Cellar/berkeley-db/5.3.21/include/ LDFLAGS=-I/usr/local/Cellar/berkeley-db/5.3.21/lib/ python setup.py build
$ sudo python setup.py install
For more details of Berkely DB versions, see here: https://www.macports.org/ports.php?by=category&substr=databases
***************
Other Platforms
***************
On platforms that Berkeley DB is not available as part of the native packaging
system (or DB was installed locally because of non-root access)
there can be issues with finding the correct headers and libraries
when compiling ``wormtable``. For example,
if we add the DB 4.8 package on FreeBSD using::
# pkg_add -r db48
we get the following errors when we try to install wormtable::
$ python setup.py build
... [Messages cut for brevity] ...
_wormtablemodule.c:3727: error: 'DB_NEXT_NODUP' undeclared (first use in this function)
_wormtablemodule.c:3733: error: 'DB_NOTFOUND' undeclared (first use in this function)
_wormtablemodule.c:3739: error: 'DistinctValueIterator' has no member named 'cursor'
_wormtablemodule.c:3739: error: 'DistinctValueIterator' has no member named 'cursor'
_wormtablemodule.c:3740: error: 'DistinctValueIterator' has no member named 'cursor'
error: command 'cc' failed with exit status 1
This is because the compiler does not know where to find the headers and library
files for Berkeley DB.
To remedy this we must set the
``LDFLAGS`` and ``CFLAGS`` environment variables to
their correct values. Unfortunately there is no simple method to do this
and some knowledge of where your system keeps headers and libraries
is needed. To complete the installation for the FreeBSD example above,
we can do the following::
$ CFLAGS=-I/usr/local/include/db48 LDFLAGS=-L/usr/local/lib/db48 python setup.py build
$ sudo python setup.py install
--------------------------------
Installation without root access
--------------------------------
If you need to install wormtable on a system where Berkeley DB is not
installed (and your system administrator refuses to install it, for
some reason), we can still compile and install it locally.
Here is a recipe that worked on a Debian squeeze machine; however, this is not guaranteed
to work on any given system and you may need to tweak things a little to suit
your environment::
$ mkdir -p $HOME/.local
$ wget http://download.oracle.com/berkeley-db/db-4.8.30.tar.gz
$ tar -zxf db-4.8.30.tar.gz
$ cd db-4.8.30/build_unix/
$ ../dist/configure --prefix=$HOME/.local
$ make install
This downloads a version of Berkeley DB from Oracle, compiles and
then installs it to the directory $HOME/.local. (The version of Berkeley DB
you use doesn't really matter once it's at least 4.8.)
Now, download
the latest version of wormtable, untar it and `cd` to the new directory.
We can then install it locally::
$ CFLAGS=-I$HOME/.local/include LDFLAGS=-L$HOME/.local/lib/ python setup.py install --user
Now we need to set up some paths so that we can use this at run time. Put the following
lines into your $HOME/.bashrc (or equivalent if you use another shell)::
export LD_LIBRARY_PATH=$HOME/.local/lib:$LD_LIBRARY_PATH
export PATH=$HOME/.local/bin:$PATH
Then, log out, log back in, and you should be able to use wormtable.
----------
Test suite
----------
Wormtable has an extensive suite of tests to ensure that data
is stored correctly.
It is a good idea to run these immediately after installation::
$ python tests.py
****************
Tested platforms
****************
Wormtable is highly portable, and
has been successfully built and tested
on the following platforms:
==================== ======== ====== ===========
Operating system Platform Python Compiler
==================== ======== ====== ===========
Ubuntu 13.04 x86-64 2.7.4 gcc 4.7.3
Ubuntu 13.04 x86-64 3.3.1 gcc 4.7.3
Ubuntu 13.04 x86-64 2.7.4 clang 3.2.1
Debian squeeze x86-64 2.6.6 gcc 4.4.5
Debian squeeze x86-64 3.1.3 gcc 4.4.5
Debian squeeze x86-64 3.1.3 clang 1.1
Debian squeeze ppc64 2.6.6 gcc 4.4.5
Debian squeeze ppc64 3.1.3 gcc 4.4.5
Debian wheezy armv6l 2.7.3 gcc 4.6.3
Fedora 17 i386 2.7.3 gcc 4.7.2
Fedora 17 i386 3.2.3 gcc 4.7.2
FreeBSD 9.0 i386 3.2.2 gcc 4.2.2
FreeBSD 9.0 i386 2.7.2 gcc 4.2.2
FreeBSD 9.0 i386 3.1.4 clang 3.0
OS X 10.8.4 x86-64 2.7.2 clang 4.2
Solaris 10 SPARC 3.3.2 gcc 4.8.0
Solaris 11.1 SPARC 2.6.8 gcc 4.5.2
Solaris 11.1 SPARC 2.6.8 Sun C 5.12
Scientific Linux 6.2 x86-64 2.6.6 icc 12.0.0
==================== ======== ====== ===========
Raw data
{
"_id": null,
"home_page": "http://pypi.python.org/pypi/wormtable",
"name": "wormtable",
"maintainer": "",
"docs_url": "https://pythonhosted.org/wormtable/",
"requires_python": "",
"maintainer_email": "",
"keywords": "Berkeley DB,VCF,Variant Call Format,Bioinformatics",
"author": "Jerome Kelleher, Dan Halligan, Rob Ness",
"author_email": "jerome.kelleher@ed.ac.uk",
"download_url": "https://files.pythonhosted.org/packages/fb/a4/e5d6383f1b4af0da85217be531f30e438f91921d1396cdcfa2e2a0402d5a/wormtable-0.1.7.tar.gz",
"platform": "POSIX",
"description": "===============================================\nWormtable\n===============================================\n\nWormtable is a write-once read-many table for large scale datasets.\nIt provides Python programmers with a simple and efficient method of\nstoring, processing and searching datasets of essentially unlimited\nsize. A wormtable consists of a set of rows, each of which contains\nvalues belonging to a fixed number of columns. Rows are encoded\nin a custom binary format, designed to be flexible, compact and\nportable. Rows are stored in a data file, and the offsets and lengths\nof these rows are stored in a Berkeley DB database\nto support efficient random access. Wormtable also\nsupports efficient searching and retrieval of rows with particular\nvalues through the use of indexes, also based on Berkeley DB.\n\nThe Variant Call Format (VCF) is supported directly by wormtable\nthrough a command line conversion program, vcf2wt. There is also a\ncommand line utility wtadmin to manage wormtables, including the ability to\ndump values and add, remove and view indexes.\n\nIf you use wormtable in your work, please cite the BMC Bioinformatics\n`article <http://www.biomedcentral.com/1471-2105/14/356>`_. See\nthe ``CITATION.txt`` file for details.\n\n-------------\nDocumentation\n-------------\n\nFull documentation for ``wormtable`` is available at\n`<http://pythonhosted.org/wormtable>`_.\n\n------------\nInstallation\n------------\n\n*******************************\nQuick install for Debian/Ubuntu\n*******************************\n\nIf you are running Debian or Ubuntu, this should get you up and running quickly::\n\n $ sudo apt-get install python-dev libdb-dev\n $ sudo pip install wormtable\n\nFor Python 3, use ``python3-dev`` and ``pip3``.\n\n********************\nGeneral instructions\n********************\n\nOnce Berkeley DB has been installed (see below) we can build the ``wormtable`` module using the\nstandard Python `methods <http://docs.python.org/install/index.html>`_. For\nexample, using pip we have ::\n\n $ sudo pip install wormtable\n\nOr, we can manually download the package, unpack it and then run::\n\n $ python setup.py build\n $ sudo python setup.py install\n\nMost of the time this will compile and install the module without difficulty.\n\nIt is also possible to download the latest development version of\n``wormtable`` from `github <https://github.com/wormtable/wormtable>`_.\n\n\n**************\nPython 2.6/3.1\n**************\n\nWormtable requires the ``argparse`` package, which was introduced to the\nstandard library for version 3.2 (it is also included in 2.7). For users\nof older Python versions, the ``argparse`` module must be installed for\nthe command line utilities to work::\n\n $ sudo pip install argparse\n\nThis is not necessary for recent versions of Python.\n\n----------------------\nInstalling Berkeley DB\n----------------------\n\nWormtable requires Berkeley DB (version 4.8 or later),\nwhich is available for all major platforms.\n\n*****\nLinux\n*****\n\nInstalling Berkeley DB is very easy on Linux distributions.\n\nOn Debian/Ubuntu use::\n\n $ sudo apt-get install libdb-dev\n\nand on Red Hat/Fedora use::\n\n # yum install libdb-devel\n\nOther distributions and package managers should provide a similarly easy\noption to install the DB development files.\n\n********\nMac OS X\n********\n\nBerkeley DB can be installed from source on a mac, via\n`macports <https://www.macports.org/>`_ or\n`homebrew <http://mxcl.github.io/homebrew/>`_.\n\nFor MacPorts, to install e.g. v5.3 ::\n\n $ sudo port install db53\n\nThen, to build/install wormtable, we need to set the CFLAGS and LDFLAGS environment\nvariables to use the headers and libraries in /opt::\n\n $ CFLAGS=-I/opt/local/include/db53 LDFLAGS=-L/opt/local/lib/db53/ python setup.py build\n $ sudo python setup.py install\n\nFor Homebrew, get the current Berkeley DB version and again build wormtable\nafter setting CFLAGS and LDFLAGS appropriately::\n\n $ brew install berkeley-db\n $ CFLAGS=-I/usr/local/Cellar/berkeley-db/5.3.21/include/ LDFLAGS=-I/usr/local/Cellar/berkeley-db/5.3.21/lib/ python setup.py build\n $ sudo python setup.py install\n\nFor more details of Berkely DB versions, see here: https://www.macports.org/ports.php?by=category&substr=databases\n\n\n***************\nOther Platforms\n***************\n\nOn platforms that Berkeley DB is not available as part of the native packaging\nsystem (or DB was installed locally because of non-root access)\nthere can be issues with finding the correct headers and libraries\nwhen compiling ``wormtable``. For example,\nif we add the DB 4.8 package on FreeBSD using::\n\n # pkg_add -r db48\n\nwe get the following errors when we try to install wormtable::\n\n $ python setup.py build\n ... [Messages cut for brevity] ...\n _wormtablemodule.c:3727: error: 'DB_NEXT_NODUP' undeclared (first use in this function)\n _wormtablemodule.c:3733: error: 'DB_NOTFOUND' undeclared (first use in this function)\n _wormtablemodule.c:3739: error: 'DistinctValueIterator' has no member named 'cursor'\n _wormtablemodule.c:3739: error: 'DistinctValueIterator' has no member named 'cursor'\n _wormtablemodule.c:3740: error: 'DistinctValueIterator' has no member named 'cursor'\n error: command 'cc' failed with exit status 1\n\nThis is because the compiler does not know where to find the headers and library\nfiles for Berkeley DB.\nTo remedy this we must set the\n``LDFLAGS`` and ``CFLAGS`` environment variables to\ntheir correct values. Unfortunately there is no simple method to do this\nand some knowledge of where your system keeps headers and libraries\nis needed. To complete the installation for the FreeBSD example above,\nwe can do the following::\n\n $ CFLAGS=-I/usr/local/include/db48 LDFLAGS=-L/usr/local/lib/db48 python setup.py build\n $ sudo python setup.py install\n\n--------------------------------\nInstallation without root access\n--------------------------------\n\nIf you need to install wormtable on a system where Berkeley DB is not\ninstalled (and your system administrator refuses to install it, for\nsome reason), we can still compile and install it locally.\nHere is a recipe that worked on a Debian squeeze machine; however, this is not guaranteed\nto work on any given system and you may need to tweak things a little to suit\nyour environment::\n\n $ mkdir -p $HOME/.local\n $ wget http://download.oracle.com/berkeley-db/db-4.8.30.tar.gz\n $ tar -zxf db-4.8.30.tar.gz\n $ cd db-4.8.30/build_unix/\n $ ../dist/configure --prefix=$HOME/.local\n $ make install\n\nThis downloads a version of Berkeley DB from Oracle, compiles and\nthen installs it to the directory $HOME/.local. (The version of Berkeley DB\nyou use doesn't really matter once it's at least 4.8.)\nNow, download\nthe latest version of wormtable, untar it and `cd` to the new directory.\nWe can then install it locally::\n\n $ CFLAGS=-I$HOME/.local/include LDFLAGS=-L$HOME/.local/lib/ python setup.py install --user\n\nNow we need to set up some paths so that we can use this at run time. Put the following\nlines into your $HOME/.bashrc (or equivalent if you use another shell)::\n\n export LD_LIBRARY_PATH=$HOME/.local/lib:$LD_LIBRARY_PATH\n export PATH=$HOME/.local/bin:$PATH\n\nThen, log out, log back in, and you should be able to use wormtable.\n\n\n----------\nTest suite\n----------\n\nWormtable has an extensive suite of tests to ensure that data\nis stored correctly.\nIt is a good idea to run these immediately after installation::\n\n $ python tests.py\n\n\n****************\nTested platforms\n****************\n\nWormtable is highly portable, and\nhas been successfully built and tested\non the following platforms:\n\n==================== ======== ====== ===========\nOperating system Platform Python Compiler\n==================== ======== ====== ===========\nUbuntu 13.04 x86-64 2.7.4 gcc 4.7.3\nUbuntu 13.04 x86-64 3.3.1 gcc 4.7.3\nUbuntu 13.04 x86-64 2.7.4 clang 3.2.1\nDebian squeeze x86-64 2.6.6 gcc 4.4.5\nDebian squeeze x86-64 3.1.3 gcc 4.4.5\nDebian squeeze x86-64 3.1.3 clang 1.1\nDebian squeeze ppc64 2.6.6 gcc 4.4.5\nDebian squeeze ppc64 3.1.3 gcc 4.4.5\nDebian wheezy armv6l 2.7.3 gcc 4.6.3\nFedora 17 i386 2.7.3 gcc 4.7.2\nFedora 17 i386 3.2.3 gcc 4.7.2\nFreeBSD 9.0 i386 3.2.2 gcc 4.2.2\nFreeBSD 9.0 i386 2.7.2 gcc 4.2.2\nFreeBSD 9.0 i386 3.1.4 clang 3.0\nOS X 10.8.4 x86-64 2.7.2 clang 4.2\nSolaris 10 SPARC 3.3.2 gcc 4.8.0\nSolaris 11.1 SPARC 2.6.8 gcc 4.5.2\nSolaris 11.1 SPARC 2.6.8 Sun C 5.12\nScientific Linux 6.2 x86-64 2.6.6 icc 12.0.0\n==================== ======== ====== ===========\n",
"bugtrack_url": null,
"license": "GNU LGPLv3+",
"summary": "Write-once read-many data sets using Berkeley DB.",
"version": "0.1.7",
"project_urls": {
"Homepage": "http://pypi.python.org/pypi/wormtable"
},
"split_keywords": [
"berkeley db",
"vcf",
"variant call format",
"bioinformatics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fba4e5d6383f1b4af0da85217be531f30e438f91921d1396cdcfa2e2a0402d5a",
"md5": "b26296365f8b30f64c6bb8b48a10bf09",
"sha256": "05e7d1bc6a6031bbcb2e69de369d0fc019071974323d64be039763f71c0ca4df"
},
"downloads": -1,
"filename": "wormtable-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "b26296365f8b30f64c6bb8b48a10bf09",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 142921,
"upload_time": "2023-10-05T13:28:04",
"upload_time_iso_8601": "2023-10-05T13:28:04.538751Z",
"url": "https://files.pythonhosted.org/packages/fb/a4/e5d6383f1b4af0da85217be531f30e438f91921d1396cdcfa2e2a0402d5a/wormtable-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-05 13:28:04",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "wormtable"
}