# omilayers
[![Documentation Status](https://readthedocs.org/projects/pip/badge/?version=stable)](https://pip.pypa.io/en/stable/?badge=stable) [![Downloads](https://static.pepy.tech/badge/omilayers)](https://pepy.tech/project/omilayers)
``omilayers`` is a Python data management library. It is suitable for multi-omic data analysis, hence the `omi` prefix, that involves the handling of diverse datasets usually referred to as omic layers. `omilayers` wraps the APIs of `SQLite` and `DuckDB` and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.
The rationale behind `omilayers` is the following:
* User stores **layers** of omic data (tables in SQL lingo).
* User creates new layers by processing and restructuring existing layers.
* User can group layers using **tags**.
* User can store a brief description for each layer.
## Why omilayers?
Although SQL is a straightfoward language, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of `omilayers` is to provide this level of abstaction to facilitate bioinformatic data analysis. The `omilayers` API resembles the `pandas` API and the user needs to write the following code to parse a column named `foo` from a layer called `omicdata`:
with DuckDB (default database)
```python
from omilayers import Omilayers
omi = Omilayers("dbname.duckdb")
result = omi.layers['omicdata']['foo']
```
with SQLite
```python
from omilayers import Omilayers
omi = Omilayers("dbname.sqlite", engine="sqlite")
result = omi.layers['omicdata']['foo']
```
## Installation
```
pip install omilayers
```
## Perform unittests
The directory `testing` includes predefined unittests for SQLite and DuckDB.
To test the functionality of `omilayers` with SQLite:
```bash
python -m unittests -v tests_sqlite.py
```
To test the functionality of `omilayers` with DuckDB:
```bash
python -m unittests -v tests_duckdb.py
```
## Testing with synthetic omic data
The directory `synthetic_data` includes two jupyter notebooks (one for SQLite and one for DuckDB) for testing `omilayers` using synthetic multi-omic data. It also includes the Python script `create_synthetic_vcf/synthesize_vcf.py` that was used to create the synthetic VCF that is hosted in Zenodo [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12790872.svg)](https://doi.org/10.5281/zenodo.12790872).
The recreation of the synthetic VCF can be done as following:
```bash
for i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;done
```
To join the generated VCFs into a single VCF:
```bash
for i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;done
```
## Documentation
You can read the full documentation here: [https://omilayers.readthedocs.io](https://omilayers.readthedocs.io/en/latest/)
Raw data
{
"_id": null,
"home_page": null,
"name": "omilayers",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "duckdb, sqlite3, omics, bioinformatics, data analysis",
"author": "Dimitrios Kioroglou",
"author_email": "<d.kioroglou@hotmail.com>",
"download_url": "https://files.pythonhosted.org/packages/44/ab/a4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f/omilayers-0.2.3.tar.gz",
"platform": null,
"description": "\n# omilayers\n\n[![Documentation Status](https://readthedocs.org/projects/pip/badge/?version=stable)](https://pip.pypa.io/en/stable/?badge=stable) [![Downloads](https://static.pepy.tech/badge/omilayers)](https://pepy.tech/project/omilayers)\n\n``omilayers`` is a Python data management library. It is suitable for multi-omic data analysis, hence the `omi` prefix, that involves the handling of diverse datasets usually referred to as omic layers. `omilayers` wraps the APIs of `SQLite` and `DuckDB` and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.\n\nThe rationale behind `omilayers` is the following:\n\n* User stores **layers** of omic data (tables in SQL lingo).\n* User creates new layers by processing and restructuring existing layers.\n* User can group layers using **tags**.\n* User can store a brief description for each layer.\n\n\n## Why omilayers?\n\nAlthough SQL is a straightfoward language, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of `omilayers` is to provide this level of abstaction to facilitate bioinformatic data analysis. The `omilayers` API resembles the `pandas` API and the user needs to write the following code to parse a column named `foo` from a layer called `omicdata`:\n\nwith DuckDB (default database)\n```python\nfrom omilayers import Omilayers\n\nomi = Omilayers(\"dbname.duckdb\")\nresult = omi.layers['omicdata']['foo']\n```\n\nwith SQLite\n```python\nfrom omilayers import Omilayers\n\nomi = Omilayers(\"dbname.sqlite\", engine=\"sqlite\")\nresult = omi.layers['omicdata']['foo']\n```\n\n\n## Installation\n\n```\npip install omilayers\n```\n\n## Perform unittests\nThe directory `testing` includes predefined unittests for SQLite and DuckDB. \n\nTo test the functionality of `omilayers` with SQLite:\n```bash\npython -m unittests -v tests_sqlite.py\n```\n\nTo test the functionality of `omilayers` with DuckDB:\n```bash\npython -m unittests -v tests_duckdb.py\n```\n\n\n## Testing with synthetic omic data\n\nThe directory `synthetic_data` includes two jupyter notebooks (one for SQLite and one for DuckDB) for testing `omilayers` using synthetic multi-omic data. It also includes the Python script `create_synthetic_vcf/synthesize_vcf.py` that was used to create the synthetic VCF that is hosted in Zenodo [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12790872.svg)](https://doi.org/10.5281/zenodo.12790872).\n\nThe recreation of the synthetic VCF can be done as following:\n```bash\nfor i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;done\n```\n\nTo join the generated VCFs into a single VCF:\n```bash\nfor i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;done\n```\n\n\n## Documentation\n\nYou can read the full documentation here: [https://omilayers.readthedocs.io](https://omilayers.readthedocs.io/en/latest/)\n\n",
"bugtrack_url": null,
"license": "CC-BY-4.0",
"summary": "A SQLite and DuckDB wrapper suitable for bioinformatic analysis of multi-omic data.",
"version": "0.2.3",
"project_urls": null,
"split_keywords": [
"duckdb",
" sqlite3",
" omics",
" bioinformatics",
" data analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7f207b40a216cedb92dd2455da02918b52e096f45b278df27406ec792feb010e",
"md5": "ec949581bf99a1ae15d0410fcef9954a",
"sha256": "8595e26c43514eba6e43e03b72acf86bd88545b311d8304e43a2047aa8739625"
},
"downloads": -1,
"filename": "omilayers-0.2.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ec949581bf99a1ae15d0410fcef9954a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 22817,
"upload_time": "2024-10-22T08:47:53",
"upload_time_iso_8601": "2024-10-22T08:47:53.909836Z",
"url": "https://files.pythonhosted.org/packages/7f/20/7b40a216cedb92dd2455da02918b52e096f45b278df27406ec792feb010e/omilayers-0.2.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "44aba4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f",
"md5": "ff3f944d8f8719ff0665cd2394761dd5",
"sha256": "45527d4be1d5725f2bf7c48ebda49db834583ffa8f99a54b2e46316b32d4212c"
},
"downloads": -1,
"filename": "omilayers-0.2.3.tar.gz",
"has_sig": false,
"md5_digest": "ff3f944d8f8719ff0665cd2394761dd5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 16039,
"upload_time": "2024-10-22T08:47:56",
"upload_time_iso_8601": "2024-10-22T08:47:56.450762Z",
"url": "https://files.pythonhosted.org/packages/44/ab/a4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f/omilayers-0.2.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-22 08:47:56",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "omilayers"
}