omilayers


Nameomilayers JSON
Version 0.2.3 PyPI version JSON
download
home_pageNone
SummaryA SQLite and DuckDB wrapper suitable for bioinformatic analysis of multi-omic data.
upload_time2024-10-22 08:47:56
maintainerNone
docs_urlNone
authorDimitrios Kioroglou
requires_pythonNone
licenseCC-BY-4.0
keywords duckdb sqlite3 omics bioinformatics data analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# omilayers

[![Documentation Status](https://readthedocs.org/projects/pip/badge/?version=stable)](https://pip.pypa.io/en/stable/?badge=stable) [![Downloads](https://static.pepy.tech/badge/omilayers)](https://pepy.tech/project/omilayers)

``omilayers`` is a Python data management library. It is suitable for multi-omic data analysis, hence the `omi` prefix, that involves the handling of diverse datasets usually referred to as omic layers. `omilayers` wraps the APIs of `SQLite` and `DuckDB` and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.

The rationale behind `omilayers` is the following:

* User stores **layers** of omic data (tables in SQL lingo).
* User creates new layers by processing and restructuring existing layers.
* User can group layers using **tags**.
* User can store a brief description for each layer.


## Why omilayers?

Although SQL is a straightfoward language, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of `omilayers` is to provide this level of abstaction to facilitate bioinformatic data analysis. The `omilayers` API resembles the `pandas` API and the user needs to write the following code to parse a column named `foo` from a layer called `omicdata`:

with DuckDB (default database)
```python
from omilayers import Omilayers

omi = Omilayers("dbname.duckdb")
result = omi.layers['omicdata']['foo']
```

with SQLite
```python
from omilayers import Omilayers

omi = Omilayers("dbname.sqlite", engine="sqlite")
result = omi.layers['omicdata']['foo']
```


## Installation

```
pip install omilayers
```

## Perform unittests
The directory `testing` includes predefined unittests for SQLite and DuckDB. 

To test the functionality of `omilayers` with SQLite:
```bash
python -m unittests -v tests_sqlite.py
```

To test the functionality of `omilayers` with DuckDB:
```bash
python -m unittests -v tests_duckdb.py
```


## Testing with synthetic omic data

The directory `synthetic_data` includes two jupyter notebooks (one for SQLite and one for DuckDB) for testing `omilayers` using synthetic multi-omic data. It also includes the Python script `create_synthetic_vcf/synthesize_vcf.py` that was used to create the synthetic VCF that is hosted in Zenodo [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12790872.svg)](https://doi.org/10.5281/zenodo.12790872).

The recreation of the synthetic VCF can be done as following:
```bash
for i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;done
```

To join the generated VCFs into a single VCF:
```bash
for i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;done
```


## Documentation

You can read the full documentation here: [https://omilayers.readthedocs.io](https://omilayers.readthedocs.io/en/latest/)


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "omilayers",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "duckdb, sqlite3, omics, bioinformatics, data analysis",
    "author": "Dimitrios Kioroglou",
    "author_email": "<d.kioroglou@hotmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/44/ab/a4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f/omilayers-0.2.3.tar.gz",
    "platform": null,
    "description": "\n# omilayers\n\n[![Documentation Status](https://readthedocs.org/projects/pip/badge/?version=stable)](https://pip.pypa.io/en/stable/?badge=stable) [![Downloads](https://static.pepy.tech/badge/omilayers)](https://pepy.tech/project/omilayers)\n\n``omilayers`` is a Python data management library. It is suitable for multi-omic data analysis, hence the `omi` prefix, that involves the handling of diverse datasets usually referred to as omic layers. `omilayers` wraps the APIs of `SQLite` and `DuckDB` and provides a high-level interface for frequent and repetitive tasks that involve fast storage, processing and retrieval of data without the need to constantly write SQL queries.\n\nThe rationale behind `omilayers` is the following:\n\n* User stores **layers** of omic data (tables in SQL lingo).\n* User creates new layers by processing and restructuring existing layers.\n* User can group layers using **tags**.\n* User can store a brief description for each layer.\n\n\n## Why omilayers?\n\nAlthough SQL is a straightfoward language, it can become quite tedious task if it needs to be repeated multiple times. Since data analysis involves highly repetitive procedures, a user would need to create functions as a means to abstract the process of writing SQL queries. The aim of `omilayers` is to provide this level of abstaction to facilitate bioinformatic data analysis. The `omilayers` API resembles the `pandas` API and the user needs to write the following code to parse a column named `foo` from a layer called `omicdata`:\n\nwith DuckDB (default database)\n```python\nfrom omilayers import Omilayers\n\nomi = Omilayers(\"dbname.duckdb\")\nresult = omi.layers['omicdata']['foo']\n```\n\nwith SQLite\n```python\nfrom omilayers import Omilayers\n\nomi = Omilayers(\"dbname.sqlite\", engine=\"sqlite\")\nresult = omi.layers['omicdata']['foo']\n```\n\n\n## Installation\n\n```\npip install omilayers\n```\n\n## Perform unittests\nThe directory `testing` includes predefined unittests for SQLite and DuckDB. \n\nTo test the functionality of `omilayers` with SQLite:\n```bash\npython -m unittests -v tests_sqlite.py\n```\n\nTo test the functionality of `omilayers` with DuckDB:\n```bash\npython -m unittests -v tests_duckdb.py\n```\n\n\n## Testing with synthetic omic data\n\nThe directory `synthetic_data` includes two jupyter notebooks (one for SQLite and one for DuckDB) for testing `omilayers` using synthetic multi-omic data. It also includes the Python script `create_synthetic_vcf/synthesize_vcf.py` that was used to create the synthetic VCF that is hosted in Zenodo [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.12790872.svg)](https://doi.org/10.5281/zenodo.12790872).\n\nThe recreation of the synthetic VCF can be done as following:\n```bash\nfor i in {1..22} {X,Y,M};do python synthesize_vcf.py $i;done\n```\n\nTo join the generated VCFs into a single VCF:\n```bash\nfor i in {1..22} {X,Y,M};do cat chr${i}.vcf >> simulated.vcf;done\n```\n\n\n## Documentation\n\nYou can read the full documentation here: [https://omilayers.readthedocs.io](https://omilayers.readthedocs.io/en/latest/)\n\n",
    "bugtrack_url": null,
    "license": "CC-BY-4.0",
    "summary": "A SQLite and DuckDB wrapper suitable for bioinformatic analysis of multi-omic data.",
    "version": "0.2.3",
    "project_urls": null,
    "split_keywords": [
        "duckdb",
        " sqlite3",
        " omics",
        " bioinformatics",
        " data analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7f207b40a216cedb92dd2455da02918b52e096f45b278df27406ec792feb010e",
                "md5": "ec949581bf99a1ae15d0410fcef9954a",
                "sha256": "8595e26c43514eba6e43e03b72acf86bd88545b311d8304e43a2047aa8739625"
            },
            "downloads": -1,
            "filename": "omilayers-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ec949581bf99a1ae15d0410fcef9954a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 22817,
            "upload_time": "2024-10-22T08:47:53",
            "upload_time_iso_8601": "2024-10-22T08:47:53.909836Z",
            "url": "https://files.pythonhosted.org/packages/7f/20/7b40a216cedb92dd2455da02918b52e096f45b278df27406ec792feb010e/omilayers-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "44aba4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f",
                "md5": "ff3f944d8f8719ff0665cd2394761dd5",
                "sha256": "45527d4be1d5725f2bf7c48ebda49db834583ffa8f99a54b2e46316b32d4212c"
            },
            "downloads": -1,
            "filename": "omilayers-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "ff3f944d8f8719ff0665cd2394761dd5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16039,
            "upload_time": "2024-10-22T08:47:56",
            "upload_time_iso_8601": "2024-10-22T08:47:56.450762Z",
            "url": "https://files.pythonhosted.org/packages/44/ab/a4114bbabec1af2bcd37d63f5b289e3a9cb9957ebe1df6869bc21866290f/omilayers-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-22 08:47:56",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "omilayers"
}
        
Elapsed time: 3.69069s