sidewinder-db


Namesidewinder-db JSON
Version 0.0.62 PyPI version JSON
download
home_pageNone
SummaryA Python-based Distributed Database
upload_time2024-05-17 16:19:25
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License Copyright (c) 2022 prmoore77 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords sidwinder sidewinder-db database distributed shard
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Sidewinder

[<img src="https://img.shields.io/badge/GitHub-prmoore77%2Fsidewinder-blue.svg?logo=Github">](https://github.com/prmoore77/sidewinder)
[<img src="https://img.shields.io/badge/dockerhub-image-green.svg?logo=Docker">](https://hub.docker.com/repository/docker/prmoorevoltron/sidewinder/general)
[![sidewinder-ci](https://github.com/prmoore77/sidewinder/actions/workflows/ci.yml/badge.svg)](https://github.com/prmoore77/sidewinder/actions/workflows/ci.yml)
[![Supported Python Versions](https://img.shields.io/pypi/pyversions/sidewinder-db)](https://pypi.org/project/sidewinder-db/)
[![PyPI version](https://badge.fury.io/py/sidewinder-db.svg)](https://badge.fury.io/py/sidewinder-db)
[![PyPI Downloads](https://img.shields.io/pypi/dm/sidewinder-db.svg)](https://pypi.org/project/sidewinder-db/)

Python-based Distributed Database

### Note: Sidewinder is experimental - and is not intended for Production workloads. 

Sidewinder is a [Python](https://python.org)-based (with [asyncio](https://docs.python.org/3/library/asyncio.html)) Proof-of-Concept Distributed Database that distributes shards of data from the server to a number of workers to "divide and conquer" OLAP database workloads.

It consists of a server, workers, and a client (where you can run interactive SQL commands).

Sidewinder will NOT distribute queries which do not contain aggregates - it will run those on the server side. 

Sidewinder uses [Apache Arrow](https://arrow.apache.org) with [Websockets](https://websockets.readthedocs.io/en/stable/) with TLS for secure communication between the server, worker(s), and client(s).  

It uses [DuckDB](https://duckdb.org) as its SQL execution engine - and the PostgreSQL parser to understand how to combine results from distributed workers.

# Setup (to run locally)

## Install package
You can install `sidewinder-db` from PyPi or from source.

### Option 1 - from PyPi
```shell
# Create the virtual environment
python3 -m venv .venv

# Activate the virtual environment
. .venv/bin/activate

pip install sidewinder-db
```

### Option 2 - from source - for development
```shell
git clone https://github.com/prmoore77/sidewinder

cd sidewinder

# Create the virtual environment
python3 -m venv .venv

# Activate the virtual environment
. .venv/bin/activate

# Upgrade pip, setuptools, and wheel
pip install --upgrade pip setuptools wheel

# Install Sidewinder-DB - in editable mode with dev dependencies
pip install --editable .[dev]
```

### Note
For the following commands - if you running from source and using `--editable` mode (for development purposes) - you will need to set the PYTHONPATH environment variable as follows:
```shell
export PYTHONPATH=$(pwd)/src
```

## Bootstrap the environment by creating a security user list (password file), TLS certificate keypair, and a sample TPC-H dataset with 11 shards
### (The passwords shown are just examples, it is recommended that you use more secure passwords)
```shell
. .venv/bin/activate
sidewinder-bootstrap \
    --client-username=scott \
    --client-password=tiger \
    --worker-password=united \
    --tpch-scale-factor=1 \
    --shard-count=11
```

## Run sidewinder locally - from root of repo (use --help option on the executables below for option details)


### 1) Server:
#### Open a terminal, then:
```bash
. .venv/bin/activate
sidewinder-server
```

### 2) Worker:
#### Open another terminal, then start a single worker (using the same worker password you used in the bootstrap command above) with command:
```bash
. .venv/bin/activate
sidewinder-worker --tls-roots=tls/server.crt --password=united
```
##### Note: you can run up to 11 workers for this example configuration, to do that do this instead of starting a single-worker:
```bash
. .venv/bin/activate
for x in {1..11}:
do
  sidewinder-worker --tls-roots=tls/server.crt --password=united &
done
```

To kill the workers later - run:
```bash
kill $(jobs -p)
```

### 3) Client:
#### Open another terminal, then connect with the client - using the same client username/password you used in the bootstrap command above:
```
. .venv/bin/activate
sidewinder-client --tls-roots=tls/server.crt --username=scott --password=tiger
```

##### Then - while in the client - you can run a sample query that will distribute to the worker(s) (if you have at least one running) - example:
```SELECT COUNT(*) FROM lineitem;```
##### Note: if you are running less than 11 workers - your answer will only reflect n/11 of the data (where n is the worker count).  We will add delta processing at a later point...

##### A query that won't distribute (because it does not contain aggregates) - would be:
```SELECT * FROM region;```
##### or:
```SELECT * FROM lineitem LIMIT 5;```

##### Note: there are TPC-H queries in the [tpc-h_queries](tpc-h_queries) folder you can run...

##### To turn distributed mode OFF in the client:
```.set distributed = false;```

##### To turn summarization mode OFF in the client (so that sidewinder does NOT summarize the workers' results - this only applies to distributed mode):
```.set summarize = false;```

### Optional DuckDB CLI (use for data QA purposes, etc.)
Install DuckDB CLI version [0.10.2](https://github.com/duckdb/duckdb/releases/tag/v0.10.2) - and make sure the executable is on your PATH.

Platform Downloads:   
[Linux x86-64](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-linux-amd64.zip)   
[Linux arm64 (aarch64)](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-linux-aarch64.zip)   
[MacOS Universal](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-osx-universal.zip)   

### Handy development commands

#### Version management

##### Bump the version of the application - (you must have installed from source with the [dev] extras)
```bash
bumpver update --patch
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "sidewinder-db",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "sidwinder, sidewinder-db, database, distributed, shard",
    "author": null,
    "author_email": "Philip Moore <prmoore77@hotmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/aa/89/c0b799f6da48881286a7a75102e24b48429fe62be35e97d8509241daf6db/sidewinder_db-0.0.62.tar.gz",
    "platform": null,
    "description": "# Sidewinder\n\n[<img src=\"https://img.shields.io/badge/GitHub-prmoore77%2Fsidewinder-blue.svg?logo=Github\">](https://github.com/prmoore77/sidewinder)\n[<img src=\"https://img.shields.io/badge/dockerhub-image-green.svg?logo=Docker\">](https://hub.docker.com/repository/docker/prmoorevoltron/sidewinder/general)\n[![sidewinder-ci](https://github.com/prmoore77/sidewinder/actions/workflows/ci.yml/badge.svg)](https://github.com/prmoore77/sidewinder/actions/workflows/ci.yml)\n[![Supported Python Versions](https://img.shields.io/pypi/pyversions/sidewinder-db)](https://pypi.org/project/sidewinder-db/)\n[![PyPI version](https://badge.fury.io/py/sidewinder-db.svg)](https://badge.fury.io/py/sidewinder-db)\n[![PyPI Downloads](https://img.shields.io/pypi/dm/sidewinder-db.svg)](https://pypi.org/project/sidewinder-db/)\n\nPython-based Distributed Database\n\n### Note: Sidewinder is experimental - and is not intended for Production workloads. \n\nSidewinder is a [Python](https://python.org)-based (with [asyncio](https://docs.python.org/3/library/asyncio.html)) Proof-of-Concept Distributed Database that distributes shards of data from the server to a number of workers to \"divide and conquer\" OLAP database workloads.\n\nIt consists of a server, workers, and a client (where you can run interactive SQL commands).\n\nSidewinder will NOT distribute queries which do not contain aggregates - it will run those on the server side. \n\nSidewinder uses [Apache Arrow](https://arrow.apache.org) with [Websockets](https://websockets.readthedocs.io/en/stable/) with TLS for secure communication between the server, worker(s), and client(s).  \n\nIt uses [DuckDB](https://duckdb.org) as its SQL execution engine - and the PostgreSQL parser to understand how to combine results from distributed workers.\n\n# Setup (to run locally)\n\n## Install package\nYou can install `sidewinder-db` from PyPi or from source.\n\n### Option 1 - from PyPi\n```shell\n# Create the virtual environment\npython3 -m venv .venv\n\n# Activate the virtual environment\n. .venv/bin/activate\n\npip install sidewinder-db\n```\n\n### Option 2 - from source - for development\n```shell\ngit clone https://github.com/prmoore77/sidewinder\n\ncd sidewinder\n\n# Create the virtual environment\npython3 -m venv .venv\n\n# Activate the virtual environment\n. .venv/bin/activate\n\n# Upgrade pip, setuptools, and wheel\npip install --upgrade pip setuptools wheel\n\n# Install Sidewinder-DB - in editable mode with dev dependencies\npip install --editable .[dev]\n```\n\n### Note\nFor the following commands - if you running from source and using `--editable` mode (for development purposes) - you will need to set the PYTHONPATH environment variable as follows:\n```shell\nexport PYTHONPATH=$(pwd)/src\n```\n\n## Bootstrap the environment by creating a security user list (password file), TLS certificate keypair, and a sample TPC-H dataset with 11 shards\n### (The passwords shown are just examples, it is recommended that you use more secure passwords)\n```shell\n. .venv/bin/activate\nsidewinder-bootstrap \\\n    --client-username=scott \\\n    --client-password=tiger \\\n    --worker-password=united \\\n    --tpch-scale-factor=1 \\\n    --shard-count=11\n```\n\n## Run sidewinder locally - from root of repo (use --help option on the executables below for option details)\n\n\n### 1) Server:\n#### Open a terminal, then:\n```bash\n. .venv/bin/activate\nsidewinder-server\n```\n\n### 2) Worker:\n#### Open another terminal, then start a single worker (using the same worker password you used in the bootstrap command above) with command:\n```bash\n. .venv/bin/activate\nsidewinder-worker --tls-roots=tls/server.crt --password=united\n```\n##### Note: you can run up to 11 workers for this example configuration, to do that do this instead of starting a single-worker:\n```bash\n. .venv/bin/activate\nfor x in {1..11}:\ndo\n  sidewinder-worker --tls-roots=tls/server.crt --password=united &\ndone\n```\n\nTo kill the workers later - run:\n```bash\nkill $(jobs -p)\n```\n\n### 3) Client:\n#### Open another terminal, then connect with the client - using the same client username/password you used in the bootstrap command above:\n```\n. .venv/bin/activate\nsidewinder-client --tls-roots=tls/server.crt --username=scott --password=tiger\n```\n\n##### Then - while in the client - you can run a sample query that will distribute to the worker(s) (if you have at least one running) - example:\n```SELECT COUNT(*) FROM lineitem;```\n##### Note: if you are running less than 11 workers - your answer will only reflect n/11 of the data (where n is the worker count).  We will add delta processing at a later point...\n\n##### A query that won't distribute (because it does not contain aggregates) - would be:\n```SELECT * FROM region;```\n##### or:\n```SELECT * FROM lineitem LIMIT 5;```\n\n##### Note: there are TPC-H queries in the [tpc-h_queries](tpc-h_queries) folder you can run...\n\n##### To turn distributed mode OFF in the client:\n```.set distributed = false;```\n\n##### To turn summarization mode OFF in the client (so that sidewinder does NOT summarize the workers' results - this only applies to distributed mode):\n```.set summarize = false;```\n\n### Optional DuckDB CLI (use for data QA purposes, etc.)\nInstall DuckDB CLI version [0.10.2](https://github.com/duckdb/duckdb/releases/tag/v0.10.2) - and make sure the executable is on your PATH.\n\nPlatform Downloads:   \n[Linux x86-64](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-linux-amd64.zip)   \n[Linux arm64 (aarch64)](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-linux-aarch64.zip)   \n[MacOS Universal](https://github.com/duckdb/duckdb/releases/download/v0.10.2/duckdb_cli-osx-universal.zip)   \n\n### Handy development commands\n\n#### Version management\n\n##### Bump the version of the application - (you must have installed from source with the [dev] extras)\n```bash\nbumpver update --patch\n```\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2022 prmoore77  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "A Python-based Distributed Database",
    "version": "0.0.62",
    "project_urls": {
        "Homepage": "https://github.com/prmoore77/sidewinder"
    },
    "split_keywords": [
        "sidwinder",
        " sidewinder-db",
        " database",
        " distributed",
        " shard"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a171aa0d3499dd43ffe8ca2c2e30f24e19602e57d60717e6f0a6ad301d341363",
                "md5": "84dc01eab464f8e1f636e3a91eb9a6c5",
                "sha256": "55383eebf224597c5486a18fc1a6aad648a33fdd9f40b4ca77f7313c6e365058"
            },
            "downloads": -1,
            "filename": "sidewinder_db-0.0.62-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "84dc01eab464f8e1f636e3a91eb9a6c5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 41137,
            "upload_time": "2024-05-17T16:19:23",
            "upload_time_iso_8601": "2024-05-17T16:19:23.820564Z",
            "url": "https://files.pythonhosted.org/packages/a1/71/aa0d3499dd43ffe8ca2c2e30f24e19602e57d60717e6f0a6ad301d341363/sidewinder_db-0.0.62-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aa89c0b799f6da48881286a7a75102e24b48429fe62be35e97d8509241daf6db",
                "md5": "002c46b5840a12a4e22908e99f75f485",
                "sha256": "c4295ea48921112ca94040123fb38a2452ea440fcb00e4a7e707728e69bf3082"
            },
            "downloads": -1,
            "filename": "sidewinder_db-0.0.62.tar.gz",
            "has_sig": false,
            "md5_digest": "002c46b5840a12a4e22908e99f75f485",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 36047,
            "upload_time": "2024-05-17T16:19:25",
            "upload_time_iso_8601": "2024-05-17T16:19:25.346972Z",
            "url": "https://files.pythonhosted.org/packages/aa/89/c0b799f6da48881286a7a75102e24b48429fe62be35e97d8509241daf6db/sidewinder_db-0.0.62.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-17 16:19:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "prmoore77",
    "github_project": "sidewinder",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sidewinder-db"
}
        
Elapsed time: 0.94091s