sqeleton


Namesqeleton JSON
Version 0.1.6 PyPI version JSON
download
home_pagehttps://github.com/erezsh/sqeleton
SummaryPython library for querying SQL databases
upload_time2024-08-14 16:50:15
maintainerNone
docs_urlNone
authorErez Shinan
requires_python<4.0,>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Sqeleton

Sqeleton is a Python library for querying SQL databases.

It consists of -

- A fast and concise query builder, designed from scratch, but inspired by PyPika and SQLAlchemy

- A modular database interface, with drivers for a long list of SQL databases.

It is comparable to other libraries such as SQLAlchemy or PyPika, in terms of API and intended audience. However, there are several notable ways in which it is different. 

## **Features:**

🏃‍♂️**High-performance**: Sqeleton's API is designed to maximize performance using batch operations

- No ORM! While ORMs are easy and familiar, their granular operations are far too slow.
- Compiles queries 4 times faster than SQLAlchemy

🙌**Parallel**: Seamless multi-threading and multi-processing support

💖**Well-tested**: In addition to having an extensive test-suite, sqeleton is used as the core of [data-diff](https://github.com/datafold/data-diff).

✅**Type-aware**: The schema is used for validation when building expressions, making sure the names are correct, and that the data-types align. (WIP)
    
- The schema can be queried at run-time, if the tables already exist in the database

✨**Multi-database access**: Sqeleton is designed to work with several databases at the same time. Its API abstracts away as many implementation details as possible.

_Databases we fully support_:

- PostgreSQL >=10
- MySQL
- Snowflake
- BigQuery
- Redshift
- Oracle
- Presto
- Databricks
- Trino
- Clickhouse
- Vertica
- DuckDB >=0.6
- SQLite (coming soon)

💻**Built-in SQL client**: Connect to any of the supported databases with just one line.

Example usage: `sqeleton repl snowflake://...`

- Has syntax-highlighting, and autocomplete
- Use `*text` to find all tables like `%text%` (or just `*` to see all tables)
- Use `?name` to see the schema of the table called `name`.

## Documentation

[Read the docs!](https://sqeleton.readthedocs.io)

Or jump straight to the [introduction](https://sqeleton.readthedocs.io/en/latest/intro.html).

### Install

Install using pip:

```bash
pip install sqeleton
```

It is recommended to install the driver dependencies using pip's `[]` syntax:

```bash
pip install 'sqeleton[mysql, postgresql]'
```

Read more in [install / getting started.](https://sqeleton.readthedocs.io/en/latest/install.html)

### Example: Basic usage

We will create a table with the numbers 0..100, and then sum them up.

```python
from sqeleton import connect, table, this

# Create a new database connection
ddb = connect("duckdb://:memory:")

# Define a table with one int column
tbl = table('my_list', schema={'item': int})

# Make a bunch of queries
queries = [
    # Create table 'my_list'
    tbl.create(),

    # Insert 100 numbers
    tbl.insert_rows([x] for x in range(100)),

    # Get the sum of the numbers
    tbl.select(this.item.sum())
]
# Query in order, and return the last result as an int
result = ddb.query(queries, int)    

# Prints: Total sum of 0..100 = 4950
print(f"Total sum of 0..100 = {result}")
```

### Example: Advanced usage

We will define a function that performs outer-join on any database, and adds two extra fields: `only_a` and `only_b`.

```python
from sqeleton.databases import Database
from sqeleton.queries import ITable, leftjoin, rightjoin, outerjoin, and_, Expr

def my_outerjoin(
        db: Database,
        a: ITable, b: ITable,
        keys1: List[str], keys2: List[str],
        select_fields: Dict[str, Expr]
    ) -> ITable:
    """This function accepts two table expressions, and returns an outer-join query.
    
    The resulting rows will include two extra boolean fields:
    "only_a", and "only_b", describing whether there was a match for that row 
    only in the first table, or only in the second table.

    Parameters:
        db - the database connection to use
        a, b - the tables to outer-join
        keys1, keys2 - the names of the columns to join on, for each table respectively
        select_fields - A dictionary of {column_name: expression} to select as a result of the outer-join
    """
    # Predicates to join on
    on = [a[k1] == b[k2] for k1, k2 in zip(keys1, keys2)]

    # Define the new boolean fields
    # If all keys are None, it means there was no match
    # Compiles to "<k1> IS NULL AND <k2> IS NULL AND <k3> IS NULL..." etc.
    only_a = and_(b[k] == None for k in keys2)
    only_b = and_(a[k] == None for k in keys1)

    if isinstance(db, MySQL):
        # MySQL doesn't support "outer join"
        # Instead, we union "left join" and "right join"
        l = leftjoin(a, b).on(*on).select(
                only_a=only_a,
                only_b=False,
                **select_fields
            )
        r = rightjoin(a, b).on(*on).select(
                only_a=False,
                only_b=only_b,
                **select_fields
            )
        return l.union(r)

    # Other databases
    return outerjoin(a, b).on(*on).select(
            only_a=only_a,
            only_b=only_b,
            **select_fields
        )
```



# TODO

- Transactions

- Indexes

- Date/time expressions

- Window functions

## Possible plans for the future (not determined yet)

- Cache the compilation of repetitive queries for even faster query-building

- Compile control flow, functions

- Define tables using type-annotated classes (SQLModel style)

## Alternatives

- [SQLAlchemy](https://www.sqlalchemy.org/)
- [PyPika](https://github.com/kayak/pypika)
- [PonyORM](https://ponyorm.org/)
- [peewee](https://github.com/coleifer/peewee)

# Thanks

Thanks to Datafold for having sponsored Sqeleton in its initial stages. For reference, [the original repo](https://github.com/datafold/sqeleton/).
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/erezsh/sqeleton",
    "name": "sqeleton",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Erez Shinan",
    "author_email": "erezshin@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6c/3f/c3d3e33025447d50cbaad517c0c7ce2c9ec4f7f0daeb6d4d6a78a5d9a12c/sqeleton-0.1.6.tar.gz",
    "platform": null,
    "description": "# Sqeleton\n\nSqeleton is a Python library for querying SQL databases.\n\nIt consists of -\n\n- A fast and concise query builder, designed from scratch, but inspired by PyPika and SQLAlchemy\n\n- A modular database interface, with drivers for a long list of SQL databases.\n\nIt is comparable to other libraries such as SQLAlchemy or PyPika, in terms of API and intended audience. However, there are several notable ways in which it is different. \n\n## **Features:**\n\n\ud83c\udfc3\u200d\u2642\ufe0f**High-performance**: Sqeleton's API is designed to maximize performance using batch operations\n\n- No ORM! While ORMs are easy and familiar, their granular operations are far too slow.\n- Compiles queries 4 times faster than SQLAlchemy\n\n\ud83d\ude4c**Parallel**: Seamless multi-threading and multi-processing support\n\n\ud83d\udc96**Well-tested**: In addition to having an extensive test-suite, sqeleton is used as the core of [data-diff](https://github.com/datafold/data-diff).\n\n\u2705**Type-aware**: The schema is used for validation when building expressions, making sure the names are correct, and that the data-types align. (WIP)\n    \n- The schema can be queried at run-time, if the tables already exist in the database\n\n\u2728**Multi-database access**: Sqeleton is designed to work with several databases at the same time. Its API abstracts away as many implementation details as possible.\n\n_Databases we fully support_:\n\n- PostgreSQL >=10\n- MySQL\n- Snowflake\n- BigQuery\n- Redshift\n- Oracle\n- Presto\n- Databricks\n- Trino\n- Clickhouse\n- Vertica\n- DuckDB >=0.6\n- SQLite (coming soon)\n\n\ud83d\udcbb**Built-in SQL client**: Connect to any of the supported databases with just one line.\n\nExample usage: `sqeleton repl snowflake://...`\n\n- Has syntax-highlighting, and autocomplete\n- Use `*text` to find all tables like `%text%` (or just `*` to see all tables)\n- Use `?name` to see the schema of the table called `name`.\n\n## Documentation\n\n[Read the docs!](https://sqeleton.readthedocs.io)\n\nOr jump straight to the [introduction](https://sqeleton.readthedocs.io/en/latest/intro.html).\n\n### Install\n\nInstall using pip:\n\n```bash\npip install sqeleton\n```\n\nIt is recommended to install the driver dependencies using pip's `[]` syntax:\n\n```bash\npip install 'sqeleton[mysql, postgresql]'\n```\n\nRead more in [install / getting started.](https://sqeleton.readthedocs.io/en/latest/install.html)\n\n### Example: Basic usage\n\nWe will create a table with the numbers 0..100, and then sum them up.\n\n```python\nfrom sqeleton import connect, table, this\n\n# Create a new database connection\nddb = connect(\"duckdb://:memory:\")\n\n# Define a table with one int column\ntbl = table('my_list', schema={'item': int})\n\n# Make a bunch of queries\nqueries = [\n    # Create table 'my_list'\n    tbl.create(),\n\n    # Insert 100 numbers\n    tbl.insert_rows([x] for x in range(100)),\n\n    # Get the sum of the numbers\n    tbl.select(this.item.sum())\n]\n# Query in order, and return the last result as an int\nresult = ddb.query(queries, int)    \n\n# Prints: Total sum of 0..100 = 4950\nprint(f\"Total sum of 0..100 = {result}\")\n```\n\n### Example: Advanced usage\n\nWe will define a function that performs outer-join on any database, and adds two extra fields: `only_a` and `only_b`.\n\n```python\nfrom sqeleton.databases import Database\nfrom sqeleton.queries import ITable, leftjoin, rightjoin, outerjoin, and_, Expr\n\ndef my_outerjoin(\n        db: Database,\n        a: ITable, b: ITable,\n        keys1: List[str], keys2: List[str],\n        select_fields: Dict[str, Expr]\n    ) -> ITable:\n    \"\"\"This function accepts two table expressions, and returns an outer-join query.\n    \n    The resulting rows will include two extra boolean fields:\n    \"only_a\", and \"only_b\", describing whether there was a match for that row \n    only in the first table, or only in the second table.\n\n    Parameters:\n        db - the database connection to use\n        a, b - the tables to outer-join\n        keys1, keys2 - the names of the columns to join on, for each table respectively\n        select_fields - A dictionary of {column_name: expression} to select as a result of the outer-join\n    \"\"\"\n    # Predicates to join on\n    on = [a[k1] == b[k2] for k1, k2 in zip(keys1, keys2)]\n\n    # Define the new boolean fields\n    # If all keys are None, it means there was no match\n    # Compiles to \"<k1> IS NULL AND <k2> IS NULL AND <k3> IS NULL...\" etc.\n    only_a = and_(b[k] == None for k in keys2)\n    only_b = and_(a[k] == None for k in keys1)\n\n    if isinstance(db, MySQL):\n        # MySQL doesn't support \"outer join\"\n        # Instead, we union \"left join\" and \"right join\"\n        l = leftjoin(a, b).on(*on).select(\n                only_a=only_a,\n                only_b=False,\n                **select_fields\n            )\n        r = rightjoin(a, b).on(*on).select(\n                only_a=False,\n                only_b=only_b,\n                **select_fields\n            )\n        return l.union(r)\n\n    # Other databases\n    return outerjoin(a, b).on(*on).select(\n            only_a=only_a,\n            only_b=only_b,\n            **select_fields\n        )\n```\n\n\n\n# TODO\n\n- Transactions\n\n- Indexes\n\n- Date/time expressions\n\n- Window functions\n\n## Possible plans for the future (not determined yet)\n\n- Cache the compilation of repetitive queries for even faster query-building\n\n- Compile control flow, functions\n\n- Define tables using type-annotated classes (SQLModel style)\n\n## Alternatives\n\n- [SQLAlchemy](https://www.sqlalchemy.org/)\n- [PyPika](https://github.com/kayak/pypika)\n- [PonyORM](https://ponyorm.org/)\n- [peewee](https://github.com/coleifer/peewee)\n\n# Thanks\n\nThanks to Datafold for having sponsored Sqeleton in its initial stages. For reference, [the original repo](https://github.com/datafold/sqeleton/).",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python library for querying SQL databases",
    "version": "0.1.6",
    "project_urls": {
        "Documentation": "https://sqeleton.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/erezsh/sqeleton",
        "Repository": "https://github.com/erezsh/sqeleton"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ae1dbff30e671ed06100e2a3bd8ca2552eea29f38de44650b430574c4f75941",
                "md5": "2d3f75e0e7f67fa06f844424c41cf852",
                "sha256": "8a794803ea57efa8d3f8536a3f4f5efdb45f32b976bf4894c3fa14e078a6c45f"
            },
            "downloads": -1,
            "filename": "sqeleton-0.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2d3f75e0e7f67fa06f844424c41cf852",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 75742,
            "upload_time": "2024-08-14T16:50:13",
            "upload_time_iso_8601": "2024-08-14T16:50:13.745581Z",
            "url": "https://files.pythonhosted.org/packages/2a/e1/dbff30e671ed06100e2a3bd8ca2552eea29f38de44650b430574c4f75941/sqeleton-0.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c3fc3d3e33025447d50cbaad517c0c7ce2c9ec4f7f0daeb6d4d6a78a5d9a12c",
                "md5": "fe8b076261edcf093d154026471dc21c",
                "sha256": "2b0fd6330129a4280eff3b8f2f6dee73619a6dc13d95045ce72a549c2a04219c"
            },
            "downloads": -1,
            "filename": "sqeleton-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "fe8b076261edcf093d154026471dc21c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 57573,
            "upload_time": "2024-08-14T16:50:15",
            "upload_time_iso_8601": "2024-08-14T16:50:15.372861Z",
            "url": "https://files.pythonhosted.org/packages/6c/3f/c3d3e33025447d50cbaad517c0c7ce2c9ec4f7f0daeb6d4d6a78a5d9a12c/sqeleton-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-14 16:50:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "erezsh",
    "github_project": "sqeleton",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sqeleton"
}
        
Elapsed time: 0.43032s