# pikaQ
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
This library is heavily inspired by
[PyPika](https://github.com/kayak/pypika), so a lot of the syntax is
based on how PyPika works. Just like PyPika, PikaQ replaces handwritten
SQL queries with programmatic construction of queries. The main
difference is that pikaQ is designed solve the problem of writing the
same code that can be parsed to different SQL dialects. This is done by
using a common syntax that is then translated to the specific dialect.
This library provides the core components and implementation for
constructing the `Select` query and the core mechanism for parsing it
into different dialect. It does not offer complete coverage of SQL
syntax nor the detailed implementation of all the dialects. However, it
should be easy to extend the library to support more SQL syntax and
other types of queries you want to construct.
Validation of SQL correctness is not an explicit goal of PikaQ. You are
encouraged to check inputs you provide to PyPika or appropriately handle
errors raised from your SQL database - just as you would have if you
were writing SQL yourself.
## Install
``` sh
pip install pikaQ
```
## How to use
For example, if we want to write the same query in Spark SQL and AWS
Athena, we might encounter this problem: we have `ADD_MONTHS` function
in [Spark SQL](https://spark.apache.org/docs/2.3.0/api/sql/#add_months),
but in AWS Athena
([Presto](https://prestodb.io/docs/current/functions/datetime.html#interval-functions))
we don’t have this function.
We can define an `ADD_MONTHS` function in the following way:
``` python
from pikaQ.terms import custom_func
from pikaQ.queries import Query
@custom_func
def add_months(column, value, dialect='spark'):
if dialect == 'athena':
return f"DATE_ADD('month', {value}, {column})"
elif dialect == 'spark':
return f'ADD_MONTH({column}, {value})'
else:
raise ValueError(f'Unsupported dialect: {dialect}')
q = (Query.from_('table')
.select('col1', add_months('col2', 3).as_('col2'))
)
```
Then we can generate the query for Spark SQL and AWS Athena:
``` python
print(q.get_sql(dialect='athena'))
```
select col1, DATE_ADD('month', 3, col2) AS col2
from table
``` python
print(q.get_sql(dialect='spark'))
```
select col1, ADD_MONTH(col2, 3) AS col2
from table
A more complex example to show how the syntax works:
``` python
from pikaQ.queries import Query, Table, Field, AliasedQuery
from pikaQ.terms import Case, Preceding, CURRENT_ROW
import pikaQ.functions as fn
a = Table('tbl1').as_('a')
b = Table('tbl2').as_('b')
s = AliasedQuery('s')
m = AliasedQuery('m')
v = AliasedQuery('v')
q0 = (Query
.from_(a)
.select(
a.col1,
a.col2,
fn.Sum(a.col3).over(b.col2).rows(Preceding(3), CURRENT_ROW).as_('total'),
fn.RowNumber().over(b.col2).orderby(b.col4).as_('row_num')
).distinct()
.where((Field('col2')-100>2) & (Field('col3')/9<=1))
.orderby(b.col2)
)
q1 = (Query
.from_(b)
.select(b.col1, b.col2, fn.Avg(b.col3).as_('avg'))
.groupby(b.col1, b.col2)
.having(fn.Count(b.col3)>2)
)
q = (Query
.with_(q0, 's')
.with_(Query
.from_(s)
.select(s.star())
.where(s.row_num == 1), 'm')
.with_(q1, 'v')
.from_('v')
.join('m')
.on(
(v.col1 == m.col1) &
(v.col2 == m.col2)
)
.select(
v.col1, v.col2, v.avg,
Case().when(m.total>100, 1).else_(0).as_('flag')
)
)
print(q.get_sql())
```
with s as (
select distinct a.col1, a.col2, SUM(a.col3) OVER (PARTITION BY b.col2 ROWS BETWEEN 3 PRECEDING AND CURRENT_ROW) AS total, ROW_NUMBER() OVER (PARTITION BY b.col2 ORDER BY b.col4) AS row_num
from tbl1 as a
where col2 - 100 > 2 and col3 / 9 <= 1
order by b.col2)
, m as (
select s.*
from s
where s.row_num = 1)
, v as (
select b.col1, b.col2, AVG(b.col3)
from tbl2 as b
group by b.col1, b.col2
having COUNT(b.col3) > 2)
select v.col1, v.col2, v.avg, CASE
WHEN m.total > 100 THEN 1
ELSE 0
END AS flag
from v
join m on v.col1 = m.col1 and v.col2 = m.col2
For more syntax examples, please refer to the
[docs](https://feynlee.github.io/pikaQ/).
## Extension
One can use the core components and logic implemented in this library to
extend the functionality to support more SQL syntax and other types of
queries. For details of how to extend the
[`SelectQuery`](https://feynlee.github.io/pikaQ/queries.html#selectquery)
to support more clauses, please refer to the
[`Query`](https://feynlee.github.io/pikaQ/queries.html#query) section of
the docs.
Raw data
{
"_id": null,
"home_page": "https://github.com/feynlee/pikaQ",
"name": "pikaQ",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "nbdev jupyter notebook python",
"author": "Ziyue Li",
"author_email": "liziyue241@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/02/43/0b54b941c9c610813adcc38ac9daf1d7ec9dc08d6b053ba9fa620f266ada/pikaQ-1.0.2.tar.gz",
"platform": null,
"description": "# pikaQ\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\nThis library is heavily inspired by\n[PyPika](https://github.com/kayak/pypika), so a lot of the syntax is\nbased on how PyPika works. Just like PyPika, PikaQ replaces handwritten\nSQL queries with programmatic construction of queries. The main\ndifference is that pikaQ is designed solve the problem of writing the\nsame code that can be parsed to different SQL dialects. This is done by\nusing a common syntax that is then translated to the specific dialect.\n\nThis library provides the core components and implementation for\nconstructing the `Select` query and the core mechanism for parsing it\ninto different dialect. It does not offer complete coverage of SQL\nsyntax nor the detailed implementation of all the dialects. However, it\nshould be easy to extend the library to support more SQL syntax and\nother types of queries you want to construct.\n\nValidation of SQL correctness is not an explicit goal of PikaQ. You are\nencouraged to check inputs you provide to PyPika or appropriately handle\nerrors raised from your SQL database - just as you would have if you\nwere writing SQL yourself.\n\n## Install\n\n``` sh\npip install pikaQ\n```\n\n## How to use\n\nFor example, if we want to write the same query in Spark SQL and AWS\nAthena, we might encounter this problem: we have `ADD_MONTHS` function\nin [Spark SQL](https://spark.apache.org/docs/2.3.0/api/sql/#add_months),\nbut in AWS Athena\n([Presto](https://prestodb.io/docs/current/functions/datetime.html#interval-functions))\nwe don\u2019t have this function.\n\nWe can define an `ADD_MONTHS` function in the following way:\n\n``` python\nfrom pikaQ.terms import custom_func\nfrom pikaQ.queries import Query\n\n\n@custom_func\ndef add_months(column, value, dialect='spark'):\n if dialect == 'athena':\n return f\"DATE_ADD('month', {value}, {column})\"\n elif dialect == 'spark':\n return f'ADD_MONTH({column}, {value})'\n else:\n raise ValueError(f'Unsupported dialect: {dialect}')\n\n\nq = (Query.from_('table')\n .select('col1', add_months('col2', 3).as_('col2'))\n)\n```\n\nThen we can generate the query for Spark SQL and AWS Athena:\n\n``` python\nprint(q.get_sql(dialect='athena'))\n```\n\n select col1, DATE_ADD('month', 3, col2) AS col2\n from table\n\n``` python\nprint(q.get_sql(dialect='spark'))\n```\n\n select col1, ADD_MONTH(col2, 3) AS col2\n from table\n\nA more complex example to show how the syntax works:\n\n``` python\nfrom pikaQ.queries import Query, Table, Field, AliasedQuery\nfrom pikaQ.terms import Case, Preceding, CURRENT_ROW\nimport pikaQ.functions as fn\n\n\na = Table('tbl1').as_('a')\nb = Table('tbl2').as_('b')\ns = AliasedQuery('s')\nm = AliasedQuery('m')\nv = AliasedQuery('v')\n\nq0 = (Query\n .from_(a)\n .select(\n a.col1,\n a.col2,\n fn.Sum(a.col3).over(b.col2).rows(Preceding(3), CURRENT_ROW).as_('total'), \n fn.RowNumber().over(b.col2).orderby(b.col4).as_('row_num')\n ).distinct()\n .where((Field('col2')-100>2) & (Field('col3')/9<=1))\n .orderby(b.col2)\n)\nq1 = (Query\n .from_(b)\n .select(b.col1, b.col2, fn.Avg(b.col3).as_('avg'))\n .groupby(b.col1, b.col2)\n .having(fn.Count(b.col3)>2)\n)\n\nq = (Query\n .with_(q0, 's')\n .with_(Query\n .from_(s)\n .select(s.star())\n .where(s.row_num == 1), 'm')\n .with_(q1, 'v')\n .from_('v')\n .join('m')\n .on(\n (v.col1 == m.col1) &\n (v.col2 == m.col2)\n )\n .select(\n v.col1, v.col2, v.avg,\n Case().when(m.total>100, 1).else_(0).as_('flag')\n )\n)\n\nprint(q.get_sql())\n```\n\n with s as (\n select distinct a.col1, a.col2, SUM(a.col3) OVER (PARTITION BY b.col2 ROWS BETWEEN 3 PRECEDING AND CURRENT_ROW) AS total, ROW_NUMBER() OVER (PARTITION BY b.col2 ORDER BY b.col4) AS row_num\n from tbl1 as a\n where col2 - 100 > 2 and col3 / 9 <= 1\n order by b.col2)\n\n , m as (\n select s.*\n from s\n where s.row_num = 1)\n\n , v as (\n select b.col1, b.col2, AVG(b.col3)\n from tbl2 as b\n group by b.col1, b.col2\n having COUNT(b.col3) > 2)\n\n select v.col1, v.col2, v.avg, CASE\n WHEN m.total > 100 THEN 1\n ELSE 0\n END AS flag\n from v\n join m on v.col1 = m.col1 and v.col2 = m.col2\n\nFor more syntax examples, please refer to the\n[docs](https://feynlee.github.io/pikaQ/).\n\n## Extension\n\nOne can use the core components and logic implemented in this library to\nextend the functionality to support more SQL syntax and other types of\nqueries. For details of how to extend the\n[`SelectQuery`](https://feynlee.github.io/pikaQ/queries.html#selectquery)\nto support more clauses, please refer to the\n[`Query`](https://feynlee.github.io/pikaQ/queries.html#query) section of\nthe docs.\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "Programmatically generate SQL queries for different database platforms.",
"version": "1.0.2",
"project_urls": {
"Homepage": "https://github.com/feynlee/pikaQ"
},
"split_keywords": [
"nbdev",
"jupyter",
"notebook",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2134daa4daf3b4db714a65e6341251bc35f2e2700bd85d041a7306f73096b62d",
"md5": "297a3a3f9ba51335aad14fbbc1077a87",
"sha256": "0fdaf158e80c254f30a1920da474297c90c6021ffd8a8dcf567be0671a010e70"
},
"downloads": -1,
"filename": "pikaQ-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "297a3a3f9ba51335aad14fbbc1077a87",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 18637,
"upload_time": "2023-09-04T15:36:57",
"upload_time_iso_8601": "2023-09-04T15:36:57.655011Z",
"url": "https://files.pythonhosted.org/packages/21/34/daa4daf3b4db714a65e6341251bc35f2e2700bd85d041a7306f73096b62d/pikaQ-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "02430b54b941c9c610813adcc38ac9daf1d7ec9dc08d6b053ba9fa620f266ada",
"md5": "45091adf89a895607fcc092758607b04",
"sha256": "3dbcc46594eb91196b0204beb73ed3d247c351b8a4f4f50694284fa920d5629f"
},
"downloads": -1,
"filename": "pikaQ-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "45091adf89a895607fcc092758607b04",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 19881,
"upload_time": "2023-09-04T15:36:59",
"upload_time_iso_8601": "2023-09-04T15:36:59.330950Z",
"url": "https://files.pythonhosted.org/packages/02/43/0b54b941c9c610813adcc38ac9daf1d7ec9dc08d6b053ba9fa620f266ada/pikaQ-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-04 15:36:59",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "feynlee",
"github_project": "pikaQ",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "pikaq"
}