# SQL_Blocks
## _SQL_Blocks_ is useful for building complex SQL commands through smaller query blocks:
---
### 1 - You can assemble a simple object that will then be converted into an SQL command:
> a = Select('Actor') # --> SELECT * FROM Actor act
_Note that an alias "act" has been added._
You can specify your own alias: `a = Select('Actor a')`
---
### 2 - You can also add a field, contains this...
* a = Select('Actor a', **name=Field**)
* Here are another ways to add a field:
- Select('Actor a', name=Distinct )
- Select('Actor a', name=NamedField('actors_name'))
- Select(
'Actor a',
name=NamedField('actors_name', Distinct)
)
2.1 -- Using expression as a field:
```
Select(
'Product',
due_date=NamedField(
'YEAR_ref',
ExpressionField('extract(year from {f})') # <<---
)
)
```
...should return:
**SELECT extract(year from due_date) as YEAR_ref...**
Possible tags in ExpressionField:
* {f} - The field name;
* {af} - The field name preceded by the table alias;
> Can be written as {a.f} or %
* {t} - The table name;
* {a} - Only the table alias.
---
### 3 - To set conditions, use **Where**:
* For example, `a = Select(... age=gt(45) )`
Some possible conditions:
* field=eq(value) - ...the field is EQUAL to the value;
* field=gt(value) - ...the field is GREATER than the value;
* field=lt(value) - ...the field is LESS than the value;
> You may use Where.**eq**, Where.**gt**, Where.**lt** ... or simply **eq**, **gt**, **lt** ... 😉
3.1 -- If you want to filter the field on a range of values:
`a = Select( 'Actor a', age=Between(45, 69) )`
...but if it is a time slot within the same day, you can do it like this:
`Select(..., event_date=SameDay("2024-10-03"))`
This results in
```
SELECT ...
WHERE
event_date >= '2024-10-03 00:00:00' AND
event_date <= '2024-10-03 23:59:59'
```
---
3.2 -- Sub-queries:
```
query = Select('Movie m', title=Field,
id=SelectIN(
'Review r',
rate=gt(4.5),
movie_id=Distinct
)
)
```
**>> print(query)**
SELECT
m.title
FROM
Movie m
WHERE
m.id IN (
SELECT DISTINCT r.movie
FROM Review r WHERE r.rate > 4.5
)
3.3 -- Optional conditions:
```
OR=Options(
genre=eq("Sci-Fi"),
awards=contains("Oscar")
)
AND=Options(
..., name=startswith('Chris')
)
```
3.4 -- Negative conditions use the _Not_ class instead of _Where_
```
based_on_book=Not.is_null()
```
3.5 -- List of values
```
hash_tag=inside(['space', 'monster', 'gore'])
```
3.6 -- Combining ExpressionField with Where condition:
* The **formula** method allows you to write an expression as a condition:
```
query=Select(
'Folks f2',
id=Where.formula('({af} = a.father OR {af} = a.mother)')
)
```
> Results: `WHERE...f2.id = a.father OR f2.id = a.mother`
---
### 4 - A field can be two things at the same time:
* m = Select('Movie m' release_date=[Field, OrderBy])
- This means that the field will appear in the results and also that the query will be ordered by that field.
* Applying **GROUP BY** to item 3.2, it would look contains this:
```
SelectIN(
'Review r', movie=[GroupBy, Distinct],
rate=Having.avg(gt(4.5))
)
```
---
### 5 - Relationships:
```
query = Select('Actor a', name=Field,
cast=Select('Cast c', id=PrimaryKey)
)
```
**>> print(query)**
```
SELECT
a.name
FROM
Actor a
JOIN Cast c ON (a.cast = c.id)
```
---
**5.1 Multiple tables without JOIN**
> Warning: This is **NOT** recommended! ⛔
#### Example:
singer = Select(
"Singer artist", id=PrimaryKey,
name=NamedField('artist_name')
)
album = Select (
"Album album",
name=NamedField('album_name'),
artist_id=Where.join(singer), # <===== 👀
)
**>> print(query)**
SELECT
album.name as album_name,
artist.name as artist_name,
album.year_recorded
FROM
Album album
,Singer artist
WHERE
(album.artist_id = artist.id)
(*) --> For more than one relationship, use the **pairs** parameter.
---
### 6 - The reverse process (parse):
```
text = """
SELECT
cas.role,
m.title,
m.release_date,
a.name as actors_name
FROM
Actor a
LEFT JOIN Cast cas ON (a.cast = cas.id)
LEFT JOIN Movie m ON (cas.movie = m.id)
WHERE
(
m.genre = 'Sci-Fi'
OR
m.awards LIKE '%Oscar%'
)
AND a.age <= 69 AND a.age >= 45
ORDER BY
m.release_date DESC
"""
```
`a, c, m = Select.parse(text)`
**6.1 --- print(a)**
```
SELECT
a.name as actors_name
FROM
Actor a
WHERE
a.age <= 69
AND a.age >= 45
```
**6.2 --- print(c)**
SELECT
c.role
FROM
Cast c
**6.3 --- print(m)**
SELECT
m.title,
m.release_date
FROM
Movie m
WHERE
( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )
ORDER BY
m.release_date DESC
**6.4 --- print(a+c)**
SELECT
a.name as actors_name,
cas.role
FROM
Actor a
JOIN Cast cas ON (a.cast = cas.id)
WHERE
a.age >= 45
AND a.age <= 69
**6.5 --- print(c+m)**
> `... or print(m+c)`
SELECT
cas.role,
m.title,
m.release_date,
m.director
FROM
Cast cas
JOIN Movie m ON (cas.movie = m.id)
WHERE
( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )
AND m.director LIKE '%Coppola%'
ORDER BY
m.release_date,
m.director
---
### 7 - You can add or delete attributes directly in objects:
* a(gender=Field)
* m.delete('director')
---
### 8 - Defining relationship on separate objects:
```
a = Select...
c = Select...
m = Select...
```
`a + c => ERROR: "No relationship found between Actor and Cast"`
8.1 - But...
a( cast=ForeignKey('Cast') )
c(id=PrimaryKey)
**a + c => Ok!**
8.2
c( movie=ForeignKey('Movie') )
m(id=PrimaryKey)
> **c + m => Ok!**
>> **m + c => Ok!**
---
**8.3 Difference between queries**
```
STATUS_DELIVERED_OK = 93
orders = Select('orders',
customer_id=ForeignKey('customers'),
status=eq(STATUS_DELIVERED_OK)
)
customers = Select('customers'
id=PrimaryKey, name=Field
)
gap = orders - customers
```
return _customers without orders_:
SELECT
c.name
FROM
customers c
WHERE
NOT c.id IN (
SELECT o.customer_id FROM orders o
WHERE o.status = 93
)
---
### 9 - Comparing objects
9.1
```
a1 = Select.parse('''
SELECT gender, Max(act.age) FROM Actor act
WHERE act.age <= 69 AND act.age >= 45
GROUP BY gender
''')[0]
a2 = Select('Actor',
age=[ Between(45, 69), Max ],
gender=[GroupBy, Field]
)
```
> **a1 == a2 # --- True!**
9.2
```
m1 = Select.parse("""
SELECT title, release_date FROM Movie m ORDER BY release_date
WHERE m.genre = 'Sci-Fi' AND m.awards LIKE '%Oscar%'
""")[0]
m2 = Select.parse("""
SELECT release_date, title
FROM Movie m
WHERE m.awards LIKE '%Oscar%' AND m.genre = 'Sci-Fi'
ORDER BY release_date
""")[0]
```
**m1 == m2 # --- True!**
9.3
```
best_movies = SelectIN(
Review=Table('role'),
rate=[GroupBy, Having.avg(gt(4.5))]
)
m1 = Select(
Movie=Table('title,release_date'),
id=best_movies
)
sql = "SELECT rev.role FROM Review rev GROUP BY rev.rate HAVING Avg(rev.rate) > 4.5"
m2 = Select(
'Movie', release_date=Field, title=Field,
id=Where(f"IN ({sql})")
)
```
**m1 == m2 # --- True!**
---
### 10 - CASE...WHEN...THEN
Select(
'Product',
label=Case('price').when(
lt(50), 'cheap'
).when(
gt(100), 'expensive'
).else_value(
'normal'
)
)
* 10.1 - If the labels used in the CASE are based on ranges of values in sequence, you can use the **Range class**:
query = Select(
'People p',
age_group=Range('age',{ # <<----------
'adult': 50,
'teenager': 17,
'child': 10,
'elderly': 70,
'young': 21,
})
)
is equivalent to...
```
SELECT
CASE
WHEN p.age BETWEEN 0 AND 10 THEN 'child'
WHEN p.age BETWEEN 11 AND 17 THEN 'teenager'
WHEN p.age BETWEEN 18 AND 21 THEN 'young'
WHEN p.age BETWEEN 22 AND 50 THEN 'adult'
WHEN p.age BETWEEN 51 AND 70 THEN 'elderly'
END AS age_group
FROM
People p
```
* 10.2 `If` class
Usefull to conditional Sum, Avg, Count...
**Example:**
Select('Loan',
penalty=If('days_late', Sum, gt(0))
)
# ...OR... penalty=If('days_late > 0', Sum)
results...
```
SELECT
Sum(CASE
WHEN days_late > 0 THEN penalty
ELSE 0
END)
FROM
Emprestimo
```
---
### 11 - optimize method
p1 = Select.parse("""
SELECT * FROM Product p
WHERE (p.category = 'Gizmo'
OR p.category = 'Gadget'
OR p.category = 'Doohickey')
AND NOT price <= 387.64
AND YEAR(last_sale) = 2024
ORDER BY
category
""")[0]
p1.optimize() # <<===============
p2 = Select.parse("""
SELECT category FROM Product p
WHERE category IN ('Gizmo','Gadget','Doohickey')
and p.price > 387.64
and p.last_sale >= '2024-01-01'
and p.last_sale <= '2024-12-31'
ORDER BY p.category LIMIT 100
""")[0]
p1 == p2 # --- True!
This will...
* Replace `OR` conditions to `SELECT IN ...`
* Put `LIMIT` if no fields or conditions defined;
* Normalizes inverted conditions;
* Auto includes fields present in `ORDER/GROUP BY`;
* Replace `YEAR` function with date range comparison.
> The method allows you to select which rules you want to apply in the optimization...Or define your own rules!
>> NOTE: When a joined table is used only as a filter, it is possible that it can be changed to a sub-query:
query = Select(
'Installments i', due_date=Field, customer=Select(
'Customer c', id=PrimaryKey,
name=endswith('Smith')
)
)
print(query)
print('-----')
query.optimize([RuleReplaceJoinBySubselect])
print(query)
```
SELECT
i.due_date
FROM
Installments i
JOIN Customer c ON (i.customer = c.id)
WHERE
c.name LIKE '%Smith'
-----
SELECT
i.due_date
FROM
Installments i
WHERE
i.customer IN (SELECT c.id FROM Customer c WHERE c.name LIKE '%Smith')
```
---
### 12 - Adding multiple fields at once
```
query = Select('post p')
query.add_fields(
'user_id, created_at',
[OrderBy, GroupBy]
)
```
...is the same as...
```
query = Select(
'post p',
user_id=[Field, GroupBy, OrderBy],
created_at=[Field, GroupBy, OrderBy]
)
```
### 13 - Change parser engine
```
a, c, m = Select.parse(
"""
Actor(name, id ?age = 40)
<- Cast(actor_id, movie_id) ->
Movie(id ^title)
""",
CypherParser
# ^^^ recognizes syntax like Neo4J queries
)
```
**print(a+c+m)**
```
SELECT
act.name,
mov.title
FROM
Cast cas
JOIN Movie mov ON (cas.movie_id = mov.id)
JOIN Actor act ON (cas.actor_id = act.id)
WHERE
act.age = 40
ORDER BY
mov.title
```
---
#### Separators and meaning:
<a id="cypher_separators"></a>
* `( )` Delimits a table and its fields
* `,` Separate fields
* `?` For simple conditions (> < = <>)
* `<-` connects to the table on the left
* `->` connects to the table on the right
* `^` Put the field in the ORDER BY clause
* `@` Immediately after the table/field name, it indicates the grouping field.
* `$` For SQL functions like **avg**$_field_, **sum**$_field_, **count**$_field_...
* `*` Sets the primary key field.
* `:` Allows you to assign an alias to the field or expression.
---
## `detect` function
It is useful to write a query in a few lines, without specifying the script type (cypher, mongoDB, SQL, Neo4J...)
### Examples:
> **13.1 - Relationship**
```
query = detect(
'MATCH(c:Customer)<-[:Order]->(p:Product)RETURN c, p'
)
print(query)
```
##### output:
SELECT * FROM
Order ord
LEFT JOIN Customer cus ON (ord.customer_id = cus.id)
RIGHT JOIN Product pro ON (ord.product_id = pro.id)
> **13.2 - Grouping**
```
query = detect(
'People@gender(avg$age?region="SOUTH"^count$qtde)'
)
print(query)
```
##### output:
SELECT
peo.gender,
Avg(peo.age),
Count(*) as qtde
FROM
People peo
WHERE
peo.region = "SOUTH"
GROUP BY
peo.gender
ORDER BY
peo.qtde
> **13.3 - Many conditions...**
```
print( detect('''
db.people.find({
{
$or: [
{status:{$eq:"B"}},
age:{$lt:50}
]
},
age:{$gte:18}, status:{$eq:"A"}
},{
name: 1, user_id: 1
}).sort({
user_id: -1
})
''') )
```
#### output:
SELECT
peo.name,
peo.user_id
FROM
people peo
WHERE
( peo. = 'B' OR peo.age < 50 ) AND
peo.age >= 18 AND
peo.status = 'A'
ORDER BY
peo.user_id DESC
> **13.4 - Relations with same table twice (or more)**
Automatically assigns aliases to each side of the relationship (In this example, one user invites another to add to their contact list)
```
print( detect(
'User(^name,id) <-Contact(requester,guest)-> User(id,name)'
# ^^^ u1 ^^^ u2
) )
```
SELECT
u1.name,
u2.name
FROM
Contact con
RIGHT JOIN User u2 ON (con.guest = u2.id)
LEFT JOIN User u1 ON (con.requester = u1.id)
ORDER BY
u1.name
---
### `translate_to` method
From a Select object, it returns the text to a script in any of the languages below:
* QueryLanguage - default
* MongoDBLanguage
* Neo4JLanguage
* DatabricksLanguage
* PandasLanguage
---
### 14 - Window Function
Aggregation functions (Avg, Min, Max, Sum, Count) -- or Window functions (Lead, Lag, Row_Number, Rank) -- have the **over** method...
query=Select(
'Enrollment e',
payment=Sum().over(
student_id=Partition, due_date=OrderBy,
# _=Rows(Current(), Following(5)),
# ^^^-------> ROWS BETWEEN CURRENT ROW AND 5 FOLLOWING
# _=Rows(Preceding(3), Following()),
# ^^^-------> ROWS BETWEEN 3 PRECEDING AND UNBOUNDED FOLLOWING
# _=Rows(Preceding(3))
# ^^^-------> ROWS 3 PRECEDING
).As('sum_per_student')
)
...that generates the following query:
```
SELECT
Sum(e.payment) OVER(
PARTITION BY student_id
ORDER BY due_date
) as sum_per_student
FROM
Enrollment e
```
---
### 15 - The `As` method:
query=Select(
'Customers c',
phone=[
Not.is_null(),
SubString(1, 4).As('area_code', GroupBy)
],
customer_id=[
Count().As('customer_count', OrderBy),
Having.count(gt(5))
]
)
You can use the result of a function as a new field -- and optionally use it in ORDER BY and/or GROUP BY clause(s):
```
SELECT
SubString(c.phone, 1, 4) as area_code,
Count(c.customer_id) as customer_count
FROM
Customers c
WHERE
NOT c.phone IS NULL
GROUP BY
area_code HAVING Count(c.customer_id) > 5
ORDER BY
customer_count
```
---
### 16 - Function classes
You may use this functions:
* SubString
* Round
* DateDiff
* Year
* Current_Date
* Avg
* Min
* Max
* Sum
* Count
* Lag
* Lead
* Row_Number
* Rank
* Coalesce
* Cast
> Some of these functions may vary in syntax depending on the database.
For example, if your query is going to run on Oracle, do the following:
`Function.dialect = Dialect.ORACLE`
> Most of this functions you can use nested inside each other.
*Example:*
```
Select(...
event_date=Substring(
Cast("CHAR"), 12, 19
).As('time')
)
```
Results...
```
SELECT ...
SubString(Cast(event_date As char), 12, 19) as time
```
>> `Function.auto_convert` option (default: True)
- Put Cast(...) when there is a difference between the types of the parameter and the return of the nested function
```
birth=Round( DateDiff(Current_Date()) ).As('age')
```
...Returns...
```
SELECT
Round(
Cast(Current_Date() - p.birth As FLOAT)
/* ^^^ */
) as age
...
```
---
### 16.1 - _GroupBy as instance_
Another way to use GroupBy is to pass functions as parameters:
```
Function.dialect = Dialect.ORACLE
query = Select(
'Sales s',
ref_date=GroupBy(
ref_year=Year, qty_sold=Sum('quantity'),
vendor=Select(
'Vendor v',
id=[PrimaryKey, Field], name=Field
)
)
)
print(query)
```
results..
```
SELECT
Extract(Year FROM s.ref_date) as ref_year,
Sum(quantity) as qty_sold,
v.id,
v.name
FROM
Sales s
JOIN Vendor v ON (s.vendor = v.id)
GROUP BY
ref_year,
v.id,
v.name
```
---
### 17 - CTE and Recursive classes
* **17.1 - _CTE class_**
```
query = Select(
'SocialMedia s', post=Count, reaction=Sum, user=GroupBy
)
print( CTE('Metrics', [query]) )
```
The result is...
```
WITH Metrics AS (
SELECT Count(s.post), Sum(s.reaction) FROM SocialMedia s GROUP BY user
)SELECT * FROM Metrics
```
* **17.2 - _Recursive class_**
```
q1 = Select(
'SocialMedia me', name=[ eq(MY_NAME), Field ]
)
q2 = Select(
'SocialMedia you' name=Field, id=Where.formula('{af} = n.friend')
)
print( Recursive('Network', [q1, q2]) )
```
The result is...
```
WITH RECURSIVE Network AS (
SELECT me.name FROM SocialMedia me WHERE
me.name = 'Júlio Cascalles'
UNION ALL
SELECT you.name FROM SocialMedia you , Network n
WHERE you.id = n.friend
)SELECT * FROM Network
```
* **17.2.1 - The `create` method** ... parameters :
<a id="cte_create_method"></a>
- name: The name of the CTE
- pattern: A cypher script that defines the tables used
- formula: The format for `Where.formula` method _(*)_
- init_value: The value for the condition in the first table
- format (optional): If tables are files or internet hiperlinks, you may especify the extension and/or folder...
> Example:
```
R = Recursive.create(
'Route R', 'Flyght(departure, arrival)',
'[2] = R.[1]', 'JFK', format='.csv'
) # ^^^--- Flyghts from JFK airport
```
_...Creates a recursive CTE called Route, using Flyght table, where the recursivity condition is Flyght.arrival equals to Route.departure_
>> (*) -- Note that [1] and [2] refers to first field and second field. 😉
Result:
WITH RECURSIVE Route AS (
SELECT f1.departure, f1.arrival
FROM Flyght.csv f1
WHERE f1.departure = 'JFK'
UNION ALL
SELECT f2.departure, f2.arrival
FROM Flyght.csv f2
, Route R
WHERE f2.arrival = R.departure
)SELECT * FROM Route R
**17.2.2 - The `join` method**
In the previous example, if you add this code...
`R.join('Airport(*id,name)', 'departure, arrival', format='.csv')`
...The result would be:
WITH RECURSIVE Route AS (
SELECT f1.departure, f1.arrival
FROM Flyght.csv f1
WHERE f1.departure = 'JFK'
UNION ALL
SELECT f2.departure, f2.arrival
FROM Flyght.csv f2
, Route R
WHERE f2.arrival = R.departure
)SELECT
a1.name, a2.name
FROM
Route R
JOIN Airport.csv a2 ON (R.arrival = a2.id)
JOIN Airport.csv a1 ON (R.departure = a1.id)
**17.2.3 - The `counter` method**
Adds an increment field in queries inside CTE:
> Examples:
* `R.counter('stops', 0)` # -- counter starts with 0 and increment +1
* `R2.counter('generation', 5, '- 1')` # -- for the code below...
```
R2 = Recursive.create(
'Ancestors a', 'People(id,name,father,mother,birth)',
'(% = a.father OR % = a.mother)', 32630, '.parquet'
)
```
...Results:
WITH RECURSIVE Ancestors AS (
SELECT p1.id, p1.name, p1.father, p1.mother, p1.birth,
5 AS generation /* <<---- Most current generation ------------*/
FROM People.parquet p1 WHERE p1.id = 32630
UNION ALL
SELECT p2.id, p2.name, p2.father, p2.mother, p2.birth,
(generation- 1) AS generation /* <<-- Previous generation -----*/
FROM People.parquet p2 , Ancestors a WHERE (p2.id = a.father OR p2.id = a.mother)
)SELECT * FROM Ancestors a
>> Note: Comments added later.
---
### 17.3 - CTEFactory class
CTEFactory exchanges subqueries for CTEs, simply by passing the text of the "dirty" query:
*Example*:
```
print(
CTEFactory("""
SELECT u001.name, agg_sales.total
FROM (
SELECT * FROM Users u
WHERE u.status = 'active'
) AS u001
JOIN (
SELECT s.user_id, Sum(s.value) as total
FROM Sales s
GROUP BY s.user_id
)
As agg_sales
ON u001.id = agg_sales.user_id
ORDER BY u001.name
""")
)
```
results...
```
WITH u001 AS (
SELECT * FROM Users u
WHERE u.status = 'active'
),
WITH agg_sales AS (
SELECT s.user_id, Sum(s.value) as total
FROM Sales s
GROUP BY s.user_id
)
SELECT
u001.name,
agg_sales.total
FROM
u001 u001
JOIN agg_sales agg_sales ON
(u001.id = agg_sales.user_id)
ORDER BY
u001.name
```
#### 17.3.1 - You can also pass a Cypher script like in the example below:
cte = CTEFactory("""
Annual_Sales_per_Vendor[
Sales(
year$ref_date:ref_year@, sum$quantity:qty_sold,
vendor) <- Vendor(id, name:vendors_name@)
]
[-1](**, ref_year) -> Goal(year, target)
""")
print(cte)
results...
```
WITH Annual_Sales_per_Vendor AS (
SELECT ven.name as vendors_name
, Year(sal.ref_date) as ref_year
, Sum(sal.quantity) as qty_sold
FROM Vendor ven LEFT JOIN Sales sal ON (ven.id = sal.vendor
GROUP BY ven.name, ref_year
)
SELECT
aspv.vendors_name,
aspv.ref_year,
aspv.qty_sold,
goa.target
FROM
Annual_Sales_per_Vendor aspv
RIGHT JOIN Goal goa ON (aspv.ref_year = goa.year)
```
For more details, see the [Cypher syntax](#cypher_separators) or [CTE create method](#cte_create_method)!
---
Raw data
{
"_id": null,
"home_page": "https://github.com/julio-cascalles/sql_blocks",
"name": "sql-blocks",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "J\u00falio Cascalles",
"author_email": "Julio Cascalles <julio.cascalles@outlook.com>",
"download_url": "https://files.pythonhosted.org/packages/0b/3f/1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10/sql_blocks-1.20250726.tar.gz",
"platform": null,
"description": "# SQL_Blocks\r\n\r\n## _SQL_Blocks_ is useful for building complex SQL commands through smaller query blocks:\r\n\r\n---\r\n\r\n### 1 - You can assemble a simple object that will then be converted into an SQL command:\r\n\r\n> a = Select('Actor') # --> SELECT * FROM Actor act\r\n\r\n_Note that an alias \"act\" has been added._\r\n\r\nYou can specify your own alias: `a = Select('Actor a')`\r\n\r\n---\r\n### 2 - You can also add a field, contains this...\r\n\r\n* a = Select('Actor a', **name=Field**)\r\n\r\n* Here are another ways to add a field:\r\n - Select('Actor a', name=Distinct )\r\n\r\n - Select('Actor a', name=NamedField('actors_name'))\r\n\r\n - Select(\r\n 'Actor a', \r\n name=NamedField('actors_name', Distinct)\r\n )\r\n\r\n\r\n 2.1 -- Using expression as a field:\r\n```\r\n Select(\r\n 'Product',\r\n due_date=NamedField(\r\n 'YEAR_ref',\r\n ExpressionField('extract(year from {f})') # <<---\r\n )\r\n )\r\n```\r\n...should return: \r\n**SELECT extract(year from due_date) as YEAR_ref...**\r\n\r\nPossible tags in ExpressionField:\r\n* {f} - The field name;\r\n* {af} - The field name preceded by the table alias;\r\n > Can be written as {a.f} or %\r\n* {t} - The table name;\r\n* {a} - Only the table alias.\r\n---\r\n\r\n### 3 - To set conditions, use **Where**:\r\n* For example, `a = Select(... age=gt(45) )`\r\n\r\n Some possible conditions:\r\n * field=eq(value) - ...the field is EQUAL to the value;\r\n * field=gt(value) - ...the field is GREATER than the value;\r\n * field=lt(value) - ...the field is LESS than the value;\r\n\r\n> You may use Where.**eq**, Where.**gt**, Where.**lt** ... or simply **eq**, **gt**, **lt** ... \ud83d\ude09\r\n\r\n3.1 -- If you want to filter the field on a range of values:\r\n\r\n`a = Select( 'Actor a', age=Between(45, 69) )`\r\n...but if it is a time slot within the same day, you can do it like this:\r\n`Select(..., event_date=SameDay(\"2024-10-03\"))`\r\nThis results in\r\n```\r\n SELECT ...\r\n WHERE\r\n event_date >= '2024-10-03 00:00:00' AND\r\n event_date <= '2024-10-03 23:59:59'\r\n```\r\n---\r\n\r\n3.2 -- Sub-queries:\r\n```\r\nquery = Select('Movie m', title=Field,\r\n id=SelectIN(\r\n 'Review r',\r\n rate=gt(4.5),\r\n movie_id=Distinct\r\n )\r\n)\r\n```\r\n\r\n**>> print(query)** \r\n\r\n SELECT\r\n m.title\r\n FROM\r\n Movie m\r\n WHERE\r\n m.id IN (\r\n SELECT DISTINCT r.movie\r\n FROM Review r WHERE r.rate > 4.5\r\n )\r\n \r\n3.3 -- Optional conditions:\r\n```\r\n OR=Options(\r\n genre=eq(\"Sci-Fi\"),\r\n awards=contains(\"Oscar\")\r\n )\r\n AND=Options(\r\n ..., name=startswith('Chris')\r\n )\r\n```\r\n\r\n3.4 -- Negative conditions use the _Not_ class instead of _Where_\r\n```\r\nbased_on_book=Not.is_null()\r\n```\r\n\r\n3.5 -- List of values\r\n```\r\nhash_tag=inside(['space', 'monster', 'gore'])\r\n```\r\n\r\n3.6 -- Combining ExpressionField with Where condition:\r\n* The **formula** method allows you to write an expression as a condition:\r\n```\r\nquery=Select(\r\n 'Folks f2',\r\n id=Where.formula('({af} = a.father OR {af} = a.mother)')\r\n)\r\n```\r\n> Results: `WHERE...f2.id = a.father OR f2.id = a.mother`\r\n\r\n---\r\n### 4 - A field can be two things at the same time:\r\n\r\n* m = Select('Movie m' release_date=[Field, OrderBy])\r\n - This means that the field will appear in the results and also that the query will be ordered by that field.\r\n* Applying **GROUP BY** to item 3.2, it would look contains this:\r\n ``` \r\n SelectIN(\r\n 'Review r', movie=[GroupBy, Distinct],\r\n rate=Having.avg(gt(4.5))\r\n )\r\n ```\r\n---\r\n### 5 - Relationships:\r\n```\r\n query = Select('Actor a', name=Field,\r\n cast=Select('Cast c', id=PrimaryKey)\r\n )\r\n```\r\n**>> print(query)** \r\n```\r\nSELECT\r\n a.name\r\nFROM\r\n Actor a\r\n JOIN Cast c ON (a.cast = c.id) \r\n```\r\n\r\n---\r\n**5.1 Multiple tables without JOIN**\r\n> Warning: This is **NOT** recommended! \u26d4\r\n\r\n\r\n#### Example:\r\n singer = Select(\r\n \"Singer artist\", id=PrimaryKey,\r\n name=NamedField('artist_name')\r\n )\r\n album = Select (\r\n \"Album album\",\r\n name=NamedField('album_name'),\r\n artist_id=Where.join(singer), # <===== \ud83d\udc40\r\n )\r\n**>> print(query)** \r\n\r\n SELECT\r\n album.name as album_name,\r\n artist.name as artist_name,\r\n album.year_recorded\r\n FROM\r\n Album album\r\n ,Singer artist\r\n WHERE\r\n (album.artist_id = artist.id)\r\n\r\n(*) --> For more than one relationship, use the **pairs** parameter.\r\n\r\n\r\n---\r\n### 6 - The reverse process (parse):\r\n```\r\ntext = \"\"\"\r\n SELECT\r\n cas.role,\r\n m.title,\r\n m.release_date,\r\n a.name as actors_name\r\n FROM\r\n Actor a\r\n LEFT JOIN Cast cas ON (a.cast = cas.id)\r\n LEFT JOIN Movie m ON (cas.movie = m.id)\r\n WHERE\r\n (\r\n m.genre = 'Sci-Fi'\r\n OR\r\n m.awards LIKE '%Oscar%'\r\n )\r\n AND a.age <= 69 AND a.age >= 45\r\n ORDER BY\r\n m.release_date DESC\r\n\"\"\"\r\n```\r\n\r\n`a, c, m = Select.parse(text)`\r\n\r\n**6.1 --- print(a)**\r\n```\r\n SELECT\r\n a.name as actors_name\r\n FROM\r\n Actor a\r\n WHERE\r\n a.age <= 69\r\n AND a.age >= 45\r\n```\r\n\r\n**6.2 --- print(c)**\r\n\r\n SELECT\r\n c.role\r\n FROM\r\n Cast c\r\n\r\n**6.3 --- print(m)**\r\n\r\n SELECT\r\n m.title,\r\n m.release_date\r\n FROM\r\n Movie m\r\n WHERE\r\n ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )\r\n ORDER BY\r\n m.release_date DESC\r\n\r\n\r\n\r\n**6.4 --- print(a+c)**\r\n\r\n SELECT\r\n a.name as actors_name,\r\n cas.role\r\n FROM\r\n Actor a\r\n JOIN Cast cas ON (a.cast = cas.id)\r\n WHERE\r\n a.age >= 45\r\n AND a.age <= 69\r\n\r\n**6.5 --- print(c+m)** \r\n> `... or print(m+c)`\r\n\r\n SELECT\r\n cas.role,\r\n m.title,\r\n m.release_date,\r\n m.director\r\n FROM\r\n Cast cas\r\n JOIN Movie m ON (cas.movie = m.id)\r\n WHERE\r\n ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )\r\n AND m.director LIKE '%Coppola%'\r\n ORDER BY\r\n m.release_date,\r\n m.director\r\n---\r\n\r\n### 7 - You can add or delete attributes directly in objects:\r\n* a(gender=Field)\r\n* m.delete('director')\r\n\r\n---\r\n\r\n### 8 - Defining relationship on separate objects:\r\n```\r\na = Select...\r\nc = Select...\r\nm = Select...\r\n```\r\n`a + c => ERROR: \"No relationship found between Actor and Cast\"`\r\n\r\n8.1 - But...\r\n\r\n a( cast=ForeignKey('Cast') )\r\n c(id=PrimaryKey)\r\n\r\n**a + c => Ok!**\r\n\r\n8.2\r\n\r\n c( movie=ForeignKey('Movie') )\r\n m(id=PrimaryKey)\r\n\r\n> **c + m => Ok!**\r\n>> **m + c => Ok!**\r\n\r\n\r\n---\r\n**8.3 Difference between queries**\r\n```\r\n STATUS_DELIVERED_OK = 93\r\n orders = Select('orders',\r\n customer_id=ForeignKey('customers'),\r\n status=eq(STATUS_DELIVERED_OK)\r\n )\r\n customers = Select('customers'\r\n id=PrimaryKey, name=Field\r\n )\r\n gap = orders - customers\r\n```\r\nreturn _customers without orders_:\r\n\r\n SELECT\r\n c.name\r\n FROM\r\n customers c\r\n WHERE\r\n NOT c.id IN (\r\n SELECT o.customer_id FROM orders o\r\n WHERE o.status = 93\r\n )\r\n---\r\n\r\n### 9 - Comparing objects\r\n\r\n9.1\r\n```\r\n a1 = Select.parse('''\r\n SELECT gender, Max(act.age) FROM Actor act\r\n WHERE act.age <= 69 AND act.age >= 45\r\n GROUP BY gender\r\n ''')[0]\r\n\r\n a2 = Select('Actor',\r\n age=[ Between(45, 69), Max ],\r\n gender=[GroupBy, Field]\r\n ) \r\n```\r\n> **a1 == a2 # --- True!**\r\n\r\n\r\n\r\n9.2\r\n```\r\n m1 = Select.parse(\"\"\"\r\n SELECT title, release_date FROM Movie m ORDER BY release_date \r\n WHERE m.genre = 'Sci-Fi' AND m.awards LIKE '%Oscar%'\r\n \"\"\")[0]\r\n\r\n m2 = Select.parse(\"\"\"\r\n SELECT release_date, title\r\n FROM Movie m\r\n WHERE m.awards LIKE '%Oscar%' AND m.genre = 'Sci-Fi'\r\n ORDER BY release_date \r\n \"\"\")[0]\r\n```\r\n\r\n**m1 == m2 # --- True!**\r\n\r\n\r\n9.3\r\n```\r\nbest_movies = SelectIN(\r\n Review=Table('role'),\r\n rate=[GroupBy, Having.avg(gt(4.5))]\r\n)\r\nm1 = Select(\r\n Movie=Table('title,release_date'),\r\n id=best_movies\r\n)\r\n\r\nsql = \"SELECT rev.role FROM Review rev GROUP BY rev.rate HAVING Avg(rev.rate) > 4.5\"\r\nm2 = Select(\r\n 'Movie', release_date=Field, title=Field,\r\n id=Where(f\"IN ({sql})\")\r\n)\r\n```\r\n**m1 == m2 # --- True!**\r\n\r\n---\r\n\r\n### 10 - CASE...WHEN...THEN\r\n Select(\r\n 'Product',\r\n label=Case('price').when(\r\n lt(50), 'cheap'\r\n ).when(\r\n gt(100), 'expensive'\r\n ).else_value(\r\n 'normal'\r\n )\r\n )\r\n\r\n* 10.1 - If the labels used in the CASE are based on ranges of values \u200b\u200bin sequence, you can use the **Range class**:\r\n\r\n query = Select(\r\n 'People p',\r\n age_group=Range('age',{ # <<---------- \r\n 'adult': 50,\r\n 'teenager': 17,\r\n 'child': 10,\r\n 'elderly': 70,\r\n 'young': 21,\r\n })\r\n )\r\nis equivalent to...\r\n```\r\n SELECT\r\n CASE\r\n WHEN p.age BETWEEN 0 AND 10 THEN 'child'\r\n WHEN p.age BETWEEN 11 AND 17 THEN 'teenager'\r\n WHEN p.age BETWEEN 18 AND 21 THEN 'young'\r\n WHEN p.age BETWEEN 22 AND 50 THEN 'adult'\r\n WHEN p.age BETWEEN 51 AND 70 THEN 'elderly'\r\n END AS age_group\r\n FROM\r\n People p\r\n```\r\n\r\n* 10.2 `If` class\r\n\r\nUsefull to conditional Sum, Avg, Count...\r\n\r\n**Example:**\r\n\r\n Select('Loan', \r\n penalty=If('days_late', Sum, gt(0))\r\n )\r\n # ...OR... penalty=If('days_late > 0', Sum)\r\n\r\nresults...\r\n```\r\nSELECT\r\n Sum(CASE\r\n WHEN days_late > 0 THEN penalty\r\n ELSE 0\r\n END)\r\nFROM\r\n Emprestimo\r\n```\r\n\r\n---\r\n\r\n### 11 - optimize method\r\n p1 = Select.parse(\"\"\"\r\n SELECT * FROM Product p\r\n WHERE (p.category = 'Gizmo'\r\n OR p.category = 'Gadget'\r\n OR p.category = 'Doohickey')\r\n AND NOT price <= 387.64\r\n AND YEAR(last_sale) = 2024\r\n ORDER BY\r\n category\r\n \"\"\")[0]\r\n p1.optimize() # <<===============\r\n p2 = Select.parse(\"\"\"\r\n SELECT category FROM Product p\r\n WHERE category IN ('Gizmo','Gadget','Doohickey')\r\n and p.price > 387.64\r\n and p.last_sale >= '2024-01-01'\r\n and p.last_sale <= '2024-12-31'\r\n ORDER BY p.category LIMIT 100\r\n \"\"\")[0]\r\n p1 == p2 # --- True!\r\n\r\n This will...\r\n* Replace `OR` conditions to `SELECT IN ...`\r\n* Put `LIMIT` if no fields or conditions defined;\r\n* Normalizes inverted conditions;\r\n* Auto includes fields present in `ORDER/GROUP BY`;\r\n* Replace `YEAR` function with date range comparison.\r\n\r\n> The method allows you to select which rules you want to apply in the optimization...Or define your own rules!\r\n\r\n>> NOTE: When a joined table is used only as a filter, it is possible that it can be changed to a sub-query:\r\n\r\n query = Select(\r\n 'Installments i', due_date=Field, customer=Select(\r\n 'Customer c', id=PrimaryKey,\r\n name=endswith('Smith')\r\n )\r\n )\r\n print(query)\r\n print('-----')\r\n query.optimize([RuleReplaceJoinBySubselect])\r\n print(query)\r\n```\r\nSELECT\r\n i.due_date\r\nFROM\r\n Installments i\r\n JOIN Customer c ON (i.customer = c.id)\r\nWHERE\r\n c.name LIKE '%Smith'\r\n-----\r\nSELECT\r\n i.due_date\r\nFROM\r\n Installments i\r\nWHERE\r\n i.customer IN (SELECT c.id FROM Customer c WHERE c.name LIKE '%Smith')\r\n```\r\n\r\n---\r\n\r\n### 12 - Adding multiple fields at once\r\n```\r\n query = Select('post p')\r\n query.add_fields(\r\n 'user_id, created_at',\r\n [OrderBy, GroupBy]\r\n )\r\n```\r\n...is the same as...\r\n```\r\n query = Select(\r\n 'post p',\r\n user_id=[Field, GroupBy, OrderBy],\r\n created_at=[Field, GroupBy, OrderBy]\r\n )\r\n```\r\n\r\n### 13 - Change parser engine\r\n```\r\na, c, m = Select.parse(\r\n \"\"\"\r\n Actor(name, id ?age = 40)\r\n <- Cast(actor_id, movie_id) ->\r\n Movie(id ^title)\r\n \"\"\",\r\n CypherParser\r\n # ^^^ recognizes syntax like Neo4J queries\r\n)\r\n```\r\n\r\n**print(a+c+m)**\r\n```\r\nSELECT\r\n act.name,\r\n mov.title\r\nFROM\r\n Cast cas\r\n JOIN Movie mov ON (cas.movie_id = mov.id)\r\n JOIN Actor act ON (cas.actor_id = act.id)\r\nWHERE\r\n act.age = 40\r\nORDER BY\r\n mov.title\r\n```\r\n---\r\n\r\n#### Separators and meaning:\r\n<a id=\"cypher_separators\"></a>\r\n\r\n* `( )` Delimits a table and its fields\r\n* `,` Separate fields\r\n* `?` For simple conditions (> < = <>)\r\n* `<-` connects to the table on the left\r\n* `->` connects to the table on the right\r\n* `^` Put the field in the ORDER BY clause\r\n* `@` Immediately after the table/field name, it indicates the grouping field.\r\n* `$` For SQL functions like **avg**$_field_, **sum**$_field_, **count**$_field_...\r\n* `*` Sets the primary key field.\r\n* `:` Allows you to assign an alias to the field or expression.\r\n\r\n\r\n---\r\n## `detect` function\r\n\r\nIt is useful to write a query in a few lines, without specifying the script type (cypher, mongoDB, SQL, Neo4J...)\r\n### Examples:\r\n\r\n> **13.1 - Relationship**\r\n```\r\nquery = detect(\r\n 'MATCH(c:Customer)<-[:Order]->(p:Product)RETURN c, p'\r\n)\r\nprint(query)\r\n```\r\n##### output:\r\n SELECT * FROM\r\n Order ord\r\n LEFT JOIN Customer cus ON (ord.customer_id = cus.id)\r\n RIGHT JOIN Product pro ON (ord.product_id = pro.id)\r\n> **13.2 - Grouping**\r\n```\r\nquery = detect(\r\n 'People@gender(avg$age?region=\"SOUTH\"^count$qtde)'\r\n)\r\nprint(query)\r\n```\r\n##### output:\r\n SELECT\r\n peo.gender,\r\n Avg(peo.age),\r\n Count(*) as qtde\r\n FROM\r\n People peo\r\n WHERE\r\n peo.region = \"SOUTH\"\r\n GROUP BY\r\n peo.gender\r\n ORDER BY\r\n peo.qtde\r\n\r\n> **13.3 - Many conditions...**\r\n```\r\n print( detect('''\r\n db.people.find({\r\n {\r\n $or: [\r\n {status:{$eq:\"B\"}},\r\n age:{$lt:50}\r\n ]\r\n },\r\n age:{$gte:18}, status:{$eq:\"A\"}\r\n },{\r\n name: 1, user_id: 1\r\n }).sort({\r\n user_id: -1\r\n })\r\n ''') )\r\n```\r\n#### output:\r\n SELECT\r\n peo.name,\r\n peo.user_id\r\n FROM\r\n people peo\r\n WHERE\r\n ( peo. = 'B' OR peo.age < 50 ) AND\r\n peo.age >= 18 AND\r\n peo.status = 'A'\r\n ORDER BY\r\n peo.user_id DESC\r\n\r\n> **13.4 - Relations with same table twice (or more)**\r\n\r\nAutomatically assigns aliases to each side of the relationship (In this example, one user invites another to add to their contact list)\r\n```\r\n print( detect(\r\n 'User(^name,id) <-Contact(requester,guest)-> User(id,name)'\r\n # ^^^ u1 ^^^ u2\r\n ) )\r\n```\r\n SELECT\r\n u1.name,\r\n u2.name\r\n FROM\r\n Contact con\r\n RIGHT JOIN User u2 ON (con.guest = u2.id)\r\n LEFT JOIN User u1 ON (con.requester = u1.id)\r\n ORDER BY\r\n u1.name\r\n\r\n---\r\n### `translate_to` method\r\nFrom a Select object, it returns the text to a script in any of the languages \u200b\u200bbelow:\r\n* QueryLanguage - default\r\n* MongoDBLanguage\r\n* Neo4JLanguage\r\n* DatabricksLanguage\r\n* PandasLanguage\r\n\r\n---\r\n### 14 - Window Function\r\n\r\nAggregation functions (Avg, Min, Max, Sum, Count) -- or Window functions (Lead, Lag, Row_Number, Rank) -- have the **over** method...\r\n\r\n query=Select(\r\n 'Enrollment e',\r\n payment=Sum().over(\r\n student_id=Partition, due_date=OrderBy,\r\n # _=Rows(Current(), Following(5)), \r\n # ^^^-------> ROWS BETWEEN CURRENT ROW AND 5 FOLLOWING\r\n # _=Rows(Preceding(3), Following()),\r\n # ^^^-------> ROWS BETWEEN 3 PRECEDING AND UNBOUNDED FOLLOWING\r\n # _=Rows(Preceding(3)) \r\n # ^^^-------> ROWS 3 PRECEDING\r\n ).As('sum_per_student')\r\n )\r\n\r\n...that generates the following query:\r\n\r\n```\r\nSELECT\r\n Sum(e.payment) OVER(\r\n PARTITION BY student_id\r\n ORDER BY due_date\r\n ) as sum_per_student\r\nFROM\r\n Enrollment e\r\n```\r\n---\r\n### 15 - The `As` method:\r\n query=Select(\r\n 'Customers c',\r\n phone=[\r\n Not.is_null(),\r\n SubString(1, 4).As('area_code', GroupBy)\r\n ],\r\n customer_id=[\r\n Count().As('customer_count', OrderBy),\r\n Having.count(gt(5))\r\n ]\r\n )\r\nYou can use the result of a function as a new field -- and optionally use it in ORDER BY and/or GROUP BY clause(s):\r\n```\r\nSELECT\r\n SubString(c.phone, 1, 4) as area_code,\r\n Count(c.customer_id) as customer_count\r\nFROM\r\n Customers c\r\nWHERE\r\n NOT c.phone IS NULL\r\nGROUP BY\r\n area_code HAVING Count(c.customer_id) > 5\r\nORDER BY\r\n customer_count\r\n```\r\n---\r\n### 16 - Function classes\r\nYou may use this functions:\r\n* SubString\r\n* Round\r\n* DateDiff\r\n* Year\r\n* Current_Date\r\n* Avg\r\n* Min\r\n* Max\r\n* Sum\r\n* Count\r\n* Lag\r\n* Lead\r\n* Row_Number\r\n* Rank\r\n* Coalesce\r\n* Cast\r\n> Some of these functions may vary in syntax depending on the database.\r\nFor example, if your query is going to run on Oracle, do the following:\r\n\r\n`Function.dialect = Dialect.ORACLE`\r\n\r\n\r\n> Most of this functions you can use nested inside each other.\r\n*Example:*\r\n```\r\n Select(...\r\n event_date=Substring(\r\n Cast(\"CHAR\"), 12, 19 \r\n ).As('time')\r\n )\r\n```\r\nResults...\r\n```\r\n SELECT ...\r\n SubString(Cast(event_date As char), 12, 19) as time\r\n```\r\n\r\n>> `Function.auto_convert` option (default: True)\r\n\r\n- Put Cast(...) when there is a difference between the types of the parameter and the return of the nested function\r\n```\r\nbirth=Round( DateDiff(Current_Date()) ).As('age')\r\n```\r\n...Returns...\r\n```\r\nSELECT\r\n Round(\r\n Cast(Current_Date() - p.birth As FLOAT)\r\n /* ^^^ */\r\n ) as age\r\n...\r\n```\r\n---\r\n\r\n### 16.1 - _GroupBy as instance_\r\n\r\nAnother way to use GroupBy is to pass functions as parameters:\r\n\r\n```\r\n Function.dialect = Dialect.ORACLE\r\n query = Select(\r\n 'Sales s',\r\n ref_date=GroupBy(\r\n ref_year=Year, qty_sold=Sum('quantity'),\r\n vendor=Select(\r\n 'Vendor v',\r\n id=[PrimaryKey, Field], name=Field\r\n )\r\n )\r\n )\r\n print(query)\r\n```\r\nresults..\r\n```\r\nSELECT\r\n Extract(Year FROM s.ref_date) as ref_year,\r\n Sum(quantity) as qty_sold,\r\n v.id,\r\n v.name\r\nFROM\r\n Sales s\r\n JOIN Vendor v ON (s.vendor = v.id)\r\nGROUP BY\r\n ref_year,\r\n v.id,\r\n v.name\r\n```\r\n\r\n---\r\n### 17 - CTE and Recursive classes\r\n\r\n* **17.1 - _CTE class_**\r\n```\r\n query = Select(\r\n 'SocialMedia s', post=Count, reaction=Sum, user=GroupBy\r\n )\r\n print( CTE('Metrics', [query]) )\r\n```\r\nThe result is...\r\n```\r\n WITH Metrics AS (\r\n SELECT Count(s.post), Sum(s.reaction) FROM SocialMedia s GROUP BY user\r\n )SELECT * FROM Metrics\r\n```\r\n\r\n* **17.2 - _Recursive class_**\r\n```\r\nq1 = Select(\r\n 'SocialMedia me', name=[ eq(MY_NAME), Field ]\r\n)\r\nq2 = Select(\r\n 'SocialMedia you' name=Field, id=Where.formula('{af} = n.friend')\r\n)\r\nprint( Recursive('Network', [q1, q2]) )\r\n```\r\nThe result is...\r\n```\r\nWITH RECURSIVE Network AS (\r\n SELECT me.name FROM SocialMedia me WHERE \r\n me.name = 'J\u00falio Cascalles'\r\nUNION ALL\r\n SELECT you.name FROM SocialMedia you , Network n\r\n WHERE you.id = n.friend\r\n)SELECT * FROM Network\r\n```\r\n\r\n* **17.2.1 - The `create` method** ... parameters :\r\n<a id=\"cte_create_method\"></a>\r\n - name: The name of the CTE\r\n - pattern: A cypher script that defines the tables used\r\n - formula: The format for `Where.formula` method _(*)_\r\n - init_value: The value for the condition in the first table\r\n - format (optional): If tables are files or internet hiperlinks, you may especify the extension and/or folder...\r\n> Example: \r\n```\r\n R = Recursive.create(\r\n 'Route R', 'Flyght(departure, arrival)',\r\n '[2] = R.[1]', 'JFK', format='.csv'\r\n ) # ^^^--- Flyghts from JFK airport\r\n```\r\n\r\n_...Creates a recursive CTE called Route, using Flyght table, where the recursivity condition is Flyght.arrival equals to Route.departure_\r\n>> (*) -- Note that [1] and [2] refers to first field and second field. \ud83d\ude09\r\n\r\nResult:\r\n\r\n WITH RECURSIVE Route AS (\r\n SELECT f1.departure, f1.arrival\r\n FROM Flyght.csv f1\r\n WHERE f1.departure = 'JFK'\r\n UNION ALL\r\n SELECT f2.departure, f2.arrival\r\n FROM Flyght.csv f2\r\n , Route R\r\n WHERE f2.arrival = R.departure\r\n )SELECT * FROM Route R\r\n\r\n**17.2.2 - The `join` method** \r\n\r\nIn the previous example, if you add this code...\r\n`R.join('Airport(*id,name)', 'departure, arrival', format='.csv')`\r\n\r\n...The result would be:\r\n\r\n WITH RECURSIVE Route AS (\r\n SELECT f1.departure, f1.arrival\r\n FROM Flyght.csv f1\r\n WHERE f1.departure = 'JFK'\r\n UNION ALL\r\n SELECT f2.departure, f2.arrival\r\n FROM Flyght.csv f2\r\n , Route R\r\n WHERE f2.arrival = R.departure\r\n )SELECT\r\n a1.name, a2.name\r\n FROM\r\n Route R\r\n JOIN Airport.csv a2 ON (R.arrival = a2.id)\r\n JOIN Airport.csv a1 ON (R.departure = a1.id)\r\n\r\n\r\n**17.2.3 - The `counter` method** \r\nAdds an increment field in queries inside CTE:\r\n> Examples:\r\n* `R.counter('stops', 0)` # -- counter starts with 0 and increment +1\r\n* `R2.counter('generation', 5, '- 1')` # -- for the code below...\r\n```\r\nR2 = Recursive.create(\r\n 'Ancestors a', 'People(id,name,father,mother,birth)',\r\n '(% = a.father OR % = a.mother)', 32630, '.parquet'\r\n)\r\n```\r\n...Results:\r\n\r\n WITH RECURSIVE Ancestors AS (\r\n SELECT p1.id, p1.name, p1.father, p1.mother, p1.birth,\r\n 5 AS generation /* <<---- Most current generation ------------*/\r\n FROM People.parquet p1 WHERE p1.id = 32630\r\n UNION ALL\r\n SELECT p2.id, p2.name, p2.father, p2.mother, p2.birth,\r\n (generation- 1) AS generation /* <<-- Previous generation -----*/\r\n FROM People.parquet p2 , Ancestors a WHERE (p2.id = a.father OR p2.id = a.mother)\r\n )SELECT * FROM Ancestors a\r\n\r\n\r\n>> Note: Comments added later.\r\n---\r\n\r\n### 17.3 - CTEFactory class\r\nCTEFactory exchanges subqueries for CTEs, simply by passing the text of the \"dirty\" query:\r\n\r\n*Example*:\r\n```\r\nprint(\r\n CTEFactory(\"\"\"\r\n SELECT u001.name, agg_sales.total\r\n FROM (\r\n SELECT * FROM Users u\r\n WHERE u.status = 'active'\r\n ) AS u001\r\n JOIN (\r\n SELECT s.user_id, Sum(s.value) as total\r\n FROM Sales s\r\n GROUP BY s.user_id\r\n )\r\n As agg_sales\r\n ON u001.id = agg_sales.user_id\r\n ORDER BY u001.name\r\n \"\"\") \r\n)\r\n```\r\nresults...\r\n```\r\n WITH u001 AS (\r\n SELECT * FROM Users u\r\n WHERE u.status = 'active'\r\n ),\r\n WITH agg_sales AS (\r\n SELECT s.user_id, Sum(s.value) as total\r\n FROM Sales s\r\n GROUP BY s.user_id\r\n )\r\n SELECT\r\n u001.name,\r\n agg_sales.total\r\n FROM\r\n u001 u001\r\n JOIN agg_sales agg_sales ON\r\n (u001.id = agg_sales.user_id)\r\n ORDER BY\r\n u001.name\r\n```\r\n\r\n#### 17.3.1 - You can also pass a Cypher script like in the example below:\r\n\r\n cte = CTEFactory(\"\"\"\r\n Annual_Sales_per_Vendor[\r\n Sales(\r\n year$ref_date:ref_year@, sum$quantity:qty_sold,\r\n vendor) <- Vendor(id, name:vendors_name@)\r\n ]\r\n\r\n [-1](**, ref_year) -> Goal(year, target)\r\n \"\"\")\r\n print(cte)\r\n\r\nresults...\r\n```\r\nWITH Annual_Sales_per_Vendor AS (\r\n SELECT ven.name as vendors_name\r\n , Year(sal.ref_date) as ref_year\r\n , Sum(sal.quantity) as qty_sold\r\n FROM Vendor ven LEFT JOIN Sales sal ON (ven.id = sal.vendor\r\n GROUP BY ven.name, ref_year\r\n)\r\nSELECT\r\n aspv.vendors_name,\r\n aspv.ref_year,\r\n aspv.qty_sold,\r\n goa.target\r\nFROM\r\n Annual_Sales_per_Vendor aspv\r\n RIGHT JOIN Goal goa ON (aspv.ref_year = goa.year)\r\n```\r\nFor more details, see the [Cypher syntax](#cypher_separators) or [CTE create method](#cte_create_method)!\r\n\r\n\r\n---\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Allows you to create objects for parts of SQL query commands. Also to combine these objects by joining them, adding or removing parts...",
"version": "1.20250726",
"project_urls": {
"Homepage": "https://github.com/julio-cascalles/sql_blocks"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5064b05117f784d0828bf400a1f0ce3dffceee69e8460c72230e44717be3c9aa",
"md5": "9dd3453c03f7a91d416083e55bd7ea25",
"sha256": "f9c9cab23821d7eea8a0ffcd1e2be3e2fed2bbab0ff9cfbf3120e2724feef62b"
},
"downloads": -1,
"filename": "sql_blocks-1.20250726-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9dd3453c03f7a91d416083e55bd7ea25",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 30832,
"upload_time": "2025-07-26T16:37:00",
"upload_time_iso_8601": "2025-07-26T16:37:00.246858Z",
"url": "https://files.pythonhosted.org/packages/50/64/b05117f784d0828bf400a1f0ce3dffceee69e8460c72230e44717be3c9aa/sql_blocks-1.20250726-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0b3f1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10",
"md5": "ace0d97c450330c1128bead73329d9f5",
"sha256": "c95c4b5ccc36f618d50075658d6ff3546bed62b2bee8977ec9cef8399d683282"
},
"downloads": -1,
"filename": "sql_blocks-1.20250726.tar.gz",
"has_sig": false,
"md5_digest": "ace0d97c450330c1128bead73329d9f5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 39240,
"upload_time": "2025-07-26T16:37:01",
"upload_time_iso_8601": "2025-07-26T16:37:01.655025Z",
"url": "https://files.pythonhosted.org/packages/0b/3f/1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10/sql_blocks-1.20250726.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-26 16:37:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "julio-cascalles",
"github_project": "sql_blocks",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "sql-blocks"
}