sql-blocks

Name	sql-blocks JSON
Version	1.20250726 JSON
	download
home_page	https://github.com/julio-cascalles/sql_blocks
Summary	Allows you to create objects for parts of SQL query commands. Also to combine these objects by joining them, adding or removing parts...
upload_time	2025-07-26 16:37:01
maintainer	None
docs_url	None
author	Júlio Cascalles
requires_python	>=3.8
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # SQL_Blocks

## _SQL_Blocks_ is useful for building complex SQL commands through smaller query blocks:

---

### 1 - You can assemble a simple object that will then be converted into an SQL command:

> a = Select('Actor') # --> SELECT * FROM Actor act

_Note that an alias "act" has been added._

You can specify your own alias:  `a = Select('Actor a')`

---
### 2 - You can also add a field, contains this...

* a = Select('Actor a', **name=Field**)

* Here are another ways to add a field:
    - Select('Actor a', name=Distinct )

    - Select('Actor a', name=NamedField('actors_name'))

    - Select(
        'Actor a', 
        name=NamedField('actors_name', Distinct)
    )


    2.1 -- Using expression as a field:
```
    Select(
        'Product',
        due_date=NamedField(
            'YEAR_ref',
            ExpressionField('extract(year from {f})') #  <<---
        )
    )
```
...should return: 
**SELECT extract(year from due_date) as YEAR_ref...**

Possible tags in ExpressionField:
* {f} - The field name;
* {af} - The field name preceded by the table alias;
    > Can be written as {a.f} or %
* {t} - The table name;
* {a} - Only the table alias.
---

### 3 - To set conditions, use **Where**:
* For example, `a = Select(... age=gt(45) )`

    Some possible conditions:
    * field=eq(value) - ...the field is EQUAL to the value;
    * field=gt(value) - ...the field is GREATER than the value;
    * field=lt(value) - ...the field is LESS than the value;

> You may use Where.**eq**, Where.**gt**, Where.**lt** ... or simply **eq**, **gt**, **lt** ... 😉

3.1 -- If you want to filter the field on a range of values:

`a = Select( 'Actor a', age=Between(45, 69) )`
...but if it is a time slot within the same day, you can do it like this:
`Select(..., event_date=SameDay("2024-10-03"))`
This results in
```
    SELECT ...
    WHERE
        event_date >= '2024-10-03 00:00:00' AND
        event_date <= '2024-10-03 23:59:59'
```
---

3.2 -- Sub-queries:
```
query = Select('Movie m', title=Field,
    id=SelectIN(
        'Review r',
        rate=gt(4.5),
        movie_id=Distinct
    )
)
```

**>> print(query)** 

        SELECT
            m.title
        FROM
            Movie m
        WHERE
            m.id IN (
                SELECT DISTINCT r.movie
                FROM Review r WHERE r.rate > 4.5
            )
    
3.3 -- Optional conditions:
```
    OR=Options(
        genre=eq("Sci-Fi"),
        awards=contains("Oscar")
    )
    AND=Options(
        ..., name=startswith('Chris')
    )
```

3.4 -- Negative conditions use the _Not_ class instead of _Where_
```
based_on_book=Not.is_null()
```

3.5 -- List of values
```
hash_tag=inside(['space', 'monster', 'gore'])
```

3.6 -- Combining ExpressionField with Where condition:
* The **formula** method allows you to write an expression as a condition:
```
query=Select(
    'Folks f2',
    id=Where.formula('({af} = a.father OR {af} = a.mother)')
)
```
> Results: `WHERE...f2.id = a.father OR f2.id = a.mother`

---
### 4 - A field can be two things at the same time:

* m = Select('Movie m' release_date=[Field, OrderBy])
    - This means that the field will appear in the results and also that the query will be ordered by that field.
* Applying **GROUP BY** to item 3.2, it would look contains this:
    ```    
    SelectIN(
        'Review r', movie=[GroupBy, Distinct],
        rate=Having.avg(gt(4.5))
    )
    ```
---
### 5 - Relationships:
```
    query = Select('Actor a', name=Field,
        cast=Select('Cast c', id=PrimaryKey)
    )
```
**>> print(query)**    
```
SELECT
    a.name
FROM
    Actor a
    JOIN Cast c ON (a.cast = c.id)    
```

---
**5.1  Multiple tables without JOIN**
> Warning: This is **NOT** recommended!  ⛔


#### Example:
    singer = Select(
        "Singer artist", id=PrimaryKey,
        name=NamedField('artist_name')
    )
    album = Select (
        "Album album",
        name=NamedField('album_name'),
        artist_id=Where.join(singer), #  <===== 👀
    )
**>> print(query)**    

    SELECT
            album.name as album_name,
            artist.name as artist_name,
            album.year_recorded
    FROM
            Album album
            ,Singer artist
    WHERE
            (album.artist_id = artist.id)

(*) --> For more than one relationship, use the **pairs** parameter.


---
### 6 - The reverse process (parse):
```
text = """
        SELECT
                cas.role,
                m.title,
                m.release_date,
                a.name as actors_name
        FROM
                Actor a
                LEFT JOIN Cast cas ON (a.cast = cas.id)
                LEFT JOIN Movie m ON (cas.movie = m.id)
        WHERE
                (
                    m.genre = 'Sci-Fi'
                    OR
                    m.awards LIKE '%Oscar%'
                )
                AND a.age <= 69 AND a.age >= 45
        ORDER BY
                m.release_date DESC
"""
```

`a, c, m = Select.parse(text)`

**6.1  --- print(a)**
```
    SELECT
            a.name as actors_name
    FROM
            Actor a
    WHERE
            a.age <= 69
            AND a.age >= 45
```

**6.2 --- print(c)**

    SELECT
            c.role
    FROM
            Cast c

**6.3 --- print(m)**

    SELECT
            m.title,
            m.release_date
    FROM
            Movie m
    WHERE
            ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )
    ORDER BY
            m.release_date DESC



**6.4 --- print(a+c)**

    SELECT
            a.name as actors_name,
            cas.role
    FROM
            Actor a
            JOIN Cast cas ON (a.cast = cas.id)
    WHERE
            a.age >= 45
            AND a.age <= 69

**6.5 --- print(c+m)** 
> `... or  print(m+c)`

    SELECT
            cas.role,
            m.title,
            m.release_date,
            m.director
    FROM
            Cast cas
            JOIN Movie m ON (cas.movie = m.id)
    WHERE
            ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )
            AND m.director LIKE '%Coppola%'
    ORDER BY
            m.release_date,
            m.director
---

### 7 - You can add or delete attributes directly in objects:
* a(gender=Field)
* m.delete('director')

---

### 8 - Defining relationship on separate objects:
```
a = Select...
c = Select...
m = Select...
```
`a + c => ERROR: "No relationship found between Actor and Cast"`

8.1 - But...

    a( cast=ForeignKey('Cast') )
    c(id=PrimaryKey)

**a + c => Ok!**

8.2

    c( movie=ForeignKey('Movie') )
    m(id=PrimaryKey)

> **c + m => Ok!**
>> **m + c => Ok!**


---
**8.3 Difference between queries**
```
    STATUS_DELIVERED_OK = 93
    orders = Select('orders',
        customer_id=ForeignKey('customers'),
        status=eq(STATUS_DELIVERED_OK)
    )
    customers = Select('customers'
        id=PrimaryKey, name=Field
    )
    gap = orders - customers
```
return _customers without orders_:

    SELECT
            c.name
    FROM
            customers c
    WHERE
            NOT c.id IN (
                SELECT o.customer_id FROM orders o
                WHERE o.status = 93
            )
---

### 9 - Comparing objects

9.1
```
        a1 = Select.parse('''
                SELECT gender, Max(act.age) FROM Actor act
                WHERE act.age <= 69 AND act.age >= 45
                GROUP BY gender
            ''')[0]

        a2 = Select('Actor',
            age=[ Between(45, 69), Max ],
            gender=[GroupBy, Field]
        )       
```
> **a1 == a2 # --- True!**



9.2
```
    m1 = Select.parse("""
        SELECT title, release_date FROM Movie m ORDER BY release_date 
        WHERE m.genre = 'Sci-Fi' AND m.awards LIKE '%Oscar%'
    """)[0]

    m2 = Select.parse("""
        SELECT release_date, title
        FROM Movie m
        WHERE m.awards LIKE '%Oscar%' AND m.genre = 'Sci-Fi'
        ORDER BY release_date 
    """)[0]
```

**m1 == m2  #  --- True!**


9.3
```
best_movies = SelectIN(
    Review=Table('role'),
    rate=[GroupBy, Having.avg(gt(4.5))]
)
m1 = Select(
    Movie=Table('title,release_date'),
    id=best_movies
)

sql = "SELECT rev.role FROM Review rev GROUP BY rev.rate HAVING Avg(rev.rate) > 4.5"
m2 = Select(
    'Movie', release_date=Field, title=Field,
    id=Where(f"IN ({sql})")
)
```
**m1 == m2 # --- True!**

---

### 10 - CASE...WHEN...THEN
    Select(
        'Product',
        label=Case('price').when(
            lt(50), 'cheap'
        ).when(
            gt(100), 'expensive'
        ).else_value(
            'normal'
        )
    )

* 10.1 - If the labels used in the CASE are based on ranges of values in sequence, you can use the **Range class**:

        query = Select(
            'People p',
            age_group=Range('age',{  # <<---------- 
                'adult': 50,
                'teenager': 17,
                'child': 10,
                'elderly': 70,
                'young': 21,
            })
        )
is equivalent to...
```
        SELECT
                CASE
                    WHEN p.age BETWEEN 0 AND 10 THEN 'child'
                    WHEN p.age BETWEEN 11 AND 17 THEN 'teenager'
                    WHEN p.age BETWEEN 18 AND 21 THEN 'young'
                    WHEN p.age BETWEEN 22 AND 50 THEN 'adult'
                    WHEN p.age BETWEEN 51 AND 70 THEN 'elderly'
                END AS age_group
        FROM
                People p
```

* 10.2 `If` class

Usefull to conditional Sum, Avg, Count...

**Example:**

    Select('Loan', 
        penalty=If('days_late', Sum, gt(0))
    )
    # ...OR... penalty=If('days_late > 0', Sum)

results...
```
SELECT
        Sum(CASE
                WHEN days_late > 0 THEN penalty
                ELSE 0
        END)
FROM
        Emprestimo
```

---

### 11 - optimize method
    p1 = Select.parse("""
            SELECT * FROM Product p
            WHERE (p.category = 'Gizmo'
                    OR p.category = 'Gadget'
                    OR p.category = 'Doohickey')
                AND NOT price <= 387.64
                AND YEAR(last_sale) = 2024
            ORDER BY
                category
        """)[0]
        p1.optimize() #  <<===============
        p2 = Select.parse("""
            SELECT category FROM Product p
            WHERE category IN ('Gizmo','Gadget','Doohickey')
                and p.price > 387.64
                and p.last_sale >= '2024-01-01'
                and p.last_sale <= '2024-12-31'
            ORDER BY p.category LIMIT 100
        """)[0]
        p1 == p2 # --- True!

 This will...
* Replace `OR` conditions to `SELECT IN ...`
* Put `LIMIT` if no fields or conditions defined;
* Normalizes inverted conditions;
* Auto includes fields present in `ORDER/GROUP BY`;
* Replace `YEAR` function with date range comparison.

> The method allows you to select which rules you want to apply in the optimization...Or define your own rules!

>> NOTE: When a joined table is used only as a filter, it is possible that it can be changed to a sub-query:

    query = Select(
        'Installments i', due_date=Field,  customer=Select(
            'Customer c', id=PrimaryKey,
            name=endswith('Smith')
        )
    )
    print(query)
    print('-----')
    query.optimize([RuleReplaceJoinBySubselect])
    print(query)
```
SELECT
        i.due_date
FROM
        Installments i
        JOIN Customer c ON (i.customer = c.id)
WHERE
        c.name LIKE '%Smith'
-----
SELECT
        i.due_date
FROM
        Installments i
WHERE
        i.customer IN (SELECT c.id FROM Customer c WHERE c.name LIKE '%Smith')
```

---

### 12 - Adding multiple fields at once
```
    query = Select('post p')
    query.add_fields(
        'user_id, created_at',
        [OrderBy, GroupBy]
    )
```
...is the same as...
```
    query = Select(
        'post p',
        user_id=[Field, GroupBy, OrderBy],
        created_at=[Field, GroupBy, OrderBy]
    )
```

### 13 - Change parser engine
```
a, c, m = Select.parse(
    """
        Actor(name, id ?age = 40)
        <- Cast(actor_id, movie_id) ->
        Movie(id ^title)
    """,
    CypherParser
    # ^^^ recognizes syntax like Neo4J queries
)
```

**print(a+c+m)**
```
SELECT
        act.name,
        mov.title
FROM
        Cast cas
        JOIN Movie mov ON (cas.movie_id = mov.id)
        JOIN Actor act ON (cas.actor_id = act.id)
WHERE
        act.age = 40
ORDER BY
        mov.title
```
---

#### Separators and meaning:
<a id="cypher_separators"></a>

* `(  )`  Delimits a table and its fields
* `,` Separate fields
* `?` For simple conditions (> < = <>)
* `<-` connects to the table on the left
* `->` connects to the table on the right
* `^` Put the field in the ORDER BY clause
* `@` Immediately after the table/field name, it indicates the grouping field.
* `$` For SQL functions like **avg**$_field_, **sum**$_field_, **count**$_field_...
* `*` Sets the primary key field.
* `:` Allows you to assign an alias to the field or expression.


---
## `detect` function

It is useful to write a query in a few lines, without specifying the script type (cypher, mongoDB, SQL, Neo4J...)
### Examples:

> **13.1 - Relationship**
```
query = detect(
    'MATCH(c:Customer)<-[:Order]->(p:Product)RETURN c, p'
)
print(query)
```
##### output:
    SELECT * FROM
        Order ord
        LEFT JOIN Customer cus ON (ord.customer_id = cus.id)
        RIGHT JOIN Product pro ON (ord.product_id = pro.id)
> **13.2 - Grouping**
```
query = detect(
    'People@gender(avg$age?region="SOUTH"^count$qtde)'
)
print(query)
```
##### output:
    SELECT
            peo.gender,
            Avg(peo.age),
            Count(*) as qtde
    FROM
            People peo
    WHERE
            peo.region = "SOUTH"
    GROUP BY
            peo.gender
    ORDER BY
            peo.qtde

> **13.3 - Many conditions...**
```
    print( detect('''
        db.people.find({
            {
                $or: [
                    {status:{$eq:"B"}},
                    age:{$lt:50}
                ]
            },
            age:{$gte:18},  status:{$eq:"A"}
        },{
            name: 1, user_id: 1
        }).sort({
            user_id: -1
        })
    ''') )
```
#### output:
    SELECT
            peo.name,
            peo.user_id
    FROM
            people peo
    WHERE
            ( peo. = 'B' OR peo.age < 50 ) AND
            peo.age >= 18 AND
            peo.status = 'A'
    ORDER BY
            peo.user_id DESC

> **13.4 - Relations with same table twice (or more)**

Automatically assigns aliases to each side of the relationship (In this example, one user invites another to add to their contact list)
```
    print( detect(
        'User(^name,id) <-Contact(requester,guest)-> User(id,name)'
       # ^^^ u1                                        ^^^ u2
    ) )
```
    SELECT
            u1.name,
            u2.name
    FROM
            Contact con
            RIGHT JOIN User u2 ON (con.guest = u2.id)
            LEFT JOIN User u1 ON (con.requester = u1.id)
    ORDER BY
            u1.name

---
### `translate_to` method
From a Select object, it returns the text to a script in any of the languages below:
* QueryLanguage - default
* MongoDBLanguage
* Neo4JLanguage
* DatabricksLanguage
* PandasLanguage

---
### 14 - Window Function

Aggregation functions (Avg, Min, Max, Sum, Count) -- or Window functions (Lead, Lag, Row_Number, Rank) -- have the **over** method...

    query=Select(
        'Enrollment e',
        payment=Sum().over(
            student_id=Partition, due_date=OrderBy,
            # _=Rows(Current(), Following(5)), 
               # ^^^-------> ROWS BETWEEN CURRENT ROW AND 5 FOLLOWING
            # _=Rows(Preceding(3), Following()),
               # ^^^-------> ROWS BETWEEN 3 PRECEDING AND UNBOUNDED FOLLOWING
            # _=Rows(Preceding(3)) 
               # ^^^-------> ROWS 3 PRECEDING
        ).As('sum_per_student')
    )

...that generates the following query:

```
SELECT
        Sum(e.payment) OVER(
                PARTITION BY student_id
                ORDER BY due_date
        ) as sum_per_student
FROM
        Enrollment e
```
---
### 15 - The `As` method:
    query=Select(
        'Customers c',
        phone=[
            Not.is_null(),
            SubString(1, 4).As('area_code', GroupBy)
        ],
        customer_id=[
            Count().As('customer_count', OrderBy),
            Having.count(gt(5))
        ]
    )
You can use the result of a function as a new field -- and optionally use it in ORDER BY and/or GROUP BY clause(s):
```
SELECT
        SubString(c.phone, 1, 4) as area_code,
        Count(c.customer_id) as customer_count
FROM
        Customers c
WHERE
        NOT c.phone IS NULL
GROUP BY
        area_code HAVING Count(c.customer_id) > 5
ORDER BY
        customer_count
```
---
### 16 - Function classes
You may use this functions:
* SubString
* Round
* DateDiff
* Year
* Current_Date
* Avg
* Min
* Max
* Sum
* Count
* Lag
* Lead
* Row_Number
* Rank
* Coalesce
* Cast
> Some of these functions may vary in syntax depending on the database.
For example, if your query is going to run on Oracle, do the following:

`Function.dialect = Dialect.ORACLE`


>  Most of this functions you can use nested inside each other.
*Example:*
```
    Select(...
        event_date=Substring(
            Cast("CHAR"), 12, 19 
        ).As('time')
    )
```
Results...
```
    SELECT ...
    SubString(Cast(event_date As char), 12, 19) as time
```

>> `Function.auto_convert` option (default: True)

- Put Cast(...) when there is a difference between the types of the parameter and the return of the nested function
```
birth=Round( DateDiff(Current_Date()) ).As('age')
```
...Returns...
```
SELECT
    Round(
        Cast(Current_Date() - p.birth As FLOAT)
       /* ^^^  */
    ) as age
...
```
---

### 16.1 - _GroupBy as instance_

Another way to use GroupBy is to pass functions as parameters:

```
    Function.dialect = Dialect.ORACLE
    query = Select(
        'Sales s',
        ref_date=GroupBy(
            ref_year=Year, qty_sold=Sum('quantity'),
            vendor=Select(
                'Vendor v',
                id=[PrimaryKey, Field], name=Field
            )
        )
    )
    print(query)
```
results..
```
SELECT
        Extract(Year FROM s.ref_date) as ref_year,
        Sum(quantity) as qty_sold,
        v.id,
        v.name
FROM
        Sales s
        JOIN Vendor v ON (s.vendor = v.id)
GROUP BY
        ref_year,
        v.id,
        v.name
```

---
### 17 - CTE and Recursive classes

* **17.1 - _CTE class_**
```
    query = Select(
        'SocialMedia s', post=Count, reaction=Sum, user=GroupBy
    )
    print( CTE('Metrics', [query]) )
```
The result is...
```
    WITH Metrics AS (
            SELECT Count(s.post), Sum(s.reaction) FROM SocialMedia s GROUP BY user
    )SELECT * FROM Metrics
```

* **17.2 - _Recursive class_**
```
q1 = Select(
    'SocialMedia me', name=[ eq(MY_NAME), Field ]
)
q2 = Select(
    'SocialMedia you' name=Field, id=Where.formula('{af} = n.friend')
)
print( Recursive('Network', [q1, q2]) )
```
The result is...
```
WITH RECURSIVE Network AS (
        SELECT me.name FROM SocialMedia me WHERE 
        me.name = 'Júlio Cascalles'
UNION ALL
        SELECT you.name FROM SocialMedia you , Network n
        WHERE  you.id = n.friend
)SELECT * FROM Network
```

* **17.2.1 - The `create` method** ... parameters :
<a id="cte_create_method"></a>
    - name: The name of the CTE
    - pattern: A cypher script that defines the tables used
    - formula: The format for `Where.formula` method _(*)_
    - init_value: The value for the condition in the first table
    - format (optional): If tables are files or internet hiperlinks, you may especify the extension and/or folder...
> Example: 
```
    R = Recursive.create(
        'Route R', 'Flyght(departure, arrival)',
        '[2] = R.[1]',  'JFK',  format='.csv'
    ) #                  ^^^--- Flyghts from JFK airport
```

_...Creates a recursive CTE called Route, using Flyght table, where the recursivity condition is Flyght.arrival equals to Route.departure_
>> (*) -- Note that [1] and [2] refers to first field and second field. 😉

Result:

    WITH RECURSIVE Route AS (
            SELECT f1.departure, f1.arrival
            FROM Flyght.csv f1
            WHERE f1.departure = 'JFK'
    UNION ALL
            SELECT f2.departure, f2.arrival
            FROM Flyght.csv f2
            , Route R
            WHERE  f2.arrival = R.departure
    )SELECT * FROM Route R

**17.2.2 - The `join` method** 

In the previous example, if you add this code...
`R.join('Airport(*id,name)', 'departure, arrival', format='.csv')`

...The result would be:

    WITH RECURSIVE Route AS (
        SELECT f1.departure, f1.arrival
        FROM Flyght.csv f1
        WHERE f1.departure = 'JFK'
    UNION ALL
        SELECT f2.departure, f2.arrival
        FROM Flyght.csv f2
        , Route R
        WHERE  f2.arrival = R.departure
    )SELECT
        a1.name, a2.name
    FROM
        Route R
        JOIN Airport.csv a2 ON (R.arrival = a2.id)
        JOIN Airport.csv a1 ON (R.departure = a1.id)


**17.2.3 - The `counter` method** 
Adds an increment field in queries inside CTE:
> Examples:
* `R.counter('stops', 0)` # -- counter starts with 0 and increment +1
* `R2.counter('generation', 5, '- 1')` # -- for the code below...
```
R2 = Recursive.create(
    'Ancestors a', 'People(id,name,father,mother,birth)',
    '(% = a.father OR % = a.mother)', 32630, '.parquet'
)
```
...Results:

    WITH RECURSIVE Ancestors AS (
        SELECT p1.id, p1.name, p1.father, p1.mother, p1.birth,
        5 AS generation  /* <<---- Most current generation ------------*/
        FROM People.parquet p1 WHERE p1.id = 32630
    UNION ALL
        SELECT p2.id, p2.name, p2.father, p2.mother, p2.birth,
        (generation- 1) AS generation /* <<-- Previous generation -----*/
        FROM People.parquet p2 , Ancestors a WHERE  (p2.id = a.father OR p2.id = a.mother)
    )SELECT * FROM Ancestors a


>> Note: Comments added later.
---

### 17.3 - CTEFactory class
CTEFactory exchanges subqueries for CTEs, simply by passing the text of the "dirty" query:

*Example*:
```
print(
        CTEFactory("""
            SELECT u001.name, agg_sales.total
            FROM (
                SELECT * FROM Users u
                WHERE u.status = 'active'
            ) AS u001
            JOIN (
                SELECT s.user_id, Sum(s.value) as total
                FROM Sales s
                GROUP BY s.user_id
            )
            As agg_sales
            ON u001.id = agg_sales.user_id
            ORDER BY u001.name
        """)        
)
```
results...
```
    WITH u001 AS (
        SELECT * FROM Users u
        WHERE u.status = 'active'
    ),
    WITH agg_sales AS (
        SELECT s.user_id, Sum(s.value) as total
        FROM Sales s
        GROUP BY s.user_id
    )
    SELECT
            u001.name,
            agg_sales.total
    FROM
            u001 u001
            JOIN agg_sales agg_sales ON
            (u001.id = agg_sales.user_id)
    ORDER BY
            u001.name
```

#### 17.3.1 - You can also pass a Cypher script like in the example below:

    cte = CTEFactory("""
        Annual_Sales_per_Vendor[
            Sales(
                year$ref_date:ref_year@, sum$quantity:qty_sold,
            vendor) <- Vendor(id, name:vendors_name@)
        ]

        [-1](**, ref_year) -> Goal(year, target)
    """)
    print(cte)

results...
```
WITH Annual_Sales_per_Vendor AS (
    SELECT ven.name as vendors_name
    , Year(sal.ref_date) as ref_year
    , Sum(sal.quantity) as qty_sold
    FROM Vendor ven LEFT JOIN Sales sal ON (ven.id = sal.vendor
    GROUP BY ven.name, ref_year
)
SELECT
        aspv.vendors_name,
        aspv.ref_year,
        aspv.qty_sold,
        goa.target
FROM
        Annual_Sales_per_Vendor aspv
        RIGHT JOIN Goal goa ON (aspv.ref_year = goa.year)
```
For more details, see the [Cypher syntax](#cypher_separators) or [CTE create method](#cte_create_method)!


---

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/julio-cascalles/sql_blocks",
    "name": "sql-blocks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "J\u00falio Cascalles",
    "author_email": "Julio Cascalles <julio.cascalles@outlook.com>",
    "download_url": "https://files.pythonhosted.org/packages/0b/3f/1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10/sql_blocks-1.20250726.tar.gz",
    "platform": null,
    "description": "# SQL_Blocks\r\n\r\n## _SQL_Blocks_ is useful for building complex SQL commands through smaller query blocks:\r\n\r\n---\r\n\r\n### 1 - You can assemble a simple object that will then be converted into an SQL command:\r\n\r\n> a = Select('Actor') # --> SELECT * FROM Actor act\r\n\r\n_Note that an alias \"act\" has been added._\r\n\r\nYou can specify your own alias:  `a = Select('Actor a')`\r\n\r\n---\r\n### 2 - You can also add a field, contains this...\r\n\r\n* a = Select('Actor a', **name=Field**)\r\n\r\n* Here are another ways to add a field:\r\n    - Select('Actor a', name=Distinct )\r\n\r\n    - Select('Actor a', name=NamedField('actors_name'))\r\n\r\n    - Select(\r\n        'Actor a', \r\n        name=NamedField('actors_name', Distinct)\r\n    )\r\n\r\n\r\n    2.1 -- Using expression as a field:\r\n```\r\n    Select(\r\n        'Product',\r\n        due_date=NamedField(\r\n            'YEAR_ref',\r\n            ExpressionField('extract(year from {f})') #  <<---\r\n        )\r\n    )\r\n```\r\n...should return: \r\n**SELECT extract(year from due_date) as YEAR_ref...**\r\n\r\nPossible tags in ExpressionField:\r\n* {f} - The field name;\r\n* {af} - The field name preceded by the table alias;\r\n    > Can be written as {a.f} or %\r\n* {t} - The table name;\r\n* {a} - Only the table alias.\r\n---\r\n\r\n### 3 - To set conditions, use **Where**:\r\n* For example, `a = Select(... age=gt(45) )`\r\n\r\n    Some possible conditions:\r\n    * field=eq(value) - ...the field is EQUAL to the value;\r\n    * field=gt(value) - ...the field is GREATER than the value;\r\n    * field=lt(value) - ...the field is LESS than the value;\r\n\r\n> You may use Where.**eq**, Where.**gt**, Where.**lt** ... or simply **eq**, **gt**, **lt** ... \ud83d\ude09\r\n\r\n3.1 -- If you want to filter the field on a range of values:\r\n\r\n`a = Select( 'Actor a', age=Between(45, 69) )`\r\n...but if it is a time slot within the same day, you can do it like this:\r\n`Select(..., event_date=SameDay(\"2024-10-03\"))`\r\nThis results in\r\n```\r\n    SELECT ...\r\n    WHERE\r\n        event_date >= '2024-10-03 00:00:00' AND\r\n        event_date <= '2024-10-03 23:59:59'\r\n```\r\n---\r\n\r\n3.2 -- Sub-queries:\r\n```\r\nquery = Select('Movie m', title=Field,\r\n    id=SelectIN(\r\n        'Review r',\r\n        rate=gt(4.5),\r\n        movie_id=Distinct\r\n    )\r\n)\r\n```\r\n\r\n**>> print(query)** \r\n\r\n        SELECT\r\n            m.title\r\n        FROM\r\n            Movie m\r\n        WHERE\r\n            m.id IN (\r\n                SELECT DISTINCT r.movie\r\n                FROM Review r WHERE r.rate > 4.5\r\n            )\r\n    \r\n3.3 -- Optional conditions:\r\n```\r\n    OR=Options(\r\n        genre=eq(\"Sci-Fi\"),\r\n        awards=contains(\"Oscar\")\r\n    )\r\n    AND=Options(\r\n        ..., name=startswith('Chris')\r\n    )\r\n```\r\n\r\n3.4 -- Negative conditions use the _Not_ class instead of _Where_\r\n```\r\nbased_on_book=Not.is_null()\r\n```\r\n\r\n3.5 -- List of values\r\n```\r\nhash_tag=inside(['space', 'monster', 'gore'])\r\n```\r\n\r\n3.6 -- Combining ExpressionField with Where condition:\r\n* The **formula** method allows you to write an expression as a condition:\r\n```\r\nquery=Select(\r\n    'Folks f2',\r\n    id=Where.formula('({af} = a.father OR {af} = a.mother)')\r\n)\r\n```\r\n> Results: `WHERE...f2.id = a.father OR f2.id = a.mother`\r\n\r\n---\r\n### 4 - A field can be two things at the same time:\r\n\r\n* m = Select('Movie m' release_date=[Field, OrderBy])\r\n    - This means that the field will appear in the results and also that the query will be ordered by that field.\r\n* Applying **GROUP BY** to item 3.2, it would look contains this:\r\n    ```    \r\n    SelectIN(\r\n        'Review r', movie=[GroupBy, Distinct],\r\n        rate=Having.avg(gt(4.5))\r\n    )\r\n    ```\r\n---\r\n### 5 - Relationships:\r\n```\r\n    query = Select('Actor a', name=Field,\r\n        cast=Select('Cast c', id=PrimaryKey)\r\n    )\r\n```\r\n**>> print(query)**    \r\n```\r\nSELECT\r\n    a.name\r\nFROM\r\n    Actor a\r\n    JOIN Cast c ON (a.cast = c.id)    \r\n```\r\n\r\n---\r\n**5.1  Multiple tables without JOIN**\r\n> Warning: This is **NOT** recommended!  \u26d4\r\n\r\n\r\n#### Example:\r\n    singer = Select(\r\n        \"Singer artist\", id=PrimaryKey,\r\n        name=NamedField('artist_name')\r\n    )\r\n    album = Select (\r\n        \"Album album\",\r\n        name=NamedField('album_name'),\r\n        artist_id=Where.join(singer), #  <===== \ud83d\udc40\r\n    )\r\n**>> print(query)**    \r\n\r\n    SELECT\r\n            album.name as album_name,\r\n            artist.name as artist_name,\r\n            album.year_recorded\r\n    FROM\r\n            Album album\r\n            ,Singer artist\r\n    WHERE\r\n            (album.artist_id = artist.id)\r\n\r\n(*) --> For more than one relationship, use the **pairs** parameter.\r\n\r\n\r\n---\r\n### 6 - The reverse process (parse):\r\n```\r\ntext = \"\"\"\r\n        SELECT\r\n                cas.role,\r\n                m.title,\r\n                m.release_date,\r\n                a.name as actors_name\r\n        FROM\r\n                Actor a\r\n                LEFT JOIN Cast cas ON (a.cast = cas.id)\r\n                LEFT JOIN Movie m ON (cas.movie = m.id)\r\n        WHERE\r\n                (\r\n                    m.genre = 'Sci-Fi'\r\n                    OR\r\n                    m.awards LIKE '%Oscar%'\r\n                )\r\n                AND a.age <= 69 AND a.age >= 45\r\n        ORDER BY\r\n                m.release_date DESC\r\n\"\"\"\r\n```\r\n\r\n`a, c, m = Select.parse(text)`\r\n\r\n**6.1  --- print(a)**\r\n```\r\n    SELECT\r\n            a.name as actors_name\r\n    FROM\r\n            Actor a\r\n    WHERE\r\n            a.age <= 69\r\n            AND a.age >= 45\r\n```\r\n\r\n**6.2 --- print(c)**\r\n\r\n    SELECT\r\n            c.role\r\n    FROM\r\n            Cast c\r\n\r\n**6.3 --- print(m)**\r\n\r\n    SELECT\r\n            m.title,\r\n            m.release_date\r\n    FROM\r\n            Movie m\r\n    WHERE\r\n            ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )\r\n    ORDER BY\r\n            m.release_date DESC\r\n\r\n\r\n\r\n**6.4 --- print(a+c)**\r\n\r\n    SELECT\r\n            a.name as actors_name,\r\n            cas.role\r\n    FROM\r\n            Actor a\r\n            JOIN Cast cas ON (a.cast = cas.id)\r\n    WHERE\r\n            a.age >= 45\r\n            AND a.age <= 69\r\n\r\n**6.5 --- print(c+m)** \r\n> `... or  print(m+c)`\r\n\r\n    SELECT\r\n            cas.role,\r\n            m.title,\r\n            m.release_date,\r\n            m.director\r\n    FROM\r\n            Cast cas\r\n            JOIN Movie m ON (cas.movie = m.id)\r\n    WHERE\r\n            ( m.genre = 'Sci-Fi' OR m.awards LIKE '%Oscar%' )\r\n            AND m.director LIKE '%Coppola%'\r\n    ORDER BY\r\n            m.release_date,\r\n            m.director\r\n---\r\n\r\n### 7 - You can add or delete attributes directly in objects:\r\n* a(gender=Field)\r\n* m.delete('director')\r\n\r\n---\r\n\r\n### 8 - Defining relationship on separate objects:\r\n```\r\na = Select...\r\nc = Select...\r\nm = Select...\r\n```\r\n`a + c => ERROR: \"No relationship found between Actor and Cast\"`\r\n\r\n8.1 - But...\r\n\r\n    a( cast=ForeignKey('Cast') )\r\n    c(id=PrimaryKey)\r\n\r\n**a + c => Ok!**\r\n\r\n8.2\r\n\r\n    c( movie=ForeignKey('Movie') )\r\n    m(id=PrimaryKey)\r\n\r\n> **c + m => Ok!**\r\n>> **m + c => Ok!**\r\n\r\n\r\n---\r\n**8.3 Difference between queries**\r\n```\r\n    STATUS_DELIVERED_OK = 93\r\n    orders = Select('orders',\r\n        customer_id=ForeignKey('customers'),\r\n        status=eq(STATUS_DELIVERED_OK)\r\n    )\r\n    customers = Select('customers'\r\n        id=PrimaryKey, name=Field\r\n    )\r\n    gap = orders - customers\r\n```\r\nreturn _customers without orders_:\r\n\r\n    SELECT\r\n            c.name\r\n    FROM\r\n            customers c\r\n    WHERE\r\n            NOT c.id IN (\r\n                SELECT o.customer_id FROM orders o\r\n                WHERE o.status = 93\r\n            )\r\n---\r\n\r\n### 9 - Comparing objects\r\n\r\n9.1\r\n```\r\n        a1 = Select.parse('''\r\n                SELECT gender, Max(act.age) FROM Actor act\r\n                WHERE act.age <= 69 AND act.age >= 45\r\n                GROUP BY gender\r\n            ''')[0]\r\n\r\n        a2 = Select('Actor',\r\n            age=[ Between(45, 69), Max ],\r\n            gender=[GroupBy, Field]\r\n        )       \r\n```\r\n> **a1 == a2 # --- True!**\r\n\r\n\r\n\r\n9.2\r\n```\r\n    m1 = Select.parse(\"\"\"\r\n        SELECT title, release_date FROM Movie m ORDER BY release_date \r\n        WHERE m.genre = 'Sci-Fi' AND m.awards LIKE '%Oscar%'\r\n    \"\"\")[0]\r\n\r\n    m2 = Select.parse(\"\"\"\r\n        SELECT release_date, title\r\n        FROM Movie m\r\n        WHERE m.awards LIKE '%Oscar%' AND m.genre = 'Sci-Fi'\r\n        ORDER BY release_date \r\n    \"\"\")[0]\r\n```\r\n\r\n**m1 == m2  #  --- True!**\r\n\r\n\r\n9.3\r\n```\r\nbest_movies = SelectIN(\r\n    Review=Table('role'),\r\n    rate=[GroupBy, Having.avg(gt(4.5))]\r\n)\r\nm1 = Select(\r\n    Movie=Table('title,release_date'),\r\n    id=best_movies\r\n)\r\n\r\nsql = \"SELECT rev.role FROM Review rev GROUP BY rev.rate HAVING Avg(rev.rate) > 4.5\"\r\nm2 = Select(\r\n    'Movie', release_date=Field, title=Field,\r\n    id=Where(f\"IN ({sql})\")\r\n)\r\n```\r\n**m1 == m2 # --- True!**\r\n\r\n---\r\n\r\n### 10 - CASE...WHEN...THEN\r\n    Select(\r\n        'Product',\r\n        label=Case('price').when(\r\n            lt(50), 'cheap'\r\n        ).when(\r\n            gt(100), 'expensive'\r\n        ).else_value(\r\n            'normal'\r\n        )\r\n    )\r\n\r\n* 10.1 - If the labels used in the CASE are based on ranges of values \u200b\u200bin sequence, you can use the **Range class**:\r\n\r\n        query = Select(\r\n            'People p',\r\n            age_group=Range('age',{  # <<---------- \r\n                'adult': 50,\r\n                'teenager': 17,\r\n                'child': 10,\r\n                'elderly': 70,\r\n                'young': 21,\r\n            })\r\n        )\r\nis equivalent to...\r\n```\r\n        SELECT\r\n                CASE\r\n                    WHEN p.age BETWEEN 0 AND 10 THEN 'child'\r\n                    WHEN p.age BETWEEN 11 AND 17 THEN 'teenager'\r\n                    WHEN p.age BETWEEN 18 AND 21 THEN 'young'\r\n                    WHEN p.age BETWEEN 22 AND 50 THEN 'adult'\r\n                    WHEN p.age BETWEEN 51 AND 70 THEN 'elderly'\r\n                END AS age_group\r\n        FROM\r\n                People p\r\n```\r\n\r\n* 10.2 `If` class\r\n\r\nUsefull to conditional Sum, Avg, Count...\r\n\r\n**Example:**\r\n\r\n    Select('Loan', \r\n        penalty=If('days_late', Sum, gt(0))\r\n    )\r\n    # ...OR... penalty=If('days_late > 0', Sum)\r\n\r\nresults...\r\n```\r\nSELECT\r\n        Sum(CASE\r\n                WHEN days_late > 0 THEN penalty\r\n                ELSE 0\r\n        END)\r\nFROM\r\n        Emprestimo\r\n```\r\n\r\n---\r\n\r\n### 11 - optimize method\r\n    p1 = Select.parse(\"\"\"\r\n            SELECT * FROM Product p\r\n            WHERE (p.category = 'Gizmo'\r\n                    OR p.category = 'Gadget'\r\n                    OR p.category = 'Doohickey')\r\n                AND NOT price <= 387.64\r\n                AND YEAR(last_sale) = 2024\r\n            ORDER BY\r\n                category\r\n        \"\"\")[0]\r\n        p1.optimize() #  <<===============\r\n        p2 = Select.parse(\"\"\"\r\n            SELECT category FROM Product p\r\n            WHERE category IN ('Gizmo','Gadget','Doohickey')\r\n                and p.price > 387.64\r\n                and p.last_sale >= '2024-01-01'\r\n                and p.last_sale <= '2024-12-31'\r\n            ORDER BY p.category LIMIT 100\r\n        \"\"\")[0]\r\n        p1 == p2 # --- True!\r\n\r\n This will...\r\n* Replace `OR` conditions to `SELECT IN ...`\r\n* Put `LIMIT` if no fields or conditions defined;\r\n* Normalizes inverted conditions;\r\n* Auto includes fields present in `ORDER/GROUP BY`;\r\n* Replace `YEAR` function with date range comparison.\r\n\r\n> The method allows you to select which rules you want to apply in the optimization...Or define your own rules!\r\n\r\n>> NOTE: When a joined table is used only as a filter, it is possible that it can be changed to a sub-query:\r\n\r\n    query = Select(\r\n        'Installments i', due_date=Field,  customer=Select(\r\n            'Customer c', id=PrimaryKey,\r\n            name=endswith('Smith')\r\n        )\r\n    )\r\n    print(query)\r\n    print('-----')\r\n    query.optimize([RuleReplaceJoinBySubselect])\r\n    print(query)\r\n```\r\nSELECT\r\n        i.due_date\r\nFROM\r\n        Installments i\r\n        JOIN Customer c ON (i.customer = c.id)\r\nWHERE\r\n        c.name LIKE '%Smith'\r\n-----\r\nSELECT\r\n        i.due_date\r\nFROM\r\n        Installments i\r\nWHERE\r\n        i.customer IN (SELECT c.id FROM Customer c WHERE c.name LIKE '%Smith')\r\n```\r\n\r\n---\r\n\r\n### 12 - Adding multiple fields at once\r\n```\r\n    query = Select('post p')\r\n    query.add_fields(\r\n        'user_id, created_at',\r\n        [OrderBy, GroupBy]\r\n    )\r\n```\r\n...is the same as...\r\n```\r\n    query = Select(\r\n        'post p',\r\n        user_id=[Field, GroupBy, OrderBy],\r\n        created_at=[Field, GroupBy, OrderBy]\r\n    )\r\n```\r\n\r\n### 13 - Change parser engine\r\n```\r\na, c, m = Select.parse(\r\n    \"\"\"\r\n        Actor(name, id ?age = 40)\r\n        <- Cast(actor_id, movie_id) ->\r\n        Movie(id ^title)\r\n    \"\"\",\r\n    CypherParser\r\n    # ^^^ recognizes syntax like Neo4J queries\r\n)\r\n```\r\n\r\n**print(a+c+m)**\r\n```\r\nSELECT\r\n        act.name,\r\n        mov.title\r\nFROM\r\n        Cast cas\r\n        JOIN Movie mov ON (cas.movie_id = mov.id)\r\n        JOIN Actor act ON (cas.actor_id = act.id)\r\nWHERE\r\n        act.age = 40\r\nORDER BY\r\n        mov.title\r\n```\r\n---\r\n\r\n#### Separators and meaning:\r\n<a id=\"cypher_separators\"></a>\r\n\r\n* `(  )`  Delimits a table and its fields\r\n* `,` Separate fields\r\n* `?` For simple conditions (> < = <>)\r\n* `<-` connects to the table on the left\r\n* `->` connects to the table on the right\r\n* `^` Put the field in the ORDER BY clause\r\n* `@` Immediately after the table/field name, it indicates the grouping field.\r\n* `$` For SQL functions like **avg**$_field_, **sum**$_field_, **count**$_field_...\r\n* `*` Sets the primary key field.\r\n* `:` Allows you to assign an alias to the field or expression.\r\n\r\n\r\n---\r\n## `detect` function\r\n\r\nIt is useful to write a query in a few lines, without specifying the script type (cypher, mongoDB, SQL, Neo4J...)\r\n### Examples:\r\n\r\n> **13.1 - Relationship**\r\n```\r\nquery = detect(\r\n    'MATCH(c:Customer)<-[:Order]->(p:Product)RETURN c, p'\r\n)\r\nprint(query)\r\n```\r\n##### output:\r\n    SELECT * FROM\r\n        Order ord\r\n        LEFT JOIN Customer cus ON (ord.customer_id = cus.id)\r\n        RIGHT JOIN Product pro ON (ord.product_id = pro.id)\r\n> **13.2 - Grouping**\r\n```\r\nquery = detect(\r\n    'People@gender(avg$age?region=\"SOUTH\"^count$qtde)'\r\n)\r\nprint(query)\r\n```\r\n##### output:\r\n    SELECT\r\n            peo.gender,\r\n            Avg(peo.age),\r\n            Count(*) as qtde\r\n    FROM\r\n            People peo\r\n    WHERE\r\n            peo.region = \"SOUTH\"\r\n    GROUP BY\r\n            peo.gender\r\n    ORDER BY\r\n            peo.qtde\r\n\r\n> **13.3 - Many conditions...**\r\n```\r\n    print( detect('''\r\n        db.people.find({\r\n            {\r\n                $or: [\r\n                    {status:{$eq:\"B\"}},\r\n                    age:{$lt:50}\r\n                ]\r\n            },\r\n            age:{$gte:18},  status:{$eq:\"A\"}\r\n        },{\r\n            name: 1, user_id: 1\r\n        }).sort({\r\n            user_id: -1\r\n        })\r\n    ''') )\r\n```\r\n#### output:\r\n    SELECT\r\n            peo.name,\r\n            peo.user_id\r\n    FROM\r\n            people peo\r\n    WHERE\r\n            ( peo. = 'B' OR peo.age < 50 ) AND\r\n            peo.age >= 18 AND\r\n            peo.status = 'A'\r\n    ORDER BY\r\n            peo.user_id DESC\r\n\r\n> **13.4 - Relations with same table twice (or more)**\r\n\r\nAutomatically assigns aliases to each side of the relationship (In this example, one user invites another to add to their contact list)\r\n```\r\n    print( detect(\r\n        'User(^name,id) <-Contact(requester,guest)-> User(id,name)'\r\n       # ^^^ u1                                        ^^^ u2\r\n    ) )\r\n```\r\n    SELECT\r\n            u1.name,\r\n            u2.name\r\n    FROM\r\n            Contact con\r\n            RIGHT JOIN User u2 ON (con.guest = u2.id)\r\n            LEFT JOIN User u1 ON (con.requester = u1.id)\r\n    ORDER BY\r\n            u1.name\r\n\r\n---\r\n### `translate_to` method\r\nFrom a Select object, it returns the text to a script in any of the languages \u200b\u200bbelow:\r\n* QueryLanguage - default\r\n* MongoDBLanguage\r\n* Neo4JLanguage\r\n* DatabricksLanguage\r\n* PandasLanguage\r\n\r\n---\r\n### 14 - Window Function\r\n\r\nAggregation functions (Avg, Min, Max, Sum, Count) -- or Window functions (Lead, Lag, Row_Number, Rank) -- have the **over** method...\r\n\r\n    query=Select(\r\n        'Enrollment e',\r\n        payment=Sum().over(\r\n            student_id=Partition, due_date=OrderBy,\r\n            # _=Rows(Current(), Following(5)), \r\n               # ^^^-------> ROWS BETWEEN CURRENT ROW AND 5 FOLLOWING\r\n            # _=Rows(Preceding(3), Following()),\r\n               # ^^^-------> ROWS BETWEEN 3 PRECEDING AND UNBOUNDED FOLLOWING\r\n            # _=Rows(Preceding(3)) \r\n               # ^^^-------> ROWS 3 PRECEDING\r\n        ).As('sum_per_student')\r\n    )\r\n\r\n...that generates the following query:\r\n\r\n```\r\nSELECT\r\n        Sum(e.payment) OVER(\r\n                PARTITION BY student_id\r\n                ORDER BY due_date\r\n        ) as sum_per_student\r\nFROM\r\n        Enrollment e\r\n```\r\n---\r\n### 15 - The `As` method:\r\n    query=Select(\r\n        'Customers c',\r\n        phone=[\r\n            Not.is_null(),\r\n            SubString(1, 4).As('area_code', GroupBy)\r\n        ],\r\n        customer_id=[\r\n            Count().As('customer_count', OrderBy),\r\n            Having.count(gt(5))\r\n        ]\r\n    )\r\nYou can use the result of a function as a new field -- and optionally use it in ORDER BY and/or GROUP BY clause(s):\r\n```\r\nSELECT\r\n        SubString(c.phone, 1, 4) as area_code,\r\n        Count(c.customer_id) as customer_count\r\nFROM\r\n        Customers c\r\nWHERE\r\n        NOT c.phone IS NULL\r\nGROUP BY\r\n        area_code HAVING Count(c.customer_id) > 5\r\nORDER BY\r\n        customer_count\r\n```\r\n---\r\n### 16 - Function classes\r\nYou may use this functions:\r\n* SubString\r\n* Round\r\n* DateDiff\r\n* Year\r\n* Current_Date\r\n* Avg\r\n* Min\r\n* Max\r\n* Sum\r\n* Count\r\n* Lag\r\n* Lead\r\n* Row_Number\r\n* Rank\r\n* Coalesce\r\n* Cast\r\n> Some of these functions may vary in syntax depending on the database.\r\nFor example, if your query is going to run on Oracle, do the following:\r\n\r\n`Function.dialect = Dialect.ORACLE`\r\n\r\n\r\n>  Most of this functions you can use nested inside each other.\r\n*Example:*\r\n```\r\n    Select(...\r\n        event_date=Substring(\r\n            Cast(\"CHAR\"), 12, 19 \r\n        ).As('time')\r\n    )\r\n```\r\nResults...\r\n```\r\n    SELECT ...\r\n    SubString(Cast(event_date As char), 12, 19) as time\r\n```\r\n\r\n>> `Function.auto_convert` option (default: True)\r\n\r\n- Put Cast(...) when there is a difference between the types of the parameter and the return of the nested function\r\n```\r\nbirth=Round( DateDiff(Current_Date()) ).As('age')\r\n```\r\n...Returns...\r\n```\r\nSELECT\r\n    Round(\r\n        Cast(Current_Date() - p.birth As FLOAT)\r\n       /* ^^^  */\r\n    ) as age\r\n...\r\n```\r\n---\r\n\r\n### 16.1 - _GroupBy as instance_\r\n\r\nAnother way to use GroupBy is to pass functions as parameters:\r\n\r\n```\r\n    Function.dialect = Dialect.ORACLE\r\n    query = Select(\r\n        'Sales s',\r\n        ref_date=GroupBy(\r\n            ref_year=Year, qty_sold=Sum('quantity'),\r\n            vendor=Select(\r\n                'Vendor v',\r\n                id=[PrimaryKey, Field], name=Field\r\n            )\r\n        )\r\n    )\r\n    print(query)\r\n```\r\nresults..\r\n```\r\nSELECT\r\n        Extract(Year FROM s.ref_date) as ref_year,\r\n        Sum(quantity) as qty_sold,\r\n        v.id,\r\n        v.name\r\nFROM\r\n        Sales s\r\n        JOIN Vendor v ON (s.vendor = v.id)\r\nGROUP BY\r\n        ref_year,\r\n        v.id,\r\n        v.name\r\n```\r\n\r\n---\r\n### 17 - CTE and Recursive classes\r\n\r\n* **17.1 - _CTE class_**\r\n```\r\n    query = Select(\r\n        'SocialMedia s', post=Count, reaction=Sum, user=GroupBy\r\n    )\r\n    print( CTE('Metrics', [query]) )\r\n```\r\nThe result is...\r\n```\r\n    WITH Metrics AS (\r\n            SELECT Count(s.post), Sum(s.reaction) FROM SocialMedia s GROUP BY user\r\n    )SELECT * FROM Metrics\r\n```\r\n\r\n* **17.2 - _Recursive class_**\r\n```\r\nq1 = Select(\r\n    'SocialMedia me', name=[ eq(MY_NAME), Field ]\r\n)\r\nq2 = Select(\r\n    'SocialMedia you' name=Field, id=Where.formula('{af} = n.friend')\r\n)\r\nprint( Recursive('Network', [q1, q2]) )\r\n```\r\nThe result is...\r\n```\r\nWITH RECURSIVE Network AS (\r\n        SELECT me.name FROM SocialMedia me WHERE \r\n        me.name = 'J\u00falio Cascalles'\r\nUNION ALL\r\n        SELECT you.name FROM SocialMedia you , Network n\r\n        WHERE  you.id = n.friend\r\n)SELECT * FROM Network\r\n```\r\n\r\n* **17.2.1 - The `create` method** ... parameters :\r\n<a id=\"cte_create_method\"></a>\r\n    - name: The name of the CTE\r\n    - pattern: A cypher script that defines the tables used\r\n    - formula: The format for `Where.formula` method _(*)_\r\n    - init_value: The value for the condition in the first table\r\n    - format (optional): If tables are files or internet hiperlinks, you may especify the extension and/or folder...\r\n> Example: \r\n```\r\n    R = Recursive.create(\r\n        'Route R', 'Flyght(departure, arrival)',\r\n        '[2] = R.[1]',  'JFK',  format='.csv'\r\n    ) #                  ^^^--- Flyghts from JFK airport\r\n```\r\n\r\n_...Creates a recursive CTE called Route, using Flyght table, where the recursivity condition is Flyght.arrival equals to Route.departure_\r\n>> (*) -- Note that [1] and [2] refers to first field and second field. \ud83d\ude09\r\n\r\nResult:\r\n\r\n    WITH RECURSIVE Route AS (\r\n            SELECT f1.departure, f1.arrival\r\n            FROM Flyght.csv f1\r\n            WHERE f1.departure = 'JFK'\r\n    UNION ALL\r\n            SELECT f2.departure, f2.arrival\r\n            FROM Flyght.csv f2\r\n            , Route R\r\n            WHERE  f2.arrival = R.departure\r\n    )SELECT * FROM Route R\r\n\r\n**17.2.2 - The `join` method** \r\n\r\nIn the previous example, if you add this code...\r\n`R.join('Airport(*id,name)', 'departure, arrival', format='.csv')`\r\n\r\n...The result would be:\r\n\r\n    WITH RECURSIVE Route AS (\r\n        SELECT f1.departure, f1.arrival\r\n        FROM Flyght.csv f1\r\n        WHERE f1.departure = 'JFK'\r\n    UNION ALL\r\n        SELECT f2.departure, f2.arrival\r\n        FROM Flyght.csv f2\r\n        , Route R\r\n        WHERE  f2.arrival = R.departure\r\n    )SELECT\r\n        a1.name, a2.name\r\n    FROM\r\n        Route R\r\n        JOIN Airport.csv a2 ON (R.arrival = a2.id)\r\n        JOIN Airport.csv a1 ON (R.departure = a1.id)\r\n\r\n\r\n**17.2.3 - The `counter` method** \r\nAdds an increment field in queries inside CTE:\r\n> Examples:\r\n* `R.counter('stops', 0)` # -- counter starts with 0 and increment +1\r\n* `R2.counter('generation', 5, '- 1')` # -- for the code below...\r\n```\r\nR2 = Recursive.create(\r\n    'Ancestors a', 'People(id,name,father,mother,birth)',\r\n    '(% = a.father OR % = a.mother)', 32630, '.parquet'\r\n)\r\n```\r\n...Results:\r\n\r\n    WITH RECURSIVE Ancestors AS (\r\n        SELECT p1.id, p1.name, p1.father, p1.mother, p1.birth,\r\n        5 AS generation  /* <<---- Most current generation ------------*/\r\n        FROM People.parquet p1 WHERE p1.id = 32630\r\n    UNION ALL\r\n        SELECT p2.id, p2.name, p2.father, p2.mother, p2.birth,\r\n        (generation- 1) AS generation /* <<-- Previous generation -----*/\r\n        FROM People.parquet p2 , Ancestors a WHERE  (p2.id = a.father OR p2.id = a.mother)\r\n    )SELECT * FROM Ancestors a\r\n\r\n\r\n>> Note: Comments added later.\r\n---\r\n\r\n### 17.3 - CTEFactory class\r\nCTEFactory exchanges subqueries for CTEs, simply by passing the text of the \"dirty\" query:\r\n\r\n*Example*:\r\n```\r\nprint(\r\n        CTEFactory(\"\"\"\r\n            SELECT u001.name, agg_sales.total\r\n            FROM (\r\n                SELECT * FROM Users u\r\n                WHERE u.status = 'active'\r\n            ) AS u001\r\n            JOIN (\r\n                SELECT s.user_id, Sum(s.value) as total\r\n                FROM Sales s\r\n                GROUP BY s.user_id\r\n            )\r\n            As agg_sales\r\n            ON u001.id = agg_sales.user_id\r\n            ORDER BY u001.name\r\n        \"\"\")        \r\n)\r\n```\r\nresults...\r\n```\r\n    WITH u001 AS (\r\n        SELECT * FROM Users u\r\n        WHERE u.status = 'active'\r\n    ),\r\n    WITH agg_sales AS (\r\n        SELECT s.user_id, Sum(s.value) as total\r\n        FROM Sales s\r\n        GROUP BY s.user_id\r\n    )\r\n    SELECT\r\n            u001.name,\r\n            agg_sales.total\r\n    FROM\r\n            u001 u001\r\n            JOIN agg_sales agg_sales ON\r\n            (u001.id = agg_sales.user_id)\r\n    ORDER BY\r\n            u001.name\r\n```\r\n\r\n#### 17.3.1 - You can also pass a Cypher script like in the example below:\r\n\r\n    cte = CTEFactory(\"\"\"\r\n        Annual_Sales_per_Vendor[\r\n            Sales(\r\n                year$ref_date:ref_year@, sum$quantity:qty_sold,\r\n            vendor) <- Vendor(id, name:vendors_name@)\r\n        ]\r\n\r\n        [-1](**, ref_year) -> Goal(year, target)\r\n    \"\"\")\r\n    print(cte)\r\n\r\nresults...\r\n```\r\nWITH Annual_Sales_per_Vendor AS (\r\n    SELECT ven.name as vendors_name\r\n    , Year(sal.ref_date) as ref_year\r\n    , Sum(sal.quantity) as qty_sold\r\n    FROM Vendor ven LEFT JOIN Sales sal ON (ven.id = sal.vendor\r\n    GROUP BY ven.name, ref_year\r\n)\r\nSELECT\r\n        aspv.vendors_name,\r\n        aspv.ref_year,\r\n        aspv.qty_sold,\r\n        goa.target\r\nFROM\r\n        Annual_Sales_per_Vendor aspv\r\n        RIGHT JOIN Goal goa ON (aspv.ref_year = goa.year)\r\n```\r\nFor more details, see the [Cypher syntax](#cypher_separators) or [CTE create method](#cte_create_method)!\r\n\r\n\r\n---\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Allows you to create objects for parts of SQL query commands. Also to combine these objects by joining them, adding or removing parts...",
    "version": "1.20250726",
    "project_urls": {
        "Homepage": "https://github.com/julio-cascalles/sql_blocks"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5064b05117f784d0828bf400a1f0ce3dffceee69e8460c72230e44717be3c9aa",
                "md5": "9dd3453c03f7a91d416083e55bd7ea25",
                "sha256": "f9c9cab23821d7eea8a0ffcd1e2be3e2fed2bbab0ff9cfbf3120e2724feef62b"
            },
            "downloads": -1,
            "filename": "sql_blocks-1.20250726-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9dd3453c03f7a91d416083e55bd7ea25",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 30832,
            "upload_time": "2025-07-26T16:37:00",
            "upload_time_iso_8601": "2025-07-26T16:37:00.246858Z",
            "url": "https://files.pythonhosted.org/packages/50/64/b05117f784d0828bf400a1f0ce3dffceee69e8460c72230e44717be3c9aa/sql_blocks-1.20250726-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b3f1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10",
                "md5": "ace0d97c450330c1128bead73329d9f5",
                "sha256": "c95c4b5ccc36f618d50075658d6ff3546bed62b2bee8977ec9cef8399d683282"
            },
            "downloads": -1,
            "filename": "sql_blocks-1.20250726.tar.gz",
            "has_sig": false,
            "md5_digest": "ace0d97c450330c1128bead73329d9f5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 39240,
            "upload_time": "2025-07-26T16:37:01",
            "upload_time_iso_8601": "2025-07-26T16:37:01.655025Z",
            "url": "https://files.pythonhosted.org/packages/0b/3f/1f794f04f4d5ba56665c5a7b37b92834fafc2a1f63734b5c6c2aa34dfb10/sql_blocks-1.20250726.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-26 16:37:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "julio-cascalles",
    "github_project": "sql_blocks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "sql-blocks"
}

Júlio Cascalles