gabiteodoru-parseq


Namegabiteodoru-parseq JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/gabiteodoru/parseq
SummaryParseQ: Q Language to Python Translator with AI-Powered Disambiguation
upload_time2025-07-16 23:49:33
maintainerNone
docs_urlNone
authorGabi Teodoru
requires_python>=3.8
licenseNone
keywords q kdb parseq compiler translator ai claude functional-programming
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ParseQ - Q Language to Python Translator

**An AI-Augmented Code Translation Pipeline**

ParseQ demonstrates a novel approach to program transformation: a **hybrid symbolic-neural transpilation system** that combines traditional compiler techniques with modern AI capabilities. This system showcases how to orchestrate multiple tools - AST parsers, rule-based transformers, and Large Language Models - to solve complex code translation problems that neither approach could handle alone.

## 🤯 Nested Chain of Thought Reasoning

When deployed as an MCP tool, ParseQ enables **recursive AI reasoning** - a fascinating example of collaborative AI workflows:

### **Multi-Level Reasoning Process:**
🧠 **Level 1**: Primary AI encounters complex q code: *"This is confusing, let me use the ParseQ tool..."*  
🔧 **Level 2**: ParseQ tool spins up isolated Claude session: *"I need to disambiguate these operators... looking at argument patterns... this `bang()` with positive int + table must be Enkey..."*  
💭 **Level 3**: Primary AI receives results: *"Based on the tool output, I can see this is actually a left join operation..."*

This creates **chain of thought squared** 🚀 - where one AI's reasoning process includes another AI's reasoning process, enabling:
- 🤖 **Distributed Reasoning**: Two AI instances collaborating on complex problems
- 🔄 **Recursive Problem Decomposition**: Delegating specialized analysis to expert AI sessions  
- 💡 **Meta-Cognitive Processing**: AI reasoning about AI reasoning
- 🛠️ **Collaborative AI Workflows**: Different AI instances specializing in different aspects

## What Makes This Interesting

This project represents an **AI-augmented pipeline** where:
- **Symbolic systems** handle the structured parsing and transformation
- **Neural systems** (Claude AI) resolve semantic ambiguities that are impossible to handle with pure rules
- **Tool orchestration** manages the multi-stage workflow with subprocess isolation and error handling

The result is a **hybrid symbolic-neural system** that translates q language expressions into readable, well-documented Python-like code by using AI-assisted disambiguation of heavily overloaded operators.

## Q parsing rules:

After getting the code parsed using the q code, using Python function `parseq0`, we will have an expression made of lists (enclosed in `[]`), dictionaries (format `{ key_list , value_list }`) and atoms (everything else). We can have different level nestings of lists and dicts.
The rule is: in a list, the first entry is the function to be executed (whether it's marked as Func or not; Func simply means built-in function or function-with-adverb), and all following entries are its parameters. Therefore a list with only one entry represents a lambda (lazy-evaluation). Also everything inside a dictionary, no matter what the deeper nesting is, is also a lambda. Otherwise, we are dealing with eager evaluation. 

Here are some examples of what q does when `parseq0` is called:
```
q)var2string parse "f[(min s; min t)]"
"[Symbol[f], [Func[enlist], [Func[min], Symbol[s]], [Func[min], Symbol[t]]]]"
q)var2string parse "`s`t!(min s; min t)"
"[Func[!], [LSymbol[s, t]], [Func[enlist], [Func[min], Symbol[s]], [Func[min], Symbol[t]]]]"
q)var2string parse "`s`t!((min;s); (min;t))"
"[Func[!], [LSymbol[s, t]], [Func[enlist], [Func[enlist], Func[min], Symbol[s]], [Func[enlist], Func[min], Symbol[t]]]]"
q)var2string parse "select min s, maxs t from c"
"[Func[?], Symbol[c], [], Bool[0], {LSymbol[s, t]: [[Func[min], Symbol[s]], [Func[maxs], Symbol[t]]]}]"
q)var2string parse "exec min s from c"
"[Func[?], Symbol[c], [], [], [[Func[min], Symbol[s]]]]"
q)var2string parse "-1 \"asd\""
"[Long[-1], LChar[a, s, d]]"
q)var2string parse "f[min sums c]"
"[Symbol[f], [Func[min], [Func[sums], Symbol[c]]]]"
q)var2string parse "f[min x;max y]"
"[Symbol[f], [Func[min], Symbol[x]], [Func[max], Symbol[y]]]"
q)var2string parse "select min s from c"
"[Func[?], Symbol[c], [], Bool[0], {LSymbol[s]: [[Func[min], Symbol[s]]]}]"
```

## Overview

Q language operators are heavily overloaded - a single symbol like `!` or `?` can have 6+ different meanings depending on the number and types of arguments. This makes q code very difficult for non-experts to understand.

ParseQ solves this by:
1. **Parsing** q expressions into an Abstract Syntax Tree (AST)
2. **Flattening** nested function calls into step-by-step assignments
3. **Disambiguating** overloaded operators using Claude AI + documentation
4. **Generating** readable Python-like code with explanatory comments

## Architecture

```
Q Expression → AST Parser → Flattened Python → AI Disambiguation → Documented Code
```

### Example Transformation

**Input q code:**
```q
a lj 2!select min s, maxs t from c
```

**Parsed q code:**
```
[Func[lj], Symbol[a], [Func[!], Long[2], [Func[?], Symbol[c], [], Bool[0], {LSymbol[s, t] : [[Func[min], Symbol[s]], [Func[maxs], Symbol[t]]]}]]]
```

**Intermediate flattened code:**
```python
temp1 = query(`c, [], False, {[`s, `t]: [[min, `s], [maxs, `t]]})
temp2 = bang(2, temp1)
result = lj(`a, temp2)
```

**Final disambiguated code using orchestration:**
```python
# Select/Exec - functional qSQL query with 4 args (table, conditions, groupby, aggregations)
# Queries table `c with no conditions ([]), no groupby (False), 
# and aggregations that map columns s and t to min(s) and maxs(t)
temp1 = query(`c, [], False, {[`s, `t]: [[min, `s], [maxs, `t]]})

# Enkey - makes first 2 columns the key of the table (positive integer + table pattern)
temp2 = bang(2, temp1)

# Left join - joins table `a with temp2
result = lj(`a, temp2)
```

### Multiple Statement Example

**Input q code:**
```q
f:{x+1};f[5]
```

**Parsed q code:**
```
[Char[;], [Builtin[:], Symbol[f], Lambda[[x], [Builtin[+], Symbol[x], Long[1]]]], [Symbol[f], Long[5]]]
```

**Flattened code:**
```python
def f(x):
    return x + 1
f(5)
```

## Components

### 1. AST Parser (`parseq.py`)

- **Tokenizer**: Breaks q parse output into tokens
- **Parser**: Builds AST from bracketed LISP-like syntax
- **Node Types**: Symbol, Integer, Float, Boolean, String, Function, List, Dict
- **Type Conversions**:
  - `Symbol[name]` → `name` (variables - no backticks)
  - `Func[name]` → `name` (functions are variables)
  - `LSymbol[s,t]` → `` `s``, `` `t`` (individual symbols with backticks)
  - `Int[5]`, `Long[5]` → `5`
  - `Real[3]`, `Float[3.0]` → `3.0` (ensures decimal point)
  - `Bool[0]`, `Bool[1]` → `False`, `True`
  - `LLong[1,2,3]` → `[1, 2, 3]` (typed lists)
  - `Dict[keys, values]` → `{keys: values}`

### 2. Flattening Engine

Converts deeply nested function calls into step-by-step assignments:
- Identifies nested function calls in AST
- Extracts intermediate expressions into temporary variables
- Generates linear sequence of assignment statements
- Maintains execution order and dependencies

**Smart Temporary Variable Management**:
- **Simple expressions**: `2+3` → `2 + 3` (no temp variables needed)
- **Nested expressions**: `1+2+3+4` → `temp1 = 3 + 4`, `temp2 = 2 + temp1`, `1 + temp2`
- **Final expression optimization**: Last temp variable is replaced with direct expression
- **Assignment handling**: Assignments return `None` to avoid extraneous output

**Output Examples**:
```python
# Simple arithmetic
pr('2+3')        # Output: 2 + 3

# Nested arithmetic  
pr('1+2+3+4')    # Output: temp1 = 3 + 4
                 #         temp2 = 2 + temp1
                 #         1 + temp2

# Variable assignment
pr('a:3')        # Output: a = 3

# Function definition
pr('f:{x+1}')    # Output: def f(x):
                 #             return x + 1

# Multiple statements (semicolon-separated)
pr('a:1;a+2')    # Output: a = 1
                 #         a + 2

# Function definition with call
pr('f:{x+1};f[5]')  # Output: def f(x):
                    #             return x + 1
                    #         f(5)
```

**Multiple Statement Support**: Handles Q's semicolon-separated statements:
- **Parsing**: `Char[;]` is recognized as a sequence operator in the AST
- **Execution Order**: Statements are processed sequentially, maintaining Q semantics
- **Return Value**: Returns the result of the last expression (Q semantics)
- **Lambda Sequences**: Supports multi-statement lambdas like `{a:x;x+a}`

**LSymbol Flattening**: Special handling for `LSymbol[a,b,c]` nodes:
- Creates individual symbol nodes instead of a list container
- Flattens symbols directly into parent context during parsing
- Prevents double-nesting: `f[`a`b`c]` becomes `f(`a, `b, `c)` not `f([[`a, `b, `c]])`

### 3. Glyph Translation

Maps q operator glyphs to readable names for variables (since functions are variables):
```python
glyph_map = {
    '@': 'at', '!': 'bang', ':': 'colon', '::': 'colon_colon',
    '-': 'dash', '.': 'dot', '$': 'dollar', '#': 'hash', 
    '?': 'query', '_': 'underscore', ',': 'comma'
}
```

Examples:
- `#` becomes `hash`
- `,` becomes `comma`
- `!` becomes `bang`

### 4. Claude AI Integration (`callclaude.py`)

- **Isolation**: Creates separate subdirectories for each Claude session
- **Process Management**: Runs `claude -p` in print mode to avoid interactive sessions
- **Error Handling**: Captures timeouts, command errors, and subprocess failures
- **Session Management**: Uses unique directory names to prevent conflicts

**Session Isolation Design**: ParseQ handles separate Claude conversations by creating temporary `claude_session_*` directories. This approach allows exactly one conversation per directory, which Claude Code CLI does not currently support natively. The temporary directories are intentionally not cleaned up during development to allow monitoring of behavior and debugging. Directory cleanup will be added in a future version once the system is mature.

### 5. Disambiguation System (`disambiguate.py`)

- **Reference Documentation**: Uses `q_operators.md` as context for AI disambiguation
- **Prompt Engineering**: Constructs detailed prompts with code + operator reference
- **AI Processing**: Leverages Claude's understanding to resolve operator ambiguity
- **Comment Generation**: Produces explanatory comments for each operation

### 6. Operator Reference (`q_operators.md`)

Comprehensive documentation of q operator overloading patterns:
- **Arity-based disambiguation**: Different meanings by argument count
- **Type-based disambiguation**: Different meanings by argument types
- **Pattern recognition**: Specific argument patterns (e.g., `0` vs positive integer)
- **Functional qSQL**: Complex 4-6 argument query operations
- **Context clues**: Usage patterns that indicate specific variants

## Recent Enhancements

### Partial Application Detection
ParseQ now intelligently detects and handles Q's partial application semantics:

- **Built-in Glyph Rules**: Variable-arity operators like `at()`, `bang()`, `query()` dispatch immediately when enough arguments are provided, but create partials only when arguments are below minimum arity
- **User Function Rules**: Fixed-arity functions create partials whenever fewer arguments than expected are provided  
- **Syntax Awareness**: Distinguishes between function calls (`func(args)`) and function object assignments (`result = func`)
- **Arity Documentation**: Comments explain the before/after arity relationships when partials are created

Example transformation:
```python
# Before partial analysis:
temp4 = func2(x)  # func2 defined with 2 parameters

# After partial analysis:  
# Partial application: func2 expects 2 args but only 1 provided
temp4 = partial(func2, x)
```

### Enhanced Function Disambiguation  
Added comprehensive support for additional Q operators:

- **`slash()` Functional**: Now disambiguates to `converge()`, `do()`, `while()`, or `reduce()` based on input function arity and usage context
- **Three-Level Arity Analysis**: Handles the complex arity relationships in functionals (slash arity, input function arity, output function arity)
- **Mandatory Replacement**: Ensures no ambiguous function names remain in output
- **Consistent Application**: Improved prompt engineering for reliable disambiguation

Example `slash()` disambiguation:
```python
# Before: ambiguous functional
temp5 = slash(temp4)

# After: specific variant based on arity analysis  
# Disambiguation: slash(unary_function) called with 1 arg → Converge
temp5 = converge(temp4)
```

### Q Assignment Support
ParseQ now properly handles Q's unified assignment syntax for both variables and functions:

- **Variable Assignments**: `x:3` → `x = 3`
- **Function Assignments**: `f:{x+1}` → `def f(x): return x + 1`
- **Arity Detection**: Only colon operations with exactly 2 arguments are treated as assignments
- **Lambda Integration**: Function assignments use the variable name as the function name instead of auto-generated names
- **Clean Output**: Eliminates redundant assignment statements for function definitions

**Key Implementation Features:**
- **Pre-flattening Detection**: Checks AST node types before flattening to identify lambda assignments
- **Extensible kwargs System**: `flatten_ast` accepts kwargs for passing context (e.g., `func_name`)
- **Duplicate Prevention**: Processes lambda assignments once instead of twice
- **Glyph Mapping**: Removed `:` from glyph translation to enable proper assignment handling

Example transformations:
```python
# Variable assignment
q_code: x:3
Output: x = 3

# Function assignment  
q_code: f:{x+1}
Output: def f(x):
            return x + 1

# Before assignment support:
# result = colon(f, func1)
# After assignment support:
# def f(x): return x + 1
```

### Lambda Return Optimization
Enhanced lambda code generation to eliminate unnecessary temporary variables:

- **Smart Return Statements**: Replaces `temp{n} = expr` with `return expr` when possible
- **Consistent Pattern**: Mirrors the expression-level optimization for `result =` assignments
- **Multi-line Support**: Handles both simple and complex lambda bodies

Example optimization:
```python
# Before optimization:
def func1(x):
    temp1 = x + 1
    temp2 = temp1 * 2
    temp3 = temp2 / 3
    return temp3

# After optimization:
def func1(x):
    temp1 = x + 1
    temp2 = temp1 * 2
    return temp2 / 3
```

### Standalone Function Output
Improved output formatting for standalone lambda expressions:

- **Clean Termination**: Eliminates confusing `result = func{n}` lines for standalone functions
- **Context Awareness**: Detects when final expression is a function definition
- **Consistent Behavior**: Maintains `result =` for non-function expressions

## Key Features

### Variable vs Symbol Distinction
ParseQ distinguishes between two fundamental Q concepts:
- **Variables**: Named references (functions, variables) → `name` (no backticks)
  - `Symbol[f]` becomes `f` 
  - `Func[min]` becomes `min`
- **Symbols**: Literal symbol values → `` `symbol`` (with backticks)
  - `LSymbol[a,b,c]` becomes `` `a``, `` `b``, `` `c``

This preserves Q's semantic distinction where `f` references a variable but `` `f`` is a symbol literal.

### Step-by-Step Execution
Instead of nested function calls, ParseQ generates linear assignment sequences that are:
- **Debuggable**: Can inspect intermediate values
- **Readable**: Clear data flow from top to bottom
- **Modifiable**: Easy to rearrange or comment individual steps

### AI-Powered Disambiguation
Uses Claude AI to resolve operator ambiguity by:
- Analyzing argument patterns and types
- Consulting comprehensive operator documentation
- Generating contextual explanations
- Adding semantic comments explaining actual operations

## Usage

### Basic Parsing
```python
from parseq import parseq0, convert_lisp_to_flat_statements

# Get raw q parse output
raw = parseq0('a lj 2!select min s, maxs t from c')

# Convert to flattened Python-like code
flattened = convert_lisp_to_flat_statements(raw)
print(flattened)
```

### Full Disambiguation
```python
from disambiguate import disambiguate_q_code

# Full pipeline: parse + flatten + disambiguate
result = disambiguate_q_code('a lj 2!select min s, maxs t from c')
print(result)
```

### Manual Claude Queries
```python
from callclaude import ask_claude

# Ask Claude questions about q operations
response = ask_claude("What does the q operator ! do with 2 arguments?")
print(response)
```

## Files

- **`parseq.py`**: Core AST parser, tokenizer, and flattening engine
- **`parseq.q`**: Q initialization script that creates function mappings, type definitions, and AST serialization utilities for converting q parse trees into string representations
- **`parseq_ns.q`**: Namespace-safe version of parseq.q with all globals prefixed with `.parseq.` to enable safe usage from remote q connections without namespace pollution
- **`callclaude.py`**: Claude AI integration for isolated subprocess calls
- **`disambiguate.py`**: Main disambiguation pipeline
- **`q_operators.md`**: Comprehensive q operator reference documentation
- **`README.md`**: This documentation

## Important Note

**Namespace Impact**: This Python package will create variables and functions in the `.parseq` namespace of your q session. While this is contained within a single namespace, users should be aware that some namespace modification occurs.

## Dependencies

- **Python 3.8+**: For dataclasses and type hints
- **Claude CLI**: Must have `claude` command available in PATH
- **Q/KDB+**: For generating parse trees (via qmcp connection)

## Limitations

- Requires q connection for parsing (uses q's built-in parser)
- Claude API calls can be slow (several seconds per disambiguation)
- Currently handles expressions, not full q scripts
- Some advanced q constructs may not be fully supported

## Future Enhancements

- **Caching**: Store disambiguation results to avoid repeated AI calls
- **Batch Processing**: Handle multiple expressions in single Claude call
- **Type Inference**: Better argument type detection for disambiguation
- **Full Script Support**: Handle q scripts with multiple statements
- **Optimization**: Reduce temporary variables where possible
- **Syntax Highlighting**: Visual indication of operator types in output

## Contributing

The system is modular and extensible:
1. **Parser Extensions**: Add new AST node types in `parseq.py`
2. **Operator Support**: Update `q_operators.md` with new operator documentation
3. **AI Prompting**: Improve disambiguation prompts in `disambiguate.py`
4. **Output Formats**: Add different code generation targets

## Examples

See the main functions for working examples:
- Basic parsing and flattening
- AI-powered disambiguation
- Symbol conversion and glyph mapping
- Step-by-step code generation

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/gabiteodoru/parseq",
    "name": "gabiteodoru-parseq",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "q kdb parseq compiler translator ai claude functional-programming",
    "author": "Gabi Teodoru",
    "author_email": "gabiteodoru@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d6/03/c05a14431784664686f43a86519f1248ff82a4448273896b14e8c711e8e1/gabiteodoru_parseq-0.1.3.tar.gz",
    "platform": null,
    "description": "# ParseQ - Q Language to Python Translator\n\n**An AI-Augmented Code Translation Pipeline**\n\nParseQ demonstrates a novel approach to program transformation: a **hybrid symbolic-neural transpilation system** that combines traditional compiler techniques with modern AI capabilities. This system showcases how to orchestrate multiple tools - AST parsers, rule-based transformers, and Large Language Models - to solve complex code translation problems that neither approach could handle alone.\n\n## \ud83e\udd2f Nested Chain of Thought Reasoning\n\nWhen deployed as an MCP tool, ParseQ enables **recursive AI reasoning** - a fascinating example of collaborative AI workflows:\n\n### **Multi-Level Reasoning Process:**\n\ud83e\udde0 **Level 1**: Primary AI encounters complex q code: *\"This is confusing, let me use the ParseQ tool...\"*  \n\ud83d\udd27 **Level 2**: ParseQ tool spins up isolated Claude session: *\"I need to disambiguate these operators... looking at argument patterns... this `bang()` with positive int + table must be Enkey...\"*  \n\ud83d\udcad **Level 3**: Primary AI receives results: *\"Based on the tool output, I can see this is actually a left join operation...\"*\n\nThis creates **chain of thought squared** \ud83d\ude80 - where one AI's reasoning process includes another AI's reasoning process, enabling:\n- \ud83e\udd16 **Distributed Reasoning**: Two AI instances collaborating on complex problems\n- \ud83d\udd04 **Recursive Problem Decomposition**: Delegating specialized analysis to expert AI sessions  \n- \ud83d\udca1 **Meta-Cognitive Processing**: AI reasoning about AI reasoning\n- \ud83d\udee0\ufe0f **Collaborative AI Workflows**: Different AI instances specializing in different aspects\n\n## What Makes This Interesting\n\nThis project represents an **AI-augmented pipeline** where:\n- **Symbolic systems** handle the structured parsing and transformation\n- **Neural systems** (Claude AI) resolve semantic ambiguities that are impossible to handle with pure rules\n- **Tool orchestration** manages the multi-stage workflow with subprocess isolation and error handling\n\nThe result is a **hybrid symbolic-neural system** that translates q language expressions into readable, well-documented Python-like code by using AI-assisted disambiguation of heavily overloaded operators.\n\n## Q parsing rules:\n\nAfter getting the code parsed using the q code, using Python function `parseq0`, we will have an expression made of lists (enclosed in `[]`), dictionaries (format `{ key_list , value_list }`) and atoms (everything else). We can have different level nestings of lists and dicts.\nThe rule is: in a list, the first entry is the function to be executed (whether it's marked as Func or not; Func simply means built-in function or function-with-adverb), and all following entries are its parameters. Therefore a list with only one entry represents a lambda (lazy-evaluation). Also everything inside a dictionary, no matter what the deeper nesting is, is also a lambda. Otherwise, we are dealing with eager evaluation. \n\nHere are some examples of what q does when `parseq0` is called:\n```\nq)var2string parse \"f[(min s; min t)]\"\n\"[Symbol[f], [Func[enlist], [Func[min], Symbol[s]], [Func[min], Symbol[t]]]]\"\nq)var2string parse \"`s`t!(min s; min t)\"\n\"[Func[!], [LSymbol[s, t]], [Func[enlist], [Func[min], Symbol[s]], [Func[min], Symbol[t]]]]\"\nq)var2string parse \"`s`t!((min;s); (min;t))\"\n\"[Func[!], [LSymbol[s, t]], [Func[enlist], [Func[enlist], Func[min], Symbol[s]], [Func[enlist], Func[min], Symbol[t]]]]\"\nq)var2string parse \"select min s, maxs t from c\"\n\"[Func[?], Symbol[c], [], Bool[0], {LSymbol[s, t]: [[Func[min], Symbol[s]], [Func[maxs], Symbol[t]]]}]\"\nq)var2string parse \"exec min s from c\"\n\"[Func[?], Symbol[c], [], [], [[Func[min], Symbol[s]]]]\"\nq)var2string parse \"-1 \\\"asd\\\"\"\n\"[Long[-1], LChar[a, s, d]]\"\nq)var2string parse \"f[min sums c]\"\n\"[Symbol[f], [Func[min], [Func[sums], Symbol[c]]]]\"\nq)var2string parse \"f[min x;max y]\"\n\"[Symbol[f], [Func[min], Symbol[x]], [Func[max], Symbol[y]]]\"\nq)var2string parse \"select min s from c\"\n\"[Func[?], Symbol[c], [], Bool[0], {LSymbol[s]: [[Func[min], Symbol[s]]]}]\"\n```\n\n## Overview\n\nQ language operators are heavily overloaded - a single symbol like `!` or `?` can have 6+ different meanings depending on the number and types of arguments. This makes q code very difficult for non-experts to understand.\n\nParseQ solves this by:\n1. **Parsing** q expressions into an Abstract Syntax Tree (AST)\n2. **Flattening** nested function calls into step-by-step assignments\n3. **Disambiguating** overloaded operators using Claude AI + documentation\n4. **Generating** readable Python-like code with explanatory comments\n\n## Architecture\n\n```\nQ Expression \u2192 AST Parser \u2192 Flattened Python \u2192 AI Disambiguation \u2192 Documented Code\n```\n\n### Example Transformation\n\n**Input q code:**\n```q\na lj 2!select min s, maxs t from c\n```\n\n**Parsed q code:**\n```\n[Func[lj], Symbol[a], [Func[!], Long[2], [Func[?], Symbol[c], [], Bool[0], {LSymbol[s, t] : [[Func[min], Symbol[s]], [Func[maxs], Symbol[t]]]}]]]\n```\n\n**Intermediate flattened code:**\n```python\ntemp1 = query(`c, [], False, {[`s, `t]: [[min, `s], [maxs, `t]]})\ntemp2 = bang(2, temp1)\nresult = lj(`a, temp2)\n```\n\n**Final disambiguated code using orchestration:**\n```python\n# Select/Exec - functional qSQL query with 4 args (table, conditions, groupby, aggregations)\n# Queries table `c with no conditions ([]), no groupby (False), \n# and aggregations that map columns s and t to min(s) and maxs(t)\ntemp1 = query(`c, [], False, {[`s, `t]: [[min, `s], [maxs, `t]]})\n\n# Enkey - makes first 2 columns the key of the table (positive integer + table pattern)\ntemp2 = bang(2, temp1)\n\n# Left join - joins table `a with temp2\nresult = lj(`a, temp2)\n```\n\n### Multiple Statement Example\n\n**Input q code:**\n```q\nf:{x+1};f[5]\n```\n\n**Parsed q code:**\n```\n[Char[;], [Builtin[:], Symbol[f], Lambda[[x], [Builtin[+], Symbol[x], Long[1]]]], [Symbol[f], Long[5]]]\n```\n\n**Flattened code:**\n```python\ndef f(x):\n    return x + 1\nf(5)\n```\n\n## Components\n\n### 1. AST Parser (`parseq.py`)\n\n- **Tokenizer**: Breaks q parse output into tokens\n- **Parser**: Builds AST from bracketed LISP-like syntax\n- **Node Types**: Symbol, Integer, Float, Boolean, String, Function, List, Dict\n- **Type Conversions**:\n  - `Symbol[name]` \u2192 `name` (variables - no backticks)\n  - `Func[name]` \u2192 `name` (functions are variables)\n  - `LSymbol[s,t]` \u2192 `` `s``, `` `t`` (individual symbols with backticks)\n  - `Int[5]`, `Long[5]` \u2192 `5`\n  - `Real[3]`, `Float[3.0]` \u2192 `3.0` (ensures decimal point)\n  - `Bool[0]`, `Bool[1]` \u2192 `False`, `True`\n  - `LLong[1,2,3]` \u2192 `[1, 2, 3]` (typed lists)\n  - `Dict[keys, values]` \u2192 `{keys: values}`\n\n### 2. Flattening Engine\n\nConverts deeply nested function calls into step-by-step assignments:\n- Identifies nested function calls in AST\n- Extracts intermediate expressions into temporary variables\n- Generates linear sequence of assignment statements\n- Maintains execution order and dependencies\n\n**Smart Temporary Variable Management**:\n- **Simple expressions**: `2+3` \u2192 `2 + 3` (no temp variables needed)\n- **Nested expressions**: `1+2+3+4` \u2192 `temp1 = 3 + 4`, `temp2 = 2 + temp1`, `1 + temp2`\n- **Final expression optimization**: Last temp variable is replaced with direct expression\n- **Assignment handling**: Assignments return `None` to avoid extraneous output\n\n**Output Examples**:\n```python\n# Simple arithmetic\npr('2+3')        # Output: 2 + 3\n\n# Nested arithmetic  \npr('1+2+3+4')    # Output: temp1 = 3 + 4\n                 #         temp2 = 2 + temp1\n                 #         1 + temp2\n\n# Variable assignment\npr('a:3')        # Output: a = 3\n\n# Function definition\npr('f:{x+1}')    # Output: def f(x):\n                 #             return x + 1\n\n# Multiple statements (semicolon-separated)\npr('a:1;a+2')    # Output: a = 1\n                 #         a + 2\n\n# Function definition with call\npr('f:{x+1};f[5]')  # Output: def f(x):\n                    #             return x + 1\n                    #         f(5)\n```\n\n**Multiple Statement Support**: Handles Q's semicolon-separated statements:\n- **Parsing**: `Char[;]` is recognized as a sequence operator in the AST\n- **Execution Order**: Statements are processed sequentially, maintaining Q semantics\n- **Return Value**: Returns the result of the last expression (Q semantics)\n- **Lambda Sequences**: Supports multi-statement lambdas like `{a:x;x+a}`\n\n**LSymbol Flattening**: Special handling for `LSymbol[a,b,c]` nodes:\n- Creates individual symbol nodes instead of a list container\n- Flattens symbols directly into parent context during parsing\n- Prevents double-nesting: `f[`a`b`c]` becomes `f(`a, `b, `c)` not `f([[`a, `b, `c]])`\n\n### 3. Glyph Translation\n\nMaps q operator glyphs to readable names for variables (since functions are variables):\n```python\nglyph_map = {\n    '@': 'at', '!': 'bang', ':': 'colon', '::': 'colon_colon',\n    '-': 'dash', '.': 'dot', '$': 'dollar', '#': 'hash', \n    '?': 'query', '_': 'underscore', ',': 'comma'\n}\n```\n\nExamples:\n- `#` becomes `hash`\n- `,` becomes `comma`\n- `!` becomes `bang`\n\n### 4. Claude AI Integration (`callclaude.py`)\n\n- **Isolation**: Creates separate subdirectories for each Claude session\n- **Process Management**: Runs `claude -p` in print mode to avoid interactive sessions\n- **Error Handling**: Captures timeouts, command errors, and subprocess failures\n- **Session Management**: Uses unique directory names to prevent conflicts\n\n**Session Isolation Design**: ParseQ handles separate Claude conversations by creating temporary `claude_session_*` directories. This approach allows exactly one conversation per directory, which Claude Code CLI does not currently support natively. The temporary directories are intentionally not cleaned up during development to allow monitoring of behavior and debugging. Directory cleanup will be added in a future version once the system is mature.\n\n### 5. Disambiguation System (`disambiguate.py`)\n\n- **Reference Documentation**: Uses `q_operators.md` as context for AI disambiguation\n- **Prompt Engineering**: Constructs detailed prompts with code + operator reference\n- **AI Processing**: Leverages Claude's understanding to resolve operator ambiguity\n- **Comment Generation**: Produces explanatory comments for each operation\n\n### 6. Operator Reference (`q_operators.md`)\n\nComprehensive documentation of q operator overloading patterns:\n- **Arity-based disambiguation**: Different meanings by argument count\n- **Type-based disambiguation**: Different meanings by argument types\n- **Pattern recognition**: Specific argument patterns (e.g., `0` vs positive integer)\n- **Functional qSQL**: Complex 4-6 argument query operations\n- **Context clues**: Usage patterns that indicate specific variants\n\n## Recent Enhancements\n\n### Partial Application Detection\nParseQ now intelligently detects and handles Q's partial application semantics:\n\n- **Built-in Glyph Rules**: Variable-arity operators like `at()`, `bang()`, `query()` dispatch immediately when enough arguments are provided, but create partials only when arguments are below minimum arity\n- **User Function Rules**: Fixed-arity functions create partials whenever fewer arguments than expected are provided  \n- **Syntax Awareness**: Distinguishes between function calls (`func(args)`) and function object assignments (`result = func`)\n- **Arity Documentation**: Comments explain the before/after arity relationships when partials are created\n\nExample transformation:\n```python\n# Before partial analysis:\ntemp4 = func2(x)  # func2 defined with 2 parameters\n\n# After partial analysis:  \n# Partial application: func2 expects 2 args but only 1 provided\ntemp4 = partial(func2, x)\n```\n\n### Enhanced Function Disambiguation  \nAdded comprehensive support for additional Q operators:\n\n- **`slash()` Functional**: Now disambiguates to `converge()`, `do()`, `while()`, or `reduce()` based on input function arity and usage context\n- **Three-Level Arity Analysis**: Handles the complex arity relationships in functionals (slash arity, input function arity, output function arity)\n- **Mandatory Replacement**: Ensures no ambiguous function names remain in output\n- **Consistent Application**: Improved prompt engineering for reliable disambiguation\n\nExample `slash()` disambiguation:\n```python\n# Before: ambiguous functional\ntemp5 = slash(temp4)\n\n# After: specific variant based on arity analysis  \n# Disambiguation: slash(unary_function) called with 1 arg \u2192 Converge\ntemp5 = converge(temp4)\n```\n\n### Q Assignment Support\nParseQ now properly handles Q's unified assignment syntax for both variables and functions:\n\n- **Variable Assignments**: `x:3` \u2192 `x = 3`\n- **Function Assignments**: `f:{x+1}` \u2192 `def f(x): return x + 1`\n- **Arity Detection**: Only colon operations with exactly 2 arguments are treated as assignments\n- **Lambda Integration**: Function assignments use the variable name as the function name instead of auto-generated names\n- **Clean Output**: Eliminates redundant assignment statements for function definitions\n\n**Key Implementation Features:**\n- **Pre-flattening Detection**: Checks AST node types before flattening to identify lambda assignments\n- **Extensible kwargs System**: `flatten_ast` accepts kwargs for passing context (e.g., `func_name`)\n- **Duplicate Prevention**: Processes lambda assignments once instead of twice\n- **Glyph Mapping**: Removed `:` from glyph translation to enable proper assignment handling\n\nExample transformations:\n```python\n# Variable assignment\nq_code: x:3\nOutput: x = 3\n\n# Function assignment  \nq_code: f:{x+1}\nOutput: def f(x):\n            return x + 1\n\n# Before assignment support:\n# result = colon(f, func1)\n# After assignment support:\n# def f(x): return x + 1\n```\n\n### Lambda Return Optimization\nEnhanced lambda code generation to eliminate unnecessary temporary variables:\n\n- **Smart Return Statements**: Replaces `temp{n} = expr` with `return expr` when possible\n- **Consistent Pattern**: Mirrors the expression-level optimization for `result =` assignments\n- **Multi-line Support**: Handles both simple and complex lambda bodies\n\nExample optimization:\n```python\n# Before optimization:\ndef func1(x):\n    temp1 = x + 1\n    temp2 = temp1 * 2\n    temp3 = temp2 / 3\n    return temp3\n\n# After optimization:\ndef func1(x):\n    temp1 = x + 1\n    temp2 = temp1 * 2\n    return temp2 / 3\n```\n\n### Standalone Function Output\nImproved output formatting for standalone lambda expressions:\n\n- **Clean Termination**: Eliminates confusing `result = func{n}` lines for standalone functions\n- **Context Awareness**: Detects when final expression is a function definition\n- **Consistent Behavior**: Maintains `result =` for non-function expressions\n\n## Key Features\n\n### Variable vs Symbol Distinction\nParseQ distinguishes between two fundamental Q concepts:\n- **Variables**: Named references (functions, variables) \u2192 `name` (no backticks)\n  - `Symbol[f]` becomes `f` \n  - `Func[min]` becomes `min`\n- **Symbols**: Literal symbol values \u2192 `` `symbol`` (with backticks)\n  - `LSymbol[a,b,c]` becomes `` `a``, `` `b``, `` `c``\n\nThis preserves Q's semantic distinction where `f` references a variable but `` `f`` is a symbol literal.\n\n### Step-by-Step Execution\nInstead of nested function calls, ParseQ generates linear assignment sequences that are:\n- **Debuggable**: Can inspect intermediate values\n- **Readable**: Clear data flow from top to bottom\n- **Modifiable**: Easy to rearrange or comment individual steps\n\n### AI-Powered Disambiguation\nUses Claude AI to resolve operator ambiguity by:\n- Analyzing argument patterns and types\n- Consulting comprehensive operator documentation\n- Generating contextual explanations\n- Adding semantic comments explaining actual operations\n\n## Usage\n\n### Basic Parsing\n```python\nfrom parseq import parseq0, convert_lisp_to_flat_statements\n\n# Get raw q parse output\nraw = parseq0('a lj 2!select min s, maxs t from c')\n\n# Convert to flattened Python-like code\nflattened = convert_lisp_to_flat_statements(raw)\nprint(flattened)\n```\n\n### Full Disambiguation\n```python\nfrom disambiguate import disambiguate_q_code\n\n# Full pipeline: parse + flatten + disambiguate\nresult = disambiguate_q_code('a lj 2!select min s, maxs t from c')\nprint(result)\n```\n\n### Manual Claude Queries\n```python\nfrom callclaude import ask_claude\n\n# Ask Claude questions about q operations\nresponse = ask_claude(\"What does the q operator ! do with 2 arguments?\")\nprint(response)\n```\n\n## Files\n\n- **`parseq.py`**: Core AST parser, tokenizer, and flattening engine\n- **`parseq.q`**: Q initialization script that creates function mappings, type definitions, and AST serialization utilities for converting q parse trees into string representations\n- **`parseq_ns.q`**: Namespace-safe version of parseq.q with all globals prefixed with `.parseq.` to enable safe usage from remote q connections without namespace pollution\n- **`callclaude.py`**: Claude AI integration for isolated subprocess calls\n- **`disambiguate.py`**: Main disambiguation pipeline\n- **`q_operators.md`**: Comprehensive q operator reference documentation\n- **`README.md`**: This documentation\n\n## Important Note\n\n**Namespace Impact**: This Python package will create variables and functions in the `.parseq` namespace of your q session. While this is contained within a single namespace, users should be aware that some namespace modification occurs.\n\n## Dependencies\n\n- **Python 3.8+**: For dataclasses and type hints\n- **Claude CLI**: Must have `claude` command available in PATH\n- **Q/KDB+**: For generating parse trees (via qmcp connection)\n\n## Limitations\n\n- Requires q connection for parsing (uses q's built-in parser)\n- Claude API calls can be slow (several seconds per disambiguation)\n- Currently handles expressions, not full q scripts\n- Some advanced q constructs may not be fully supported\n\n## Future Enhancements\n\n- **Caching**: Store disambiguation results to avoid repeated AI calls\n- **Batch Processing**: Handle multiple expressions in single Claude call\n- **Type Inference**: Better argument type detection for disambiguation\n- **Full Script Support**: Handle q scripts with multiple statements\n- **Optimization**: Reduce temporary variables where possible\n- **Syntax Highlighting**: Visual indication of operator types in output\n\n## Contributing\n\nThe system is modular and extensible:\n1. **Parser Extensions**: Add new AST node types in `parseq.py`\n2. **Operator Support**: Update `q_operators.md` with new operator documentation\n3. **AI Prompting**: Improve disambiguation prompts in `disambiguate.py`\n4. **Output Formats**: Add different code generation targets\n\n## Examples\n\nSee the main functions for working examples:\n- Basic parsing and flattening\n- AI-powered disambiguation\n- Symbol conversion and glyph mapping\n- Step-by-step code generation\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "ParseQ: Q Language to Python Translator with AI-Powered Disambiguation",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/gabiteodoru/parseq"
    },
    "split_keywords": [
        "q",
        "kdb",
        "parseq",
        "compiler",
        "translator",
        "ai",
        "claude",
        "functional-programming"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "73ce0dff2203c309295cb4544e6b663d6d77b7dc853050a1955c84ade165ef6d",
                "md5": "146905427b04106b54902c7e5ccc09e7",
                "sha256": "03f03808aaa3f74c7229adfcf3643764e76935dd22501355ec22a68a079b6108"
            },
            "downloads": -1,
            "filename": "gabiteodoru_parseq-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "146905427b04106b54902c7e5ccc09e7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27276,
            "upload_time": "2025-07-16T23:49:32",
            "upload_time_iso_8601": "2025-07-16T23:49:32.259093Z",
            "url": "https://files.pythonhosted.org/packages/73/ce/0dff2203c309295cb4544e6b663d6d77b7dc853050a1955c84ade165ef6d/gabiteodoru_parseq-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d603c05a14431784664686f43a86519f1248ff82a4448273896b14e8c711e8e1",
                "md5": "250f879f0d6c6f4c73bc412b4284a09f",
                "sha256": "ddeb1491548d43909c50fa7d80dba3d0e693a6c183230f76c89c646e38311ac4"
            },
            "downloads": -1,
            "filename": "gabiteodoru_parseq-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "250f879f0d6c6f4c73bc412b4284a09f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25920,
            "upload_time": "2025-07-16T23:49:33",
            "upload_time_iso_8601": "2025-07-16T23:49:33.556925Z",
            "url": "https://files.pythonhosted.org/packages/d6/03/c05a14431784664686f43a86519f1248ff82a4448273896b14e8c711e8e1/gabiteodoru_parseq-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-16 23:49:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "gabiteodoru",
    "github_project": "parseq",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "gabiteodoru-parseq"
}
        
Elapsed time: 1.79742s