strictjson


Namestrictjson JSON
Version 5.1.3 PyPI version JSON
download
home_pageNone
SummaryA Strict JSON Framework for LLM Outputs, that fixes problems that json.loads() cannot solve
upload_time2024-08-05 02:32:18
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Strict JSON v5.1.3
[UPDATE]: For Agentic Framework, do check out TaskGen (the official Agentic Framework building on StrictJSON). This will make the StrictJSON repo neater and this github will focus on using StrictJSON for LLM Output Parsing
- https://github.com/simbianai/taskgen

### A Strict JSON Framework for LLM Outputs, that fixes problems that json.loads() cannot solve
- Works for JSON outputs with multiple ' or " or { or } or \ or unmatched braces/brackets that may break a json.loads()

### Base Functionalities (see Tutorial.ipynb)
- Ensures LLM outputs into a dictionary based on a JSON format (HUGE: Nested lists and dictionaries now supported)
- Supports `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Enum[]`, `bool` type forcing with LLM-based error correction, as well as LLM-based error correction using `type: ensure <restriction>`, and (advanced) custom user checks using `custom_checks`
- Easy construction of LLM-based functions using ```Function``` (Note: renamed from `strict_function` to keep in line with naming convention of capitalised class groups. `strict_function` still works for legacy support.)
- Easy integration with OpenAI JSON Mode by setting `openai_json_mode = True`
- Exposing of llm variable for `strict_json` and `Function` for easy use of self-defined LLMs
- `AsyncFunction` and `strict_json_async` for async (and faster) processing

### Tutorials and Community Support
- Created: 7 Apr 2023
- Collaborators welcome
- Video tutorial (Ask Me Anything): [https://www.youtube.com/watch?v=L4aytve5v1Q](https://www.youtube.com/watch?v=L4aytve5v1Q)
- Video tutorial: [https://www.youtube.com/watch?v=IjTUKAciTCg](https://www.youtube.com/watch?v=1N-znDTlhNc)
- Discussion Channel (my discord - John's AI Group): [discord.gg/bzp87AHJy5](discord.gg/bzp87AHJy5)

## How do I use this? 
1. Download package via command line ```pip install strictjson```
2. Import the required functions from ```strictjson```
3. Set up the relevant API Keys for your LLM if needed. Refer to ```Tutorial.ipynb``` for how to do it for Jupyter Notebooks.

## How does it work?
- Extract JSON values as a string using a special regex (add delimiters to ```key``` to make ```###key###```) to split keys and values. (New!) Also works for nested datatypes by splitting recursively.
- Uses ```ast.literal_eval``` to best match the extracted output value to a literal (e.g. int, string, dict).
- Ensures that all JSON fields are output by LLM, with optional type checking, if not it will feed in error message to LLM to iteratively correct its generation (default: 3 tries)

# Features:
# 1. Basic Generation

- **system_prompt**: Write in whatever you want the LLM to become. "You are a \<purpose in life\>"
- **user_prompt**: The user input. Later, when we use it as a function, this is the function input
- **output_format**: JSON of output variables in a dictionary, with the key as the output key, and the value as the output description
    - The output keys will be preserved exactly, while the LLM will generate content to match the description of the value as best as possible
- **llm**: The llm you want to use. Takes in `system_prompt` and `user_prompt` and outputs the LLM-generated string

#### Example LLM Definition
```python
def llm(system_prompt: str, user_prompt: str) -> str:
    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''
    # ensure your LLM imports are all within this function
    from openai import OpenAI
    
    # define your own LLM here
    client = OpenAI()
    response = client.chat.completions.create(
        model='gpt-4o-mini',
        temperature = 0,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content
```

#### Example Usage
```python
res = strict_json(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'},
                    llm = llm)
                                    
print(res)
```

#### Example Output
```{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}```

## 2. Advanced Generation
- More advanced demonstration involving code that would typically break ```json.loads()```

#### Example Usage
```python
res = strict_json(system_prompt = 'You are a code generator, generating code to fulfil a task',
                    user_prompt = 'Given array p, output a function named func_sum to return its sum',
                    output_format = {'Elaboration': 'How you would do it',
                                     'C': 'Code',
                                    'Python': 'Code'},
                    llm = llm)
                                    
print(res)
```

#### Example Output
```{'Elaboration': 'Use a loop to iterate through each element in the array and add it to a running total.', ```

```'C': 'int func_sum(int p[], int size) {\n    int sum = 0;\n    for (int i = 0; i < size; i++) {\n        sum += p[i];\n    }\n    return sum;\n}', ```

```'Python': 'def func_sum(p):\n    sum = 0\n    for num in p:\n        sum += num\n    return sum'}```

## 3. Type forcing output variables
- Generally, ```strict_json``` will infer the data type automatically for you for the output fields
- However, if you would like very specific data types, you can do data forcing using ```type: <data_type>``` at the last part of the output field description
- ```<data_type>``` must be of the form `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool` for type checking to work
- `code` removes all unicode escape characters that might interfere with normal code running
- The `Enum` and `List` are not case sensitive, so `enum` and `list` works just as well
- For `Enum[list_of_category_names]`, it is best to give an "Other" category in case the LLM fails to classify correctly with the other options.
- If `list` or `List[]` is not formatted correctly in LLM's output, we will correct it by asking the LLM to list out the elements line by line
- For `dict`,  we can further check whether keys are present using `Dict[list_of_key_names]`
- Other types will first be forced by rule-based conversion, any further errors will be fed into LLM's error feedback mechanism
- If `<data_type>` is not the specified data types, it can still be useful to shape the output for the LLM. However, no type checking will be done.
- Note: LLM understands the word `Array` better than `List` since `Array` is the official JSON object type, so in the backend, any type with the word `List` will be converted to `Array`.

### LLM-based checks
- If you would like the LLM to ensure that the type is being met, use `type: ensure <requirement>`
- This will run a LLM to check if the requirement is met. If requirement is not met, the LLM will generate what needs to be done to meet the requirement, which will be fed into the error-correcting loop of `strict_json`

#### Example Usage 1
```python
res = strict_json(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment, type: Enum["Pos", "Neg", "Other"]',
                                    'Adjectives': 'Array of adjectives, type: List[str]',
                                    'Words': 'Number of words, type: int',
                                    'In English': 'Whether sentence is in English, type: bool'},
                  llm = llm)
                                    
print(res)
```

#### Example Output 1
```{'Sentiment': 'Pos', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7, 'In English': True}```

#### Example Usage 2
```python
res = strict_json(system_prompt = 'You are an expert at organising birthday parties',
                    user_prompt = 'Give me some information on how to organise a birthday',
                    output_format = {'Famous Quote about Age': 'type: ensure quote contains the word age',
                                    'Lucky draw numbers': '3 numbers from 1-50, type: List[int]',
                                    'Sample venues': 'Describe two venues, type: List[Dict["Venue", "Description"]]'},
                    llm = llm)

print(res)
```

#### Example Output 2
`Using LLM to check "The secret of staying young is to live honestly, eat slowly, and lie about your age. - Lucille Ball" to see if it adheres to "quote contains the word age" Requirement Met: True`


```{'Famous Quote about Age': 'The secret of staying young is to live honestly, eat slowly, and lie about your age. - Lucille Ball',```
```'Lucky draw numbers': [7, 21, 35],```

```'Sample venues': [{'Venue': 'Beachside Resort', 'Description': 'A beautiful resort with stunning views of the beach. Perfect for a summer birthday party.'}, {'Venue': 'Indoor Trampoline Park', 'Description': 'An exciting venue with trampolines and fun activities. Ideal for an active and energetic birthday celebration.'}]}```

## 4. Functions
- Enhances ```strict_json()``` with a function-like interface for repeated use of modular LLM-based functions (or wraps external functions)
- Use angle brackets <> to enclose input variable names. First input variable name to appear in `fn_description` will be first input variable and second to appear will be second input variable. For example, `fn_description = 'Adds up two numbers, <var1> and <var2>'` will result in a function with first input variable `var1` and second input variable `var2`
- (Optional) If you would like greater specificity in your function's input, you can describe the variable after the : in the input variable name, e.g. `<var1: an integer from 10 to 30>`. Here, `var1` is the input variable and `an integer from 10 to 30` is the description.
- (Optional) If your description of the variable is one of `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool`, we will enforce type checking when generating the function inputs in `get_next_subtask` method of the `Agent` class. Example: `<var1: int>`. Refer to Section 3. Type Forcing Output Variables for details.
- Inputs (primary):
    - **fn_description**: String. Function description to describe process of transforming input variables to output variables. Variables must be enclosed in <> and listed in order of appearance in function input.
        - New feature: If `external_fn` is provided and no `fn_description` is provided, then we will automatically parse out the fn_description based on docstring of `external_fn`. The docstring should contain the names of all compulsory input variables
        - New feature: If `external_fn` is provided and no `output_format` is provided, then we will automatically derive the `output_format` from the function signature
    - **output_format**: Dict. Dictionary containing output variables names and description for each variable.
    
- Inputs (optional):
    - **examples** - Dict or List[Dict]. Examples in Dictionary form with the input and output variables (list if more than one)
    - **external_fn** - Python Function. If defined, instead of using LLM to process the function, we will run the external function. 
        If there are multiple outputs of this function, we will map it to the keys of `output_format` in a one-to-one fashion
    - **fn_name** - String. If provided, this will be the name of the function. Otherwise, if `external_fn` is provided, it will be the name of `external_fn`. Otherwise, we will use LLM to generate a function name from the `fn_description`
    - **kwargs** - Dict. Additional arguments you would like to pass on to the strict_json function
        
- Outputs:
    JSON of output variables in a dictionary (similar to ```strict_json```)
    
#### Example Usage 1 (Description only)
```python
# basic configuration with variable names (in order of appearance in fn_description)
fn = Function(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', 
                     output_format = {'output': 'sentence'},
                     llm = llm)

# Use the function
fn('ball', 'dog', 'happy') #obj, entity, emotion
```

#### Example Output 1
```{'output': 'The happy dog chased the ball.'}```

#### Example Usage 2 (Examples only)
```python
# Construct the function: infer pattern from just examples without description (here it is multiplication)
fn = Function(fn_description = 'Map <var1> and <var2> to output based on examples', 
                     output_format = {'output': 'final answer'}, 
                     examples = [{'var1': 3, 'var2': 2, 'output': 6}, 
                                 {'var1': 5, 'var2': 3, 'output': 15}, 
                                 {'var1': 7, 'var2': 4, 'output': 28}],
                     llm = llm)

# Use the function
fn(2, 10) #var1, var2
```

#### Example Output 2
```{'output': 20}```

#### Example Usage 3 (Description and Examples)
```python
# Construct the function: description and examples with variable names
# variable names will be referenced in order of appearance in fn_description
fn = Function(fn_description = 'Output the sum and difference of <num1> and <num2>', 
                 output_format = {'sum': 'sum of two numbers', 
                                  'difference': 'absolute difference of two numbers'},
                 examples = {'num1': 2, 'num2': 4, 'sum': 6, 'difference': 2},
                 llm = llm)

# Use the function
fn(3, 4) #num1, num2
```

#### Example Output 3
```{'sum': 7, 'difference': 1}```

#### Example Usage 4 (External Function with automatic inference of fn_description and output_format - Preferred)
```python
# Docstring should provide all input variables, otherwise we will add it in automatically
# We will ignore shared_variables, *args and **kwargs
# No need to define llm in Function for External Functions
from typing import List
def add_number_to_list(num1: int, num_list: List[int], *args, **kwargs) -> List[int]:
    '''Adds num1 to num_list'''
    num_list.append(num1)
    return num_list

fn = Function(external_fn = add_number_to_list)

# Show the processed function docstring
print(str(fn))

# Use the function
fn(3, [2, 4, 5])
```
#### Example Output 5
`Description: Adds <num1: int> to <num_list: list>`

`Input: ['num1', 'num_list']`

`Output: {'num_list': 'Array of numbers'}`

`{'num_list': [2, 4, 5, 3]}`

#### Example Usage 5 (External Function with manually defined fn_description and output_format - Legacy Approach)
```python
def binary_to_decimal(x):
    return int(str(x), 2)

# an external function with a single output variable, with an expressive variable description
fn = Function(fn_description = 'Convert input <x: a binary number in base 2> to base 10', 
            output_format = {'output1': 'x in base 10'},
            external_fn = binary_to_decimal,
            llm = llm)

# Use the function
fn(10) #x
```

#### Example Output 4
```{'output1': 2}```

## 5. Integrating with OpenAI JSON Mode
- If you want to use the OpenAI JSON Mode, you can simply add in ```openai_json_mode = True``` and set ```model = 'gpt-4-1106-preview'``` or ```model = 'gpt-3.5-turbo-1106'``` in ```strict_json``` or ```Function```
- We will set model to ```gpt-3.5-turbo-1106``` by default if you provide an invalid model
- This does not work with the `llm` variable
- Note that type checking does not work with OpenAI JSON Mode

#### Example Usage
```python
res = strict_json(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'},
                    model = 'gpt-3.5-turbo-1106' # Set the model
                    openai_json_mode = True) # Toggle this to True
                                    
print(res)
```

#### Example Output
```{'Sentiment': 'positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 6}```

## 6. Nested Outputs
- StrictJSON supports nested outputs like nested lists and dictionaries

#### Example Input
```python
res = strict_json(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': ['Type of Sentiment', 
                                                   'Strength of Sentiment, type: Enum[1, 2, 3, 4, 5]'],
                                    'Adjectives': "Name and Description as separate keys, type: List[Dict['Name', 'Description']]",
                                    'Words': {
                                        'Number of words': 'Word count', 
                                        'Language': {
                                              'English': 'Whether it is English, type: bool',
                                              'Chinese': 'Whether it is Chinese, type: bool'
                                                  },
                                        'Proper Words': 'Whether the words are proper in the native language, type: bool'
                                        }
                                    },
                 llm = llm)

print(res)
```

#### Example Output
`{'Sentiment': ['Positive', 3],`

`'Adjectives': [{'Name': 'beautiful', 'Description': 'pleasing to the senses'}, {'Name': 'sunny', 'Description': 'filled with sunshine'}],`

`'Words':`

`     {'Number of words': 6,`
    
`     'Language': {'English': True, 'Chinese': False},`

`     'Proper Words': True}`
    
`}`

## 7. Return as JSON
- By default, `strict_json` returns a Python Dictionary
- If needed to parse as JSON, simply set `return_as_json=True`
- By default, this is set to `False` in order to return a Python Dictionry

## 8. Async Mode

- `AsyncFunction` and `strict_json_async`
    - These are the async equivalents of `Function` and `strict_json`
    - You will need to define an LLM that can operate in async mode
    - Everything is the same as the sync version of the functions, except you use the `await` keyword when calling `AsyncFunction` and `strict_json_async`
    
    
- Using Async can help do parallel processes simulataneously, resulting in a much faster workflow

#### Example LLM in Async Mode
```python
async def llm_async(system_prompt: str, user_prompt: str):
    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''
    # ensure your LLM imports are all within this function
    from openai import AsyncOpenAI
    
    # define your own LLM here
    client = AsyncOpenAI()
    response = await client.chat.completions.create(
        model='gpt-4o-mini',
        temperature = 0,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response.choices[0].message.content
```

#### Example Input (strict_json_async)
```python
res = await strict_json_async(system_prompt = 'You are a classifier',
                    user_prompt = 'It is a beautiful and sunny day',
                    output_format = {'Sentiment': 'Type of Sentiment',
                                    'Adjectives': 'Array of adjectives',
                                    'Words': 'Number of words'},
                                     llm = llm_async) # set this to your own LLM

print(res)
```

#### Example Output
`{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}`

#### Example Input (AsyncFunction)
```python
fn =  AsyncFunction(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', 
                     output_format = {'output': 'sentence'},
                     llm = llm_async) # set this to your own LLM

res = await fn('ball', 'dog', 'happy') #obj, entity, emotion

print(res)
```

#### Example Output
`{'output': 'The dog happily chased the ball.'}`

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "strictjson",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "John Tan Chong Min <tanchongmin@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b2/5c/395b886b8156d818ce868c65ae5d0114f76b2eac67244e6611d658a41bc1/strictjson-5.1.3.tar.gz",
    "platform": null,
    "description": "# Strict JSON v5.1.3\n[UPDATE]: For Agentic Framework, do check out TaskGen (the official Agentic Framework building on StrictJSON). This will make the StrictJSON repo neater and this github will focus on using StrictJSON for LLM Output Parsing\n- https://github.com/simbianai/taskgen\n\n### A Strict JSON Framework for LLM Outputs, that fixes problems that json.loads() cannot solve\n- Works for JSON outputs with multiple ' or \" or { or } or \\ or unmatched braces/brackets that may break a json.loads()\n\n### Base Functionalities (see Tutorial.ipynb)\n- Ensures LLM outputs into a dictionary based on a JSON format (HUGE: Nested lists and dictionaries now supported)\n- Supports `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Enum[]`, `bool` type forcing with LLM-based error correction, as well as LLM-based error correction using `type: ensure <restriction>`, and (advanced) custom user checks using `custom_checks`\n- Easy construction of LLM-based functions using ```Function``` (Note: renamed from `strict_function` to keep in line with naming convention of capitalised class groups. `strict_function` still works for legacy support.)\n- Easy integration with OpenAI JSON Mode by setting `openai_json_mode = True`\n- Exposing of llm variable for `strict_json` and `Function` for easy use of self-defined LLMs\n- `AsyncFunction` and `strict_json_async` for async (and faster) processing\n\n### Tutorials and Community Support\n- Created: 7 Apr 2023\n- Collaborators welcome\n- Video tutorial (Ask Me Anything): [https://www.youtube.com/watch?v=L4aytve5v1Q](https://www.youtube.com/watch?v=L4aytve5v1Q)\n- Video tutorial: [https://www.youtube.com/watch?v=IjTUKAciTCg](https://www.youtube.com/watch?v=1N-znDTlhNc)\n- Discussion Channel (my discord - John's AI Group): [discord.gg/bzp87AHJy5](discord.gg/bzp87AHJy5)\n\n## How do I use this? \n1. Download package via command line ```pip install strictjson```\n2. Import the required functions from ```strictjson```\n3. Set up the relevant API Keys for your LLM if needed. Refer to ```Tutorial.ipynb``` for how to do it for Jupyter Notebooks.\n\n## How does it work?\n- Extract JSON values as a string using a special regex (add delimiters to ```key``` to make ```###key###```) to split keys and values. (New!) Also works for nested datatypes by splitting recursively.\n- Uses ```ast.literal_eval``` to best match the extracted output value to a literal (e.g. int, string, dict).\n- Ensures that all JSON fields are output by LLM, with optional type checking, if not it will feed in error message to LLM to iteratively correct its generation (default: 3 tries)\n\n# Features:\n# 1. Basic Generation\n\n- **system_prompt**: Write in whatever you want the LLM to become. \"You are a \\<purpose in life\\>\"\n- **user_prompt**: The user input. Later, when we use it as a function, this is the function input\n- **output_format**: JSON of output variables in a dictionary, with the key as the output key, and the value as the output description\n    - The output keys will be preserved exactly, while the LLM will generate content to match the description of the value as best as possible\n- **llm**: The llm you want to use. Takes in `system_prompt` and `user_prompt` and outputs the LLM-generated string\n\n#### Example LLM Definition\n```python\ndef llm(system_prompt: str, user_prompt: str) -> str:\n    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''\n    # ensure your LLM imports are all within this function\n    from openai import OpenAI\n    \n    # define your own LLM here\n    client = OpenAI()\n    response = client.chat.completions.create(\n        model='gpt-4o-mini',\n        temperature = 0,\n        messages=[\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": user_prompt}\n        ]\n    )\n    return response.choices[0].message.content\n```\n\n#### Example Usage\n```python\nres = strict_json(system_prompt = 'You are a classifier',\n                    user_prompt = 'It is a beautiful and sunny day',\n                    output_format = {'Sentiment': 'Type of Sentiment',\n                                    'Adjectives': 'Array of adjectives',\n                                    'Words': 'Number of words'},\n                    llm = llm)\n                                    \nprint(res)\n```\n\n#### Example Output\n```{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}```\n\n## 2. Advanced Generation\n- More advanced demonstration involving code that would typically break ```json.loads()```\n\n#### Example Usage\n```python\nres = strict_json(system_prompt = 'You are a code generator, generating code to fulfil a task',\n                    user_prompt = 'Given array p, output a function named func_sum to return its sum',\n                    output_format = {'Elaboration': 'How you would do it',\n                                     'C': 'Code',\n                                    'Python': 'Code'},\n                    llm = llm)\n                                    \nprint(res)\n```\n\n#### Example Output\n```{'Elaboration': 'Use a loop to iterate through each element in the array and add it to a running total.', ```\n\n```'C': 'int func_sum(int p[], int size) {\\n    int sum = 0;\\n    for (int i = 0; i < size; i++) {\\n        sum += p[i];\\n    }\\n    return sum;\\n}', ```\n\n```'Python': 'def func_sum(p):\\n    sum = 0\\n    for num in p:\\n        sum += num\\n    return sum'}```\n\n## 3. Type forcing output variables\n- Generally, ```strict_json``` will infer the data type automatically for you for the output fields\n- However, if you would like very specific data types, you can do data forcing using ```type: <data_type>``` at the last part of the output field description\n- ```<data_type>``` must be of the form `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool` for type checking to work\n- `code` removes all unicode escape characters that might interfere with normal code running\n- The `Enum` and `List` are not case sensitive, so `enum` and `list` works just as well\n- For `Enum[list_of_category_names]`, it is best to give an \"Other\" category in case the LLM fails to classify correctly with the other options.\n- If `list` or `List[]` is not formatted correctly in LLM's output, we will correct it by asking the LLM to list out the elements line by line\n- For `dict`,  we can further check whether keys are present using `Dict[list_of_key_names]`\n- Other types will first be forced by rule-based conversion, any further errors will be fed into LLM's error feedback mechanism\n- If `<data_type>` is not the specified data types, it can still be useful to shape the output for the LLM. However, no type checking will be done.\n- Note: LLM understands the word `Array` better than `List` since `Array` is the official JSON object type, so in the backend, any type with the word `List` will be converted to `Array`.\n\n### LLM-based checks\n- If you would like the LLM to ensure that the type is being met, use `type: ensure <requirement>`\n- This will run a LLM to check if the requirement is met. If requirement is not met, the LLM will generate what needs to be done to meet the requirement, which will be fed into the error-correcting loop of `strict_json`\n\n#### Example Usage 1\n```python\nres = strict_json(system_prompt = 'You are a classifier',\n                    user_prompt = 'It is a beautiful and sunny day',\n                    output_format = {'Sentiment': 'Type of Sentiment, type: Enum[\"Pos\", \"Neg\", \"Other\"]',\n                                    'Adjectives': 'Array of adjectives, type: List[str]',\n                                    'Words': 'Number of words, type: int',\n                                    'In English': 'Whether sentence is in English, type: bool'},\n                  llm = llm)\n                                    \nprint(res)\n```\n\n#### Example Output 1\n```{'Sentiment': 'Pos', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7, 'In English': True}```\n\n#### Example Usage 2\n```python\nres = strict_json(system_prompt = 'You are an expert at organising birthday parties',\n                    user_prompt = 'Give me some information on how to organise a birthday',\n                    output_format = {'Famous Quote about Age': 'type: ensure quote contains the word age',\n                                    'Lucky draw numbers': '3 numbers from 1-50, type: List[int]',\n                                    'Sample venues': 'Describe two venues, type: List[Dict[\"Venue\", \"Description\"]]'},\n                    llm = llm)\n\nprint(res)\n```\n\n#### Example Output 2\n`Using LLM to check \"The secret of staying young is to live honestly, eat slowly, and lie about your age. - Lucille Ball\" to see if it adheres to \"quote contains the word age\" Requirement Met: True`\n\n\n```{'Famous Quote about Age': 'The secret of staying young is to live honestly, eat slowly, and lie about your age. - Lucille Ball',```\n```'Lucky draw numbers': [7, 21, 35],```\n\n```'Sample venues': [{'Venue': 'Beachside Resort', 'Description': 'A beautiful resort with stunning views of the beach. Perfect for a summer birthday party.'}, {'Venue': 'Indoor Trampoline Park', 'Description': 'An exciting venue with trampolines and fun activities. Ideal for an active and energetic birthday celebration.'}]}```\n\n## 4. Functions\n- Enhances ```strict_json()``` with a function-like interface for repeated use of modular LLM-based functions (or wraps external functions)\n- Use angle brackets <> to enclose input variable names. First input variable name to appear in `fn_description` will be first input variable and second to appear will be second input variable. For example, `fn_description = 'Adds up two numbers, <var1> and <var2>'` will result in a function with first input variable `var1` and second input variable `var2`\n- (Optional) If you would like greater specificity in your function's input, you can describe the variable after the : in the input variable name, e.g. `<var1: an integer from 10 to 30>`. Here, `var1` is the input variable and `an integer from 10 to 30` is the description.\n- (Optional) If your description of the variable is one of `int`, `float`, `str`, `dict`, `list`, `array`, `code`, `Dict[]`, `List[]`, `Array[]`, `Enum[]`, `bool`, we will enforce type checking when generating the function inputs in `get_next_subtask` method of the `Agent` class. Example: `<var1: int>`. Refer to Section 3. Type Forcing Output Variables for details.\n- Inputs (primary):\n    - **fn_description**: String. Function description to describe process of transforming input variables to output variables. Variables must be enclosed in <> and listed in order of appearance in function input.\n        - New feature: If `external_fn` is provided and no `fn_description` is provided, then we will automatically parse out the fn_description based on docstring of `external_fn`. The docstring should contain the names of all compulsory input variables\n        - New feature: If `external_fn` is provided and no `output_format` is provided, then we will automatically derive the `output_format` from the function signature\n    - **output_format**: Dict. Dictionary containing output variables names and description for each variable.\n    \n- Inputs (optional):\n    - **examples** - Dict or List[Dict]. Examples in Dictionary form with the input and output variables (list if more than one)\n    - **external_fn** - Python Function. If defined, instead of using LLM to process the function, we will run the external function. \n        If there are multiple outputs of this function, we will map it to the keys of `output_format` in a one-to-one fashion\n    - **fn_name** - String. If provided, this will be the name of the function. Otherwise, if `external_fn` is provided, it will be the name of `external_fn`. Otherwise, we will use LLM to generate a function name from the `fn_description`\n    - **kwargs** - Dict. Additional arguments you would like to pass on to the strict_json function\n        \n- Outputs:\n    JSON of output variables in a dictionary (similar to ```strict_json```)\n    \n#### Example Usage 1 (Description only)\n```python\n# basic configuration with variable names (in order of appearance in fn_description)\nfn = Function(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', \n                     output_format = {'output': 'sentence'},\n                     llm = llm)\n\n# Use the function\nfn('ball', 'dog', 'happy') #obj, entity, emotion\n```\n\n#### Example Output 1\n```{'output': 'The happy dog chased the ball.'}```\n\n#### Example Usage 2 (Examples only)\n```python\n# Construct the function: infer pattern from just examples without description (here it is multiplication)\nfn = Function(fn_description = 'Map <var1> and <var2> to output based on examples', \n                     output_format = {'output': 'final answer'}, \n                     examples = [{'var1': 3, 'var2': 2, 'output': 6}, \n                                 {'var1': 5, 'var2': 3, 'output': 15}, \n                                 {'var1': 7, 'var2': 4, 'output': 28}],\n                     llm = llm)\n\n# Use the function\nfn(2, 10) #var1, var2\n```\n\n#### Example Output 2\n```{'output': 20}```\n\n#### Example Usage 3 (Description and Examples)\n```python\n# Construct the function: description and examples with variable names\n# variable names will be referenced in order of appearance in fn_description\nfn = Function(fn_description = 'Output the sum and difference of <num1> and <num2>', \n                 output_format = {'sum': 'sum of two numbers', \n                                  'difference': 'absolute difference of two numbers'},\n                 examples = {'num1': 2, 'num2': 4, 'sum': 6, 'difference': 2},\n                 llm = llm)\n\n# Use the function\nfn(3, 4) #num1, num2\n```\n\n#### Example Output 3\n```{'sum': 7, 'difference': 1}```\n\n#### Example Usage 4 (External Function with automatic inference of fn_description and output_format - Preferred)\n```python\n# Docstring should provide all input variables, otherwise we will add it in automatically\n# We will ignore shared_variables, *args and **kwargs\n# No need to define llm in Function for External Functions\nfrom typing import List\ndef add_number_to_list(num1: int, num_list: List[int], *args, **kwargs) -> List[int]:\n    '''Adds num1 to num_list'''\n    num_list.append(num1)\n    return num_list\n\nfn = Function(external_fn = add_number_to_list)\n\n# Show the processed function docstring\nprint(str(fn))\n\n# Use the function\nfn(3, [2, 4, 5])\n```\n#### Example Output 5\n`Description: Adds <num1: int> to <num_list: list>`\n\n`Input: ['num1', 'num_list']`\n\n`Output: {'num_list': 'Array of numbers'}`\n\n`{'num_list': [2, 4, 5, 3]}`\n\n#### Example Usage 5 (External Function with manually defined fn_description and output_format - Legacy Approach)\n```python\ndef binary_to_decimal(x):\n    return int(str(x), 2)\n\n# an external function with a single output variable, with an expressive variable description\nfn = Function(fn_description = 'Convert input <x: a binary number in base 2> to base 10', \n            output_format = {'output1': 'x in base 10'},\n            external_fn = binary_to_decimal,\n            llm = llm)\n\n# Use the function\nfn(10) #x\n```\n\n#### Example Output 4\n```{'output1': 2}```\n\n## 5. Integrating with OpenAI JSON Mode\n- If you want to use the OpenAI JSON Mode, you can simply add in ```openai_json_mode = True``` and set ```model = 'gpt-4-1106-preview'``` or ```model = 'gpt-3.5-turbo-1106'``` in ```strict_json``` or ```Function```\n- We will set model to ```gpt-3.5-turbo-1106``` by default if you provide an invalid model\n- This does not work with the `llm` variable\n- Note that type checking does not work with OpenAI JSON Mode\n\n#### Example Usage\n```python\nres = strict_json(system_prompt = 'You are a classifier',\n                    user_prompt = 'It is a beautiful and sunny day',\n                    output_format = {'Sentiment': 'Type of Sentiment',\n                                    'Adjectives': 'Array of adjectives',\n                                    'Words': 'Number of words'},\n                    model = 'gpt-3.5-turbo-1106' # Set the model\n                    openai_json_mode = True) # Toggle this to True\n                                    \nprint(res)\n```\n\n#### Example Output\n```{'Sentiment': 'positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 6}```\n\n## 6. Nested Outputs\n- StrictJSON supports nested outputs like nested lists and dictionaries\n\n#### Example Input\n```python\nres = strict_json(system_prompt = 'You are a classifier',\n                    user_prompt = 'It is a beautiful and sunny day',\n                    output_format = {'Sentiment': ['Type of Sentiment', \n                                                   'Strength of Sentiment, type: Enum[1, 2, 3, 4, 5]'],\n                                    'Adjectives': \"Name and Description as separate keys, type: List[Dict['Name', 'Description']]\",\n                                    'Words': {\n                                        'Number of words': 'Word count', \n                                        'Language': {\n                                              'English': 'Whether it is English, type: bool',\n                                              'Chinese': 'Whether it is Chinese, type: bool'\n                                                  },\n                                        'Proper Words': 'Whether the words are proper in the native language, type: bool'\n                                        }\n                                    },\n                 llm = llm)\n\nprint(res)\n```\n\n#### Example Output\n`{'Sentiment': ['Positive', 3],`\n\n`'Adjectives': [{'Name': 'beautiful', 'Description': 'pleasing to the senses'}, {'Name': 'sunny', 'Description': 'filled with sunshine'}],`\n\n`'Words':`\n\n`     {'Number of words': 6,`\n    \n`     'Language': {'English': True, 'Chinese': False},`\n\n`     'Proper Words': True}`\n    \n`}`\n\n## 7. Return as JSON\n- By default, `strict_json` returns a Python Dictionary\n- If needed to parse as JSON, simply set `return_as_json=True`\n- By default, this is set to `False` in order to return a Python Dictionry\n\n## 8. Async Mode\n\n- `AsyncFunction` and `strict_json_async`\n    - These are the async equivalents of `Function` and `strict_json`\n    - You will need to define an LLM that can operate in async mode\n    - Everything is the same as the sync version of the functions, except you use the `await` keyword when calling `AsyncFunction` and `strict_json_async`\n    \n    \n- Using Async can help do parallel processes simulataneously, resulting in a much faster workflow\n\n#### Example LLM in Async Mode\n```python\nasync def llm_async(system_prompt: str, user_prompt: str):\n    ''' Here, we use OpenAI for illustration, you can change it to your own LLM '''\n    # ensure your LLM imports are all within this function\n    from openai import AsyncOpenAI\n    \n    # define your own LLM here\n    client = AsyncOpenAI()\n    response = await client.chat.completions.create(\n        model='gpt-4o-mini',\n        temperature = 0,\n        messages=[\n            {\"role\": \"system\", \"content\": system_prompt},\n            {\"role\": \"user\", \"content\": user_prompt}\n        ]\n    )\n    return response.choices[0].message.content\n```\n\n#### Example Input (strict_json_async)\n```python\nres = await strict_json_async(system_prompt = 'You are a classifier',\n                    user_prompt = 'It is a beautiful and sunny day',\n                    output_format = {'Sentiment': 'Type of Sentiment',\n                                    'Adjectives': 'Array of adjectives',\n                                    'Words': 'Number of words'},\n                                     llm = llm_async) # set this to your own LLM\n\nprint(res)\n```\n\n#### Example Output\n`{'Sentiment': 'Positive', 'Adjectives': ['beautiful', 'sunny'], 'Words': 7}`\n\n#### Example Input (AsyncFunction)\n```python\nfn =  AsyncFunction(fn_description = 'Output a sentence with <obj> and <entity> in the style of <emotion>', \n                     output_format = {'output': 'sentence'},\n                     llm = llm_async) # set this to your own LLM\n\nres = await fn('ball', 'dog', 'happy') #obj, entity, emotion\n\nprint(res)\n```\n\n#### Example Output\n`{'output': 'The dog happily chased the ball.'}`\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Strict JSON Framework for LLM Outputs, that fixes problems that json.loads() cannot solve",
    "version": "5.1.3",
    "project_urls": {
        "Homepage": "https://github.com/tanchongmin/strictjson",
        "Issues": "https://github.com/tanchongmin/strictjson/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b8043201bd989e4860f84dfdac6d432907eb5dbd4eefb94a9c39a195bec5dde0",
                "md5": "ae6b5df521c476ad99b09e60ca312a11",
                "sha256": "bb5b661ea6ddc808da21d1d76598d58b73782a301da3428aaee2eafec5984f39"
            },
            "downloads": -1,
            "filename": "strictjson-5.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ae6b5df521c476ad99b09e60ca312a11",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 27012,
            "upload_time": "2024-08-05T02:32:16",
            "upload_time_iso_8601": "2024-08-05T02:32:16.436585Z",
            "url": "https://files.pythonhosted.org/packages/b8/04/3201bd989e4860f84dfdac6d432907eb5dbd4eefb94a9c39a195bec5dde0/strictjson-5.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b25c395b886b8156d818ce868c65ae5d0114f76b2eac67244e6611d658a41bc1",
                "md5": "dd21a04b4416658fabee43d3b752f21e",
                "sha256": "d319097b384f3ba0a1b00e1be7c43626bf892396cd358f53b3bc94f94be4e0a1"
            },
            "downloads": -1,
            "filename": "strictjson-5.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "dd21a04b4416658fabee43d3b752f21e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 27001,
            "upload_time": "2024-08-05T02:32:18",
            "upload_time_iso_8601": "2024-08-05T02:32:18.306232Z",
            "url": "https://files.pythonhosted.org/packages/b2/5c/395b886b8156d818ce868c65ae5d0114f76b2eac67244e6611d658a41bc1/strictjson-5.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-05 02:32:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tanchongmin",
    "github_project": "strictjson",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "strictjson"
}
        
Elapsed time: 0.29784s