Name | llmprogram JSON |
Version |
0.1.1
JSON |
| download |
home_page | None |
Summary | A Python package for creating and running LLM programs. |
upload_time | 2025-08-11 15:41:09 |
maintainer | None |
docs_url | None |
author | Dipankar Sarkar |
requires_python | >=3.9 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# LLM Program
`llmprogram` is a Python package that provides a structured and powerful way to create and run programs that use Large Language Models (LLMs). It uses a YAML-based configuration to define the behavior of your LLM programs, making them easy to create, manage, and share.
## How is `llmprogram` different?
There are many libraries and frameworks available for working with LLMs. Here’s what makes `llmprogram` different:
* **Focus on Programmatic LLM-Chains:** `llmprogram` is designed to create self-contained, reusable "programs" that can be chained together to build more complex applications. The YAML-based configuration makes it easy to define and version these programs.
* **Data Quality and Validation:** The built-in input and output validation using JSON schemas ensures that your programs are robust and that the data flowing through them is correct. This is crucial for building reliable LLM-powered applications.
* **Dataset Generation as a First-Class Citizen:** `llmprogram` is designed with the entire lifecycle of an LLM application in mind, from development to production and fine-tuning. The automatic logging to a SQLite database makes it incredibly easy to create high-quality datasets for fine-tuning your own models.
* **Simplicity and Intuitiveness:** The YAML configuration is easy to read and write, and the Python API is simple and intuitive. This makes it easy to get started and to build complex applications without a steep learning curve.
## Features
* **YAML-based Configuration:** Define your LLM programs using simple and intuitive YAML files.
* **Input/Output Validation:** Use JSON schemas to validate the inputs and outputs of your programs, ensuring data integrity.
* **Jinja2 Templating:** Use the power of Jinja2 templates to create dynamic prompts for your LLMs.
* **Caching:** Built-in support for Redis caching to save time and reduce costs.
* **Execution Logging:** Automatically log program executions to a SQLite database for analysis and debugging.
* **Streaming:** Support for streaming responses from the LLM.
* **Extensible with Tools:** Extend the functionality of your programs by adding custom tools (functions) that the LLM can call.
* **Batch Processing:** Process multiple inputs in parallel for improved performance.
* **CLI for Dataset Generation:** A command-line interface to generate instruction datasets for LLM fine-tuning from your logged data.
## Getting Started
### Installation
```bash
pip install llmprogram
```
### Usage
1. **Set your OpenAI API Key:**
```bash
export OPENAI_API_KEY='your-api-key'
```
2. **Create a program YAML file:**
Create a file named `sentiment_analysis.yaml`:
```yaml
name: sentiment_analysis
description: Analyzes the sentiment of a given text.
version: 1.0.0
model:
provider: openai
name: gpt-4.1-mini
temperature: 0.5
max_tokens: 100
response_format: json_object
system_prompt: |
You are a sentiment analysis expert. Analyze the sentiment of the given text and return a JSON response with the following format:
- sentiment (string): "positive", "negative", or "neutral"
- score (number): A score from -1 (most negative) to 1 (most positive)
input_schema:
type: object
required:
- text
properties:
text:
type: string
description: The text to analyze.
output_schema:
type: object
required:
- sentiment
- score
properties:
sentiment:
type: string
enum: ["positive", "negative", "neutral"]
score:
type: number
minimum: -1
maximum: 1
template: |
Analyze the following text:
{{text}}
```
3. **Run the program:**
Create a file named `run_sentiment_analysis.py`:
```python
import asyncio
from llmprogram import LLMProgram
async def main():
program = LLMProgram('sentiment_analysis.yaml')
result = await program(text='I love this new product! It is amazing.')
print(result)
if __name__ == '__main__':
asyncio.run(main())
```
Run the script:
```bash
python run_sentiment_analysis.py
```
## Configuration
The behavior of each LLM program is defined in a YAML file. Here are the key sections:
* `name`, `description`, `version`: Basic metadata for your program.
* `model`: Defines the LLM provider, model name, and other parameters like `temperature` and `max_tokens`.
* `system_prompt`: The instructions that are given to the LLM to guide its behavior.
* `input_schema`: A JSON schema that defines the expected input for the program. The program will validate the input against this schema before execution.
* `output_schema`: A JSON schema that defines the expected output from the LLM. The program will validate the LLM's output against this schema.
* `template`: A Jinja2 template that is used to generate the prompt that is sent to the LLM. The template is rendered with the input variables.
## Using with other OpenAI-compatible endpoints
You can use `llmprogram` with any OpenAI-compatible endpoint, such as [Ollama](https://ollama.ai/). To do this, you can pass the `api_key` and `base_url` to the `LLMProgram` constructor:
```python
program = LLMProgram(
'your_program.yaml',
api_key='your-api-key', # optional, defaults to OPENAI_API_KEY env var
base_url='http://localhost:11434/v1' # example for Ollama
)
```
## Caching
`llmprogram` supports caching of LLM responses to Redis to improve performance and reduce costs. To enable caching, you need to have a Redis server running.
By default, caching is enabled. You can disable it or configure the Redis connection and cache TTL (time-to-live) when you create an `LLMProgram` instance:
```python
program = LLMProgram(
'your_program.yaml',
enable_cache=True,
redis_url="redis://localhost:6379",
cache_ttl=3600 # in seconds
)
```
## Logging and Dataset Generation
`llmprogram` automatically logs every execution of a program to a SQLite database. The database file is created in the same directory as the program YAML file, with a `.db` extension.
This logging feature is not just for debugging; it's also a powerful tool for creating high-quality datasets for fine-tuning your own LLMs. Each record in the log contains:
* `function_input`: The input given to the program.
* `function_output`: The output received from the LLM.
* `llm_input`: The prompt sent to the LLM.
* `llm_output`: The raw response from the LLM.
### Generating a Dataset
You can use the built-in CLI to generate an instruction dataset from the logged data. The dataset is created in JSONL format, which is commonly used for fine-tuning.
```bash
llmprogram generate-dataset /path/to/your_program.db /path/to/your_dataset.jsonl
```
Each line in the output file will be a JSON object with the following keys:
* `instruction`: The system prompt and the user prompt, combined to form the instruction for the LLM.
* `output`: The output from the LLM.
## Command-Line Interface (CLI)
`llmprogram` comes with a command-line interface for common tasks.
### `generate-dataset`
Generate an instruction dataset for LLM fine-tuning from a SQLite log file.
**Usage:**
```bash
llmprogram generate-dataset <database_path> <output_path>
```
**Arguments:**
* `database_path`: The path to the SQLite database file.
* `output_path`: The path to write the generated dataset to.
## Examples
You can find more examples in the `examples` directory:
* **Sentiment Analysis:** A simple program to analyze the sentiment of a piece of text. (`examples/sentiment_analysis.yaml`)
* **Code Generator:** A program that generates Python code from a natural language description. (`examples/code_generator.yaml`)
* **Email Generator:** A program that generates a professional email based on a few inputs. (`examples/email_generator.yaml`)
To run the examples, navigate to the `examples` directory and run the corresponding `run_*.py` script.
## Development
To run the tests for this package, you will need to install `pytest`:
```bash
pip install pytest
```
Then, you can run the tests from the root directory of the project:
```bash
pytest
```
Raw data
{
"_id": null,
"home_page": null,
"name": "llmprogram",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Dipankar Sarkar",
"author_email": "me@dipankar.name",
"download_url": "https://files.pythonhosted.org/packages/89/e4/8983ac46eac5614b797b0f1fe4ba1c72495762b4ab658786a7a9b96f27d1/llmprogram-0.1.1.tar.gz",
"platform": null,
"description": "# LLM Program\n\n`llmprogram` is a Python package that provides a structured and powerful way to create and run programs that use Large Language Models (LLMs). It uses a YAML-based configuration to define the behavior of your LLM programs, making them easy to create, manage, and share.\n\n## How is `llmprogram` different?\n\nThere are many libraries and frameworks available for working with LLMs. Here\u2019s what makes `llmprogram` different:\n\n* **Focus on Programmatic LLM-Chains:** `llmprogram` is designed to create self-contained, reusable \"programs\" that can be chained together to build more complex applications. The YAML-based configuration makes it easy to define and version these programs.\n* **Data Quality and Validation:** The built-in input and output validation using JSON schemas ensures that your programs are robust and that the data flowing through them is correct. This is crucial for building reliable LLM-powered applications.\n* **Dataset Generation as a First-Class Citizen:** `llmprogram` is designed with the entire lifecycle of an LLM application in mind, from development to production and fine-tuning. The automatic logging to a SQLite database makes it incredibly easy to create high-quality datasets for fine-tuning your own models.\n* **Simplicity and Intuitiveness:** The YAML configuration is easy to read and write, and the Python API is simple and intuitive. This makes it easy to get started and to build complex applications without a steep learning curve.\n\n## Features\n\n* **YAML-based Configuration:** Define your LLM programs using simple and intuitive YAML files.\n* **Input/Output Validation:** Use JSON schemas to validate the inputs and outputs of your programs, ensuring data integrity.\n* **Jinja2 Templating:** Use the power of Jinja2 templates to create dynamic prompts for your LLMs.\n* **Caching:** Built-in support for Redis caching to save time and reduce costs.\n* **Execution Logging:** Automatically log program executions to a SQLite database for analysis and debugging.\n* **Streaming:** Support for streaming responses from the LLM.\n* **Extensible with Tools:** Extend the functionality of your programs by adding custom tools (functions) that the LLM can call.\n* **Batch Processing:** Process multiple inputs in parallel for improved performance.\n* **CLI for Dataset Generation:** A command-line interface to generate instruction datasets for LLM fine-tuning from your logged data.\n\n## Getting Started\n\n### Installation\n\n```bash\npip install llmprogram\n```\n\n### Usage\n\n1. **Set your OpenAI API Key:**\n\n ```bash\n export OPENAI_API_KEY='your-api-key'\n ```\n\n2. **Create a program YAML file:**\n\n Create a file named `sentiment_analysis.yaml`:\n\n ```yaml\n name: sentiment_analysis\n description: Analyzes the sentiment of a given text.\n version: 1.0.0\n\n model:\n provider: openai\n name: gpt-4.1-mini\n temperature: 0.5\n max_tokens: 100\n response_format: json_object\n\n system_prompt: |\n You are a sentiment analysis expert. Analyze the sentiment of the given text and return a JSON response with the following format:\n - sentiment (string): \"positive\", \"negative\", or \"neutral\"\n - score (number): A score from -1 (most negative) to 1 (most positive)\n\n input_schema:\n type: object\n required:\n - text\n properties:\n text:\n type: string\n description: The text to analyze.\n\n output_schema:\n type: object\n required:\n - sentiment\n - score\n properties:\n sentiment:\n type: string\n enum: [\"positive\", \"negative\", \"neutral\"]\n score:\n type: number\n minimum: -1\n maximum: 1\n\n template: |\n Analyze the following text:\n {{text}}\n ```\n\n3. **Run the program:**\n\n Create a file named `run_sentiment_analysis.py`:\n\n ```python\n import asyncio\n from llmprogram import LLMProgram\n\n async def main():\n program = LLMProgram('sentiment_analysis.yaml')\n result = await program(text='I love this new product! It is amazing.')\n print(result)\n\n if __name__ == '__main__':\n asyncio.run(main())\n ```\n\n Run the script:\n\n ```bash\n python run_sentiment_analysis.py\n ```\n\n## Configuration\n\nThe behavior of each LLM program is defined in a YAML file. Here are the key sections:\n\n* `name`, `description`, `version`: Basic metadata for your program.\n* `model`: Defines the LLM provider, model name, and other parameters like `temperature` and `max_tokens`.\n* `system_prompt`: The instructions that are given to the LLM to guide its behavior.\n* `input_schema`: A JSON schema that defines the expected input for the program. The program will validate the input against this schema before execution.\n* `output_schema`: A JSON schema that defines the expected output from the LLM. The program will validate the LLM's output against this schema.\n* `template`: A Jinja2 template that is used to generate the prompt that is sent to the LLM. The template is rendered with the input variables.\n\n## Using with other OpenAI-compatible endpoints\n\nYou can use `llmprogram` with any OpenAI-compatible endpoint, such as [Ollama](https://ollama.ai/). To do this, you can pass the `api_key` and `base_url` to the `LLMProgram` constructor:\n\n```python\nprogram = LLMProgram(\n 'your_program.yaml',\n api_key='your-api-key', # optional, defaults to OPENAI_API_KEY env var\n base_url='http://localhost:11434/v1' # example for Ollama\n)\n```\n\n## Caching\n\n`llmprogram` supports caching of LLM responses to Redis to improve performance and reduce costs. To enable caching, you need to have a Redis server running.\n\nBy default, caching is enabled. You can disable it or configure the Redis connection and cache TTL (time-to-live) when you create an `LLMProgram` instance:\n\n```python\nprogram = LLMProgram(\n 'your_program.yaml',\n enable_cache=True,\n redis_url=\"redis://localhost:6379\",\n cache_ttl=3600 # in seconds\n)\n```\n\n## Logging and Dataset Generation\n\n`llmprogram` automatically logs every execution of a program to a SQLite database. The database file is created in the same directory as the program YAML file, with a `.db` extension.\n\nThis logging feature is not just for debugging; it's also a powerful tool for creating high-quality datasets for fine-tuning your own LLMs. Each record in the log contains:\n\n* `function_input`: The input given to the program.\n* `function_output`: The output received from the LLM.\n* `llm_input`: The prompt sent to the LLM.\n* `llm_output`: The raw response from the LLM.\n\n### Generating a Dataset\n\nYou can use the built-in CLI to generate an instruction dataset from the logged data. The dataset is created in JSONL format, which is commonly used for fine-tuning.\n\n```bash\nllmprogram generate-dataset /path/to/your_program.db /path/to/your_dataset.jsonl\n```\n\nEach line in the output file will be a JSON object with the following keys:\n\n* `instruction`: The system prompt and the user prompt, combined to form the instruction for the LLM.\n* `output`: The output from the LLM.\n\n## Command-Line Interface (CLI)\n\n`llmprogram` comes with a command-line interface for common tasks.\n\n### `generate-dataset`\n\nGenerate an instruction dataset for LLM fine-tuning from a SQLite log file.\n\n**Usage:**\n\n```bash\nllmprogram generate-dataset <database_path> <output_path>\n```\n\n**Arguments:**\n\n* `database_path`: The path to the SQLite database file.\n* `output_path`: The path to write the generated dataset to.\n\n## Examples\n\nYou can find more examples in the `examples` directory:\n\n* **Sentiment Analysis:** A simple program to analyze the sentiment of a piece of text. (`examples/sentiment_analysis.yaml`)\n* **Code Generator:** A program that generates Python code from a natural language description. (`examples/code_generator.yaml`)\n* **Email Generator:** A program that generates a professional email based on a few inputs. (`examples/email_generator.yaml`)\n\nTo run the examples, navigate to the `examples` directory and run the corresponding `run_*.py` script.\n\n## Development\n\nTo run the tests for this package, you will need to install `pytest`:\n\n```bash\npip install pytest\n```\n\nThen, you can run the tests from the root directory of the project:\n\n```bash\npytest\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python package for creating and running LLM programs.",
"version": "0.1.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "dc2e3ed26157b2bf0306c3812014c6cc1f1f3d99146b5226dbd5bcea51521c00",
"md5": "1564cea7fe0c0ed149f622bdd9fc8be4",
"sha256": "bbe1125d889597d40b89c544c3ee88f35fa67bde006548dcb148c46535fb0466"
},
"downloads": -1,
"filename": "llmprogram-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1564cea7fe0c0ed149f622bdd9fc8be4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 12525,
"upload_time": "2025-08-11T15:41:08",
"upload_time_iso_8601": "2025-08-11T15:41:08.472004Z",
"url": "https://files.pythonhosted.org/packages/dc/2e/3ed26157b2bf0306c3812014c6cc1f1f3d99146b5226dbd5bcea51521c00/llmprogram-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "89e48983ac46eac5614b797b0f1fe4ba1c72495762b4ab658786a7a9b96f27d1",
"md5": "9c6ef1ef202e1d4d492c98b67a82728a",
"sha256": "9a920adb952faae2778e7b8d0e4cfe59156c543ea9ee9e502dd2312957e80091"
},
"downloads": -1,
"filename": "llmprogram-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "9c6ef1ef202e1d4d492c98b67a82728a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 13735,
"upload_time": "2025-08-11T15:41:09",
"upload_time_iso_8601": "2025-08-11T15:41:09.669307Z",
"url": "https://files.pythonhosted.org/packages/89/e4/8983ac46eac5614b797b0f1fe4ba1c72495762b4ab658786a7a9b96f27d1/llmprogram-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-11 15:41:09",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "llmprogram"
}