# JsonAI β Production-Ready Structured JSON Generation with LLMs
JsonAI is a comprehensive Python library for generating structured JSON data using Large Language Models (LLMs). It provides enterprise-grade features including robust JSON schema validation, multiple model backends, REST API, React frontend, CLI interface, and production deployment configurations.
Current version: 0.15.1
## π Whatβs New in 0.15.1
- Stabilized FastAPI REST API with endpoints for sync/async generation, batch processing, stats, cache management, and schema validation
- Performance suite:
- PerformanceMonitor async timing fixes
- CachedJsonformer with LRU/TTL caching
- BatchProcessor for efficient concurrent execution
- OptimizedJsonformer combines caching + batch processing with warmup
- Async generation improvements:
- FullAsyncJsonformer (aliased as AsyncJsonformer in the API)
- AsyncJsonformer wrapper in main.py for async tool execution
- Logging hygiene: lazy logging interpolation to reduce overhead
- Packaging: PyPI publish flow cleaned; version bumped to 0.15.1
## π Features
### Core Capabilities
- Multiple LLM Backends: Ollama, OpenAI, and HuggingFace Transformers
- Full JSON Schema Coverage: primitives, arrays, objects, enums, nested structures, oneOf
- Performance Optimization: caching (LRU/TTL), batch processing, async operations
- Production Ready: Docker, FastAPI, monitoring, scaling considerations
### Interfaces & APIs
- REST API: FastAPI-based service with OpenAPI docs
- React Frontend: Modern web interface for JSON generation
- CLI Interface: Command-line tools for automation and batch processing
- Python Library: Programmatic access with sync and async support
### Enterprise Features
- Caching System: Intelligent multi-level caching (LRU/TTL)
- Batch Processing: Concurrent batch execution
- Performance Monitoring: Built-in metrics via PerformanceMonitor
- Schema Validation: Comprehensive validation with jsonschema
- Multiple Output Formats: JSON, YAML, XML, and CSV
## π¦ Installation
### Option 1: pip (Recommended)
```bash
pip install jsonai
```
### Option 2: From Source
```bash
git clone https://github.com/yourusername/JsonAI.git
cd JsonAI
poetry install
```
### Option 3: Docker
```bash
# Quick start with Docker
docker run -p 8000:8000 jsonai:latest
# Full stack with Docker Compose
docker-compose up -d
```
## Architecture Overview
The `jsonAI` library is modular and consists of the following components:
- **Jsonformer** (jsonAI.main): Orchestrates generation, formatting, and validation
- **TypeGenerator**: Generates values for each JSON Schema type
- **OutputFormatter**: Converts data into JSON, YAML, XML, CSV
- **SchemaValidator**: Validates data with jsonschema
- **ToolRegistry**: Registers and resolves Python/MCP tools
- **Async Paths**:
- **FullAsyncJsonformer** (jsonAI.async_jsonformer): asynchronous generator taking model_backend, json_schema, prompt (aliased as AsyncJsonformer in API)
- **AsyncJsonformer wrapper** (jsonAI.main): wraps a Jsonformer instance for async tool execution
## Testing
The project includes comprehensive tests for each component and integration:
- **Unit Tests**: Test individual components.
- **Integration Tests**: Validate the interaction between components.
To run tests:
```bash
pytest tests/
```
## Quick API Start (FastAPI)
Run the API with uvicorn:
```bash
uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000
```
Then open http://localhost:8000/docs for interactive Swagger UI.
### REST Endpoints
- POST /generate β synchronous generation
- POST /generate/async β asynchronous generation
- POST /generate/batch β concurrent batch generation
- GET /stats β performance and cache statistics
- DELETE /cache β clear all caches
- POST /validate β validate a JSON schema
Minimal cURL examples:
```bash
# Sync generate
curl -X POST http://localhost:8000/generate -H "Content-Type: application/json" -d '{
"prompt": "Generate a simple user object",
"schema": {"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}}},
"model_name": "ollama",
"model_path": "llama3"
}'
# Async generate
curl -X POST http://localhost:8000/generate/async -H "Content-Type: application/json" -d '{
"prompt": "Generate a simple user object",
"schema": {"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}}},
"model_name": "ollama",
"model_path": "llama3"
}'
# Batch generate
curl -X POST http://localhost:8000/generate/batch -H "Content-Type: application/json" -d '{
"requests": [
{"prompt":"User 1","schema":{"type":"object","properties":{"name":{"type":"string"}}},"model_name":"ollama","model_path":"llama3"},
{"prompt":"User 2","schema":{"type":"object","properties":{"name":{"type":"string"}}},"model_name":"ollama","model_path":"llama3"}
],
"max_concurrent": 5
}'
```
## Examples
### Basic JSON Generation
```python
from jsonAI.main import Jsonformer
# Suppose you have a backend that implements ModelBackend
from jsonAI.model_backends import DummyBackend
backend = DummyBackend() # replace with OllamaBackend/OpenAIBackend/etc.
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"isStudent": {"type": "boolean"}
}
}
prompt = "Generate a person's profile."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt)
output = jsonformer()
print(output)
```
### XML Output
### YAML Output
```python
schema = {
"type": "object",
"properties": {
"city": {"type": "string"},
"population": {"type": "integer"}
}
}
prompt = "Generate a city profile."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="yaml")
output = jsonformer()
print(output)
```
### CSV Output
```python
schema = {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"score": {"type": "number"}
}
}
}
prompt = "Generate a list of students and their scores."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="csv")
output = jsonformer()
print(output)
```
### CLI Example
#### Basic CLI Usage
```bash
python -m jsonAI.cli generate --schema schema.json --prompt "Generate a product" --output-format json
```
#### Using Ollama Backend (Recommended for LLMs)
```bash
python -m jsonAI.cli generate --schema complex_schema.json \
--prompt "Generate a comprehensive person profile as JSON." \
--use-ollama --ollama-model llama3
```
#### Features
- Robustly extracts the first valid JSON object from any LLM output (even if wrapped in <answer> tags or surrounded by extra text)
- Supports all JSON schema types: primitives, enums, arrays, objects, null, oneOf, nested/complex
- Validates output against the schema and warns if invalid
- Pretty-prints objects/arrays, prints primitives/null as-is
- Production-ready for any schema and LLM output style
#### Example Output
```json
{
"id": "profile with all supported JSON schema types.",
"name": "re",
"age": 30,
"is_active": true,
"email": "example@example.com",
"roles": ["admin", "user"],
"address": {"street": "123 Main St", "city": "Anytown", "zip": "12345", "country": "USA"},
"preferences": {"newsletter": true, "theme": "dark", "language": "en"},
"tags": ["tech", "developer"],
"score": 95,
"metadata": {"key1": "value1", "key2": "value2"},
"status": "active",
"history": [{"date": "2023-01-01", "event": "joined", "details": "Account created"}],
"profile_picture": "https://example.com/avatar.jpg",
"settings": {"notifications": true, "privacy": "private"},
"null_field": null
}
```
See `complex_schema.json` for a comprehensive schema example.
### Tool Calling Example
```python
def send_email(email):
print(f"Sending email to {email}")
return "Email sent"
tool_registry = ToolRegistry()
tool_registry.register_tool("send_email", send_email)
schema = {
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"}
},
"x-jsonai-tool-call": {
"name": "send_email",
"arguments": {"email": "email"}
}
}
prompt = "Generate a user email."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, tool_registry=tool_registry)
output = jsonformer()
print(output)
```
### MCP Integration Example
```python
def mcp_callback(tool_name, server_name, kwargs):
# Simulate MCP call
return f"Called {tool_name} on {server_name} with {kwargs}"
schema = {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"x-jsonai-tool-call": {
"name": "search_tool",
"arguments": {"query": "query"}
}
}
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, mcp_callback=mcp_callback)
output = jsonformer()
print(output)
```
### Complex Schema Example
```python
schema = {
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"id": {"type": "uuid"},
"name": {"type": "string"},
"email": {"type": "string", "format": "email"}
}
},
"roles": {
"type": "array",
"items": {"type": "string", "enum": ["admin", "user", "guest"]}
},
"profile": {
"oneOf": [
{"type": "object", "properties": {"age": {"type": "integer"}}},
{"type": "object", "properties": {"birthdate": {"type": "date"}}}
]
}
},
"x-jsonai-tool-call": {
"name": "send_welcome_email",
"arguments": {"email": "user.email"}
}
}
# ...setup model, tokenizer, tool_registry, etc...
jsonformer = Jsonformer(model, tokenizer, schema, prompt, tool_registry=tool_registry)
output = jsonformer()
print(output)
```
```python
schema = {
"type": "object",
"properties": {
"book": {
"type": "object",
"properties": {
"title": {"type": "string"},
"author": {"type": "string"},
"year": {"type": "integer"}
}
}
}
}
prompt = "Generate details for a book."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="xml")
output = jsonformer()
print(output)
```
### Tool Chaining Example
You can chain multiple tools together using the `x-jsonai-tool-chain` schema key. Each tool in the chain receives arguments from the generated data and/or previous tool outputs.
```python
from jsonAI.main import Jsonformer
from jsonAI.tool_registry import ToolRegistry
def add(x, y):
return {"sum": x + y}
def multiply(sum, factor):
return {"product": sum * factor}
registry = ToolRegistry()
registry.register_tool("add", add)
registry.register_tool("multiply", multiply)
schema = {
"type": "object",
"properties": {
"x": {"type": "integer"},
"y": {"type": "integer"},
"factor": {"type": "integer"}
},
"x-jsonai-tool-chain": [
{
"name": "add",
"arguments": {"x": "x", "y": "y"}
},
{
"name": "multiply",
"arguments": {"sum": "sum", "factor": "factor"}
}
]
}
prompt = "Calculate (x + y) * factor."
jsonformer = Jsonformer(
model_backend=None, # Not used in this example
json_schema=schema,
prompt=prompt,
tool_registry=registry
)
# Provide input data (simulate generated data)
jsonformer.value = {"x": 2, "y": 3, "factor": 4}
generated = jsonformer.generate_data()
result = jsonformer._execute_tool_call(generated)
print(result)
# Output will include all intermediate and final tool results.
```
## Performance and Caching
JsonAI includes a performance suite to optimize throughput and latency.
- **PerformanceMonitor**: measures durations for operations (async-safe)
- **CachedJsonformer**: two-level caching
- LRU cache for simple schema-based results
- TTL cache for prompt-based entries for complex schemas
- **OptimizedJsonformer**: all performance features plus cache warmup and batch helpers
- **BatchProcessor**: asynchronous concurrent processing (configurable semaphore)
Example:
```python
from jsonAI.performance import OptimizedJsonformer
from jsonAI.model_backends import DummyBackend
backend = DummyBackend()
schema = {"type":"object","properties":{"name":{"type":"string"}}}
jsonformer = OptimizedJsonformer(
model=backend, # accepts a ModelBackend
tokenizer=backend.tokenizer,
schema=schema,
cache_size=1000,
cache_ttl=3600
)
# Single generation (cached)
print(jsonformer.generate("Generate a name"))
# Batch generation
requests = [
{"prompt":"User A","kwargs":{}},
{"prompt":"User B","kwargs":{}}
]
print(jsonformer.generate_batch(requests))
```
To inspect performance and cache stats at runtime, use the REST API `GET /stats` or:
```python
jsonformer.get_comprehensive_stats()
```
## Output Format Γ Type Coverage
| Type | Example | JSON | XML | YAML | CSV* |
|-----------|----------------|------|------|------|------|
| number | 3.14 | β
| β
| β
| β
|
| integer | 42 | β
| β
| β
| β
|
| boolean | true | β
| β
| β
| β
|
| string | "hello" | β
| β
| β
| β
|
| datetime | "2023-06-29T12:00:00Z" | β
| β
| β
| β
|
| date | "2023-06-29" | β
| β
| β
| β
|
| time | "12:00:00" | β
| β
| β
| β
|
| uuid | "123e4567-e89b-12d3-a456-426614174000" | β
| β
| β
| β
|
| binary | "SGVsbG8=" | β
| β
| β
| β
|
| null | null | β
| (β οΈ) | β
| (β οΈ) |
| array | [1,2,3] | β
| β
| β
| (β οΈ) |
| object | {"a":1} | β
| β
| β
| (β οΈ) |
| enum | "red" | β
| β
| β
| β
|
| p_enum | "blue" | β
| β
| β
| β
|
| p_integer | 7 | β
| β
| β
| β
|
β
= Supported
β οΈ = Supported with caveats (e.g., nulls in XML/CSV, arrays/objects in CSV)
*CSV: Only arrays of objects (tabular) are practical
## Integrations & Capabilities
- LLMs: HuggingFace Transformers, OpenAI, Ollama (vLLM patterns apply)
- FastAPI: See `jsonAI/api.py` and `examples/fastapi_example.py`
- Tool Registry: Register and call Python or MCP tools from schemas; supports tool chaining via `x-jsonai-tool-chain`
- Async Support:
- `FullAsyncJsonformer` for async generation with `model_backend/json_schema/prompt`
- `AsyncJsonformer` wrapper (jsonAI.main) for async tool execution
See the [examples/](examples/) directory for more advanced usage and integration patterns.
## License
This project is licensed under the MIT License.
## Native Library Usage
JsonAI leverages high-performance native libraries for data processing and extensibility:
- **PyYAML** for YAML serialization
- **lxml** for XML output
- **cachetools** for caching
- **requests** and **aiohttp** for HTTP
- **jsonschema** for validation
For any tabular or batch data processing, it is recommended to use **pandas** for reliability and performance. If you extend JsonAI or build custom output logic, prefer native libraries like pandas, numpy, or others for best results.
## Multi-Environment Support
JsonAI supports multiple environments: dev, qa, perf, cte, and prod. Each environment has its own `.env` file at the project root.
- **Local Development:**
Copy or rename the desired `.env.*` file to `.env` before running locally.
```bash
cp .env.dev .env
uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000
```
- **Docker Compose:**
Edit `docker-compose.yml` to set the `env_file` for the desired environment (e.g., `.env.prod`).
Or override at runtime:
```bash
docker-compose --env-file .env.qa up -d
```
- **Docker:**
Pass the environment file at runtime:
```bash
docker run --env-file .env.prod -p 8000:8000 jsonai:latest
```
- **CI/CD:**
The GitHub Actions workflow tests all environments by copying the correct `.env.*` file to `.env` for each matrix job.
- **APP_ENV Variable:**
The Dockerfile sets `APP_ENV` (default: dev) for extensibility. You can override this at runtime.
See `docs/deployment.md` for more details.
## Deployment
- API:
- `uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000`
- CORS is enabled by default for development; harden for production
- Docker:
- `docker build -t jsonai:latest .`
- `docker run -p 8000:8000 jsonai:latest`
- Docker Compose:
- `docker-compose up -d`
- See `docs/deployment.md` for more
## Versioning and Release
PyPI forbids reusing the same filename for the same version. Always bump the version:
```bash
poetry version patch # or minor/major
poetry build
poetry publish -u __token__ -p $PYPI_TOKEN
```
Automate in CI by bumping on tags and using repository secrets for tokens.
## Streaming Support
JsonAI supports streaming data generation (experimental API in examples). Example pattern:
```python
jsonformer = Jsonformer(model_backend, json_schema, prompt)
for data_chunk in jsonformer.stream_generate_data():
print(data_chunk)
```
For async streaming, adapt the pattern with the async wrapper as needed.
Raw data
{
"_id": null,
"home_page": "https://github.com/kishoretvk/GenerativeJson",
"name": "jsonAI",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "json, llm, schema, generation, fastapi, openai, ollama, transformers",
"author": "1rgs",
"author_email": "kishoretvk9@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/73/c4/60dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517/jsonai-0.15.2.2.tar.gz",
"platform": null,
"description": "# JsonAI \u2014 Production-Ready Structured JSON Generation with LLMs\n\nJsonAI is a comprehensive Python library for generating structured JSON data using Large Language Models (LLMs). It provides enterprise-grade features including robust JSON schema validation, multiple model backends, REST API, React frontend, CLI interface, and production deployment configurations.\n\nCurrent version: 0.15.1\n\n## \ud83d\udd14 What\u2019s New in 0.15.1\n\n- Stabilized FastAPI REST API with endpoints for sync/async generation, batch processing, stats, cache management, and schema validation\n- Performance suite:\n - PerformanceMonitor async timing fixes\n - CachedJsonformer with LRU/TTL caching\n - BatchProcessor for efficient concurrent execution\n - OptimizedJsonformer combines caching + batch processing with warmup\n- Async generation improvements:\n - FullAsyncJsonformer (aliased as AsyncJsonformer in the API)\n - AsyncJsonformer wrapper in main.py for async tool execution\n- Logging hygiene: lazy logging interpolation to reduce overhead\n- Packaging: PyPI publish flow cleaned; version bumped to 0.15.1\n\n## \ud83d\ude80 Features\n\n### Core Capabilities\n- Multiple LLM Backends: Ollama, OpenAI, and HuggingFace Transformers\n- Full JSON Schema Coverage: primitives, arrays, objects, enums, nested structures, oneOf\n- Performance Optimization: caching (LRU/TTL), batch processing, async operations\n- Production Ready: Docker, FastAPI, monitoring, scaling considerations\n\n### Interfaces & APIs\n- REST API: FastAPI-based service with OpenAPI docs\n- React Frontend: Modern web interface for JSON generation\n- CLI Interface: Command-line tools for automation and batch processing\n- Python Library: Programmatic access with sync and async support\n\n### Enterprise Features\n- Caching System: Intelligent multi-level caching (LRU/TTL)\n- Batch Processing: Concurrent batch execution\n- Performance Monitoring: Built-in metrics via PerformanceMonitor\n- Schema Validation: Comprehensive validation with jsonschema\n- Multiple Output Formats: JSON, YAML, XML, and CSV\n\n## \ud83d\udce6 Installation\n\n### Option 1: pip (Recommended)\n```bash\npip install jsonai\n```\n\n### Option 2: From Source\n```bash\ngit clone https://github.com/yourusername/JsonAI.git\ncd JsonAI\npoetry install\n```\n\n### Option 3: Docker\n```bash\n# Quick start with Docker\ndocker run -p 8000:8000 jsonai:latest\n\n# Full stack with Docker Compose\ndocker-compose up -d\n```\n\n## Architecture Overview\n\nThe `jsonAI` library is modular and consists of the following components:\n\n- **Jsonformer** (jsonAI.main): Orchestrates generation, formatting, and validation\n- **TypeGenerator**: Generates values for each JSON Schema type\n- **OutputFormatter**: Converts data into JSON, YAML, XML, CSV\n- **SchemaValidator**: Validates data with jsonschema\n- **ToolRegistry**: Registers and resolves Python/MCP tools\n- **Async Paths**:\n - **FullAsyncJsonformer** (jsonAI.async_jsonformer): asynchronous generator taking model_backend, json_schema, prompt (aliased as AsyncJsonformer in API)\n - **AsyncJsonformer wrapper** (jsonAI.main): wraps a Jsonformer instance for async tool execution\n\n## Testing\n\nThe project includes comprehensive tests for each component and integration:\n\n- **Unit Tests**: Test individual components.\n- **Integration Tests**: Validate the interaction between components.\n\nTo run tests:\n\n```bash\npytest tests/\n```\n\n## Quick API Start (FastAPI)\n\nRun the API with uvicorn:\n\n```bash\nuvicorn jsonAI.api:app --host 0.0.0.0 --port 8000\n```\n\nThen open http://localhost:8000/docs for interactive Swagger UI.\n\n### REST Endpoints\n\n- POST /generate \u2014 synchronous generation\n- POST /generate/async \u2014 asynchronous generation\n- POST /generate/batch \u2014 concurrent batch generation\n- GET /stats \u2014 performance and cache statistics\n- DELETE /cache \u2014 clear all caches\n- POST /validate \u2014 validate a JSON schema\n\nMinimal cURL examples:\n\n```bash\n# Sync generate\ncurl -X POST http://localhost:8000/generate -H \"Content-Type: application/json\" -d '{\n \"prompt\": \"Generate a simple user object\",\n \"schema\": {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"age\":{\"type\":\"integer\"}}},\n \"model_name\": \"ollama\",\n \"model_path\": \"llama3\"\n}'\n\n# Async generate\ncurl -X POST http://localhost:8000/generate/async -H \"Content-Type: application/json\" -d '{\n \"prompt\": \"Generate a simple user object\",\n \"schema\": {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"age\":{\"type\":\"integer\"}}},\n \"model_name\": \"ollama\",\n \"model_path\": \"llama3\"\n}'\n\n# Batch generate\ncurl -X POST http://localhost:8000/generate/batch -H \"Content-Type: application/json\" -d '{\n \"requests\": [\n {\"prompt\":\"User 1\",\"schema\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}},\"model_name\":\"ollama\",\"model_path\":\"llama3\"},\n {\"prompt\":\"User 2\",\"schema\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}},\"model_name\":\"ollama\",\"model_path\":\"llama3\"}\n ],\n \"max_concurrent\": 5\n}'\n```\n\n## Examples\n\n### Basic JSON Generation\n\n```python\nfrom jsonAI.main import Jsonformer\n\n# Suppose you have a backend that implements ModelBackend\nfrom jsonAI.model_backends import DummyBackend\nbackend = DummyBackend() # replace with OllamaBackend/OpenAIBackend/etc.\n\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"age\": {\"type\": \"integer\"},\n \"isStudent\": {\"type\": \"boolean\"}\n }\n}\nprompt = \"Generate a person's profile.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt)\noutput = jsonformer()\nprint(output)\n```\n\n\n### XML Output\n### YAML Output\n\n```python\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"city\": {\"type\": \"string\"},\n \"population\": {\"type\": \"integer\"}\n }\n}\nprompt = \"Generate a city profile.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"yaml\")\noutput = jsonformer()\nprint(output)\n```\n\n### CSV Output\n\n```python\nschema = {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"object\",\n \"properties\": {\n \"name\": {\"type\": \"string\"},\n \"score\": {\"type\": \"number\"}\n }\n }\n}\nprompt = \"Generate a list of students and their scores.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"csv\")\noutput = jsonformer()\nprint(output)\n```\n\n\n### CLI Example\n\n#### Basic CLI Usage\n\n```bash\npython -m jsonAI.cli generate --schema schema.json --prompt \"Generate a product\" --output-format json\n```\n\n#### Using Ollama Backend (Recommended for LLMs)\n\n```bash\npython -m jsonAI.cli generate --schema complex_schema.json \\\n --prompt \"Generate a comprehensive person profile as JSON.\" \\\n --use-ollama --ollama-model llama3\n```\n\n#### Features\n- Robustly extracts the first valid JSON object from any LLM output (even if wrapped in <answer> tags or surrounded by extra text)\n- Supports all JSON schema types: primitives, enums, arrays, objects, null, oneOf, nested/complex\n- Validates output against the schema and warns if invalid\n- Pretty-prints objects/arrays, prints primitives/null as-is\n- Production-ready for any schema and LLM output style\n\n#### Example Output\n\n```json\n{\n \"id\": \"profile with all supported JSON schema types.\",\n \"name\": \"re\",\n \"age\": 30,\n \"is_active\": true,\n \"email\": \"example@example.com\",\n \"roles\": [\"admin\", \"user\"],\n \"address\": {\"street\": \"123 Main St\", \"city\": \"Anytown\", \"zip\": \"12345\", \"country\": \"USA\"},\n \"preferences\": {\"newsletter\": true, \"theme\": \"dark\", \"language\": \"en\"},\n \"tags\": [\"tech\", \"developer\"],\n \"score\": 95,\n \"metadata\": {\"key1\": \"value1\", \"key2\": \"value2\"},\n \"status\": \"active\",\n \"history\": [{\"date\": \"2023-01-01\", \"event\": \"joined\", \"details\": \"Account created\"}],\n \"profile_picture\": \"https://example.com/avatar.jpg\",\n \"settings\": {\"notifications\": true, \"privacy\": \"private\"},\n \"null_field\": null\n}\n```\n\nSee `complex_schema.json` for a comprehensive schema example.\n\n### Tool Calling Example\n\n```python\ndef send_email(email):\n print(f\"Sending email to {email}\")\n return \"Email sent\"\n\ntool_registry = ToolRegistry()\ntool_registry.register_tool(\"send_email\", send_email)\n\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"email\": {\"type\": \"string\", \"format\": \"email\"}\n },\n \"x-jsonai-tool-call\": {\n \"name\": \"send_email\",\n \"arguments\": {\"email\": \"email\"}\n }\n}\nprompt = \"Generate a user email.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, tool_registry=tool_registry)\noutput = jsonformer()\nprint(output)\n```\n\n### MCP Integration Example\n\n```python\ndef mcp_callback(tool_name, server_name, kwargs):\n # Simulate MCP call\n return f\"Called {tool_name} on {server_name} with {kwargs}\"\n\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\"type\": \"string\"}\n },\n \"x-jsonai-tool-call\": {\n \"name\": \"search_tool\",\n \"arguments\": {\"query\": \"query\"}\n }\n}\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, mcp_callback=mcp_callback)\noutput = jsonformer()\nprint(output)\n```\n\n### Complex Schema Example\n\n```python\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"user\": {\n \"type\": \"object\",\n \"properties\": {\n \"id\": {\"type\": \"uuid\"},\n \"name\": {\"type\": \"string\"},\n \"email\": {\"type\": \"string\", \"format\": \"email\"}\n }\n },\n \"roles\": {\n \"type\": \"array\",\n \"items\": {\"type\": \"string\", \"enum\": [\"admin\", \"user\", \"guest\"]}\n },\n \"profile\": {\n \"oneOf\": [\n {\"type\": \"object\", \"properties\": {\"age\": {\"type\": \"integer\"}}},\n {\"type\": \"object\", \"properties\": {\"birthdate\": {\"type\": \"date\"}}}\n ]\n }\n },\n \"x-jsonai-tool-call\": {\n \"name\": \"send_welcome_email\",\n \"arguments\": {\"email\": \"user.email\"}\n }\n}\n# ...setup model, tokenizer, tool_registry, etc...\njsonformer = Jsonformer(model, tokenizer, schema, prompt, tool_registry=tool_registry)\noutput = jsonformer()\nprint(output)\n```\n\n```python\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"book\": {\n \"type\": \"object\",\n \"properties\": {\n \"title\": {\"type\": \"string\"},\n \"author\": {\"type\": \"string\"},\n \"year\": {\"type\": \"integer\"}\n }\n }\n }\n}\n\nprompt = \"Generate details for a book.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"xml\")\noutput = jsonformer()\nprint(output)\n```\n\n### Tool Chaining Example\n\nYou can chain multiple tools together using the `x-jsonai-tool-chain` schema key. Each tool in the chain receives arguments from the generated data and/or previous tool outputs.\n\n```python\nfrom jsonAI.main import Jsonformer\nfrom jsonAI.tool_registry import ToolRegistry\n\ndef add(x, y):\n return {\"sum\": x + y}\n\ndef multiply(sum, factor):\n return {\"product\": sum * factor}\n\nregistry = ToolRegistry()\nregistry.register_tool(\"add\", add)\nregistry.register_tool(\"multiply\", multiply)\n\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"x\": {\"type\": \"integer\"},\n \"y\": {\"type\": \"integer\"},\n \"factor\": {\"type\": \"integer\"}\n },\n \"x-jsonai-tool-chain\": [\n {\n \"name\": \"add\",\n \"arguments\": {\"x\": \"x\", \"y\": \"y\"}\n },\n {\n \"name\": \"multiply\",\n \"arguments\": {\"sum\": \"sum\", \"factor\": \"factor\"}\n }\n ]\n}\n\nprompt = \"Calculate (x + y) * factor.\"\njsonformer = Jsonformer(\n model_backend=None, # Not used in this example\n json_schema=schema,\n prompt=prompt,\n tool_registry=registry\n)\n# Provide input data (simulate generated data)\njsonformer.value = {\"x\": 2, \"y\": 3, \"factor\": 4}\ngenerated = jsonformer.generate_data()\nresult = jsonformer._execute_tool_call(generated)\nprint(result)\n# Output will include all intermediate and final tool results.\n```\n\n## Performance and Caching\n\nJsonAI includes a performance suite to optimize throughput and latency.\n\n- **PerformanceMonitor**: measures durations for operations (async-safe)\n- **CachedJsonformer**: two-level caching\n - LRU cache for simple schema-based results\n - TTL cache for prompt-based entries for complex schemas\n- **OptimizedJsonformer**: all performance features plus cache warmup and batch helpers\n- **BatchProcessor**: asynchronous concurrent processing (configurable semaphore)\n\nExample:\n\n```python\nfrom jsonAI.performance import OptimizedJsonformer\nfrom jsonAI.model_backends import DummyBackend\n\nbackend = DummyBackend()\nschema = {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}}\n\njsonformer = OptimizedJsonformer(\n model=backend, # accepts a ModelBackend\n tokenizer=backend.tokenizer,\n schema=schema,\n cache_size=1000,\n cache_ttl=3600\n)\n\n# Single generation (cached)\nprint(jsonformer.generate(\"Generate a name\"))\n\n# Batch generation\nrequests = [\n {\"prompt\":\"User A\",\"kwargs\":{}},\n {\"prompt\":\"User B\",\"kwargs\":{}}\n]\nprint(jsonformer.generate_batch(requests))\n```\n\nTo inspect performance and cache stats at runtime, use the REST API `GET /stats` or:\n```python\njsonformer.get_comprehensive_stats()\n```\n\n## Output Format \u00d7 Type Coverage\n\n\n| Type | Example | JSON | XML | YAML | CSV* |\n|-----------|----------------|------|------|------|------|\n| number | 3.14 | \u2705 | \u2705 | \u2705 | \u2705 |\n| integer | 42 | \u2705 | \u2705 | \u2705 | \u2705 |\n| boolean | true | \u2705 | \u2705 | \u2705 | \u2705 |\n| string | \"hello\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| datetime | \"2023-06-29T12:00:00Z\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| date | \"2023-06-29\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| time | \"12:00:00\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| uuid | \"123e4567-e89b-12d3-a456-426614174000\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| binary | \"SGVsbG8=\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| null | null | \u2705 | (\u26a0\ufe0f) | \u2705 | (\u26a0\ufe0f) |\n| array | [1,2,3] | \u2705 | \u2705 | \u2705 | (\u26a0\ufe0f) |\n| object | {\"a\":1} | \u2705 | \u2705 | \u2705 | (\u26a0\ufe0f) |\n| enum | \"red\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| p_enum | \"blue\" | \u2705 | \u2705 | \u2705 | \u2705 |\n| p_integer | 7 | \u2705 | \u2705 | \u2705 | \u2705 |\n\n\u2705 = Supported\n\u26a0\ufe0f = Supported with caveats (e.g., nulls in XML/CSV, arrays/objects in CSV)\n*CSV: Only arrays of objects (tabular) are practical\n\n\n## Integrations & Capabilities\n\n- LLMs: HuggingFace Transformers, OpenAI, Ollama (vLLM patterns apply)\n- FastAPI: See `jsonAI/api.py` and `examples/fastapi_example.py`\n- Tool Registry: Register and call Python or MCP tools from schemas; supports tool chaining via `x-jsonai-tool-chain`\n- Async Support:\n - `FullAsyncJsonformer` for async generation with `model_backend/json_schema/prompt`\n - `AsyncJsonformer` wrapper (jsonAI.main) for async tool execution\n\nSee the [examples/](examples/) directory for more advanced usage and integration patterns.\n\n## License\n\nThis project is licensed under the MIT License.\n\n## Native Library Usage\n\nJsonAI leverages high-performance native libraries for data processing and extensibility:\n\n- **PyYAML** for YAML serialization\n- **lxml** for XML output\n- **cachetools** for caching\n- **requests** and **aiohttp** for HTTP\n- **jsonschema** for validation\n\nFor any tabular or batch data processing, it is recommended to use **pandas** for reliability and performance. If you extend JsonAI or build custom output logic, prefer native libraries like pandas, numpy, or others for best results.\n\n## Multi-Environment Support\n\nJsonAI supports multiple environments: dev, qa, perf, cte, and prod. Each environment has its own `.env` file at the project root.\n\n- **Local Development:** \n Copy or rename the desired `.env.*` file to `.env` before running locally.\n ```bash\n cp .env.dev .env\n uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000\n ```\n\n- **Docker Compose:** \n Edit `docker-compose.yml` to set the `env_file` for the desired environment (e.g., `.env.prod`). \n Or override at runtime:\n ```bash\n docker-compose --env-file .env.qa up -d\n ```\n\n- **Docker:** \n Pass the environment file at runtime:\n ```bash\n docker run --env-file .env.prod -p 8000:8000 jsonai:latest\n ```\n\n- **CI/CD:** \n The GitHub Actions workflow tests all environments by copying the correct `.env.*` file to `.env` for each matrix job.\n\n- **APP_ENV Variable:** \n The Dockerfile sets `APP_ENV` (default: dev) for extensibility. You can override this at runtime.\n\nSee `docs/deployment.md` for more details.\n\n## Deployment\n\n- API:\n - `uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000`\n - CORS is enabled by default for development; harden for production\n- Docker:\n - `docker build -t jsonai:latest .`\n - `docker run -p 8000:8000 jsonai:latest`\n- Docker Compose:\n - `docker-compose up -d`\n- See `docs/deployment.md` for more\n\n## Versioning and Release\n\nPyPI forbids reusing the same filename for the same version. Always bump the version:\n\n```bash\npoetry version patch # or minor/major\npoetry build\npoetry publish -u __token__ -p $PYPI_TOKEN\n```\n\nAutomate in CI by bumping on tags and using repository secrets for tokens.\n\n## Streaming Support\n\nJsonAI supports streaming data generation (experimental API in examples). Example pattern:\n\n```python\njsonformer = Jsonformer(model_backend, json_schema, prompt)\nfor data_chunk in jsonformer.stream_generate_data():\n print(data_chunk)\n```\n\nFor async streaming, adapt the pattern with the async wrapper as needed.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python library for dynamic JSON generation based on schemas using language models.",
"version": "0.15.2.2",
"project_urls": {
"Documentation": "https://github.com/kishoretvk/GenerativeJson#readme",
"Homepage": "https://github.com/kishoretvk/GenerativeJson",
"Repository": "https://github.com/kishoretvk/GenerativeJson"
},
"split_keywords": [
"json",
" llm",
" schema",
" generation",
" fastapi",
" openai",
" ollama",
" transformers"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b3301009a51a5a5b29153a1aa1619282c27db00d156ea4c1f33e11e6da03270c",
"md5": "ca34197b8faa03d7dab52e3de99ad460",
"sha256": "c2b40f1319c2c685c7ca34917c00021d81e2402d99d0a26858f72ea68348b007"
},
"downloads": -1,
"filename": "jsonai-0.15.2.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ca34197b8faa03d7dab52e3de99ad460",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 47935,
"upload_time": "2025-08-16T23:42:16",
"upload_time_iso_8601": "2025-08-16T23:42:16.495495Z",
"url": "https://files.pythonhosted.org/packages/b3/30/1009a51a5a5b29153a1aa1619282c27db00d156ea4c1f33e11e6da03270c/jsonai-0.15.2.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "73c460dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517",
"md5": "d23ff0ac44605556616209971992d62d",
"sha256": "ef999f557a4aa147840845d9a5c747a74d4bca44be292d6109f9289f0d76f9e9"
},
"downloads": -1,
"filename": "jsonai-0.15.2.2.tar.gz",
"has_sig": false,
"md5_digest": "d23ff0ac44605556616209971992d62d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 44342,
"upload_time": "2025-08-16T23:42:17",
"upload_time_iso_8601": "2025-08-16T23:42:17.910834Z",
"url": "https://files.pythonhosted.org/packages/73/c4/60dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517/jsonai-0.15.2.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-16 23:42:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kishoretvk",
"github_project": "GenerativeJson",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "jsonai"
}