jsonAI


NamejsonAI JSON
Version 0.15.2.2 PyPI version JSON
download
home_pagehttps://github.com/kishoretvk/GenerativeJson
SummaryA Python library for dynamic JSON generation based on schemas using language models.
upload_time2025-08-16 23:42:17
maintainerNone
docs_urlNone
author1rgs
requires_python<4.0,>=3.9
licenseMIT
keywords json llm schema generation fastapi openai ollama transformers
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # JsonAI β€” Production-Ready Structured JSON Generation with LLMs

JsonAI is a comprehensive Python library for generating structured JSON data using Large Language Models (LLMs). It provides enterprise-grade features including robust JSON schema validation, multiple model backends, REST API, React frontend, CLI interface, and production deployment configurations.

Current version: 0.15.1

## πŸ”” What’s New in 0.15.1

- Stabilized FastAPI REST API with endpoints for sync/async generation, batch processing, stats, cache management, and schema validation
- Performance suite:
  - PerformanceMonitor async timing fixes
  - CachedJsonformer with LRU/TTL caching
  - BatchProcessor for efficient concurrent execution
  - OptimizedJsonformer combines caching + batch processing with warmup
- Async generation improvements:
  - FullAsyncJsonformer (aliased as AsyncJsonformer in the API)
  - AsyncJsonformer wrapper in main.py for async tool execution
- Logging hygiene: lazy logging interpolation to reduce overhead
- Packaging: PyPI publish flow cleaned; version bumped to 0.15.1

## πŸš€ Features

### Core Capabilities
- Multiple LLM Backends: Ollama, OpenAI, and HuggingFace Transformers
- Full JSON Schema Coverage: primitives, arrays, objects, enums, nested structures, oneOf
- Performance Optimization: caching (LRU/TTL), batch processing, async operations
- Production Ready: Docker, FastAPI, monitoring, scaling considerations

### Interfaces & APIs
- REST API: FastAPI-based service with OpenAPI docs
- React Frontend: Modern web interface for JSON generation
- CLI Interface: Command-line tools for automation and batch processing
- Python Library: Programmatic access with sync and async support

### Enterprise Features
- Caching System: Intelligent multi-level caching (LRU/TTL)
- Batch Processing: Concurrent batch execution
- Performance Monitoring: Built-in metrics via PerformanceMonitor
- Schema Validation: Comprehensive validation with jsonschema
- Multiple Output Formats: JSON, YAML, XML, and CSV

## πŸ“¦ Installation

### Option 1: pip (Recommended)
```bash
pip install jsonai
```

### Option 2: From Source
```bash
git clone https://github.com/yourusername/JsonAI.git
cd JsonAI
poetry install
```

### Option 3: Docker
```bash
# Quick start with Docker
docker run -p 8000:8000 jsonai:latest

# Full stack with Docker Compose
docker-compose up -d
```

## Architecture Overview

The `jsonAI` library is modular and consists of the following components:

- **Jsonformer** (jsonAI.main): Orchestrates generation, formatting, and validation
- **TypeGenerator**: Generates values for each JSON Schema type
- **OutputFormatter**: Converts data into JSON, YAML, XML, CSV
- **SchemaValidator**: Validates data with jsonschema
- **ToolRegistry**: Registers and resolves Python/MCP tools
- **Async Paths**:
  - **FullAsyncJsonformer** (jsonAI.async_jsonformer): asynchronous generator taking model_backend, json_schema, prompt (aliased as AsyncJsonformer in API)
  - **AsyncJsonformer wrapper** (jsonAI.main): wraps a Jsonformer instance for async tool execution

## Testing

The project includes comprehensive tests for each component and integration:

-   **Unit Tests**: Test individual components.
-   **Integration Tests**: Validate the interaction between components.

To run tests:

```bash
pytest tests/
```

## Quick API Start (FastAPI)

Run the API with uvicorn:

```bash
uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000
```

Then open http://localhost:8000/docs for interactive Swagger UI.

### REST Endpoints

- POST /generate β€” synchronous generation
- POST /generate/async β€” asynchronous generation
- POST /generate/batch β€” concurrent batch generation
- GET /stats β€” performance and cache statistics
- DELETE /cache β€” clear all caches
- POST /validate β€” validate a JSON schema

Minimal cURL examples:

```bash
# Sync generate
curl -X POST http://localhost:8000/generate -H "Content-Type: application/json" -d '{
  "prompt": "Generate a simple user object",
  "schema": {"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}}},
  "model_name": "ollama",
  "model_path": "llama3"
}'

# Async generate
curl -X POST http://localhost:8000/generate/async -H "Content-Type: application/json" -d '{
  "prompt": "Generate a simple user object",
  "schema": {"type":"object","properties":{"name":{"type":"string"},"age":{"type":"integer"}}},
  "model_name": "ollama",
  "model_path": "llama3"
}'

# Batch generate
curl -X POST http://localhost:8000/generate/batch -H "Content-Type: application/json" -d '{
  "requests": [
    {"prompt":"User 1","schema":{"type":"object","properties":{"name":{"type":"string"}}},"model_name":"ollama","model_path":"llama3"},
    {"prompt":"User 2","schema":{"type":"object","properties":{"name":{"type":"string"}}},"model_name":"ollama","model_path":"llama3"}
  ],
  "max_concurrent": 5
}'
```

## Examples

### Basic JSON Generation

```python
from jsonAI.main import Jsonformer

# Suppose you have a backend that implements ModelBackend
from jsonAI.model_backends import DummyBackend
backend = DummyBackend()  # replace with OllamaBackend/OpenAIBackend/etc.

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "isStudent": {"type": "boolean"}
    }
}
prompt = "Generate a person's profile."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt)
output = jsonformer()
print(output)
```


### XML Output
### YAML Output

```python
schema = {
    "type": "object",
    "properties": {
        "city": {"type": "string"},
        "population": {"type": "integer"}
    }
}
prompt = "Generate a city profile."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="yaml")
output = jsonformer()
print(output)
```

### CSV Output

```python
schema = {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "score": {"type": "number"}
        }
    }
}
prompt = "Generate a list of students and their scores."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="csv")
output = jsonformer()
print(output)
```


### CLI Example

#### Basic CLI Usage

```bash
python -m jsonAI.cli generate --schema schema.json --prompt "Generate a product" --output-format json
```

#### Using Ollama Backend (Recommended for LLMs)

```bash
python -m jsonAI.cli generate --schema complex_schema.json \
  --prompt "Generate a comprehensive person profile as JSON." \
  --use-ollama --ollama-model llama3
```

#### Features
- Robustly extracts the first valid JSON object from any LLM output (even if wrapped in <answer> tags or surrounded by extra text)
- Supports all JSON schema types: primitives, enums, arrays, objects, null, oneOf, nested/complex
- Validates output against the schema and warns if invalid
- Pretty-prints objects/arrays, prints primitives/null as-is
- Production-ready for any schema and LLM output style

#### Example Output

```json
{
  "id": "profile with all supported JSON schema types.",
  "name": "re",
  "age": 30,
  "is_active": true,
  "email": "example@example.com",
  "roles": ["admin", "user"],
  "address": {"street": "123 Main St", "city": "Anytown", "zip": "12345", "country": "USA"},
  "preferences": {"newsletter": true, "theme": "dark", "language": "en"},
  "tags": ["tech", "developer"],
  "score": 95,
  "metadata": {"key1": "value1", "key2": "value2"},
  "status": "active",
  "history": [{"date": "2023-01-01", "event": "joined", "details": "Account created"}],
  "profile_picture": "https://example.com/avatar.jpg",
  "settings": {"notifications": true, "privacy": "private"},
  "null_field": null
}
```

See `complex_schema.json` for a comprehensive schema example.

### Tool Calling Example

```python
def send_email(email):
    print(f"Sending email to {email}")
    return "Email sent"

tool_registry = ToolRegistry()
tool_registry.register_tool("send_email", send_email)

schema = {
    "type": "object",
    "properties": {
        "email": {"type": "string", "format": "email"}
    },
    "x-jsonai-tool-call": {
        "name": "send_email",
        "arguments": {"email": "email"}
    }
}
prompt = "Generate a user email."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, tool_registry=tool_registry)
output = jsonformer()
print(output)
```

### MCP Integration Example

```python
def mcp_callback(tool_name, server_name, kwargs):
    # Simulate MCP call
    return f"Called {tool_name} on {server_name} with {kwargs}"

schema = {
    "type": "object",
    "properties": {
        "query": {"type": "string"}
    },
    "x-jsonai-tool-call": {
        "name": "search_tool",
        "arguments": {"query": "query"}
    }
}
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, mcp_callback=mcp_callback)
output = jsonformer()
print(output)
```

### Complex Schema Example

```python
schema = {
    "type": "object",
    "properties": {
        "user": {
            "type": "object",
            "properties": {
                "id": {"type": "uuid"},
                "name": {"type": "string"},
                "email": {"type": "string", "format": "email"}
            }
        },
        "roles": {
            "type": "array",
            "items": {"type": "string", "enum": ["admin", "user", "guest"]}
        },
        "profile": {
            "oneOf": [
                {"type": "object", "properties": {"age": {"type": "integer"}}},
                {"type": "object", "properties": {"birthdate": {"type": "date"}}}
            ]
        }
    },
    "x-jsonai-tool-call": {
        "name": "send_welcome_email",
        "arguments": {"email": "user.email"}
    }
}
# ...setup model, tokenizer, tool_registry, etc...
jsonformer = Jsonformer(model, tokenizer, schema, prompt, tool_registry=tool_registry)
output = jsonformer()
print(output)
```

```python
schema = {
    "type": "object",
    "properties": {
        "book": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "author": {"type": "string"},
                "year": {"type": "integer"}
            }
        }
    }
}

prompt = "Generate details for a book."
jsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format="xml")
output = jsonformer()
print(output)
```

### Tool Chaining Example

You can chain multiple tools together using the `x-jsonai-tool-chain` schema key. Each tool in the chain receives arguments from the generated data and/or previous tool outputs.

```python
from jsonAI.main import Jsonformer
from jsonAI.tool_registry import ToolRegistry

def add(x, y):
    return {"sum": x + y}

def multiply(sum, factor):
    return {"product": sum * factor}

registry = ToolRegistry()
registry.register_tool("add", add)
registry.register_tool("multiply", multiply)

schema = {
    "type": "object",
    "properties": {
        "x": {"type": "integer"},
        "y": {"type": "integer"},
        "factor": {"type": "integer"}
    },
    "x-jsonai-tool-chain": [
        {
            "name": "add",
            "arguments": {"x": "x", "y": "y"}
        },
        {
            "name": "multiply",
            "arguments": {"sum": "sum", "factor": "factor"}
        }
    ]
}

prompt = "Calculate (x + y) * factor."
jsonformer = Jsonformer(
    model_backend=None,  # Not used in this example
    json_schema=schema,
    prompt=prompt,
    tool_registry=registry
)
# Provide input data (simulate generated data)
jsonformer.value = {"x": 2, "y": 3, "factor": 4}
generated = jsonformer.generate_data()
result = jsonformer._execute_tool_call(generated)
print(result)
# Output will include all intermediate and final tool results.
```

## Performance and Caching

JsonAI includes a performance suite to optimize throughput and latency.

- **PerformanceMonitor**: measures durations for operations (async-safe)
- **CachedJsonformer**: two-level caching
  - LRU cache for simple schema-based results
  - TTL cache for prompt-based entries for complex schemas
- **OptimizedJsonformer**: all performance features plus cache warmup and batch helpers
- **BatchProcessor**: asynchronous concurrent processing (configurable semaphore)

Example:

```python
from jsonAI.performance import OptimizedJsonformer
from jsonAI.model_backends import DummyBackend

backend = DummyBackend()
schema = {"type":"object","properties":{"name":{"type":"string"}}}

jsonformer = OptimizedJsonformer(
    model=backend,          # accepts a ModelBackend
    tokenizer=backend.tokenizer,
    schema=schema,
    cache_size=1000,
    cache_ttl=3600
)

# Single generation (cached)
print(jsonformer.generate("Generate a name"))

# Batch generation
requests = [
  {"prompt":"User A","kwargs":{}},
  {"prompt":"User B","kwargs":{}}
]
print(jsonformer.generate_batch(requests))
```

To inspect performance and cache stats at runtime, use the REST API `GET /stats` or:
```python
jsonformer.get_comprehensive_stats()
```

## Output Format Γ— Type Coverage


| Type      | Example         | JSON | XML  | YAML | CSV* |
|-----------|----------------|------|------|------|------|
| number    | 3.14           | βœ…   | βœ…   | βœ…   | βœ…   |
| integer   | 42             | βœ…   | βœ…   | βœ…   | βœ…   |
| boolean   | true           | βœ…   | βœ…   | βœ…   | βœ…   |
| string    | "hello"        | βœ…   | βœ…   | βœ…   | βœ…   |
| datetime  | "2023-06-29T12:00:00Z" | βœ…   | βœ…   | βœ…   | βœ…   |
| date      | "2023-06-29"   | βœ…   | βœ…   | βœ…   | βœ…   |
| time      | "12:00:00"     | βœ…   | βœ…   | βœ…   | βœ…   |
| uuid      | "123e4567-e89b-12d3-a456-426614174000" | βœ…   | βœ…   | βœ…   | βœ…   |
| binary    | "SGVsbG8="     | βœ…   | βœ…   | βœ…   | βœ…   |
| null      | null           | βœ…   | (⚠️) | βœ…   | (⚠️) |
| array     | [1,2,3]        | βœ…   | βœ…   | βœ…   | (⚠️) |
| object    | {"a":1}        | βœ…   | βœ…   | βœ…   | (⚠️) |
| enum      | "red"          | βœ…   | βœ…   | βœ…   | βœ…   |
| p_enum    | "blue"         | βœ…   | βœ…   | βœ…   | βœ…   |
| p_integer | 7              | βœ…   | βœ…   | βœ…   | βœ…   |

βœ… = Supported
⚠️ = Supported with caveats (e.g., nulls in XML/CSV, arrays/objects in CSV)
*CSV: Only arrays of objects (tabular) are practical


## Integrations & Capabilities

- LLMs: HuggingFace Transformers, OpenAI, Ollama (vLLM patterns apply)
- FastAPI: See `jsonAI/api.py` and `examples/fastapi_example.py`
- Tool Registry: Register and call Python or MCP tools from schemas; supports tool chaining via `x-jsonai-tool-chain`
- Async Support:
  - `FullAsyncJsonformer` for async generation with `model_backend/json_schema/prompt`
  - `AsyncJsonformer` wrapper (jsonAI.main) for async tool execution

See the [examples/](examples/) directory for more advanced usage and integration patterns.

## License

This project is licensed under the MIT License.

## Native Library Usage

JsonAI leverages high-performance native libraries for data processing and extensibility:

- **PyYAML** for YAML serialization
- **lxml** for XML output
- **cachetools** for caching
- **requests** and **aiohttp** for HTTP
- **jsonschema** for validation

For any tabular or batch data processing, it is recommended to use **pandas** for reliability and performance. If you extend JsonAI or build custom output logic, prefer native libraries like pandas, numpy, or others for best results.

## Multi-Environment Support

JsonAI supports multiple environments: dev, qa, perf, cte, and prod. Each environment has its own `.env` file at the project root.

- **Local Development:**  
  Copy or rename the desired `.env.*` file to `.env` before running locally.
  ```bash
  cp .env.dev .env
  uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000
  ```

- **Docker Compose:**  
  Edit `docker-compose.yml` to set the `env_file` for the desired environment (e.g., `.env.prod`).  
  Or override at runtime:
  ```bash
  docker-compose --env-file .env.qa up -d
  ```

- **Docker:**  
  Pass the environment file at runtime:
  ```bash
  docker run --env-file .env.prod -p 8000:8000 jsonai:latest
  ```

- **CI/CD:**  
  The GitHub Actions workflow tests all environments by copying the correct `.env.*` file to `.env` for each matrix job.

- **APP_ENV Variable:**  
  The Dockerfile sets `APP_ENV` (default: dev) for extensibility. You can override this at runtime.

See `docs/deployment.md` for more details.

## Deployment

- API:
  - `uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000`
  - CORS is enabled by default for development; harden for production
- Docker:
  - `docker build -t jsonai:latest .`
  - `docker run -p 8000:8000 jsonai:latest`
- Docker Compose:
  - `docker-compose up -d`
- See `docs/deployment.md` for more

## Versioning and Release

PyPI forbids reusing the same filename for the same version. Always bump the version:

```bash
poetry version patch  # or minor/major
poetry build
poetry publish -u __token__ -p $PYPI_TOKEN
```

Automate in CI by bumping on tags and using repository secrets for tokens.

## Streaming Support

JsonAI supports streaming data generation (experimental API in examples). Example pattern:

```python
jsonformer = Jsonformer(model_backend, json_schema, prompt)
for data_chunk in jsonformer.stream_generate_data():
    print(data_chunk)
```

For async streaming, adapt the pattern with the async wrapper as needed.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kishoretvk/GenerativeJson",
    "name": "jsonAI",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "json, llm, schema, generation, fastapi, openai, ollama, transformers",
    "author": "1rgs",
    "author_email": "kishoretvk9@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/73/c4/60dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517/jsonai-0.15.2.2.tar.gz",
    "platform": null,
    "description": "# JsonAI \u2014 Production-Ready Structured JSON Generation with LLMs\n\nJsonAI is a comprehensive Python library for generating structured JSON data using Large Language Models (LLMs). It provides enterprise-grade features including robust JSON schema validation, multiple model backends, REST API, React frontend, CLI interface, and production deployment configurations.\n\nCurrent version: 0.15.1\n\n## \ud83d\udd14 What\u2019s New in 0.15.1\n\n- Stabilized FastAPI REST API with endpoints for sync/async generation, batch processing, stats, cache management, and schema validation\n- Performance suite:\n  - PerformanceMonitor async timing fixes\n  - CachedJsonformer with LRU/TTL caching\n  - BatchProcessor for efficient concurrent execution\n  - OptimizedJsonformer combines caching + batch processing with warmup\n- Async generation improvements:\n  - FullAsyncJsonformer (aliased as AsyncJsonformer in the API)\n  - AsyncJsonformer wrapper in main.py for async tool execution\n- Logging hygiene: lazy logging interpolation to reduce overhead\n- Packaging: PyPI publish flow cleaned; version bumped to 0.15.1\n\n## \ud83d\ude80 Features\n\n### Core Capabilities\n- Multiple LLM Backends: Ollama, OpenAI, and HuggingFace Transformers\n- Full JSON Schema Coverage: primitives, arrays, objects, enums, nested structures, oneOf\n- Performance Optimization: caching (LRU/TTL), batch processing, async operations\n- Production Ready: Docker, FastAPI, monitoring, scaling considerations\n\n### Interfaces & APIs\n- REST API: FastAPI-based service with OpenAPI docs\n- React Frontend: Modern web interface for JSON generation\n- CLI Interface: Command-line tools for automation and batch processing\n- Python Library: Programmatic access with sync and async support\n\n### Enterprise Features\n- Caching System: Intelligent multi-level caching (LRU/TTL)\n- Batch Processing: Concurrent batch execution\n- Performance Monitoring: Built-in metrics via PerformanceMonitor\n- Schema Validation: Comprehensive validation with jsonschema\n- Multiple Output Formats: JSON, YAML, XML, and CSV\n\n## \ud83d\udce6 Installation\n\n### Option 1: pip (Recommended)\n```bash\npip install jsonai\n```\n\n### Option 2: From Source\n```bash\ngit clone https://github.com/yourusername/JsonAI.git\ncd JsonAI\npoetry install\n```\n\n### Option 3: Docker\n```bash\n# Quick start with Docker\ndocker run -p 8000:8000 jsonai:latest\n\n# Full stack with Docker Compose\ndocker-compose up -d\n```\n\n## Architecture Overview\n\nThe `jsonAI` library is modular and consists of the following components:\n\n- **Jsonformer** (jsonAI.main): Orchestrates generation, formatting, and validation\n- **TypeGenerator**: Generates values for each JSON Schema type\n- **OutputFormatter**: Converts data into JSON, YAML, XML, CSV\n- **SchemaValidator**: Validates data with jsonschema\n- **ToolRegistry**: Registers and resolves Python/MCP tools\n- **Async Paths**:\n  - **FullAsyncJsonformer** (jsonAI.async_jsonformer): asynchronous generator taking model_backend, json_schema, prompt (aliased as AsyncJsonformer in API)\n  - **AsyncJsonformer wrapper** (jsonAI.main): wraps a Jsonformer instance for async tool execution\n\n## Testing\n\nThe project includes comprehensive tests for each component and integration:\n\n-   **Unit Tests**: Test individual components.\n-   **Integration Tests**: Validate the interaction between components.\n\nTo run tests:\n\n```bash\npytest tests/\n```\n\n## Quick API Start (FastAPI)\n\nRun the API with uvicorn:\n\n```bash\nuvicorn jsonAI.api:app --host 0.0.0.0 --port 8000\n```\n\nThen open http://localhost:8000/docs for interactive Swagger UI.\n\n### REST Endpoints\n\n- POST /generate \u2014 synchronous generation\n- POST /generate/async \u2014 asynchronous generation\n- POST /generate/batch \u2014 concurrent batch generation\n- GET /stats \u2014 performance and cache statistics\n- DELETE /cache \u2014 clear all caches\n- POST /validate \u2014 validate a JSON schema\n\nMinimal cURL examples:\n\n```bash\n# Sync generate\ncurl -X POST http://localhost:8000/generate -H \"Content-Type: application/json\" -d '{\n  \"prompt\": \"Generate a simple user object\",\n  \"schema\": {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"age\":{\"type\":\"integer\"}}},\n  \"model_name\": \"ollama\",\n  \"model_path\": \"llama3\"\n}'\n\n# Async generate\ncurl -X POST http://localhost:8000/generate/async -H \"Content-Type: application/json\" -d '{\n  \"prompt\": \"Generate a simple user object\",\n  \"schema\": {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"age\":{\"type\":\"integer\"}}},\n  \"model_name\": \"ollama\",\n  \"model_path\": \"llama3\"\n}'\n\n# Batch generate\ncurl -X POST http://localhost:8000/generate/batch -H \"Content-Type: application/json\" -d '{\n  \"requests\": [\n    {\"prompt\":\"User 1\",\"schema\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}},\"model_name\":\"ollama\",\"model_path\":\"llama3\"},\n    {\"prompt\":\"User 2\",\"schema\":{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}},\"model_name\":\"ollama\",\"model_path\":\"llama3\"}\n  ],\n  \"max_concurrent\": 5\n}'\n```\n\n## Examples\n\n### Basic JSON Generation\n\n```python\nfrom jsonAI.main import Jsonformer\n\n# Suppose you have a backend that implements ModelBackend\nfrom jsonAI.model_backends import DummyBackend\nbackend = DummyBackend()  # replace with OllamaBackend/OpenAIBackend/etc.\n\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"name\": {\"type\": \"string\"},\n        \"age\": {\"type\": \"integer\"},\n        \"isStudent\": {\"type\": \"boolean\"}\n    }\n}\nprompt = \"Generate a person's profile.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt)\noutput = jsonformer()\nprint(output)\n```\n\n\n### XML Output\n### YAML Output\n\n```python\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"city\": {\"type\": \"string\"},\n        \"population\": {\"type\": \"integer\"}\n    }\n}\nprompt = \"Generate a city profile.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"yaml\")\noutput = jsonformer()\nprint(output)\n```\n\n### CSV Output\n\n```python\nschema = {\n    \"type\": \"array\",\n    \"items\": {\n        \"type\": \"object\",\n        \"properties\": {\n            \"name\": {\"type\": \"string\"},\n            \"score\": {\"type\": \"number\"}\n        }\n    }\n}\nprompt = \"Generate a list of students and their scores.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"csv\")\noutput = jsonformer()\nprint(output)\n```\n\n\n### CLI Example\n\n#### Basic CLI Usage\n\n```bash\npython -m jsonAI.cli generate --schema schema.json --prompt \"Generate a product\" --output-format json\n```\n\n#### Using Ollama Backend (Recommended for LLMs)\n\n```bash\npython -m jsonAI.cli generate --schema complex_schema.json \\\n  --prompt \"Generate a comprehensive person profile as JSON.\" \\\n  --use-ollama --ollama-model llama3\n```\n\n#### Features\n- Robustly extracts the first valid JSON object from any LLM output (even if wrapped in <answer> tags or surrounded by extra text)\n- Supports all JSON schema types: primitives, enums, arrays, objects, null, oneOf, nested/complex\n- Validates output against the schema and warns if invalid\n- Pretty-prints objects/arrays, prints primitives/null as-is\n- Production-ready for any schema and LLM output style\n\n#### Example Output\n\n```json\n{\n  \"id\": \"profile with all supported JSON schema types.\",\n  \"name\": \"re\",\n  \"age\": 30,\n  \"is_active\": true,\n  \"email\": \"example@example.com\",\n  \"roles\": [\"admin\", \"user\"],\n  \"address\": {\"street\": \"123 Main St\", \"city\": \"Anytown\", \"zip\": \"12345\", \"country\": \"USA\"},\n  \"preferences\": {\"newsletter\": true, \"theme\": \"dark\", \"language\": \"en\"},\n  \"tags\": [\"tech\", \"developer\"],\n  \"score\": 95,\n  \"metadata\": {\"key1\": \"value1\", \"key2\": \"value2\"},\n  \"status\": \"active\",\n  \"history\": [{\"date\": \"2023-01-01\", \"event\": \"joined\", \"details\": \"Account created\"}],\n  \"profile_picture\": \"https://example.com/avatar.jpg\",\n  \"settings\": {\"notifications\": true, \"privacy\": \"private\"},\n  \"null_field\": null\n}\n```\n\nSee `complex_schema.json` for a comprehensive schema example.\n\n### Tool Calling Example\n\n```python\ndef send_email(email):\n    print(f\"Sending email to {email}\")\n    return \"Email sent\"\n\ntool_registry = ToolRegistry()\ntool_registry.register_tool(\"send_email\", send_email)\n\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"email\": {\"type\": \"string\", \"format\": \"email\"}\n    },\n    \"x-jsonai-tool-call\": {\n        \"name\": \"send_email\",\n        \"arguments\": {\"email\": \"email\"}\n    }\n}\nprompt = \"Generate a user email.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, tool_registry=tool_registry)\noutput = jsonformer()\nprint(output)\n```\n\n### MCP Integration Example\n\n```python\ndef mcp_callback(tool_name, server_name, kwargs):\n    # Simulate MCP call\n    return f\"Called {tool_name} on {server_name} with {kwargs}\"\n\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"query\": {\"type\": \"string\"}\n    },\n    \"x-jsonai-tool-call\": {\n        \"name\": \"search_tool\",\n        \"arguments\": {\"query\": \"query\"}\n    }\n}\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, mcp_callback=mcp_callback)\noutput = jsonformer()\nprint(output)\n```\n\n### Complex Schema Example\n\n```python\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"user\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"id\": {\"type\": \"uuid\"},\n                \"name\": {\"type\": \"string\"},\n                \"email\": {\"type\": \"string\", \"format\": \"email\"}\n            }\n        },\n        \"roles\": {\n            \"type\": \"array\",\n            \"items\": {\"type\": \"string\", \"enum\": [\"admin\", \"user\", \"guest\"]}\n        },\n        \"profile\": {\n            \"oneOf\": [\n                {\"type\": \"object\", \"properties\": {\"age\": {\"type\": \"integer\"}}},\n                {\"type\": \"object\", \"properties\": {\"birthdate\": {\"type\": \"date\"}}}\n            ]\n        }\n    },\n    \"x-jsonai-tool-call\": {\n        \"name\": \"send_welcome_email\",\n        \"arguments\": {\"email\": \"user.email\"}\n    }\n}\n# ...setup model, tokenizer, tool_registry, etc...\njsonformer = Jsonformer(model, tokenizer, schema, prompt, tool_registry=tool_registry)\noutput = jsonformer()\nprint(output)\n```\n\n```python\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"book\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"title\": {\"type\": \"string\"},\n                \"author\": {\"type\": \"string\"},\n                \"year\": {\"type\": \"integer\"}\n            }\n        }\n    }\n}\n\nprompt = \"Generate details for a book.\"\njsonformer = Jsonformer(model_backend=backend, json_schema=schema, prompt=prompt, output_format=\"xml\")\noutput = jsonformer()\nprint(output)\n```\n\n### Tool Chaining Example\n\nYou can chain multiple tools together using the `x-jsonai-tool-chain` schema key. Each tool in the chain receives arguments from the generated data and/or previous tool outputs.\n\n```python\nfrom jsonAI.main import Jsonformer\nfrom jsonAI.tool_registry import ToolRegistry\n\ndef add(x, y):\n    return {\"sum\": x + y}\n\ndef multiply(sum, factor):\n    return {\"product\": sum * factor}\n\nregistry = ToolRegistry()\nregistry.register_tool(\"add\", add)\nregistry.register_tool(\"multiply\", multiply)\n\nschema = {\n    \"type\": \"object\",\n    \"properties\": {\n        \"x\": {\"type\": \"integer\"},\n        \"y\": {\"type\": \"integer\"},\n        \"factor\": {\"type\": \"integer\"}\n    },\n    \"x-jsonai-tool-chain\": [\n        {\n            \"name\": \"add\",\n            \"arguments\": {\"x\": \"x\", \"y\": \"y\"}\n        },\n        {\n            \"name\": \"multiply\",\n            \"arguments\": {\"sum\": \"sum\", \"factor\": \"factor\"}\n        }\n    ]\n}\n\nprompt = \"Calculate (x + y) * factor.\"\njsonformer = Jsonformer(\n    model_backend=None,  # Not used in this example\n    json_schema=schema,\n    prompt=prompt,\n    tool_registry=registry\n)\n# Provide input data (simulate generated data)\njsonformer.value = {\"x\": 2, \"y\": 3, \"factor\": 4}\ngenerated = jsonformer.generate_data()\nresult = jsonformer._execute_tool_call(generated)\nprint(result)\n# Output will include all intermediate and final tool results.\n```\n\n## Performance and Caching\n\nJsonAI includes a performance suite to optimize throughput and latency.\n\n- **PerformanceMonitor**: measures durations for operations (async-safe)\n- **CachedJsonformer**: two-level caching\n  - LRU cache for simple schema-based results\n  - TTL cache for prompt-based entries for complex schemas\n- **OptimizedJsonformer**: all performance features plus cache warmup and batch helpers\n- **BatchProcessor**: asynchronous concurrent processing (configurable semaphore)\n\nExample:\n\n```python\nfrom jsonAI.performance import OptimizedJsonformer\nfrom jsonAI.model_backends import DummyBackend\n\nbackend = DummyBackend()\nschema = {\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}}\n\njsonformer = OptimizedJsonformer(\n    model=backend,          # accepts a ModelBackend\n    tokenizer=backend.tokenizer,\n    schema=schema,\n    cache_size=1000,\n    cache_ttl=3600\n)\n\n# Single generation (cached)\nprint(jsonformer.generate(\"Generate a name\"))\n\n# Batch generation\nrequests = [\n  {\"prompt\":\"User A\",\"kwargs\":{}},\n  {\"prompt\":\"User B\",\"kwargs\":{}}\n]\nprint(jsonformer.generate_batch(requests))\n```\n\nTo inspect performance and cache stats at runtime, use the REST API `GET /stats` or:\n```python\njsonformer.get_comprehensive_stats()\n```\n\n## Output Format \u00d7 Type Coverage\n\n\n| Type      | Example         | JSON | XML  | YAML | CSV* |\n|-----------|----------------|------|------|------|------|\n| number    | 3.14           | \u2705   | \u2705   | \u2705   | \u2705   |\n| integer   | 42             | \u2705   | \u2705   | \u2705   | \u2705   |\n| boolean   | true           | \u2705   | \u2705   | \u2705   | \u2705   |\n| string    | \"hello\"        | \u2705   | \u2705   | \u2705   | \u2705   |\n| datetime  | \"2023-06-29T12:00:00Z\" | \u2705   | \u2705   | \u2705   | \u2705   |\n| date      | \"2023-06-29\"   | \u2705   | \u2705   | \u2705   | \u2705   |\n| time      | \"12:00:00\"     | \u2705   | \u2705   | \u2705   | \u2705   |\n| uuid      | \"123e4567-e89b-12d3-a456-426614174000\" | \u2705   | \u2705   | \u2705   | \u2705   |\n| binary    | \"SGVsbG8=\"     | \u2705   | \u2705   | \u2705   | \u2705   |\n| null      | null           | \u2705   | (\u26a0\ufe0f) | \u2705   | (\u26a0\ufe0f) |\n| array     | [1,2,3]        | \u2705   | \u2705   | \u2705   | (\u26a0\ufe0f) |\n| object    | {\"a\":1}        | \u2705   | \u2705   | \u2705   | (\u26a0\ufe0f) |\n| enum      | \"red\"          | \u2705   | \u2705   | \u2705   | \u2705   |\n| p_enum    | \"blue\"         | \u2705   | \u2705   | \u2705   | \u2705   |\n| p_integer | 7              | \u2705   | \u2705   | \u2705   | \u2705   |\n\n\u2705 = Supported\n\u26a0\ufe0f = Supported with caveats (e.g., nulls in XML/CSV, arrays/objects in CSV)\n*CSV: Only arrays of objects (tabular) are practical\n\n\n## Integrations & Capabilities\n\n- LLMs: HuggingFace Transformers, OpenAI, Ollama (vLLM patterns apply)\n- FastAPI: See `jsonAI/api.py` and `examples/fastapi_example.py`\n- Tool Registry: Register and call Python or MCP tools from schemas; supports tool chaining via `x-jsonai-tool-chain`\n- Async Support:\n  - `FullAsyncJsonformer` for async generation with `model_backend/json_schema/prompt`\n  - `AsyncJsonformer` wrapper (jsonAI.main) for async tool execution\n\nSee the [examples/](examples/) directory for more advanced usage and integration patterns.\n\n## License\n\nThis project is licensed under the MIT License.\n\n## Native Library Usage\n\nJsonAI leverages high-performance native libraries for data processing and extensibility:\n\n- **PyYAML** for YAML serialization\n- **lxml** for XML output\n- **cachetools** for caching\n- **requests** and **aiohttp** for HTTP\n- **jsonschema** for validation\n\nFor any tabular or batch data processing, it is recommended to use **pandas** for reliability and performance. If you extend JsonAI or build custom output logic, prefer native libraries like pandas, numpy, or others for best results.\n\n## Multi-Environment Support\n\nJsonAI supports multiple environments: dev, qa, perf, cte, and prod. Each environment has its own `.env` file at the project root.\n\n- **Local Development:**  \n  Copy or rename the desired `.env.*` file to `.env` before running locally.\n  ```bash\n  cp .env.dev .env\n  uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000\n  ```\n\n- **Docker Compose:**  \n  Edit `docker-compose.yml` to set the `env_file` for the desired environment (e.g., `.env.prod`).  \n  Or override at runtime:\n  ```bash\n  docker-compose --env-file .env.qa up -d\n  ```\n\n- **Docker:**  \n  Pass the environment file at runtime:\n  ```bash\n  docker run --env-file .env.prod -p 8000:8000 jsonai:latest\n  ```\n\n- **CI/CD:**  \n  The GitHub Actions workflow tests all environments by copying the correct `.env.*` file to `.env` for each matrix job.\n\n- **APP_ENV Variable:**  \n  The Dockerfile sets `APP_ENV` (default: dev) for extensibility. You can override this at runtime.\n\nSee `docs/deployment.md` for more details.\n\n## Deployment\n\n- API:\n  - `uvicorn jsonAI.api:app --host 0.0.0.0 --port 8000`\n  - CORS is enabled by default for development; harden for production\n- Docker:\n  - `docker build -t jsonai:latest .`\n  - `docker run -p 8000:8000 jsonai:latest`\n- Docker Compose:\n  - `docker-compose up -d`\n- See `docs/deployment.md` for more\n\n## Versioning and Release\n\nPyPI forbids reusing the same filename for the same version. Always bump the version:\n\n```bash\npoetry version patch  # or minor/major\npoetry build\npoetry publish -u __token__ -p $PYPI_TOKEN\n```\n\nAutomate in CI by bumping on tags and using repository secrets for tokens.\n\n## Streaming Support\n\nJsonAI supports streaming data generation (experimental API in examples). Example pattern:\n\n```python\njsonformer = Jsonformer(model_backend, json_schema, prompt)\nfor data_chunk in jsonformer.stream_generate_data():\n    print(data_chunk)\n```\n\nFor async streaming, adapt the pattern with the async wrapper as needed.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library for dynamic JSON generation based on schemas using language models.",
    "version": "0.15.2.2",
    "project_urls": {
        "Documentation": "https://github.com/kishoretvk/GenerativeJson#readme",
        "Homepage": "https://github.com/kishoretvk/GenerativeJson",
        "Repository": "https://github.com/kishoretvk/GenerativeJson"
    },
    "split_keywords": [
        "json",
        " llm",
        " schema",
        " generation",
        " fastapi",
        " openai",
        " ollama",
        " transformers"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b3301009a51a5a5b29153a1aa1619282c27db00d156ea4c1f33e11e6da03270c",
                "md5": "ca34197b8faa03d7dab52e3de99ad460",
                "sha256": "c2b40f1319c2c685c7ca34917c00021d81e2402d99d0a26858f72ea68348b007"
            },
            "downloads": -1,
            "filename": "jsonai-0.15.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ca34197b8faa03d7dab52e3de99ad460",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 47935,
            "upload_time": "2025-08-16T23:42:16",
            "upload_time_iso_8601": "2025-08-16T23:42:16.495495Z",
            "url": "https://files.pythonhosted.org/packages/b3/30/1009a51a5a5b29153a1aa1619282c27db00d156ea4c1f33e11e6da03270c/jsonai-0.15.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "73c460dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517",
                "md5": "d23ff0ac44605556616209971992d62d",
                "sha256": "ef999f557a4aa147840845d9a5c747a74d4bca44be292d6109f9289f0d76f9e9"
            },
            "downloads": -1,
            "filename": "jsonai-0.15.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d23ff0ac44605556616209971992d62d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 44342,
            "upload_time": "2025-08-16T23:42:17",
            "upload_time_iso_8601": "2025-08-16T23:42:17.910834Z",
            "url": "https://files.pythonhosted.org/packages/73/c4/60dec1163ddb8211ec09dd9a26eae8598b877a61593b04686502c2c3f517/jsonai-0.15.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-16 23:42:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kishoretvk",
    "github_project": "GenerativeJson",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "jsonai"
}
        
Elapsed time: 1.74919s