synthgen-client


Namesynthgen-client JSON
Version 0.0.1 PyPI version JSON
download
home_pageNone
SummaryA Python client library for the Synthetic Data Generation API
upload_time2025-03-01 17:07:29
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords api client data generation synthetic
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Synthetic Data Client

A Python client library for interacting with the Synthetic Data Generation API Framework Synthgen https://github.com/nasirus/synthgen.

## Installation

```bash
pip install synthgen-client
```

## Features

- Async/await support
- Type hints and validation using Pydantic
- Comprehensive error handling
- Streaming support for large exports
- Batch operations support
- Rich CLI progress displays
- Token usage and cost tracking

## Quick Start

```python
from synthgen import SynthgenClient
from synthgen.models import Task

# Initialize the client
client = SynthgenClient(
    base_url="https://api.synthgen.example.com",
    api_key="your-api-key"
)


# Example of a task using a local LLM provider
provider = "http://host.docker.internal:11434/v1/chat/completions"
model = "qwen2.5:0.5b"
api_key = "api_key"

# Create a single task
task = Task(
    custom_id="test",
    method="POST",
    url=provider,
    api_key=api_key,
    body={
        "model": model,
        "messages": [{"role": "user", "content": "solve 2x + 4 = 10"}],
    },
)

# Create a batch of tasks
tasks = [task]
for i in range(1, 10):
    tasks.append(Task(
        custom_id=f"task-00{i+1}",
        method="POST",
        url=provider,
        api_key=api_key,
        body={
            "model": model,
            "messages": [{"role": "user", "content": f"solve {i}x + 4 = 10"}],
        }
        )
    )

# Submit and monitor batch processing with cost tracking
results = client.monitor_batch(
    tasks=tasks,
    cost_by_1m_input_token=0.01,
    cost_by_1m_output_token=0.03
)

# Process results
for result in results:
    print(f"Task {result.message_id}: {result.status}")
    if result.body:
        print(f"Generated {len(result.body.get('data', []))} records")
```

## Configuration

The client can be configured in multiple ways:

### Environment Variables

```bash
# Set these environment variables
export SYNTHGEN_BASE_URL="http://localhost:8002"
export SYNTHGEN_API_KEY="your-api-key"

# Then initialize without parameters
client = SynthgenClient()
```

### Direct Parameters

```python
client = SynthgenClient(
    base_url="http://localhost:8002",
    api_key="your-api-key",
    timeout=7200  # 2 hours
)
```

### Configuration File

```python
# config.json
# {
#   "base_url": "http://localhost:8002",
#   "api_key": "your-api-key",
#   "timeout": 7200
# }

client = SynthgenClient(config_file="config.json")
```

## Batch Processing

The library provides powerful batch processing capabilities:

```python
# Create a batch of tasks
tasks = [
    Task(
        custom_id="task-001",
        method="POST",
        url=provider,
        api_key=api_key,
        body={
            "model": model,
            "messages": [{"role": "user", "content": "solve 2x + 4 = 10"}],
        },
        dataset="customers",
        use_cache=True,
    ),
    # Add more tasks...
]

# Submit batch and get batch_id
response = client.create_batch(tasks)
batch_id = response.batch_id

# Monitor batch progress with rich UI
results = client.monitor_batch(batch_id=batch_id)

# Or submit and monitor in one step
results = client.monitor_batch(tasks=tasks)
```

## Health Checks

```python
# Check system health
health = client.check_health()
print(f"System status: {health.status}")
print(f"API: {health.services.api}")
print(f"RabbitMQ: {health.services.rabbitmq}")
print(f"Elasticsearch: {health.services.elasticsearch}")
print(f"Queue consumers: {health.services.queue_consumers}")
```

## Task Management

```python
# Get task by ID
task = client.get_task("task-message-id")
print(f"Task status: {task.status}")
print(f"Completion time: {task.completed_at}")

# Delete a task
client.delete_task("task-message-id")
```

## Batch Management

```python
# Get all batches
batches = client.get_batches()
print(f"Total batches: {batches.total}")

# Get specific batch
batch = client.get_batch("batch-id")
print(f"Completed tasks: {batch.completed_tasks}/{batch.total_tasks}")
print(f"Token usage: {batch.total_tokens}")

# Get all tasks in a batch
tasks = client.get_batch_tasks("batch-id")

# Get only failed tasks
from synthgen.models import TaskStatus
failed_tasks = client.get_batch_tasks("batch-id", task_status=TaskStatus.FAILED)

# Delete a batch
client.delete_batch("batch-id")
```

## Context Manager Support

The client supports the context manager protocol for automatic resource cleanup:

```python
with SynthgenClient() as client:
    health = client.check_health()
    # Client will be automatically closed when exiting the with block
```

## Error Handling

The client provides robust error handling with automatic retries:

```python
from synthgen.exceptions import APIError

try:
    result = client.get_task("non-existent-id")
except APIError as e:
    print(f"API Error: {e.message}")
    print(f"Status code: {e.status_code}")
    if e.response:
        print(f"Response: {e.response.text}")
```

## Models

The library uses Pydantic models for type validation and serialization:

- `Task`: Represents a task to be submitted
- `TaskResponse`: Contains task results and metadata
- `Batch`: Contains batch status and statistics
- `BatchList`: Paginated list of batches
- `HealthResponse`: System health information
- `TaskStatus`: Enum of possible task states (PENDING, PROCESSING, COMPLETED, FAILED)

## Advanced Usage

### Monitoring Existing Batches

```python
# Monitor an existing batch
results = client.monitor_batch(
    batch_id="existing-batch-id",
    cost_by_1m_input_token=0.01,
    cost_by_1m_output_token=0.03
)
```

### Customizing Batch Creation

```python
# Create batch with custom chunk size for large batches
response = client.create_batch(tasks, chunk_size=500)
```

## Requirements

- Python 3.8+
- httpx>=0.24.0
- pydantic>=2.0.0
- rich (for progress displays)

## License

MIT
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "synthgen-client",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "api, client, data, generation, synthetic",
    "author": null,
    "author_email": "nasirus <email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/00/ad/be102f9a3401e202d5a29c131befb620edd627a2d2c598b49f5872552fa0/synthgen_client-0.0.1.tar.gz",
    "platform": null,
    "description": "# Synthetic Data Client\n\nA Python client library for interacting with the Synthetic Data Generation API Framework Synthgen https://github.com/nasirus/synthgen.\n\n## Installation\n\n```bash\npip install synthgen-client\n```\n\n## Features\n\n- Async/await support\n- Type hints and validation using Pydantic\n- Comprehensive error handling\n- Streaming support for large exports\n- Batch operations support\n- Rich CLI progress displays\n- Token usage and cost tracking\n\n## Quick Start\n\n```python\nfrom synthgen import SynthgenClient\nfrom synthgen.models import Task\n\n# Initialize the client\nclient = SynthgenClient(\n    base_url=\"https://api.synthgen.example.com\",\n    api_key=\"your-api-key\"\n)\n\n\n# Example of a task using a local LLM provider\nprovider = \"http://host.docker.internal:11434/v1/chat/completions\"\nmodel = \"qwen2.5:0.5b\"\napi_key = \"api_key\"\n\n# Create a single task\ntask = Task(\n    custom_id=\"test\",\n    method=\"POST\",\n    url=provider,\n    api_key=api_key,\n    body={\n        \"model\": model,\n        \"messages\": [{\"role\": \"user\", \"content\": \"solve 2x + 4 = 10\"}],\n    },\n)\n\n# Create a batch of tasks\ntasks = [task]\nfor i in range(1, 10):\n    tasks.append(Task(\n        custom_id=f\"task-00{i+1}\",\n        method=\"POST\",\n        url=provider,\n        api_key=api_key,\n        body={\n            \"model\": model,\n            \"messages\": [{\"role\": \"user\", \"content\": f\"solve {i}x + 4 = 10\"}],\n        }\n        )\n    )\n\n# Submit and monitor batch processing with cost tracking\nresults = client.monitor_batch(\n    tasks=tasks,\n    cost_by_1m_input_token=0.01,\n    cost_by_1m_output_token=0.03\n)\n\n# Process results\nfor result in results:\n    print(f\"Task {result.message_id}: {result.status}\")\n    if result.body:\n        print(f\"Generated {len(result.body.get('data', []))} records\")\n```\n\n## Configuration\n\nThe client can be configured in multiple ways:\n\n### Environment Variables\n\n```bash\n# Set these environment variables\nexport SYNTHGEN_BASE_URL=\"http://localhost:8002\"\nexport SYNTHGEN_API_KEY=\"your-api-key\"\n\n# Then initialize without parameters\nclient = SynthgenClient()\n```\n\n### Direct Parameters\n\n```python\nclient = SynthgenClient(\n    base_url=\"http://localhost:8002\",\n    api_key=\"your-api-key\",\n    timeout=7200  # 2 hours\n)\n```\n\n### Configuration File\n\n```python\n# config.json\n# {\n#   \"base_url\": \"http://localhost:8002\",\n#   \"api_key\": \"your-api-key\",\n#   \"timeout\": 7200\n# }\n\nclient = SynthgenClient(config_file=\"config.json\")\n```\n\n## Batch Processing\n\nThe library provides powerful batch processing capabilities:\n\n```python\n# Create a batch of tasks\ntasks = [\n    Task(\n        custom_id=\"task-001\",\n        method=\"POST\",\n        url=provider,\n        api_key=api_key,\n        body={\n            \"model\": model,\n            \"messages\": [{\"role\": \"user\", \"content\": \"solve 2x + 4 = 10\"}],\n        },\n        dataset=\"customers\",\n        use_cache=True,\n    ),\n    # Add more tasks...\n]\n\n# Submit batch and get batch_id\nresponse = client.create_batch(tasks)\nbatch_id = response.batch_id\n\n# Monitor batch progress with rich UI\nresults = client.monitor_batch(batch_id=batch_id)\n\n# Or submit and monitor in one step\nresults = client.monitor_batch(tasks=tasks)\n```\n\n## Health Checks\n\n```python\n# Check system health\nhealth = client.check_health()\nprint(f\"System status: {health.status}\")\nprint(f\"API: {health.services.api}\")\nprint(f\"RabbitMQ: {health.services.rabbitmq}\")\nprint(f\"Elasticsearch: {health.services.elasticsearch}\")\nprint(f\"Queue consumers: {health.services.queue_consumers}\")\n```\n\n## Task Management\n\n```python\n# Get task by ID\ntask = client.get_task(\"task-message-id\")\nprint(f\"Task status: {task.status}\")\nprint(f\"Completion time: {task.completed_at}\")\n\n# Delete a task\nclient.delete_task(\"task-message-id\")\n```\n\n## Batch Management\n\n```python\n# Get all batches\nbatches = client.get_batches()\nprint(f\"Total batches: {batches.total}\")\n\n# Get specific batch\nbatch = client.get_batch(\"batch-id\")\nprint(f\"Completed tasks: {batch.completed_tasks}/{batch.total_tasks}\")\nprint(f\"Token usage: {batch.total_tokens}\")\n\n# Get all tasks in a batch\ntasks = client.get_batch_tasks(\"batch-id\")\n\n# Get only failed tasks\nfrom synthgen.models import TaskStatus\nfailed_tasks = client.get_batch_tasks(\"batch-id\", task_status=TaskStatus.FAILED)\n\n# Delete a batch\nclient.delete_batch(\"batch-id\")\n```\n\n## Context Manager Support\n\nThe client supports the context manager protocol for automatic resource cleanup:\n\n```python\nwith SynthgenClient() as client:\n    health = client.check_health()\n    # Client will be automatically closed when exiting the with block\n```\n\n## Error Handling\n\nThe client provides robust error handling with automatic retries:\n\n```python\nfrom synthgen.exceptions import APIError\n\ntry:\n    result = client.get_task(\"non-existent-id\")\nexcept APIError as e:\n    print(f\"API Error: {e.message}\")\n    print(f\"Status code: {e.status_code}\")\n    if e.response:\n        print(f\"Response: {e.response.text}\")\n```\n\n## Models\n\nThe library uses Pydantic models for type validation and serialization:\n\n- `Task`: Represents a task to be submitted\n- `TaskResponse`: Contains task results and metadata\n- `Batch`: Contains batch status and statistics\n- `BatchList`: Paginated list of batches\n- `HealthResponse`: System health information\n- `TaskStatus`: Enum of possible task states (PENDING, PROCESSING, COMPLETED, FAILED)\n\n## Advanced Usage\n\n### Monitoring Existing Batches\n\n```python\n# Monitor an existing batch\nresults = client.monitor_batch(\n    batch_id=\"existing-batch-id\",\n    cost_by_1m_input_token=0.01,\n    cost_by_1m_output_token=0.03\n)\n```\n\n### Customizing Batch Creation\n\n```python\n# Create batch with custom chunk size for large batches\nresponse = client.create_batch(tasks, chunk_size=500)\n```\n\n## Requirements\n\n- Python 3.8+\n- httpx>=0.24.0\n- pydantic>=2.0.0\n- rich (for progress displays)\n\n## License\n\nMIT",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python client library for the Synthetic Data Generation API",
    "version": "0.0.1",
    "project_urls": {
        "Documentation": "https://github.com/nasirus/synthgen-client#readme",
        "Homepage": "https://github.com/nasirus/synthgen-client",
        "Repository": "https://github.com/nasirus/synthgen-client.git"
    },
    "split_keywords": [
        "api",
        " client",
        " data",
        " generation",
        " synthetic"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "77700e939c6e5fc26833aabb1407a64c50fd3f83bfe11ad24f4913cc2d41e81d",
                "md5": "fe83405f8171aea0f501598fa0217949",
                "sha256": "98b64723b83c3a41dab6dd06ccb1255a6ac0bae4ccc4a77e0a16547dd2a52e92"
            },
            "downloads": -1,
            "filename": "synthgen_client-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fe83405f8171aea0f501598fa0217949",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11574,
            "upload_time": "2025-03-01T17:07:27",
            "upload_time_iso_8601": "2025-03-01T17:07:27.181602Z",
            "url": "https://files.pythonhosted.org/packages/77/70/0e939c6e5fc26833aabb1407a64c50fd3f83bfe11ad24f4913cc2d41e81d/synthgen_client-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "00adbe102f9a3401e202d5a29c131befb620edd627a2d2c598b49f5872552fa0",
                "md5": "79298c507004efb3529f1a1a94ad70bc",
                "sha256": "84bb02c48713af7f2ff6db8a2e6ed6f56b1aa45d5d5fe4daf6dc6cd04ab795ca"
            },
            "downloads": -1,
            "filename": "synthgen_client-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "79298c507004efb3529f1a1a94ad70bc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 12543,
            "upload_time": "2025-03-01T17:07:29",
            "upload_time_iso_8601": "2025-03-01T17:07:29.005513Z",
            "url": "https://files.pythonhosted.org/packages/00/ad/be102f9a3401e202d5a29c131befb620edd627a2d2c598b49f5872552fa0/synthgen_client-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-03-01 17:07:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nasirus",
    "github_project": "synthgen-client#readme",
    "github_not_found": true,
    "lcname": "synthgen-client"
}
        
Elapsed time: 1.26269s