# ๐ฏ Syda - AI-Powered Synthetic Data Generation
[](https://badge.fury.io/py/syda)
[](https://www.python.org/downloads/)
[](https://www.gnu.org/licenses/lgpl-3.0)
[](https://python.syda.ai)
[](https://github.com/syda-ai/syda/stargazers)
> **Generate high-quality synthetic data with AI while preserving referential integrity**
Syda seamlessly integrates with **Anthropic Claude**, **OpenAI GPT** and **Google Gemini** models to create realistic test data, maintain privacy compliance, and accelerate development workflows.
## ๐ Documentation
**๐ For detailed documentation, examples, and API reference, visit: [https://python.syda.ai/](https://python.syda.ai/)**
## โก 30-Second Quick Start
```bash
pip install syda
```
Create `.env` file:
```bash
# .env
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# OR
OPENAI_API_KEY=your_openai_api_key_here
# OR
GEMINI_API_KEY=your_gemini_api_key_here
```
```python
"""
Syda 30-Second Quick Start Example
Demonstrates AI-powered synthetic data generation with perfect referential integrity
"""
from syda import SyntheticDataGenerator, ModelConfig
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
print("๐ Starting Syda 30-Second Quick Start...")
# Configure AI model
generator = SyntheticDataGenerator(
model_config=ModelConfig(
provider="anthropic",
model_name="claude-3-5-haiku-20241022"
)
)
# Define schemas with rich descriptions for better AI understanding
schemas = {
# Categories schema with table and column descriptions
'categories': {
'__table_description__': 'Product categories for organizing items in the e-commerce catalog',
'id': {
'type': 'number',
'description': 'Unique identifier for the category',
'primary_key': True
},
'name': {
'type': 'text',
'description': 'Category name (Electronics, Home Decor, Sports, etc.)'
},
'description': {
'type': 'text',
'description': 'Detailed description of what products belong in this category'
}
},
# Products schema with table and column descriptions and foreign keys
'products': {
'__table_description__': 'Individual products available for purchase with pricing and category assignment',
'__foreign_keys__': {
'category_id': ['categories', 'id'] # products.category_id references categories.id
},
'id': {
'type': 'number',
'description': 'Unique product identifier',
'primary_key': True
},
'name': {
'type': 'text',
'description': 'Product name and title'
},
'category_id': {
'type': 'foreign_key',
'description': 'Reference to the category this product belongs to'
},
'price': {
'type': 'number',
'description': 'Product price in USD'
}
}
}
# Generate data with perfect referential integrity
print("๐ Generating categories and products...")
results = generator.generate_for_schemas(
schemas=schemas,
sample_sizes={"categories": 5, "products": 20},
output_dir="data"
)
print("โ
Generated realistic data with perfect foreign key relationships!")
print("๐ Check the 'data' folder for categories.csv and products.csv")
# Check data/ folder for categories.csv and products.csv
```
## ๐ Why Developers Love Syda
| Feature | Benefit | Example |
|---------|---------|---------|
| ๐ค **Multi-AI Provider** | No vendor lock-in | Claude, GPT, Gemini models |
| ๐ **Zero Orphaned Records** | Perfect referential integrity | `product.category_id` โ `category.id` โ
|
| ๐๏ธ **SQLAlchemy Native** | Use existing models directly | `Customer`, `Contact` classes โ CSV data |
| ๐ **Multiple Schema Formats** | Flexible input options | SQLAlchemy, YAML, JSON, Dict |
| ๐ **Document Generation** | AI-powered PDFs linked to data | Product catalogs, receipts, contracts |
| ๐ง **Custom Generators** | Complex business logic | Tax calculations, pricing rules, arrays |
| ๐ก๏ธ **Privacy-First** | Protect real user data | GDPR/CCPA compliant testing |
| โก **Developer Experience** | Just works | Type hints, great docs |
## ๐ Retail Example
### 1. Define your schemas
<details>
<summary><strong>๐ Click to view schema files</strong> (category_schema.yml & product_schema.yml)</summary>
**category_schema.yml:**
```yaml
__table_name__: Category
__description__: Retail product categories
id:
type: integer
description: Unique category ID
constraints:
primary_key: true
not_null: true
min: 1
max: 1000
name:
type: string
description: Category name
constraints:
not_null: true
length: 50
unique: true
parent_id:
type: integer
description: Parent category ID for hierarchical categories, if it is a parent category, this field should be 0
constraints:
min: 0
max: 1000
description:
type: text
description: Detailed category description
constraints:
length: 500
active:
type: boolean
description: Whether the category is active
constraints:
not_null: true
```
**product_schema.yml:**
```yaml
__table_name__: Product
__description__: Retail products
__foreign_keys__:
category_id: [Category, id]
id:
type: integer
description: Unique product ID
constraints:
primary_key: true
not_null: true
min: 1
max: 10000
name:
type: string
description: Product name
constraints:
not_null: true
length: 100
unique: true
category_id:
type: integer
description: Category ID for the product
constraints:
not_null: true
min: 1
max: 1000
sku:
type: string
description: Stock Keeping Unit - unique product code
constraints:
not_null: true
pattern: '^P[A-Z]{2}-\d{5}$'
length: 10
unique: true
price:
type: float
description: Product price in USD
constraints:
not_null: true
min: 0.99
max: 9999.99
decimals: 2
stock_quantity:
type: integer
description: Current stock level
constraints:
not_null: true
min: 0
max: 10000
is_featured:
type: boolean
description: Whether the product is featured
constraints:
not_null: true
```
</details>
### 2. Generate structured data
<details>
<summary><strong>๐ Click to view Python code</strong></summary>
```python
from syda import SyntheticDataGenerator, ModelConfig
from dotenv import load_dotenv
import os
# Load environment variables from .env file
load_dotenv()
# Configure your AI model
config = ModelConfig(
provider="anthropic",
model_name="claude-3-5-haiku-20241022"
)
# Create generator
generator = SyntheticDataGenerator(model_config=config)
# Define your schemas (structured data only)
schemas = {
"categories": "category_schema.yml",
"products": "product_schema.yml"
}
# Generate synthetic data with relationships intact
results = generator.generate_for_schemas(
schemas=schemas,
sample_sizes={"categories": 5, "products": 20},
output_dir="output",
prompts = {
"Category": "Generate retail product categories with hierarchical structure.",
"Product": "Generate retail products with names, SKUs, prices, and descriptions. Ensure a good variety of prices and categories."
}
)
# Perfect referential integrity guaranteed! ๐ฏ
print("โ
Generated realistic data with perfect foreign key relationships!")
```
</details>
**Output:**
```bash
๐ output/
โโโ ๐ categories.csv # 5 product categories with hierarchical structure
โโโ ๐ products.csv # 20 products, all with valid category_id references
```
### 3. Want to generate documents too? Add document templates!
To generate **AI-powered documents** along with your structured data, simply add the product catalog schema and update your code:
<details>
<summary><strong>๐ Click to view document schema</strong> (product_catalog_schema.yml)</summary>
**product_catalog_schema.yml (Document Template):**
```yaml
__template__: true
__description__: Product catalog page template
__name__: ProductCatalog
__depends_on__: [Product, Category]
__foreign_keys__:
product_name: [Product, name]
category_name: [Category, name]
product_price: [Product, price]
product_sku: [Product, sku]
__template_source__: templates/product_catalog.html
__input_file_type__: html
__output_file_type__: pdf
# Product information (linked to Product table)
product_name:
type: string
length: 100
description: Name of the featured product
category_name:
type: string
length: 50
description: Category this product belongs to
product_sku:
type: string
length: 10
description: Product SKU code
product_price:
type: float
decimals: 2
description: Product price in USD
# Marketing content (AI-generated)
product_description:
type: text
length: 500
description: Detailed marketing description of the product
key_features:
type: text
length: 300
description: Bullet points of key product features
marketing_tagline:
type: string
length: 100
description: Catchy marketing tagline for the product
availability_status:
type: string
enum: ["In Stock", "Limited Stock", "Out of Stock", "Pre-Order"]
description: Current availability status
```
</details>
<details>
<summary><strong>๐จ Click to view HTML template</strong> (templates/product_catalog.html)</summary>
**Create the Jinja HTML template** (`templates/product_catalog.html`):
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>{{ product_name }} - Product Catalog</title>
<style>
body {
font-family: 'Arial', sans-serif;
max-width: 800px;
margin: 0 auto;
padding: 40px;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: #333;
}
.catalog-page {
background: white;
padding: 40px;
border-radius: 15px;
box-shadow: 0 10px 30px rgba(0,0,0,0.2);
}
.product-header {
text-align: center;
margin-bottom: 30px;
border-bottom: 3px solid #667eea;
padding-bottom: 20px;
}
.product-name {
font-size: 36px;
font-weight: bold;
color: #2c3e50;
margin-bottom: 10px;
}
.category-sku {
font-size: 16px;
color: #7f8c8d;
margin-bottom: 15px;
}
.price {
font-size: 32px;
color: #e74c3c;
font-weight: bold;
}
.tagline {
font-style: italic;
font-size: 18px;
color: #34495e;
text-align: center;
margin: 20px 0;
padding: 15px;
background: #ecf0f1;
border-radius: 8px;
}
.description {
font-size: 16px;
line-height: 1.6;
margin: 25px 0;
text-align: justify;
}
.features {
background: #f8f9fa;
padding: 20px;
border-radius: 8px;
margin: 25px 0;
}
.features h3 {
color: #2c3e50;
margin-top: 0;
}
.availability {
text-align: center;
font-size: 18px;
font-weight: bold;
padding: 15px;
border-radius: 8px;
margin-top: 30px;
}
.in-stock { background: #d4edda; color: #155724; }
.limited-stock { background: #fff3cd; color: #856404; }
.out-of-stock { background: #f8d7da; color: #721c24; }
.pre-order { background: #d1ecf1; color: #0c5460; }
</style>
</head>
<body>
<div class="catalog-page">
<div class="product-header">
<div class="product-name">{{ product_name }}</div>
<div class="category-sku">{{ category_name }} Category | SKU: {{ product_sku }}</div>
<div class="price">${{ "%.2f"|format(product_price) }}</div>
</div>
<div class="tagline">"{{ marketing_tagline }}"</div>
<div class="description">
{{ product_description }}
</div>
<div class="features">
<h3>KEY FEATURES:</h3>
{{ key_features }}
</div>
<div class="availability {{ availability_status.lower().replace(' ', '-') }}">
Availability: {{ availability_status }}
</div>
</div>
</body>
</html>
```
</details>
<details>
<summary><strong>๐ Click to view updated Python code</strong> (with document generation)</summary>
```python
# Same setup as before...
from syda import SyntheticDataGenerator, ModelConfig
from dotenv import load_dotenv
load_dotenv()
config = ModelConfig(provider="anthropic", model_name="claude-3-5-haiku-20241022")
generator = SyntheticDataGenerator(model_config=config)
# Define your schemas (structured data)
schemas = {
"categories": "category_schema.yml",
"products": "product_schema.yml",
# ๐ Add document templates
"product_catalogs": "product_catalog_schema.yml"
}
# Generate both structured data AND documents
results = generator.generate_for_schemas(
schemas=schemas,
templates=templates, # ๐ Add this line
sample_sizes={
"categories": 5,
"products": 20,
"product_catalogs": 10 # ๐ Add this line
},
output_dir="output",
prompts = {
"Category": "Generate retail product categories with hierarchical structure.",
"Product": "Generate retail products with names, SKUs, prices, and descriptions. Ensure a good variety of prices and categories.",
"ProductCatalog": "Generate compelling product catalog pages with marketing descriptions, key features, and sales copy." # ๐ Add this line
}
)
print("โ
Generated structured data + AI-powered product catalogs!")
```
</details>
**Enhanced Output:**
```bash
๐ output/
โโโ ๐ categories.csv # 5 product categories with hierarchical structure
โโโ ๐ products.csv # 20 products, all with valid category_id references
โโโ ๐ product_catalogs/ # ๐ AI-generated marketing documents
โโโ catalog_1.pdf # Product names match products.csv
โโโ catalog_2.pdf # Prices match products.csv
โโโ catalog_3.pdf # Perfect data consistency!
โโโ ...
โโโ catalog_10.pdf
```
## ๐ See It In Action
### **Realistic Retail Data + AI-Generated Product Catalogs**
**Categories Table:**
```csv
id,name,parent_id,description,active
1,Electronics,0,Electronic devices and accessories,true
2,Smartphones,1,Mobile phones and accessories,true
3,Laptops,1,Portable computers and accessories,true
4,Clothing,0,Apparel and fashion items,true
5,Men's Clothing,4,Men's apparel and accessories,true
```
**Products Table (with matching category_id):**
```csv
id,name,category_id,sku,price,stock_quantity,is_featured
1,iPhone 15 Pro,2,PSM-12345,999.99,50,true
2,MacBook Air M3,3,PLA-67890,1299.99,25,true
3,Samsung Galaxy S24,2,PSA-11111,899.99,75,false
4,Dell XPS 13,3,PDE-22222,1099.99,30,false
5,Men's Cotton T-Shirt,5,PMC-33333,24.99,200,false
```
**Generated Product Catalog PDF Content:**
```
IPHONE 15 PRO
Smartphones Category | SKU: PSM-12345
$999.99
Revolutionary Performance, Unmatched Design
Experience the future of mobile technology with the iPhone 15 Pro.
Featuring the powerful A17 Pro chip, this device delivers unprecedented
performance for both work and play. The titanium design combines
durability with elegance, while the advanced camera system captures
professional-quality photos and videos.
KEY FEATURES:
โข A17 Pro chip with 6-core GPU
โข Pro camera system with 3x optical zoom
โข Titanium design with Action Button
โข USB-C connectivity
โข All-day battery life
"Innovation that fits in your pocket"
Availability: In Stock
```
> ๐ฏ **Perfect Integration**: The PDF catalog contains **actual product names, SKUs, and prices** from the CSV data, plus **AI-generated marketing content** - zero inconsistencies!
### 4. Need custom business logic? Add custom generators!
For advanced scenarios requiring **custom calculations** or **complex business rules**, you can add custom generator functions:
<details>
<summary><strong>๐ง Click to view custom generators example</strong></summary>
```python
# Define custom generator functions
def calculate_tax(row, parent_dfs=None, **kwargs):
"""Calculate tax amount based on subtotal and tax rate"""
subtotal = row.get('subtotal', 0)
tax_rate = row.get('tax_rate', 8.5) # Default 8.5%
return round(subtotal * (tax_rate / 100), 2)
def calculate_total(row, parent_dfs=None, **kwargs):
"""Calculate final total: subtotal + tax - discount"""
subtotal = row.get('subtotal', 0)
tax_amount = row.get('tax_amount', 0)
discount = row.get('discount_amount', 0)
return round(subtotal + tax_amount - discount, 2)
def generate_receipt_items(row, parent_dfs=None, **kwargs):
"""Generate receipt items based on actual transactions"""
items = []
if parent_dfs and 'Product' in parent_dfs and 'Transaction' in parent_dfs:
products_df = parent_dfs['Product']
transactions_df = parent_dfs['Transaction']
# Get customer's transactions
customer_id = row.get('customer_id')
customer_transactions = transactions_df[
transactions_df['customer_id'] == customer_id
]
# Build receipt items from actual transaction data
for _, tx in customer_transactions.iterrows():
product = products_df[products_df['id'] == tx['product_id']].iloc[0]
items.append({
"product_name": product['name'],
"sku": product['sku'],
"quantity": int(tx['quantity']),
"unit_price": float(product['price']),
"item_total": round(tx['quantity'] * product['price'], 2)
})
return items
# Add custom generators to your generation
custom_generators = {
"ProductCatalog": {
"tax_amount": calculate_tax,
"total": calculate_total,
"items": generate_receipt_items
}
}
# Generate with custom business logic
results = generator.generate_for_schemas(
schemas=schemas,
templates=templates,
sample_sizes={"categories": 5, "products": 20, "product_catalogs": 10},
output_dir="output",
custom_generators=custom_generators, # ๐ Add this line
prompts={
"Category": "Generate retail product categories with hierarchical structure.",
"Product": "Generate retail products with names, SKUs, prices, and descriptions.",
"ProductCatalog": "Generate compelling product catalog pages with marketing copy."
}
)
print("โ
Generated data with custom business logic!")
```
</details>
> ๐ฏ **Custom generators let you:**
> - **Calculate fields** based on other data (taxes, totals, discounts)
> - **Access related data** from other tables via `parent_dfs`
> - **Implement complex business rules** (pricing logic, inventory rules)
> - **Generate structured data** (arrays, nested objects, JSON)
## ๐๏ธ Works with Your Existing SQLAlchemy Models
Already using **SQLAlchemy**? Syda works directly with your existing models - no schema conversion needed!
<details>
<summary><strong>๐๏ธ Click to view SQLAlchemy example</strong></summary>
```python
from sqlalchemy import Column, Integer, String, Float, ForeignKey, Boolean
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from syda import SyntheticDataGenerator, ModelConfig
from dotenv import load_dotenv
load_dotenv()
Base = declarative_base()
# Your existing SQLAlchemy models
class Customer(Base):
__tablename__ = 'customers'
id = Column(Integer, primary_key=True)
name = Column(String(100), nullable=False, comment='Customer organization name')
industry = Column(String(50), comment='Industry sector')
annual_revenue = Column(Float, comment='Annual revenue in USD')
status = Column(String(20), comment='Active, Inactive, or Prospect')
# Relationships work perfectly
contacts = relationship("Contact", back_populates="customer")
class Contact(Base):
__tablename__ = 'contacts'
id = Column(Integer, primary_key=True)
customer_id = Column(Integer, ForeignKey('customers.id'), nullable=False)
first_name = Column(String(50), nullable=False)
last_name = Column(String(50), nullable=False)
email = Column(String(100), nullable=False, unique=True)
position = Column(String(100), comment='Job title')
is_primary = Column(Boolean, comment='Primary contact for customer')
customer = relationship("Customer", back_populates="contacts")
# Generate data directly from your models
config = ModelConfig(provider="anthropic", model_name="claude-3-5-haiku-20241022")
generator = SyntheticDataGenerator(model_config=config)
results = generator.generate_for_sqlalchemy_models(
sqlalchemy_models=[Customer, Contact],
sample_sizes={"Customer": 10, "Contact": 25},
output_dir="crm_data"
)
print("โ
Generated CRM data with perfect foreign key relationships!")
```
</details>
**Output:**
```bash
๐ crm_data/
โโโ ๐ customers.csv # 10 companies with realistic industry data
โโโ ๐ contacts.csv # 25 contacts, all with valid customer_id references
```
> ๐ฏ **Zero Configuration**: Your SQLAlchemy `comments` become AI generation hints, `ForeignKey` relationships are automatically maintained, and `nullable=False` constraints are respected!
## ๐ค Contributing
We would **love your contributions**! Syda is an open-source project that thrives on community involvement.
### ๐ **Ways to Contribute**
- **๐ Report bugs** - Help us identify and fix issues
- **๐ก Suggest features** - Share your ideas for new capabilities
- **๐ Improve docs** - Help make our documentation even better
- **๐ง Submit code** - Fix bugs, add features, optimize performance
- **๐งช Add examples** - Show how Syda works in your domain
- **โญ Star the repo** - Help others discover Syda
### ๐ **How to Get Started**
1. **Check our [Contributing Guide](CONTRIBUTING.md)** for detailed instructions
2. **Browse [open issues](https://github.com/syda-ai/syda/issues)** to find something to work on
3. **Join discussions** in our GitHub Issues and Discussions
4. **Fork the repo** and submit your first pull request!
### ๐ฏ **Good First Issues**
Looking for ways to contribute? Check out issues labeled:
- `good first issue` - Perfect for newcomers
- `help wanted` - We'd especially appreciate help here
- `documentation` - Help improve our docs
- `examples` - Add new use cases and examples
**Every contribution matters - from fixing typos to adding major features!** ๐
---
**โญ Star this repo** if Syda helps your workflow โข **๐ [Read the docs](https://python.syda.ai)** for detailed guides โข **๐ [Report issues](https://github.com/syda-ai/syda/issues)** to help us improve
Raw data
{
"_id": null,
"home_page": null,
"name": "syda",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "synthetic data, AI, machine learning, data generation, testing, privacy, SQLAlchemy, OpenAI, Anthropic, Claude, Google, Gemini, GPT",
"author": null,
"author_email": "Rama Krishna Kumar Lingamgunta <lrkkumar2606@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/f5/70/3175776358ecc6e57b9e8cd576c83cdcef789cd259eaac0d0d311b6303eb/syda-0.0.2.tar.gz",
"platform": null,
"description": "# \ud83c\udfaf Syda - AI-Powered Synthetic Data Generation\n\n[](https://badge.fury.io/py/syda)\n[](https://www.python.org/downloads/)\n[](https://www.gnu.org/licenses/lgpl-3.0)\n[](https://python.syda.ai)\n[](https://github.com/syda-ai/syda/stargazers)\n\n> **Generate high-quality synthetic data with AI while preserving referential integrity**\n\nSyda seamlessly integrates with **Anthropic Claude**, **OpenAI GPT** and **Google Gemini** models to create realistic test data, maintain privacy compliance, and accelerate development workflows.\n\n## \ud83d\udcda Documentation\n\n**\ud83d\udcd6 For detailed documentation, examples, and API reference, visit: [https://python.syda.ai/](https://python.syda.ai/)**\n\n## \u26a1 30-Second Quick Start\n\n```bash\npip install syda\n```\n\nCreate `.env` file:\n```bash\n# .env\nANTHROPIC_API_KEY=your_anthropic_api_key_here\n# OR\nOPENAI_API_KEY=your_openai_api_key_here\n# OR\nGEMINI_API_KEY=your_gemini_api_key_here\n```\n\n```python\n\"\"\"\nSyda 30-Second Quick Start Example\nDemonstrates AI-powered synthetic data generation with perfect referential integrity\n\"\"\"\n\nfrom syda import SyntheticDataGenerator, ModelConfig\nfrom dotenv import load_dotenv\n\n# Load environment variables from .env file\nload_dotenv()\n\nprint(\"\ud83d\ude80 Starting Syda 30-Second Quick Start...\")\n\n# Configure AI model\ngenerator = SyntheticDataGenerator(\n model_config=ModelConfig(\n provider=\"anthropic\", \n model_name=\"claude-3-5-haiku-20241022\"\n )\n)\n\n# Define schemas with rich descriptions for better AI understanding\nschemas = {\n # Categories schema with table and column descriptions\n 'categories': {\n '__table_description__': 'Product categories for organizing items in the e-commerce catalog',\n 'id': {\n 'type': 'number', \n 'description': 'Unique identifier for the category', \n 'primary_key': True\n },\n 'name': {\n 'type': 'text', \n 'description': 'Category name (Electronics, Home Decor, Sports, etc.)'\n },\n 'description': {\n 'type': 'text', \n 'description': 'Detailed description of what products belong in this category'\n }\n },\n\n # Products schema with table and column descriptions and foreign keys\n 'products': {\n '__table_description__': 'Individual products available for purchase with pricing and category assignment',\n '__foreign_keys__': {\n 'category_id': ['categories', 'id'] # products.category_id references categories.id\n },\n 'id': {\n 'type': 'number', \n 'description': 'Unique product identifier', \n 'primary_key': True\n },\n 'name': {\n 'type': 'text', \n 'description': 'Product name and title'\n },\n 'category_id': {\n 'type': 'foreign_key', \n 'description': 'Reference to the category this product belongs to'\n },\n 'price': {\n 'type': 'number', \n 'description': 'Product price in USD'\n }\n }\n}\n\n# Generate data with perfect referential integrity\nprint(\"\ud83d\udcca Generating categories and products...\")\nresults = generator.generate_for_schemas(\n schemas=schemas,\n sample_sizes={\"categories\": 5, \"products\": 20},\n output_dir=\"data\"\n)\n\nprint(\"\u2705 Generated realistic data with perfect foreign key relationships!\")\nprint(\"\ud83d\udcc2 Check the 'data' folder for categories.csv and products.csv\")\n# Check data/ folder for categories.csv and products.csv\n```\n\n\n## \ud83d\ude80 Why Developers Love Syda\n\n| Feature | Benefit | Example |\n|---------|---------|---------|\n| \ud83e\udd16 **Multi-AI Provider** | No vendor lock-in | Claude, GPT, Gemini models |\n| \ud83d\udd17 **Zero Orphaned Records** | Perfect referential integrity | `product.category_id` \u2192 `category.id` \u2705 |\n| \ud83c\udfd7\ufe0f **SQLAlchemy Native** | Use existing models directly | `Customer`, `Contact` classes \u2192 CSV data |\n| \ud83d\udcca **Multiple Schema Formats** | Flexible input options | SQLAlchemy, YAML, JSON, Dict |\n| \ud83d\udcc4 **Document Generation** | AI-powered PDFs linked to data | Product catalogs, receipts, contracts |\n| \ud83d\udd27 **Custom Generators** | Complex business logic | Tax calculations, pricing rules, arrays |\n| \ud83d\udee1\ufe0f **Privacy-First** | Protect real user data | GDPR/CCPA compliant testing |\n| \u26a1 **Developer Experience** | Just works | Type hints, great docs |\n\n\n## \ud83d\uded2 Retail Example\n\n### 1. Define your schemas\n\n<details>\n<summary><strong>\ud83d\udccb Click to view schema files</strong> (category_schema.yml & product_schema.yml)</summary>\n\n**category_schema.yml:**\n```yaml\n__table_name__: Category\n__description__: Retail product categories\n\nid:\n type: integer\n description: Unique category ID\n constraints:\n primary_key: true\n not_null: true\n min: 1\n max: 1000\n\nname:\n type: string\n description: Category name\n constraints:\n not_null: true\n length: 50\n unique: true\n\nparent_id:\n type: integer\n description: Parent category ID for hierarchical categories, if it is a parent category, this field should be 0\n constraints:\n min: 0\n max: 1000\n\ndescription:\n type: text\n description: Detailed category description\n constraints:\n length: 500\n\nactive:\n type: boolean\n description: Whether the category is active\n constraints:\n not_null: true\n```\n\n**product_schema.yml:**\n```yaml\n__table_name__: Product\n__description__: Retail products\n__foreign_keys__:\n category_id: [Category, id]\n\nid:\n type: integer\n description: Unique product ID\n constraints:\n primary_key: true\n not_null: true\n min: 1\n max: 10000\n\nname:\n type: string\n description: Product name\n constraints:\n not_null: true\n length: 100\n unique: true\n\ncategory_id:\n type: integer\n description: Category ID for the product\n constraints:\n not_null: true\n min: 1\n max: 1000\n\nsku:\n type: string\n description: Stock Keeping Unit - unique product code\n constraints:\n not_null: true\n pattern: '^P[A-Z]{2}-\\d{5}$'\n length: 10\n unique: true\n\nprice:\n type: float\n description: Product price in USD\n constraints:\n not_null: true\n min: 0.99\n max: 9999.99\n decimals: 2\n\nstock_quantity:\n type: integer\n description: Current stock level\n constraints:\n not_null: true\n min: 0\n max: 10000\n\nis_featured:\n type: boolean\n description: Whether the product is featured\n constraints:\n not_null: true\n```\n\n</details>\n\n\n\n### 2. Generate structured data\n\n<details>\n<summary><strong>\ud83d\udc0d Click to view Python code</strong></summary>\n\n```python\nfrom syda import SyntheticDataGenerator, ModelConfig\nfrom dotenv import load_dotenv\nimport os\n\n# Load environment variables from .env file\nload_dotenv()\n\n# Configure your AI model \nconfig = ModelConfig(\n provider=\"anthropic\",\n model_name=\"claude-3-5-haiku-20241022\"\n)\n\n# Create generator\ngenerator = SyntheticDataGenerator(model_config=config)\n\n# Define your schemas (structured data only)\nschemas = {\n \"categories\": \"category_schema.yml\",\n \"products\": \"product_schema.yml\"\n}\n\n# Generate synthetic data with relationships intact\nresults = generator.generate_for_schemas(\n schemas=schemas,\n sample_sizes={\"categories\": 5, \"products\": 20},\n output_dir=\"output\",\n prompts = {\n \"Category\": \"Generate retail product categories with hierarchical structure.\",\n \"Product\": \"Generate retail products with names, SKUs, prices, and descriptions. Ensure a good variety of prices and categories.\"\n }\n)\n\n# Perfect referential integrity guaranteed! \ud83c\udfaf\nprint(\"\u2705 Generated realistic data with perfect foreign key relationships!\")\n```\n\n</details>\n\n**Output:**\n```bash\n\ud83d\udcc2 output/\n\u251c\u2500\u2500 \ud83d\udcca categories.csv # 5 product categories with hierarchical structure\n\u2514\u2500\u2500 \ud83d\udcca products.csv # 20 products, all with valid category_id references\n```\n\n### 3. Want to generate documents too? Add document templates!\n\nTo generate **AI-powered documents** along with your structured data, simply add the product catalog schema and update your code:\n\n<details>\n<summary><strong>\ud83d\udcc4 Click to view document schema</strong> (product_catalog_schema.yml)</summary>\n\n**product_catalog_schema.yml (Document Template):**\n```yaml\n__template__: true\n__description__: Product catalog page template\n__name__: ProductCatalog\n__depends_on__: [Product, Category]\n__foreign_keys__:\n product_name: [Product, name]\n category_name: [Category, name]\n product_price: [Product, price]\n product_sku: [Product, sku]\n__template_source__: templates/product_catalog.html\n__input_file_type__: html\n__output_file_type__: pdf\n\n# Product information (linked to Product table)\nproduct_name:\n type: string\n length: 100\n description: Name of the featured product\n\ncategory_name:\n type: string\n length: 50\n description: Category this product belongs to\n\nproduct_sku:\n type: string\n length: 10\n description: Product SKU code\n\nproduct_price:\n type: float\n decimals: 2\n description: Product price in USD\n\n# Marketing content (AI-generated)\nproduct_description:\n type: text\n length: 500\n description: Detailed marketing description of the product\n\nkey_features:\n type: text\n length: 300\n description: Bullet points of key product features\n\nmarketing_tagline:\n type: string\n length: 100\n description: Catchy marketing tagline for the product\n\navailability_status:\n type: string\n enum: [\"In Stock\", \"Limited Stock\", \"Out of Stock\", \"Pre-Order\"]\n description: Current availability status\n```\n\n</details>\n\n<details>\n<summary><strong>\ud83c\udfa8 Click to view HTML template</strong> (templates/product_catalog.html)</summary>\n\n**Create the Jinja HTML template** (`templates/product_catalog.html`):\n```html\n<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n <meta charset=\"UTF-8\">\n <title>{{ product_name }} - Product Catalog</title>\n <style>\n body {\n font-family: 'Arial', sans-serif;\n max-width: 800px;\n margin: 0 auto;\n padding: 40px;\n background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);\n color: #333;\n }\n .catalog-page {\n background: white;\n padding: 40px;\n border-radius: 15px;\n box-shadow: 0 10px 30px rgba(0,0,0,0.2);\n }\n .product-header {\n text-align: center;\n margin-bottom: 30px;\n border-bottom: 3px solid #667eea;\n padding-bottom: 20px;\n }\n .product-name {\n font-size: 36px;\n font-weight: bold;\n color: #2c3e50;\n margin-bottom: 10px;\n }\n .category-sku {\n font-size: 16px;\n color: #7f8c8d;\n margin-bottom: 15px;\n }\n .price {\n font-size: 32px;\n color: #e74c3c;\n font-weight: bold;\n }\n .tagline {\n font-style: italic;\n font-size: 18px;\n color: #34495e;\n text-align: center;\n margin: 20px 0;\n padding: 15px;\n background: #ecf0f1;\n border-radius: 8px;\n }\n .description {\n font-size: 16px;\n line-height: 1.6;\n margin: 25px 0;\n text-align: justify;\n }\n .features {\n background: #f8f9fa;\n padding: 20px;\n border-radius: 8px;\n margin: 25px 0;\n }\n .features h3 {\n color: #2c3e50;\n margin-top: 0;\n }\n .availability {\n text-align: center;\n font-size: 18px;\n font-weight: bold;\n padding: 15px;\n border-radius: 8px;\n margin-top: 30px;\n }\n .in-stock { background: #d4edda; color: #155724; }\n .limited-stock { background: #fff3cd; color: #856404; }\n .out-of-stock { background: #f8d7da; color: #721c24; }\n .pre-order { background: #d1ecf1; color: #0c5460; }\n </style>\n</head>\n<body>\n <div class=\"catalog-page\">\n <div class=\"product-header\">\n <div class=\"product-name\">{{ product_name }}</div>\n <div class=\"category-sku\">{{ category_name }} Category | SKU: {{ product_sku }}</div>\n <div class=\"price\">${{ \"%.2f\"|format(product_price) }}</div>\n </div>\n \n <div class=\"tagline\">\"{{ marketing_tagline }}\"</div>\n \n <div class=\"description\">\n {{ product_description }}\n </div>\n \n <div class=\"features\">\n <h3>KEY FEATURES:</h3>\n {{ key_features }}\n </div>\n \n <div class=\"availability {{ availability_status.lower().replace(' ', '-') }}\">\n Availability: {{ availability_status }}\n </div>\n </div>\n</body>\n</html>\n```\n\n</details>\n\n<details>\n<summary><strong>\ud83d\udc0d Click to view updated Python code</strong> (with document generation)</summary>\n\n```python\n# Same setup as before...\nfrom syda import SyntheticDataGenerator, ModelConfig\nfrom dotenv import load_dotenv\n\nload_dotenv()\nconfig = ModelConfig(provider=\"anthropic\", model_name=\"claude-3-5-haiku-20241022\")\ngenerator = SyntheticDataGenerator(model_config=config)\n\n# Define your schemas (structured data)\nschemas = {\n \"categories\": \"category_schema.yml\",\n \"products\": \"product_schema.yml\",\n # \ud83c\udd95 Add document templates\n \"product_catalogs\": \"product_catalog_schema.yml\"\n}\n\n\n# Generate both structured data AND documents\nresults = generator.generate_for_schemas(\n schemas=schemas,\n templates=templates, # \ud83c\udd95 Add this line\n sample_sizes={\n \"categories\": 5,\n \"products\": 20,\n \"product_catalogs\": 10 # \ud83c\udd95 Add this line\n },\n output_dir=\"output\",\n prompts = {\n \"Category\": \"Generate retail product categories with hierarchical structure.\",\n \"Product\": \"Generate retail products with names, SKUs, prices, and descriptions. Ensure a good variety of prices and categories.\",\n \"ProductCatalog\": \"Generate compelling product catalog pages with marketing descriptions, key features, and sales copy.\" # \ud83c\udd95 Add this line\n }\n)\n\nprint(\"\u2705 Generated structured data + AI-powered product catalogs!\")\n```\n\n</details>\n\n**Enhanced Output:**\n```bash\n\ud83d\udcc2 output/\n\u251c\u2500\u2500 \ud83d\udcca categories.csv # 5 product categories with hierarchical structure\n\u251c\u2500\u2500 \ud83d\udcca products.csv # 20 products, all with valid category_id references \n\u2514\u2500\u2500 \ud83d\udcc4 product_catalogs/ # \ud83c\udd95 AI-generated marketing documents\n \u251c\u2500\u2500 catalog_1.pdf # Product names match products.csv\n \u251c\u2500\u2500 catalog_2.pdf # Prices match products.csv\n \u251c\u2500\u2500 catalog_3.pdf # Perfect data consistency!\n \u251c\u2500\u2500 ...\n \u2514\u2500\u2500 catalog_10.pdf\n```\n\n\n\n## \ud83d\udcca See It In Action\n\n### **Realistic Retail Data + AI-Generated Product Catalogs**\n\n**Categories Table:**\n```csv\nid,name,parent_id,description,active\n1,Electronics,0,Electronic devices and accessories,true\n2,Smartphones,1,Mobile phones and accessories,true\n3,Laptops,1,Portable computers and accessories,true\n4,Clothing,0,Apparel and fashion items,true\n5,Men's Clothing,4,Men's apparel and accessories,true\n```\n\n**Products Table (with matching category_id):**\n```csv\nid,name,category_id,sku,price,stock_quantity,is_featured\n1,iPhone 15 Pro,2,PSM-12345,999.99,50,true\n2,MacBook Air M3,3,PLA-67890,1299.99,25,true\n3,Samsung Galaxy S24,2,PSA-11111,899.99,75,false\n4,Dell XPS 13,3,PDE-22222,1099.99,30,false\n5,Men's Cotton T-Shirt,5,PMC-33333,24.99,200,false\n```\n\n**Generated Product Catalog PDF Content:**\n```\nIPHONE 15 PRO\nSmartphones Category | SKU: PSM-12345\n\n$999.99\n\nRevolutionary Performance, Unmatched Design\n\nExperience the future of mobile technology with the iPhone 15 Pro. \nFeaturing the powerful A17 Pro chip, this device delivers unprecedented \nperformance for both work and play. The titanium design combines \ndurability with elegance, while the advanced camera system captures \nprofessional-quality photos and videos.\n\nKEY FEATURES:\n\u2022 A17 Pro chip with 6-core GPU\n\u2022 Pro camera system with 3x optical zoom \n\u2022 Titanium design with Action Button\n\u2022 USB-C connectivity\n\u2022 All-day battery life\n\n\"Innovation that fits in your pocket\"\n\nAvailability: In Stock\n```\n\n> \ud83c\udfaf **Perfect Integration**: The PDF catalog contains **actual product names, SKUs, and prices** from the CSV data, plus **AI-generated marketing content** - zero inconsistencies!\n\n\n### 4. Need custom business logic? Add custom generators!\n\nFor advanced scenarios requiring **custom calculations** or **complex business rules**, you can add custom generator functions:\n\n<details>\n<summary><strong>\ud83d\udd27 Click to view custom generators example</strong></summary>\n\n```python\n# Define custom generator functions\ndef calculate_tax(row, parent_dfs=None, **kwargs):\n \"\"\"Calculate tax amount based on subtotal and tax rate\"\"\"\n subtotal = row.get('subtotal', 0)\n tax_rate = row.get('tax_rate', 8.5) # Default 8.5%\n return round(subtotal * (tax_rate / 100), 2)\n\ndef calculate_total(row, parent_dfs=None, **kwargs):\n \"\"\"Calculate final total: subtotal + tax - discount\"\"\"\n subtotal = row.get('subtotal', 0)\n tax_amount = row.get('tax_amount', 0)\n discount = row.get('discount_amount', 0)\n return round(subtotal + tax_amount - discount, 2)\n\ndef generate_receipt_items(row, parent_dfs=None, **kwargs):\n \"\"\"Generate receipt items based on actual transactions\"\"\"\n items = []\n \n if parent_dfs and 'Product' in parent_dfs and 'Transaction' in parent_dfs:\n products_df = parent_dfs['Product']\n transactions_df = parent_dfs['Transaction']\n \n # Get customer's transactions\n customer_id = row.get('customer_id')\n customer_transactions = transactions_df[\n transactions_df['customer_id'] == customer_id\n ]\n \n # Build receipt items from actual transaction data\n for _, tx in customer_transactions.iterrows():\n product = products_df[products_df['id'] == tx['product_id']].iloc[0]\n \n items.append({\n \"product_name\": product['name'],\n \"sku\": product['sku'],\n \"quantity\": int(tx['quantity']),\n \"unit_price\": float(product['price']),\n \"item_total\": round(tx['quantity'] * product['price'], 2)\n })\n \n return items\n\n# Add custom generators to your generation\ncustom_generators = {\n \"ProductCatalog\": {\n \"tax_amount\": calculate_tax,\n \"total\": calculate_total,\n \"items\": generate_receipt_items\n }\n}\n\n# Generate with custom business logic\nresults = generator.generate_for_schemas(\n schemas=schemas,\n templates=templates,\n sample_sizes={\"categories\": 5, \"products\": 20, \"product_catalogs\": 10},\n output_dir=\"output\",\n custom_generators=custom_generators, # \ud83c\udd95 Add this line\n prompts={\n \"Category\": \"Generate retail product categories with hierarchical structure.\",\n \"Product\": \"Generate retail products with names, SKUs, prices, and descriptions.\",\n \"ProductCatalog\": \"Generate compelling product catalog pages with marketing copy.\"\n }\n)\n\nprint(\"\u2705 Generated data with custom business logic!\")\n```\n\n</details>\n\n> \ud83c\udfaf **Custom generators let you:**\n> - **Calculate fields** based on other data (taxes, totals, discounts)\n> - **Access related data** from other tables via `parent_dfs`\n> - **Implement complex business rules** (pricing logic, inventory rules)\n> - **Generate structured data** (arrays, nested objects, JSON)\n\n## \ud83c\udfd7\ufe0f Works with Your Existing SQLAlchemy Models\n\nAlready using **SQLAlchemy**? Syda works directly with your existing models - no schema conversion needed!\n\n<details>\n<summary><strong>\ud83c\udfd7\ufe0f Click to view SQLAlchemy example</strong></summary>\n\n```python\nfrom sqlalchemy import Column, Integer, String, Float, ForeignKey, Boolean\nfrom sqlalchemy.ext.declarative import declarative_base\nfrom sqlalchemy.orm import relationship\nfrom syda import SyntheticDataGenerator, ModelConfig\nfrom dotenv import load_dotenv\n\nload_dotenv()\n\nBase = declarative_base()\n\n# Your existing SQLAlchemy models\nclass Customer(Base):\n __tablename__ = 'customers'\n \n id = Column(Integer, primary_key=True)\n name = Column(String(100), nullable=False, comment='Customer organization name')\n industry = Column(String(50), comment='Industry sector')\n annual_revenue = Column(Float, comment='Annual revenue in USD')\n status = Column(String(20), comment='Active, Inactive, or Prospect')\n \n # Relationships work perfectly\n contacts = relationship(\"Contact\", back_populates=\"customer\")\n\nclass Contact(Base):\n __tablename__ = 'contacts'\n \n id = Column(Integer, primary_key=True)\n customer_id = Column(Integer, ForeignKey('customers.id'), nullable=False)\n first_name = Column(String(50), nullable=False)\n last_name = Column(String(50), nullable=False)\n email = Column(String(100), nullable=False, unique=True)\n position = Column(String(100), comment='Job title')\n is_primary = Column(Boolean, comment='Primary contact for customer')\n \n customer = relationship(\"Customer\", back_populates=\"contacts\")\n\n# Generate data directly from your models\nconfig = ModelConfig(provider=\"anthropic\", model_name=\"claude-3-5-haiku-20241022\")\ngenerator = SyntheticDataGenerator(model_config=config)\n\nresults = generator.generate_for_sqlalchemy_models(\n sqlalchemy_models=[Customer, Contact],\n sample_sizes={\"Customer\": 10, \"Contact\": 25},\n output_dir=\"crm_data\"\n)\n\nprint(\"\u2705 Generated CRM data with perfect foreign key relationships!\")\n```\n\n</details>\n\n**Output:**\n```bash\n\ud83d\udcc2 crm_data/\n\u251c\u2500\u2500 \ud83d\udcca customers.csv # 10 companies with realistic industry data\n\u2514\u2500\u2500 \ud83d\udcca contacts.csv # 25 contacts, all with valid customer_id references\n```\n\n> \ud83c\udfaf **Zero Configuration**: Your SQLAlchemy `comments` become AI generation hints, `ForeignKey` relationships are automatically maintained, and `nullable=False` constraints are respected!\n\n\n## \ud83e\udd1d Contributing\n\nWe would **love your contributions**! Syda is an open-source project that thrives on community involvement.\n\n### \ud83c\udf1f **Ways to Contribute**\n\n- **\ud83d\udc1b Report bugs** - Help us identify and fix issues\n- **\ud83d\udca1 Suggest features** - Share your ideas for new capabilities \n- **\ud83d\udcdd Improve docs** - Help make our documentation even better\n- **\ud83d\udd27 Submit code** - Fix bugs, add features, optimize performance\n- **\ud83e\uddea Add examples** - Show how Syda works in your domain\n- **\u2b50 Star the repo** - Help others discover Syda\n\n### \ud83d\udccb **How to Get Started**\n\n1. **Check our [Contributing Guide](CONTRIBUTING.md)** for detailed instructions\n2. **Browse [open issues](https://github.com/syda-ai/syda/issues)** to find something to work on\n3. **Join discussions** in our GitHub Issues and Discussions\n4. **Fork the repo** and submit your first pull request!\n\n### \ud83c\udfaf **Good First Issues**\n\nLooking for ways to contribute? Check out issues labeled:\n- `good first issue` - Perfect for newcomers\n- `help wanted` - We'd especially appreciate help here\n- `documentation` - Help improve our docs\n- `examples` - Add new use cases and examples\n\n**Every contribution matters - from fixing typos to adding major features!** \ud83d\ude4f\n\n---\n\n**\u2b50 Star this repo** if Syda helps your workflow \u2022 **\ud83d\udcd6 [Read the docs](https://python.syda.ai)** for detailed guides \u2022 **\ud83d\udc1b [Report issues](https://github.com/syda-ai/syda/issues)** to help us improve\n",
"bugtrack_url": null,
"license": "LGPL-3.0-or-later",
"summary": "A Python library for AI-powered synthetic data generation with referential integrity",
"version": "0.0.2",
"project_urls": {
"Changelog": "https://github.com/syda-ai/syda/blob/main/CHANGELOG.md",
"Documentation": "https://python.syda.ai",
"Homepage": "https://github.com/syda-ai/syda",
"Issues": "https://github.com/syda-ai/syda/issues",
"Repository": "https://github.com/syda-ai/syda.git"
},
"split_keywords": [
"synthetic data",
" ai",
" machine learning",
" data generation",
" testing",
" privacy",
" sqlalchemy",
" openai",
" anthropic",
" claude",
" google",
" gemini",
" gpt"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "ba7567552deb39642ecb5235f869c0eb8fc94d73493251399537bd62828506b6",
"md5": "0894bedec383f37480914000cb0f5e62",
"sha256": "d8c327ae604da846a5d4166580f551bf27003ff04b4fb83cfc925e6eddfe5a93"
},
"downloads": -1,
"filename": "syda-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0894bedec383f37480914000cb0f5e62",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 46664,
"upload_time": "2025-08-23T04:56:03",
"upload_time_iso_8601": "2025-08-23T04:56:03.009156Z",
"url": "https://files.pythonhosted.org/packages/ba/75/67552deb39642ecb5235f869c0eb8fc94d73493251399537bd62828506b6/syda-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f5703175776358ecc6e57b9e8cd576c83cdcef789cd259eaac0d0d311b6303eb",
"md5": "10e38efa5a8217570739efc4bd8d8a34",
"sha256": "8d012638d06188ea42dd027f5621234fa1a55b3503c92d14452078f8adffb821"
},
"downloads": -1,
"filename": "syda-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "10e38efa5a8217570739efc4bd8d8a34",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 146422,
"upload_time": "2025-08-23T04:56:04",
"upload_time_iso_8601": "2025-08-23T04:56:04.281318Z",
"url": "https://files.pythonhosted.org/packages/f5/70/3175776358ecc6e57b9e8cd576c83cdcef789cd259eaac0d0d311b6303eb/syda-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-23 04:56:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "syda-ai",
"github_project": "syda",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pydantic",
"specs": [
[
">=",
"2.4.2"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
">=",
"2.0.23"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"2.0.3"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.24.3"
]
]
},
{
"name": "networkx",
"specs": [
[
">=",
"3.1"
]
]
},
{
"name": "jsonref",
"specs": [
[
">=",
"1.1.0"
]
]
},
{
"name": "openai",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "anthropic",
"specs": [
[
">=",
"0.7.0"
]
]
},
{
"name": "instructor",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "google-genai",
"specs": [
[
">=",
"1.30.0"
]
]
},
{
"name": "python-magic",
"specs": [
[
">=",
"0.4.27"
]
]
},
{
"name": "python-docx",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "openpyxl",
"specs": [
[
">=",
"3.1.2"
]
]
},
{
"name": "weasyprint",
"specs": [
[
">=",
"65.1"
]
]
},
{
"name": "pyyaml",
"specs": [
[
">=",
"6.0.1"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.4.0"
]
]
},
{
"name": "boto3",
"specs": [
[
">=",
"1.28.0"
]
]
},
{
"name": "azure-storage-blob",
"specs": [
[
">=",
"12.19.0"
]
]
},
{
"name": "pdfplumber",
"specs": [
[
">=",
"0.10.3"
]
]
},
{
"name": "pillow",
"specs": [
[
">=",
"10.0.1"
]
]
},
{
"name": "pytesseract",
"specs": [
[
">=",
"0.3.10"
]
]
},
{
"name": "sqlalchemy-utils",
"specs": [
[
">=",
"0.41.1"
]
]
},
{
"name": "mkdocs-material",
"specs": [
[
">=",
"9.6.15"
]
]
},
{
"name": "mkdocs",
"specs": [
[
">=",
"1.6.1"
]
]
},
{
"name": "mkdocs-macros-plugin",
"specs": [
[
">=",
"1.3.7"
]
]
}
],
"lcname": "syda"
}