# Mockingbird ๐ฆ
**Generate realistic mock data with relationships in seconds**
[](https://badge.fury.io/py/mockingbird-cli)
[](https://pypi.org/project/mockingbird-cli/)
[](https://opensource.org/licenses/MIT)
Mockingbird is a powerful CLI tool that generates realistic mock data with proper relationships and referential integrity. Perfect for testing, development, demos, and populating databases with meaningful data.
## โจ Key Features
- **๐ฏ Realistic Data**: Generate names, emails, addresses, and more using the Faker library
- **๐ Relational Integrity**: Create proper foreign key relationships between entities
- **๐ Simple Configuration**: Define your data structure in an intuitive YAML blueprint
- **๐ฒ Reproducible**: Use seeds to generate the same dataset consistently
- **๐ Multiple Formats**: Output to CSV, JSON, or Parquet
- **โก Fast Generation**: Efficiently create large datasets
- **๐๏ธ Complex Relationships**: Support for multi-level references and contextual data
## Project Home
[Project Home](https://mockingbird.smallapps.in/)
## ๐ Quick Start
### Installation
```bash
pip install mockingbird
```
### Create Your First Dataset
1. **Initialize a blueprint:**
```bash
mockingbird init
```
2. **Define your data structure** in `Blueprint.yaml`:
```yaml
Users:
count: 100
fields:
user_id: {generator: sequence, config: {start_at: 1}}
name: {generator: faker, config: {generator: name}}
email: {generator: faker, config: {generator: email}}
status: {generator: choice, config: {choices: ["active", "inactive"], weights: [0.8, 0.2]}}
Orders:
count: 250
fields:
order_id: {generator: sequence, config: {start_at: 1000}}
user_id: {generator: ref, config: {ref: Users.user_id}}
order_date: {generator: faker, config: {generator: date_time_this_year}}
amount: {generator: faker, config: {generator: pydecimal, left_digits: 3, right_digits: 2, positive: true}}
```
3. **Generate your data:**
```bash
mockingbird generate Blueprint.yaml
```
4. **Find your data** in the `output_data/` directory as CSV files!
## ๐ฏ Use Cases
- **๐งช Testing**: Create realistic test datasets for your applications
- **๐ง Development**: Populate development databases with meaningful data
- **๐ Demos**: Generate impressive demo data for presentations
- **โก Performance Testing**: Create large datasets to test system performance
- **๐ Learning**: Practice with realistic data for tutorials and courses
## ๐ ๏ธ Generators
Mockingbird provides powerful generators for different data types:
| Generator | Purpose | Example |
|-----------|---------|---------|
| `sequence` | Auto-incrementing numbers | User IDs, Order numbers |
| `faker` | Realistic fake data | Names, emails, addresses |
| `choice` | Random selection from options | Status, categories, types |
| `ref` | Reference other entities | Foreign keys, relationships |
| `timestamp` | Random dates/times | Creation dates, events |
| `expr` | Custom expressions | Calculated fields, conditions |
| `enum` | Cycle through values | Round-robin assignments |
## ๐ Examples
### E-commerce Dataset
```yaml
Categories:
count: 5
fields:
category_id: {generator: sequence, config: {start_at: 100}}
name: {generator: choice, config: {choices: ["Electronics", "Books", "Clothing", "Home", "Sports"]}}
Products:
count: 50
fields:
product_id: {generator: sequence, config: {start_at: 200}}
name: {generator: faker, config: {generator: catch_phrase}}
category_id: {generator: ref, config: {ref: Categories.category_id}}
price: {generator: faker, config: {generator: pydecimal, left_digits: 3, right_digits: 2, positive: true}}
Customers:
count: 25
fields:
customer_id: {generator: sequence, config: {start_at: 1000}}
name: {generator: faker, config: {generator: name}}
email: {generator: faker, config: {generator: email}}
Orders:
count: 75
fields:
order_id: {generator: sequence, config: {start_at: 3000}}
customer_id: {generator: ref, config: {ref: Customers.customer_id}}
customer_name: {generator: ref, config: {use_record_from: customer_id, field_to_get: name}}
order_date: {generator: faker, config: {generator: date_time_this_year}}
OrderItems:
count: 200
fields:
item_id: {generator: sequence, config: {start_at: 4000}}
order_id: {generator: ref, config: {ref: Orders.order_id}}
product_id: {generator: ref, config: {ref: Products.product_id}}
quantity: {generator: faker, config: {generator: random_int, min: 1, max: 4}}
unit_price: {generator: ref, config: {use_record_from: product_id, field_to_get: price}}
```
### User Activity Tracking
```yaml
Users:
count: 50
fields:
user_id: {generator: sequence}
username: {generator: faker, config: {generator: user_name}}
email: {generator: faker, config: {generator: email}}
Events:
count: 500
fields:
event_id: {generator: sequence, config: {start_at: 10000}}
user_id: {generator: ref, config: {ref: Users.user_id}}
event_type: {generator: choice, config: {choices: ["login", "logout", "view_page", "purchase"]}}
timestamp: {generator: timestamp, config: {start_date: "2024-01-01", end_date: "2024-12-31"}}
```
## ๐๏ธ Command Line Options
```bash
# Basic generation
mockingbird generate Blueprint.yaml
# Custom blueprint and output
mockingbird generate Blueprint.yaml --output-dir ./data --format parquet
# Reproducible data with seed
mockingbird generate Blueprint.yaml --seed 42
# Different output formats
mockingbird generate Blueprint.yaml --format json
mockingbird generate Blueprint.yaml --format parquet
```
## ๐ Requirements
- Python 3.11 or higher
- No additional dependencies required
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Links
- **Documentation**: [Full User Manual](https://mockingbird.smallapps.in/)
## ๐ Why Mockingbird?
Unlike other mock data generators, Mockingbird focuses on **relationships and realism**:
- โ
**Smart References**: Automatic dependency resolution ensures data integrity
- โ
**Contextual Data**: Pull related fields from the same record for consistency
- โ
**Realistic Distributions**: Use weights to create realistic data patterns
- โ
**Scalable**: Generate thousands of related records efficiently
- โ
**Flexible Output**: Choose the format that works for your workflow
---
**Ready to generate some amazing mock data?** ๐
```bash
pip install mockingbird-cli
mockingbird init
mockingbird generate Blueprint.yaml
```
Raw data
{
"_id": null,
"home_page": null,
"name": "mockingbird-cli",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Prasad Bhamidipati <prasadbhamidi@gmail.com>",
"keywords": "faker, mock-data, data-generator, cli, testing, fixtures, synthetic-data, relational-data, csv, json, parquet, relationships",
"author": null,
"author_email": "Prasad Bhamidipati <prasadbhamidi@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/64/e1/69721bc37ad4d200e9651b1c650b7684ddadcea8ce64f926475139c168df/mockingbird_cli-0.5.0.tar.gz",
"platform": null,
"description": "# Mockingbird \ud83d\udc26\n\n**Generate realistic mock data with relationships in seconds**\n\n[](https://badge.fury.io/py/mockingbird-cli)\n[](https://pypi.org/project/mockingbird-cli/)\n[](https://opensource.org/licenses/MIT)\n\nMockingbird is a powerful CLI tool that generates realistic mock data with proper relationships and referential integrity. Perfect for testing, development, demos, and populating databases with meaningful data.\n\n## \u2728 Key Features\n\n- **\ud83c\udfaf Realistic Data**: Generate names, emails, addresses, and more using the Faker library\n- **\ud83d\udd17 Relational Integrity**: Create proper foreign key relationships between entities\n- **\ud83d\udcdd Simple Configuration**: Define your data structure in an intuitive YAML blueprint\n- **\ud83c\udfb2 Reproducible**: Use seeds to generate the same dataset consistently\n- **\ud83d\udcca Multiple Formats**: Output to CSV, JSON, or Parquet\n- **\u26a1 Fast Generation**: Efficiently create large datasets\n- **\ud83c\udfd7\ufe0f Complex Relationships**: Support for multi-level references and contextual data\n\n## Project Home\n[Project Home](https://mockingbird.smallapps.in/)\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install mockingbird\n```\n\n### Create Your First Dataset\n\n1. **Initialize a blueprint:**\n ```bash\n mockingbird init\n ```\n\n2. **Define your data structure** in `Blueprint.yaml`:\n ```yaml\n Users:\n count: 100\n fields:\n user_id: {generator: sequence, config: {start_at: 1}}\n name: {generator: faker, config: {generator: name}}\n email: {generator: faker, config: {generator: email}}\n status: {generator: choice, config: {choices: [\"active\", \"inactive\"], weights: [0.8, 0.2]}}\n\n Orders:\n count: 250\n fields:\n order_id: {generator: sequence, config: {start_at: 1000}}\n user_id: {generator: ref, config: {ref: Users.user_id}}\n order_date: {generator: faker, config: {generator: date_time_this_year}}\n amount: {generator: faker, config: {generator: pydecimal, left_digits: 3, right_digits: 2, positive: true}}\n ```\n\n3. **Generate your data:**\n ```bash\n mockingbird generate Blueprint.yaml\n ```\n\n4. **Find your data** in the `output_data/` directory as CSV files!\n\n## \ud83c\udfaf Use Cases\n\n- **\ud83e\uddea Testing**: Create realistic test datasets for your applications\n- **\ud83d\udd27 Development**: Populate development databases with meaningful data\n- **\ud83d\udcca Demos**: Generate impressive demo data for presentations\n- **\u26a1 Performance Testing**: Create large datasets to test system performance\n- **\ud83c\udf93 Learning**: Practice with realistic data for tutorials and courses\n\n## \ud83d\udee0\ufe0f Generators\n\nMockingbird provides powerful generators for different data types:\n\n| Generator | Purpose | Example |\n|-----------|---------|---------|\n| `sequence` | Auto-incrementing numbers | User IDs, Order numbers |\n| `faker` | Realistic fake data | Names, emails, addresses |\n| `choice` | Random selection from options | Status, categories, types |\n| `ref` | Reference other entities | Foreign keys, relationships |\n| `timestamp` | Random dates/times | Creation dates, events |\n| `expr` | Custom expressions | Calculated fields, conditions |\n| `enum` | Cycle through values | Round-robin assignments |\n\n## \ud83d\udcd6 Examples\n\n### E-commerce Dataset\n\n```yaml\nCategories:\n count: 5\n fields:\n category_id: {generator: sequence, config: {start_at: 100}}\n name: {generator: choice, config: {choices: [\"Electronics\", \"Books\", \"Clothing\", \"Home\", \"Sports\"]}}\n\nProducts:\n count: 50\n fields:\n product_id: {generator: sequence, config: {start_at: 200}}\n name: {generator: faker, config: {generator: catch_phrase}}\n category_id: {generator: ref, config: {ref: Categories.category_id}}\n price: {generator: faker, config: {generator: pydecimal, left_digits: 3, right_digits: 2, positive: true}}\n\nCustomers:\n count: 25\n fields:\n customer_id: {generator: sequence, config: {start_at: 1000}}\n name: {generator: faker, config: {generator: name}}\n email: {generator: faker, config: {generator: email}}\n\nOrders:\n count: 75\n fields:\n order_id: {generator: sequence, config: {start_at: 3000}}\n customer_id: {generator: ref, config: {ref: Customers.customer_id}}\n customer_name: {generator: ref, config: {use_record_from: customer_id, field_to_get: name}}\n order_date: {generator: faker, config: {generator: date_time_this_year}}\n\nOrderItems:\n count: 200\n fields:\n item_id: {generator: sequence, config: {start_at: 4000}}\n order_id: {generator: ref, config: {ref: Orders.order_id}}\n product_id: {generator: ref, config: {ref: Products.product_id}}\n quantity: {generator: faker, config: {generator: random_int, min: 1, max: 4}}\n unit_price: {generator: ref, config: {use_record_from: product_id, field_to_get: price}}\n```\n\n### User Activity Tracking\n\n```yaml\nUsers:\n count: 50\n fields:\n user_id: {generator: sequence}\n username: {generator: faker, config: {generator: user_name}}\n email: {generator: faker, config: {generator: email}}\n\nEvents:\n count: 500\n fields:\n event_id: {generator: sequence, config: {start_at: 10000}}\n user_id: {generator: ref, config: {ref: Users.user_id}}\n event_type: {generator: choice, config: {choices: [\"login\", \"logout\", \"view_page\", \"purchase\"]}}\n timestamp: {generator: timestamp, config: {start_date: \"2024-01-01\", end_date: \"2024-12-31\"}}\n```\n\n## \ud83c\udf9b\ufe0f Command Line Options\n\n```bash\n# Basic generation\nmockingbird generate Blueprint.yaml\n\n# Custom blueprint and output\nmockingbird generate Blueprint.yaml --output-dir ./data --format parquet\n\n# Reproducible data with seed\nmockingbird generate Blueprint.yaml --seed 42\n\n# Different output formats\nmockingbird generate Blueprint.yaml --format json\nmockingbird generate Blueprint.yaml --format parquet\n```\n\n## \ud83d\udccb Requirements\n\n- Python 3.11 or higher\n- No additional dependencies required\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- **Documentation**: [Full User Manual](https://mockingbird.smallapps.in/)\n\n## \ud83c\udf89 Why Mockingbird?\n\nUnlike other mock data generators, Mockingbird focuses on **relationships and realism**:\n\n- \u2705 **Smart References**: Automatic dependency resolution ensures data integrity\n- \u2705 **Contextual Data**: Pull related fields from the same record for consistency\n- \u2705 **Realistic Distributions**: Use weights to create realistic data patterns\n- \u2705 **Scalable**: Generate thousands of related records efficiently\n- \u2705 **Flexible Output**: Choose the format that works for your workflow\n\n---\n\n**Ready to generate some amazing mock data?** \ud83d\ude80\n\n```bash\npip install mockingbird-cli\nmockingbird init\nmockingbird generate Blueprint.yaml\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A powerful CLI tool for generating realistic mock data with relationships and referential integrity",
"version": "0.5.0",
"project_urls": {
"Homepage": "https://mockingbird.smallapps.in"
},
"split_keywords": [
"faker",
" mock-data",
" data-generator",
" cli",
" testing",
" fixtures",
" synthetic-data",
" relational-data",
" csv",
" json",
" parquet",
" relationships"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a1d8cde28101052b9a3d9b1cb5a1c23ae93af24904d134fbfb2ad953b79aa2eb",
"md5": "4127c2275102ea16e9b5aad4f19362b1",
"sha256": "bca8ec98e518da2664bd8e2cac271075b16e7105a0f4e586cf54cfc48a213a09"
},
"downloads": -1,
"filename": "mockingbird_cli-0.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4127c2275102ea16e9b5aad4f19362b1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 35738,
"upload_time": "2025-08-05T11:48:22",
"upload_time_iso_8601": "2025-08-05T11:48:22.047294Z",
"url": "https://files.pythonhosted.org/packages/a1/d8/cde28101052b9a3d9b1cb5a1c23ae93af24904d134fbfb2ad953b79aa2eb/mockingbird_cli-0.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "64e169721bc37ad4d200e9651b1c650b7684ddadcea8ce64f926475139c168df",
"md5": "4556af7b256e6d23691dd003ef04824b",
"sha256": "455ff8823d0ae4e85530dd42b024bf134939ebfbef0ac2d1eb9bfd191d9d2387"
},
"downloads": -1,
"filename": "mockingbird_cli-0.5.0.tar.gz",
"has_sig": false,
"md5_digest": "4556af7b256e6d23691dd003ef04824b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 34759,
"upload_time": "2025-08-05T11:48:23",
"upload_time_iso_8601": "2025-08-05T11:48:23.468589Z",
"url": "https://files.pythonhosted.org/packages/64/e1/69721bc37ad4d200e9651b1c650b7684ddadcea8ce64f926475139c168df/mockingbird_cli-0.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-05 11:48:23",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "mockingbird-cli"
}