mockpipe


Namemockpipe JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryDummy data generator focusing on customisability and maintained relationships for mocking data pipelines
upload_time2025-01-02 03:19:36
maintainerNone
docs_urlNone
authorBenskiBoy
requires_python>=3.8
licenseNone
keywords mocking data faker testing generator pipeline pipe
VCS
bugtrack_url
requirements click duckdb Faker faker-commerce jsonlines PyYAML
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ::

  ███╗   ███╗ ██████╗  ██████╗██╗  ██╗██████╗ ██╗██████╗ ███████╗
  ████╗ ████║██╔═══██╗██╔════╝██║ ██╔╝██╔══██╗██║██╔══██╗██╔════╝
  ██╔████╔██║██║   ██║██║     █████╔╝ ██████╔╝██║██████╔╝█████╗  
  ██║╚██╔╝██║██║   ██║██║     ██╔═██╗ ██╔═══╝ ██║██╔═══╝ ██╔══╝  
  ██║ ╚═╝ ██║╚██████╔╝╚██████╗██║  ██╗██║     ██║██║     ███████╗
  ╚═╝     ╚═╝ ╚═════╝  ╚═════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝     ╚══════╝

|pypi| |build| |license|

-------------

MockPipe
-------------

There's a lot of sample databases out there and lots of ways to generate some dummy data (i.e. faker, which this project uses), but i couldn't find much in the way of dynamically generating realistic data that could be used to generate some scenarios that one might actually find coming out of a operational systems CDC feed.
This is an attampt to create a utility/library that can be used to setup some .

From a yaml config a set of sample tables can be defined, using dummy default values for any newly generated rows along with a set of actions that can be performed with a certain frequency.

The dummy values actually invoke the Faker library to generate somewhat realistic entries, along with support for other data types that may refer to existing values within the table or other tables so that relationships can be maintained.

Data is persisted onto a duckdb database so the outputs can be persisted between executions and support any other analysis/queries you may want to do.


Features
-------------
- **Dynamic Data Generation**: Generate sample tables from a YAML configuration, using dummy default values for newly generated rows.
- **Faker Integration**: Leverage the Faker library to create realistic entries.
- **Relationship Maintenance**: Support for data types that refer to existing values within the same table or other tables, ensuring relationships are preserved.
- **Action Frequency**: Define a set of actions to be performed with a certain frequency.
- **Persistence**: Data is persisted in a DuckDB database, allowing outputs to be saved between executions and enabling further analysis or queries.

Installation
-------------

To install Mockpipe, you can use pip:

.. code:: bash

  pip install mockpipe

Basic Usage
-------------

.. code:: python

  import mockpipe

  # Define your YAML configuration
  yaml_config = """
  tables:
    - name: users
      columns:
        - name: id
          type: integer
          primary_key: true
        - name: name
          type: string
          faker: name
        - name: email
          type: string
          faker: email
  actions:
    - table: users
      action: insert
      frequency: 1.0

  # Initialize Mockpipe with the configuration
  mp = mockpipe.Mockpipe(yaml_config)

Command line Usage
--------------------

.. code:: bash

  Usage: mockpipe [OPTIONS]

  Options:
    --config_create     generate a sample config file
    --config PATH       path to yaml config file
    --steps INTEGER     Number of steps to execute initially
    --run-time INTEGER  Time to run the mockpipe process in seconds
    --version           Show the version and exit.
    --help              Show this message and exit.

Config Specification
--------------------
**Top Level Keys**

+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+
| key                | value type | allowed values | default value | sample    | explanation                                                                                             |
+====================+============+================+===============+===========+=========================================================================================================+
| db_path            | path       | any            | mockpipe.db   | sample.db | path of duckdb db                                                                                       |
+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+
| delete_behaviour   | string     | [soft, hard]   | soft          | soft      | whether deleted records will be marked as deleted with 'D' or actually hard deleted in the persisted db |
+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+
| inter_action_delay | float      | 0.0 ->         | 0.5           | 0.1       | delay between each action                                                                               |
+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+
| output             | table      |                |               |           | output format                                                                                           |
+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+


**Output**

+--------+------------+----------------+---------------+---------+------------------------+
| key    | value type | allowed values | default value | sample  | explanation            |
+========+============+================+===============+=========+========================+
| format | string     | [json, csv]    | json          | json    | file format output     |
+--------+------------+----------------+---------------+---------+------------------------+
| path   | path       | any            | extract       | extract | folder path for output |
+--------+------------+----------------+---------------+---------+------------------------+

**Tables**

+---------+------------+----------------+---------------+-----------+---------------------------------------+
| key     | value type | allowed values | default value | sample    | explanation                           |
+=========+============+================+===============+===========+=======================================+
| name    | string     | any            | N/A           | employees | table name used. Also used for output |
+---------+------------+----------------+---------------+-----------+---------------------------------------+
| fields  | table      |                |               |           | List of fields in table               |
+---------+------------+----------------+---------------+-----------+---------------------------------------+
| actions | table      |                |               |           | List of actions within table          |
+---------+------------+----------------+---------------+-----------+---------------------------------------+

**Fields**

+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+
| key       | value type | allowed values                                 | default value | sample              | explanation                           | Note                    |
+===========+============+================================================+===============+=====================+=======================================+=========================+
| name      | string     | any                                            | N/A           | order_date          | table name used. Also used for output |                         |
+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+
| type      | string     | [string, int, float, boolean]                  | N/A           | string              | List of fields in table               |                         |
+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+
| value     | string     | [increment, static(*), table_random(), fake.*] | N/A           | fake.date_between   | List of actions within table          | See 'Field Value Usage' |
+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+
| arugments | list       | any                                            | N/A           |- "-1y"              | Arguments to pass to faker functions  | See 'Field Value Usage' |
|           |            |                                                |               |- "today"            |                                       |                         |
+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+

**Actions**

+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| key                 | value type    | allowed values                                   | default value | sample                                                       | explanation                                                                                                      | Note                |
+=====================+===============+==================================================+===============+==============================================================+==================================================================================================================+=====================+
| name                | string        | any                                              | N/A           | update_order_status                                          | name of action                                                                                                   |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| field               | string        | any                                              | N/A           | order_status                                                 | field which gets updated                                                                                         |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| action              | string        | [create, delete, set]                            | N/A           | set                                                          | type of action to perform                                                                                        |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| value               | string        | [increment, static(*), table_random(), fake.*]   | N/A           | fake.random_element                                          | value to set field to                                                                                            |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| arguments           | list          | any                                              | N/A           | ('pending', 'completed', 'shipped', 'delivered')             | if using faker, arguments to pass                                                                                |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| frequency           | float         | 0->1                                             | N/A           | 0.25                                                         | relative frequency of action                                                                                     |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| where_condition     | string        | <table>.<value> == <condition>                   | N/A           | products.product_id == table_random(products, product_id, 0) | where condition to limit which rows in table to apply action to                                                  | See where condition |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| action_condition    | string        | EFFECT_ONLY                                      | N/A           | EFFECT_ONLY                                                  | used to specify if the action is only ever to be invoked by another action (i.e., an effect)                     |                     |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| effect              | string        | <table>.<action>(<target_col>=<source_col>, ...) | N/A           | product.product_count(order_id=order_id)                     | After the specified action is executed, another action can be invoked, passing values onwards to the next action | See Effect          |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| effect_count        | [int, string] | 0->max(int), inherit                             | N/A           | inherit                                                      | if effect is set, how many times to invoke the next effect                                                       | See Effect          |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+
| effect_count_random | string        | <min>,<max>                                      | N/A           | 1,5                                                          | if effect is set, how many times to invoke the next effect                                                       | See Effect          |
+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+


**Field Values**

+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| type        | increment                                                                                                                                                                             |
+=============+=======================================================================================================================================================================================+
| explanation | Will only wok for integer fields. It acts as you'd expect, incrementing the value by 1 for each new row generated and selecting a random value from the specified table respectively. |
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| syntax      | ``increment``                                                                                                                                                                         |
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| examples    | ``increment``                                                                                                                                                                         |
+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

+-------------+------------------------------------------------------------------------------------------------------------------------------------+
| type        | static                                                                                                                             |
+=============+====================================================================================================================================+
| explanation | Will set a static value on each new row generated. This can be any value you want, but it will be the same for each row generated. |
+-------------+------------------------------------------------------------------------------------------------------------------------------------+
| syntax      | ``static(<value>)``                                                                                                                |
+-------------+------------------------------------------------------------------------------------------------------------------------------------+
| examples    | ``static(false), static(100), static('pending')``                                                                                  |
+-------------+------------------------------------------------------------------------------------------------------------------------------------+


+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| type        | table_random                                                                                                                                                                               |
+=============+============================================================================================================================================================================================+
| explanation | Will select a random value from the specified table for each new row generated. Note, will only select non-deleted rows. It's important to set a default value in case the table is empty. |
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| syntax      | ``table_random(<table_name>, <column_name>, <default_value>)``                                                                                                                             |
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| examples    | ``table_random(products, product_id, 0)``                                                                                                                                                  |
+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


+-------------+-----------------------------------------------------------------------------------------------------------------------+
| type        | fake.*                                                                                                                |
+=============+=======================================================================================================================+
| explanation | Will generate a value using the faker library. The arguments key can be used to pass arguments to the faker function. |
+-------------+-----------------------------------------------------------------------------------------------------------------------+
| syntax      | ``fake.<faker_function>``                                                                                             |
+-------------+-----------------------------------------------------------------------------------------------------------------------+
| examples    | fake.company                                                                                                          |
+-------------+-----------------------------------------------------------------------------------------------------------------------+


**Effects**

The effect is used to specify that after the specified action is executed, another action can be invoked, passing values onwards to the next action.
This can be useful for chaining actions together to create one to one, one to many relationships, you can also specify how many times to invoke the next 

effect: 

+-------------+--------------------------------------------------------------------------------+
| explanation | Which action to invoke after the current action is executed.                   |
+-------------+--------------------------------------------------------------------------------+
| syntax      | ``<table>.<action>(<target_col>=<source_col>, <target_col=<source_col>, ...)`` |
+-------------+--------------------------------------------------------------------------------+
| example     | ``effect: product.product_count(order_id=order_id)``                           |
+-------------+--------------------------------------------------------------------------------+


effect_count:

+-------------+-----------------------------------------------------------------------------------------------------------------+
| explanation | If the effect is set, how many times to invoke the next effect. Note, can not be used with effect_count_random. |
+-------------+-----------------------------------------------------------------------------------------------------------------+
| syntax      | ``<int>``                                                                                                       |
+-------------+-----------------------------------------------------------------------------------------------------------------+
| example     | ``1``                                                                                                           |
+-------------+-----------------------------------------------------------------------------------------------------------------+



effect_count_random:

+-------------+----------------------------------------------------------------------------------------------------------+
| explanation | If the effect is set, how many times to invoke the next effect. Note, can not be used with effect_count. |
+-------------+----------------------------------------------------------------------------------------------------------+
| syntax      | ``<min>,<max>``                                                                                          |
+-------------+----------------------------------------------------------------------------------------------------------+
| example     | ``1,5``                                                                                                  |
+-------------+----------------------------------------------------------------------------------------------------------+


action_condition:

Used to specify if the action is only ever to be invoked by another action (i.e., an effect).

+-------------+-----------------------------------------------------------------------------------------------+
| explanation | Used to specify if the action is only ever to be invoked by another action (i.e., an effect). |
+-------------+-----------------------------------------------------------------------------------------------+
| syntax      | ``EFFECT_ONLY``                                                                               |
+-------------+-----------------------------------------------------------------------------------------------+
| example     | ``EFFECT_ONLY``                                                                               |
+-------------+-----------------------------------------------------------------------------------------------+

**Where Condition**

+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| explanation                   | The where condition is used to limit which rows in the table an action is applied to. It can be set to a filter, i.e. where status=='pending' or it can perform a lookup to another table to get the value to filter on. |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| syntax                        | ``<table>.<value> == / != / >= / <= / > / < <condition>``                                                                                                                                                                |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_random condition syntax | ``table_random(<table_name>, <column_name>, <default_value>)``                                                                                                                                                           |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| static syntax                 | ``static(<value>)``                                                                                                                                                                                                      |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_random example          | ``products.product_id == table_random(orders, product_id, 0)``                                                                                                                                                           |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| static example                | ``products.product_id == static(1)``                                                                                                                                                                                     |
+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Future Enhancements
--------------------
- improved yaml config validation
- improved logging
- increased test coverage
- simplyfy action usage and allow for duckdb functions
- support additional data output formats (e.g. xml, parquet)
- create custom faker functions to allow for more complex data generation
- better typing support


Contributing
-------------

Contributions are welcome, Please open an issue or submit a pull request on GitHub.


License
-------------

This project is licensed under the MIT License. See the LICENSE file for details.


Acknowledgements
-----------------

- [Faker](https://github.com/joke2k/faker) - For generating realistic dummy data.
- [DuckDB](https://duckdb.org/) - For data persistence and analysis.


.. |pypi| image:: https://img.shields.io/pypi/v/mockpipe.svg?style=flat-square&label=version
    :target: https://pypi.org/project/mockpipe/
    :alt: Latest version released on PyPI

.. |build| image:: https://github.com/BenskiBoy/mockpipe/actions/workflows/build.yml/badge.svg
    :target: https://github.com/BenskiBoy/mockpipe/actions/workflows/build.yml
    :alt: Build status of the master branch

.. |license| image:: https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square
    :target: https://raw.githubusercontent.com/BenskiBoy/mockpipe/master/LICENSE
    :alt: Package license

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "mockpipe",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "mocking data faker testing generator pipeline pipe",
    "author": "BenskiBoy",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/5c/90/a4b8a57bedfe484b84905cb9a3937334b5f1d8d628dcbc40d524413b3939/mockpipe-0.0.2.tar.gz",
    "platform": null,
    "description": "::\n\n  \u2588\u2588\u2588\u2557   \u2588\u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2588\u2588\u2557  \u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2557  \u2588\u2588\u2557\u2588\u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2557\u2588\u2588\u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\n  \u2588\u2588\u2588\u2588\u2557 \u2588\u2588\u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2554\u2550\u2550\u2550\u2550\u255d\u2588\u2588\u2551 \u2588\u2588\u2554\u255d\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557\u2588\u2588\u2554\u2550\u2550\u2550\u2550\u255d\n  \u2588\u2588\u2554\u2588\u2588\u2588\u2588\u2554\u2588\u2588\u2551\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551     \u2588\u2588\u2588\u2588\u2588\u2554\u255d \u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2588\u2588\u2588\u2557  \n  \u2588\u2588\u2551\u255a\u2588\u2588\u2554\u255d\u2588\u2588\u2551\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2551     \u2588\u2588\u2554\u2550\u2588\u2588\u2557 \u2588\u2588\u2554\u2550\u2550\u2550\u255d \u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2550\u255d \u2588\u2588\u2554\u2550\u2550\u255d  \n  \u2588\u2588\u2551 \u255a\u2550\u255d \u2588\u2588\u2551\u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2551  \u2588\u2588\u2557\u2588\u2588\u2551     \u2588\u2588\u2551\u2588\u2588\u2551     \u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\n  \u255a\u2550\u255d     \u255a\u2550\u255d \u255a\u2550\u2550\u2550\u2550\u2550\u255d  \u255a\u2550\u2550\u2550\u2550\u2550\u255d\u255a\u2550\u255d  \u255a\u2550\u255d\u255a\u2550\u255d     \u255a\u2550\u255d\u255a\u2550\u255d     \u255a\u2550\u2550\u2550\u2550\u2550\u2550\u255d\n\n|pypi| |build| |license|\n\n-------------\n\nMockPipe\n-------------\n\nThere's a lot of sample databases out there and lots of ways to generate some dummy data (i.e. faker, which this project uses), but i couldn't find much in the way of dynamically generating realistic data that could be used to generate some scenarios that one might actually find coming out of a operational systems CDC feed.\nThis is an attampt to create a utility/library that can be used to setup some .\n\nFrom a yaml config a set of sample tables can be defined, using dummy default values for any newly generated rows along with a set of actions that can be performed with a certain frequency.\n\nThe dummy values actually invoke the Faker library to generate somewhat realistic entries, along with support for other data types that may refer to existing values within the table or other tables so that relationships can be maintained.\n\nData is persisted onto a duckdb database so the outputs can be persisted between executions and support any other analysis/queries you may want to do.\n\n\nFeatures\n-------------\n- **Dynamic Data Generation**: Generate sample tables from a YAML configuration, using dummy default values for newly generated rows.\n- **Faker Integration**: Leverage the Faker library to create realistic entries.\n- **Relationship Maintenance**: Support for data types that refer to existing values within the same table or other tables, ensuring relationships are preserved.\n- **Action Frequency**: Define a set of actions to be performed with a certain frequency.\n- **Persistence**: Data is persisted in a DuckDB database, allowing outputs to be saved between executions and enabling further analysis or queries.\n\nInstallation\n-------------\n\nTo install Mockpipe, you can use pip:\n\n.. code:: bash\n\n  pip install mockpipe\n\nBasic Usage\n-------------\n\n.. code:: python\n\n  import mockpipe\n\n  # Define your YAML configuration\n  yaml_config = \"\"\"\n  tables:\n    - name: users\n      columns:\n        - name: id\n          type: integer\n          primary_key: true\n        - name: name\n          type: string\n          faker: name\n        - name: email\n          type: string\n          faker: email\n  actions:\n    - table: users\n      action: insert\n      frequency: 1.0\n\n  # Initialize Mockpipe with the configuration\n  mp = mockpipe.Mockpipe(yaml_config)\n\nCommand line Usage\n--------------------\n\n.. code:: bash\n\n  Usage: mockpipe [OPTIONS]\n\n  Options:\n    --config_create     generate a sample config file\n    --config PATH       path to yaml config file\n    --steps INTEGER     Number of steps to execute initially\n    --run-time INTEGER  Time to run the mockpipe process in seconds\n    --version           Show the version and exit.\n    --help              Show this message and exit.\n\nConfig Specification\n--------------------\n**Top Level Keys**\n\n+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+\n| key                | value type | allowed values | default value | sample    | explanation                                                                                             |\n+====================+============+================+===============+===========+=========================================================================================================+\n| db_path            | path       | any            | mockpipe.db   | sample.db | path of duckdb db                                                                                       |\n+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+\n| delete_behaviour   | string     | [soft, hard]   | soft          | soft      | whether deleted records will be marked as deleted with 'D' or actually hard deleted in the persisted db |\n+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+\n| inter_action_delay | float      | 0.0 ->         | 0.5           | 0.1       | delay between each action                                                                               |\n+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+\n| output             | table      |                |               |           | output format                                                                                           |\n+--------------------+------------+----------------+---------------+-----------+---------------------------------------------------------------------------------------------------------+\n\n\n**Output**\n\n+--------+------------+----------------+---------------+---------+------------------------+\n| key    | value type | allowed values | default value | sample  | explanation            |\n+========+============+================+===============+=========+========================+\n| format | string     | [json, csv]    | json          | json    | file format output     |\n+--------+------------+----------------+---------------+---------+------------------------+\n| path   | path       | any            | extract       | extract | folder path for output |\n+--------+------------+----------------+---------------+---------+------------------------+\n\n**Tables**\n\n+---------+------------+----------------+---------------+-----------+---------------------------------------+\n| key     | value type | allowed values | default value | sample    | explanation                           |\n+=========+============+================+===============+===========+=======================================+\n| name    | string     | any            | N/A           | employees | table name used. Also used for output |\n+---------+------------+----------------+---------------+-----------+---------------------------------------+\n| fields  | table      |                |               |           | List of fields in table               |\n+---------+------------+----------------+---------------+-----------+---------------------------------------+\n| actions | table      |                |               |           | List of actions within table          |\n+---------+------------+----------------+---------------+-----------+---------------------------------------+\n\n**Fields**\n\n+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+\n| key       | value type | allowed values                                 | default value | sample              | explanation                           | Note                    |\n+===========+============+================================================+===============+=====================+=======================================+=========================+\n| name      | string     | any                                            | N/A           | order_date          | table name used. Also used for output |                         |\n+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+\n| type      | string     | [string, int, float, boolean]                  | N/A           | string              | List of fields in table               |                         |\n+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+\n| value     | string     | [increment, static(*), table_random(), fake.*] | N/A           | fake.date_between   | List of actions within table          | See 'Field Value Usage' |\n+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+\n| arugments | list       | any                                            | N/A           |- \"-1y\"              | Arguments to pass to faker functions  | See 'Field Value Usage' |\n|           |            |                                                |               |- \"today\"            |                                       |                         |\n+-----------+------------+------------------------------------------------+---------------+---------------------+---------------------------------------+-------------------------+\n\n**Actions**\n\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| key                 | value type    | allowed values                                   | default value | sample                                                       | explanation                                                                                                      | Note                |\n+=====================+===============+==================================================+===============+==============================================================+==================================================================================================================+=====================+\n| name                | string        | any                                              | N/A           | update_order_status                                          | name of action                                                                                                   |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| field               | string        | any                                              | N/A           | order_status                                                 | field which gets updated                                                                                         |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| action              | string        | [create, delete, set]                            | N/A           | set                                                          | type of action to perform                                                                                        |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| value               | string        | [increment, static(*), table_random(), fake.*]   | N/A           | fake.random_element                                          | value to set field to                                                                                            |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| arguments           | list          | any                                              | N/A           | ('pending', 'completed', 'shipped', 'delivered')             | if using faker, arguments to pass                                                                                |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| frequency           | float         | 0->1                                             | N/A           | 0.25                                                         | relative frequency of action                                                                                     |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| where_condition     | string        | <table>.<value> == <condition>                   | N/A           | products.product_id == table_random(products, product_id, 0) | where condition to limit which rows in table to apply action to                                                  | See where condition |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| action_condition    | string        | EFFECT_ONLY                                      | N/A           | EFFECT_ONLY                                                  | used to specify if the action is only ever to be invoked by another action (i.e., an effect)                     |                     |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| effect              | string        | <table>.<action>(<target_col>=<source_col>, ...) | N/A           | product.product_count(order_id=order_id)                     | After the specified action is executed, another action can be invoked, passing values onwards to the next action | See Effect          |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| effect_count        | [int, string] | 0->max(int), inherit                             | N/A           | inherit                                                      | if effect is set, how many times to invoke the next effect                                                       | See Effect          |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n| effect_count_random | string        | <min>,<max>                                      | N/A           | 1,5                                                          | if effect is set, how many times to invoke the next effect                                                       | See Effect          |\n+---------------------+---------------+--------------------------------------------------+---------------+--------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------+---------------------+\n\n\n**Field Values**\n\n+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| type        | increment                                                                                                                                                                             |\n+=============+=======================================================================================================================================================================================+\n| explanation | Will only wok for integer fields. It acts as you'd expect, incrementing the value by 1 for each new row generated and selecting a random value from the specified table respectively. |\n+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| syntax      | ``increment``                                                                                                                                                                         |\n+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| examples    | ``increment``                                                                                                                                                                         |\n+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\n+-------------+------------------------------------------------------------------------------------------------------------------------------------+\n| type        | static                                                                                                                             |\n+=============+====================================================================================================================================+\n| explanation | Will set a static value on each new row generated. This can be any value you want, but it will be the same for each row generated. |\n+-------------+------------------------------------------------------------------------------------------------------------------------------------+\n| syntax      | ``static(<value>)``                                                                                                                |\n+-------------+------------------------------------------------------------------------------------------------------------------------------------+\n| examples    | ``static(false), static(100), static('pending')``                                                                                  |\n+-------------+------------------------------------------------------------------------------------------------------------------------------------+\n\n\n+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| type        | table_random                                                                                                                                                                               |\n+=============+============================================================================================================================================================================================+\n| explanation | Will select a random value from the specified table for each new row generated. Note, will only select non-deleted rows. It's important to set a default value in case the table is empty. |\n+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| syntax      | ``table_random(<table_name>, <column_name>, <default_value>)``                                                                                                                             |\n+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| examples    | ``table_random(products, product_id, 0)``                                                                                                                                                  |\n+-------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\n\n+-------------+-----------------------------------------------------------------------------------------------------------------------+\n| type        | fake.*                                                                                                                |\n+=============+=======================================================================================================================+\n| explanation | Will generate a value using the faker library. The arguments key can be used to pass arguments to the faker function. |\n+-------------+-----------------------------------------------------------------------------------------------------------------------+\n| syntax      | ``fake.<faker_function>``                                                                                             |\n+-------------+-----------------------------------------------------------------------------------------------------------------------+\n| examples    | fake.company                                                                                                          |\n+-------------+-----------------------------------------------------------------------------------------------------------------------+\n\n\n**Effects**\n\nThe effect is used to specify that after the specified action is executed, another action can be invoked, passing values onwards to the next action.\nThis can be useful for chaining actions together to create one to one, one to many relationships, you can also specify how many times to invoke the next \n\neffect: \n\n+-------------+--------------------------------------------------------------------------------+\n| explanation | Which action to invoke after the current action is executed.                   |\n+-------------+--------------------------------------------------------------------------------+\n| syntax      | ``<table>.<action>(<target_col>=<source_col>, <target_col=<source_col>, ...)`` |\n+-------------+--------------------------------------------------------------------------------+\n| example     | ``effect: product.product_count(order_id=order_id)``                           |\n+-------------+--------------------------------------------------------------------------------+\n\n\neffect_count:\n\n+-------------+-----------------------------------------------------------------------------------------------------------------+\n| explanation | If the effect is set, how many times to invoke the next effect. Note, can not be used with effect_count_random. |\n+-------------+-----------------------------------------------------------------------------------------------------------------+\n| syntax      | ``<int>``                                                                                                       |\n+-------------+-----------------------------------------------------------------------------------------------------------------+\n| example     | ``1``                                                                                                           |\n+-------------+-----------------------------------------------------------------------------------------------------------------+\n\n\n\neffect_count_random:\n\n+-------------+----------------------------------------------------------------------------------------------------------+\n| explanation | If the effect is set, how many times to invoke the next effect. Note, can not be used with effect_count. |\n+-------------+----------------------------------------------------------------------------------------------------------+\n| syntax      | ``<min>,<max>``                                                                                          |\n+-------------+----------------------------------------------------------------------------------------------------------+\n| example     | ``1,5``                                                                                                  |\n+-------------+----------------------------------------------------------------------------------------------------------+\n\n\naction_condition:\n\nUsed to specify if the action is only ever to be invoked by another action (i.e., an effect).\n\n+-------------+-----------------------------------------------------------------------------------------------+\n| explanation | Used to specify if the action is only ever to be invoked by another action (i.e., an effect). |\n+-------------+-----------------------------------------------------------------------------------------------+\n| syntax      | ``EFFECT_ONLY``                                                                               |\n+-------------+-----------------------------------------------------------------------------------------------+\n| example     | ``EFFECT_ONLY``                                                                               |\n+-------------+-----------------------------------------------------------------------------------------------+\n\n**Where Condition**\n\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| explanation                   | The where condition is used to limit which rows in the table an action is applied to. It can be set to a filter, i.e. where status=='pending' or it can perform a lookup to another table to get the value to filter on. |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| syntax                        | ``<table>.<value> == / != / >= / <= / > / < <condition>``                                                                                                                                                                |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| table_random condition syntax | ``table_random(<table_name>, <column_name>, <default_value>)``                                                                                                                                                           |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| static syntax                 | ``static(<value>)``                                                                                                                                                                                                      |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| table_random example          | ``products.product_id == table_random(orders, product_id, 0)``                                                                                                                                                           |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n| static example                | ``products.product_id == static(1)``                                                                                                                                                                                     |\n+-------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+\n\nFuture Enhancements\n--------------------\n- improved yaml config validation\n- improved logging\n- increased test coverage\n- simplyfy action usage and allow for duckdb functions\n- support additional data output formats (e.g. xml, parquet)\n- create custom faker functions to allow for more complex data generation\n- better typing support\n\n\nContributing\n-------------\n\nContributions are welcome, Please open an issue or submit a pull request on GitHub.\n\n\nLicense\n-------------\n\nThis project is licensed under the MIT License. See the LICENSE file for details.\n\n\nAcknowledgements\n-----------------\n\n- [Faker](https://github.com/joke2k/faker) - For generating realistic dummy data.\n- [DuckDB](https://duckdb.org/) - For data persistence and analysis.\n\n\n.. |pypi| image:: https://img.shields.io/pypi/v/mockpipe.svg?style=flat-square&label=version\n    :target: https://pypi.org/project/mockpipe/\n    :alt: Latest version released on PyPI\n\n.. |build| image:: https://github.com/BenskiBoy/mockpipe/actions/workflows/build.yml/badge.svg\n    :target: https://github.com/BenskiBoy/mockpipe/actions/workflows/build.yml\n    :alt: Build status of the master branch\n\n.. |license| image:: https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square\n    :target: https://raw.githubusercontent.com/BenskiBoy/mockpipe/master/LICENSE\n    :alt: Package license\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Dummy data generator focusing on customisability and maintained relationships for mocking data pipelines",
    "version": "0.0.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/BenskiBoy/mockpipe/issues",
        "Changes": "https://github.com/BenskiBoy/mockpipe/blob/master/CHANGELOG.md",
        "Source": "https://github.com/BenskiBoy/mockpipe"
    },
    "split_keywords": [
        "mocking",
        "data",
        "faker",
        "testing",
        "generator",
        "pipeline",
        "pipe"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4a144cf257ee4895bf034d523b11626a9430e2fc893ec1fe1cc37443456834cb",
                "md5": "72bd60dee58a5cbf30b512d32ee74092",
                "sha256": "7445e514dff36bf01cae8f6fdb2770c58a1212d715273903595ad02bcf6ea31f"
            },
            "downloads": -1,
            "filename": "mockpipe-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "72bd60dee58a5cbf30b512d32ee74092",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 24651,
            "upload_time": "2025-01-02T03:19:33",
            "upload_time_iso_8601": "2025-01-02T03:19:33.726718Z",
            "url": "https://files.pythonhosted.org/packages/4a/14/4cf257ee4895bf034d523b11626a9430e2fc893ec1fe1cc37443456834cb/mockpipe-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5c90a4b8a57bedfe484b84905cb9a3937334b5f1d8d628dcbc40d524413b3939",
                "md5": "b56ad8729a8d229c8493cf60ebdf9a8d",
                "sha256": "2529528432bc9db862bd4ec0095664f9c95bd51f823cd2242e0cb14beb279b78"
            },
            "downloads": -1,
            "filename": "mockpipe-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "b56ad8729a8d229c8493cf60ebdf9a8d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 33153,
            "upload_time": "2025-01-02T03:19:36",
            "upload_time_iso_8601": "2025-01-02T03:19:36.241304Z",
            "url": "https://files.pythonhosted.org/packages/5c/90/a4b8a57bedfe484b84905cb9a3937334b5f1d8d628dcbc40d524413b3939/mockpipe-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-02 03:19:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "BenskiBoy",
    "github_project": "mockpipe",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "duckdb",
            "specs": [
                [
                    "==",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "Faker",
            "specs": [
                [
                    "==",
                    "26.0.0"
                ]
            ]
        },
        {
            "name": "faker-commerce",
            "specs": [
                [
                    "==",
                    "1.0.4"
                ]
            ]
        },
        {
            "name": "jsonlines",
            "specs": [
                [
                    "==",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        }
    ],
    "lcname": "mockpipe"
}
        
Elapsed time: 0.39695s