fusion-utils

Name	fusion-utils JSON
Version	1.5.5 JSON
	download
home_page	https://github.com/dylandoyle11/fusion_utils
Summary	A utility package Fusion 2.0
upload_time	2024-11-13 14:11:53
maintainer	None
docs_url	None
author	Dylan D
requires_python	None
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <<<<<<< HEAD
Steps to convert sql pipeline to use library:
1) generate the MAIN and QA
2) create handler
3) Ensure any table names can be converted i.e add them to QA TABLE MAPPING
4) Add specific channel to slack if doesnt exist and update slack channel map
5) Add QA context variable

TODO:
- How to import module
- How to write QA tasks
- Details about task failures etc
- talk about aliases and accessing them
- talk about external tasks

# Pipeline Class Documentation

## Overview

The `Pipeline` class is a comprehensive and versatile tool designed to manage and automate complex data processing workflows. It is ideal for environments where the integrity, accuracy, and sequential execution of tasks are critical. The primary purpose of the `Pipeline` is to facilitate the orchestration of a series of tasks, ensuring that each task is executed in the correct order and that dependencies between tasks are properly managed.

### Key Features:

1. **Task Management**: The `Pipeline` class allows you to define a sequence of tasks, including both regular and QA (Quality Assurance) tasks. Each task can be defined with specific queries, conditions for QA checks, and execution stages.

2. **Stage-Based Execution**: Tasks are organized into stages, with each stage representing a phase in the data processing workflow. The pipeline ensures that tasks within a stage are executed before moving on to the next stage, maintaining the logical flow of data processing.

3. **QA Integration**: The `Pipeline` integrates QA tasks that validate data at different stages of the pipeline. These QA tasks are designed to ensure that the data meets specified conditions before proceeding, helping to maintain data quality throughout the workflow.

4. **Logging and Monitoring**: The `Pipeline` class includes robust logging capabilities through the `FusionLogger` class. Logs can be directed to the console, files, or Slack channels, providing real-time monitoring and post-execution analysis of the pipeline's performance.

5. **Error Handling and Notifications**: The pipeline is equipped with detailed error handling mechanisms that capture and log errors, preventing the pipeline from proceeding in case of critical failures. Notifications can be sent via email, summarizing the execution status of the pipeline.

### Use Cases:

- **Data Aggregation and Transformation**: Automate the process of aggregating and transforming large datasets, preparing them for analysis or reporting. The pipeline ensures that each transformation step is completed successfully before moving to the next.

- **Data Validation**: Use QA tasks to validate the data at various stages, ensuring that it meets predefined conditions before being used in further processing or analysis.

- **Automated Reporting**: Automate the generation of reports by defining tasks that extract, transform, and load (ETL) data, followed by QA checks to ensure the accuracy of the final report.

- **Data Migration**: Facilitate the migration of data from one environment to another, using stages to manage the extraction, transformation, and loading of data in a controlled manner.

### Sample Usage
```python
from fusion_utils.pipeline import Pipeline
from fusion_utils.task import Task

# CREATE PIPELINE INSTANCE
pipeline = Pipeline('PL_Aggregate_ISR_DMA', QA_flag=ctx['QA'])
pipeline.set_email_recipients('dylan.doyle@jdpa.com')

# CONSTRUCT PIPELINE
task1 = Task(name='TRANSCOUNTS', query_definition='SELECT COUNT(*) FROM table')

task2 = Task(name='ALL_FK', query_definition='SELECT ALL_FK FROM table')

pipeline.add_task(task1)
pipeline.add_task(task2)

pipeline.execute_all()
```



## Implementation Steps

### 1. Generate the MAIN and QA

To start using the pipeline library, you need to define the MAIN and QA tasks that the pipeline will execute. MAIN tasks typically involve data processing, transformation, and loading, while QA tasks validate the outcomes of MAIN tasks to ensure data quality.

**Steps:**
- Define your tasks as `Task` objects with their corresponding SQL queries.
- Use the `add_task` method to include these tasks in the pipeline.
- Ensure QA tasks are properly flagged with `is_qa=True` and include conditions for validation.

### 2. Create Handler

A handler is required to manage the execution of tasks within the pipeline. The handler is responsible for controlling the order of execution, managing dependencies, and handling any errors that occur.

**Steps:**
- Use the `Pipeline.execute_all()` method to execute all tasks sequentially by stages.
- The handler will manage task execution, logging, and error handling as defined in your pipeline class.

### 3. Ensure Table Names Can Be Converted

To ensure that table names used in your SQL queries can be translated correctly for both QA and PROD environments, add them to the QA Table Mapping.

**Steps:**
- Add any new table aliases and their corresponding dataset names to the QA Table Mapping (`LKP_QA_TABLE_MAPPING`) in your BigQuery project.
- Use the `Pipeline.translate_tables()` method to automatically convert table names in your queries based on the environment.

### 4. Add Specific Channel to Slack

If your pipeline needs to log to a specific Slack channel that doesn't already exist, you'll need to update the Slack channel map.

**Steps:**
- Check if the specific channel for your pipeline exists in the Slack channel mapping (`LKP_LOG_CHANNELS`).
- If not, add the new channel with the appropriate configurations for either QA or PROD notifications.
- Update the `Pipeline._get_log_channel()` method to retrieve the correct channel based on your pipeline name and QA flag.

### 5. Add QA Context Variable

The QA context variable is crucial for differentiating between QA and PROD runs, allowing the pipeline to switch contexts and perform QA-specific tasks.

**Steps:**
- Pass the QA flag as a context variable when initializing your pipeline instance (e.g., `Pipeline(name='Pipeline_Name', QA_flag=ctx['QA'])`).
- Ensure the QA flag is set correctly, either by passing 'true'/'false' strings or by setting the flag directly within your code.

## Additional Sections

### How to Import the Module

To use the `Pipeline` class and its associated components (`Task`, `FusionLogger`, etc.), ensure you import them correctly into your script or notebook.

**Example:**
```python
from fusion_utils.pipeline import Pipeline
from fusion_utils.task import Task
```

## Best Practices
Tasks properties can by defined by Dictionaries or Arrays, then passed to a helper function to construct and add to the pipeline. This way, Task details can be generated externally in previous steps and cleanly passed to the Pipeline object without overcrowding the Pipeline object creation step.

```python
from fusion_utils.pipeline import Pipeline
from fusion_utils.task import Task

# CREATE PIPELINE INSTANCE
pipeline = Pipeline('PL_Aggregate_ISR_DMA', QA_flag=ctx['QA'])
pipeline.set_email_recipients('dylan.doyle@jdpa.com')

# CONSTRUCT PIPELINE
tasks = pyspark_utils.get_pandas_from_task_alias('queries').to_dict(orient='records')
qa_tasks = pyspark_utils.get_pandas_from_task_alias('qa_queries').to_dict(orient='records')

for task in tasks:
    pipeline.add_task(Task(**task))

for qa_task in qa_tasks:
    pipeline.add_task(Task(**qa_task))

pipeline.execute_all()
```

In this example, Task definitions are created in a previous AIC step and the resultant dataframe is then accessed to create the Task objects. 

The creation of the Task Dictionaries can look something like this:

```python
import pandas as pd
c = '''select * from table'''
WRITE_BRIDGE = '''select * from {STG_BRIDGE} where condition = False''

data = [{'name':'STG_BRIDGE', 'query_definition':STG_BRIDGE,'stage':1},{'name':'WRITE_BRIDGE', 'query_definition':WRITE_BRIDGE,'stage':2}]

return pd.DataFrame(data)
```

This will result in a DataFrame object containing various records and details of the tasks which can be ingested by the Pipeline class using a helper function.

## Pipeline Class

### Initialization

```python
pipeline = Pipeline(name: str, QA_flag: Optional[str] = None)
```

**Parameters:**
- `name` (str): The name of the pipeline.
- `QA_flag` (Optional[str]): Flag to enable QA mode. If `None`, defaults to `True`.

### Methods

#### `set_email_recipients(recipients: Union[str, List[str]])`

Sets the email recipients for notifications.

**Parameters:**
- `recipients` (Union[str, List[str]]): A string or a list of email addresses.

**Usage Example:**
```python
pipeline.set_email_recipients('dylan.doyle@jdpa.com')
```

#### `add_task(task: Task)`

Adds a task to the pipeline.

**Parameters:**
- `task` (Task): The task to add.

**Usage Example:**
```python
task = Task(name='TASK_1', query_definition='SELECT * FROM dataset.table', table_alias='task_table')
pipeline.add_task(task)
```

#### `add_external_task(df: pd.DataFrame, temp_table_name: str)`

Adds an external task to the pipeline by loading a DataFrame into BigQuery.

**Parameters:**
- `df` (pd.DataFrame): DataFrame to load.
- `temp_table_name` (str): The name of the temporary table to create.

**Usage Example:**
```python
query_df = pd.DataFrame({'column': [1, 2, 3]})
pipeline.add_external_task(query_df, 'query_df')
```

#### `execute_all()`

Executes all tasks added to the pipeline in their respective stages.

**Usage Example:**
```python
pipeline.execute_all()
```

## Task Class

The `Task` class represents a unit of work within the pipeline. It contains the query definition, execution details, and any conditions that must be met for QA tasks.

### Task Class Initialization

```python
task = Task(name, query_definition, table_alias=None, query=None, is_qa=False, optional=False, stage=None, condition=None, include_html=False)
```

**Parameters:**
- `name` (str): The name of the task.
- `query_definition` (str): The SQL query definition for the task.
- `table_alias` (Optional[str]): Alias for the table.
- `query` (Optional[str]): Translated query.
- `is_qa` (bool): Flag indicating if this is a QA task.
- `optional` (bool): If `True`, the task can fail without halting the pipeline.
- `stage` (Optional[int]): The stage in which the task should be executed.
- `condition` (Optional[str]): A string representing a lambda function used for QA condition checking.
- `include_html` (bool): Whether to include an HTML representation of the dataframe in the logs.

### Methods

#### `define_query(query_definition: str)`

Sets the query definition for the task.

**Parameters:**
- `query_definition` (str): The SQL query definition for the task.

#### `define_table_alias(table_alias: str)`

Sets the table alias for the task.

**Parameters:**
- `table_alias` (str): Alias for the table.

#### `define_optional(optional: bool)`

Sets whether the task is optional.

**Parameters:**
- `optional` (bool): If `True`, the task is optional.

## FusionLogger Class

The `FusionLogger` class is responsible for logging pipeline execution details. It can log to the console, a file, and a memory stream, and can also send log messages to Slack.

### Initialization

```python
logger = FusionLogger(slack_bot_token=None, slack_channel=None)
```

**Parameters:**
- `slack_bot_token` (Optional[str]): Token for Slack bot.
- `slack_channel` (Optional[str]): Slack channel ID for logging.

### Methods

#### `log(message: str, level: str = 'info')`

Logs a message at the specified logging level.

**Parameters:**
- `message` (str): The message to log.
- `level` (str): The logging level (`'info'`, `'warning'`, `'error'`, `'critical'`).

**Usage Example:**
```python
logger.log('This is an info message')
```

#### `get_log_contents() -> str`

Returns the contents of the log stored in the memory stream.

**Returns:**
- `str`: The contents of the log.

#### `attach_to_email(email_message: MIMEMultipart)`

Attaches the log contents as a text file to an email.

**Parameters:**
- `email_message` (MIMEMultipart): The email message object.

### Example Usage

```python
task = Task(name='TRANSCOUNTS', query_definition='SELECT COUNT(*) FROM table', is_qa=True, condition="df['count'] == 100")
pipeline.add_task(task)
pipeline.execute_all()
```

## Additional Information

This documentation covers the essential methods and usage patterns of the `Pipeline`, `Task`, and `FusionLogger` classes. By following the examples provided, users can create, manage, and execute complex data pipelines with integrated QA checks and comprehensive logging.

**Note:** This is a high-level overview. For more advanced usage and configurations, please refer to the source code or additional detailed documentation.


#### External Tasks:

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dylandoyle11/fusion_utils",
    "name": "fusion-utils",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Dylan D",
    "author_email": "dylan.doyle@jdpa.com",
    "download_url": "https://files.pythonhosted.org/packages/ce/71/f52ef5d5253f53ad545cccea54a425fa9f9f658d07dd55d95401e9ce37ec/fusion_utils-1.5.5.tar.gz",
    "platform": null,
    "description": "<<<<<<< HEAD\nSteps to convert sql pipeline to use library:\n1) generate the MAIN and QA\n2) create handler\n3) Ensure any table names can be converted i.e add them to QA TABLE MAPPING\n4) Add specific channel to slack if doesnt exist and update slack channel map\n5) Add QA context variable\n\nTODO:\n- How to import module\n- How to write QA tasks\n- Details about task failures etc\n- talk about aliases and accessing them\n- talk about external tasks\n\n# Pipeline Class Documentation\n\n## Overview\n\nThe `Pipeline` class is a comprehensive and versatile tool designed to manage and automate complex data processing workflows. It is ideal for environments where the integrity, accuracy, and sequential execution of tasks are critical. The primary purpose of the `Pipeline` is to facilitate the orchestration of a series of tasks, ensuring that each task is executed in the correct order and that dependencies between tasks are properly managed.\n\n### Key Features:\n\n1. **Task Management**: The `Pipeline` class allows you to define a sequence of tasks, including both regular and QA (Quality Assurance) tasks. Each task can be defined with specific queries, conditions for QA checks, and execution stages.\n\n2. **Stage-Based Execution**: Tasks are organized into stages, with each stage representing a phase in the data processing workflow. The pipeline ensures that tasks within a stage are executed before moving on to the next stage, maintaining the logical flow of data processing.\n\n3. **QA Integration**: The `Pipeline` integrates QA tasks that validate data at different stages of the pipeline. These QA tasks are designed to ensure that the data meets specified conditions before proceeding, helping to maintain data quality throughout the workflow.\n\n4. **Logging and Monitoring**: The `Pipeline` class includes robust logging capabilities through the `FusionLogger` class. Logs can be directed to the console, files, or Slack channels, providing real-time monitoring and post-execution analysis of the pipeline's performance.\n\n5. **Error Handling and Notifications**: The pipeline is equipped with detailed error handling mechanisms that capture and log errors, preventing the pipeline from proceeding in case of critical failures. Notifications can be sent via email, summarizing the execution status of the pipeline.\n\n### Use Cases:\n\n- **Data Aggregation and Transformation**: Automate the process of aggregating and transforming large datasets, preparing them for analysis or reporting. The pipeline ensures that each transformation step is completed successfully before moving to the next.\n\n- **Data Validation**: Use QA tasks to validate the data at various stages, ensuring that it meets predefined conditions before being used in further processing or analysis.\n\n- **Automated Reporting**: Automate the generation of reports by defining tasks that extract, transform, and load (ETL) data, followed by QA checks to ensure the accuracy of the final report.\n\n- **Data Migration**: Facilitate the migration of data from one environment to another, using stages to manage the extraction, transformation, and loading of data in a controlled manner.\n\n### Sample Usage\n```python\nfrom fusion_utils.pipeline import Pipeline\nfrom fusion_utils.task import Task\n\n# CREATE PIPELINE INSTANCE\npipeline = Pipeline('PL_Aggregate_ISR_DMA', QA_flag=ctx['QA'])\npipeline.set_email_recipients('dylan.doyle@jdpa.com')\n\n# CONSTRUCT PIPELINE\ntask1 = Task(name='TRANSCOUNTS', query_definition='SELECT COUNT(*) FROM table')\n\ntask2 = Task(name='ALL_FK', query_definition='SELECT ALL_FK FROM table')\n\npipeline.add_task(task1)\npipeline.add_task(task2)\n\npipeline.execute_all()\n```\n\n\n\n## Implementation Steps\n\n### 1. Generate the MAIN and QA\n\nTo start using the pipeline library, you need to define the MAIN and QA tasks that the pipeline will execute. MAIN tasks typically involve data processing, transformation, and loading, while QA tasks validate the outcomes of MAIN tasks to ensure data quality.\n\n**Steps:**\n- Define your tasks as `Task` objects with their corresponding SQL queries.\n- Use the `add_task` method to include these tasks in the pipeline.\n- Ensure QA tasks are properly flagged with `is_qa=True` and include conditions for validation.\n\n### 2. Create Handler\n\nA handler is required to manage the execution of tasks within the pipeline. The handler is responsible for controlling the order of execution, managing dependencies, and handling any errors that occur.\n\n**Steps:**\n- Use the `Pipeline.execute_all()` method to execute all tasks sequentially by stages.\n- The handler will manage task execution, logging, and error handling as defined in your pipeline class.\n\n### 3. Ensure Table Names Can Be Converted\n\nTo ensure that table names used in your SQL queries can be translated correctly for both QA and PROD environments, add them to the QA Table Mapping.\n\n**Steps:**\n- Add any new table aliases and their corresponding dataset names to the QA Table Mapping (`LKP_QA_TABLE_MAPPING`) in your BigQuery project.\n- Use the `Pipeline.translate_tables()` method to automatically convert table names in your queries based on the environment.\n\n### 4. Add Specific Channel to Slack\n\nIf your pipeline needs to log to a specific Slack channel that doesn't already exist, you'll need to update the Slack channel map.\n\n**Steps:**\n- Check if the specific channel for your pipeline exists in the Slack channel mapping (`LKP_LOG_CHANNELS`).\n- If not, add the new channel with the appropriate configurations for either QA or PROD notifications.\n- Update the `Pipeline._get_log_channel()` method to retrieve the correct channel based on your pipeline name and QA flag.\n\n### 5. Add QA Context Variable\n\nThe QA context variable is crucial for differentiating between QA and PROD runs, allowing the pipeline to switch contexts and perform QA-specific tasks.\n\n**Steps:**\n- Pass the QA flag as a context variable when initializing your pipeline instance (e.g., `Pipeline(name='Pipeline_Name', QA_flag=ctx['QA'])`).\n- Ensure the QA flag is set correctly, either by passing 'true'/'false' strings or by setting the flag directly within your code.\n\n## Additional Sections\n\n### How to Import the Module\n\nTo use the `Pipeline` class and its associated components (`Task`, `FusionLogger`, etc.), ensure you import them correctly into your script or notebook.\n\n**Example:**\n```python\nfrom fusion_utils.pipeline import Pipeline\nfrom fusion_utils.task import Task\n```\n\n## Best Practices\nTasks properties can by defined by Dictionaries or Arrays, then passed to a helper function to construct and add to the pipeline. This way, Task details can be generated externally in previous steps and cleanly passed to the Pipeline object without overcrowding the Pipeline object creation step.\n\n```python\nfrom fusion_utils.pipeline import Pipeline\nfrom fusion_utils.task import Task\n\n# CREATE PIPELINE INSTANCE\npipeline = Pipeline('PL_Aggregate_ISR_DMA', QA_flag=ctx['QA'])\npipeline.set_email_recipients('dylan.doyle@jdpa.com')\n\n# CONSTRUCT PIPELINE\ntasks = pyspark_utils.get_pandas_from_task_alias('queries').to_dict(orient='records')\nqa_tasks = pyspark_utils.get_pandas_from_task_alias('qa_queries').to_dict(orient='records')\n\nfor task in tasks:\n    pipeline.add_task(Task(**task))\n\nfor qa_task in qa_tasks:\n    pipeline.add_task(Task(**qa_task))\n\npipeline.execute_all()\n```\n\nIn this example, Task definitions are created in a previous AIC step and the resultant dataframe is then accessed to create the Task objects. \n\nThe creation of the Task Dictionaries can look something like this:\n\n```python\nimport pandas as pd\nc = '''select * from table'''\nWRITE_BRIDGE = '''select * from {STG_BRIDGE} where condition = False''\n\ndata = [{'name':'STG_BRIDGE', 'query_definition':STG_BRIDGE,'stage':1},{'name':'WRITE_BRIDGE', 'query_definition':WRITE_BRIDGE,'stage':2}]\n\nreturn pd.DataFrame(data)\n```\n\nThis will result in a DataFrame object containing various records and details of the tasks which can be ingested by the Pipeline class using a helper function.\n\n## Pipeline Class\n\n### Initialization\n\n```python\npipeline = Pipeline(name: str, QA_flag: Optional[str] = None)\n```\n\n**Parameters:**\n- `name` (str): The name of the pipeline.\n- `QA_flag` (Optional[str]): Flag to enable QA mode. If `None`, defaults to `True`.\n\n### Methods\n\n#### `set_email_recipients(recipients: Union[str, List[str]])`\n\nSets the email recipients for notifications.\n\n**Parameters:**\n- `recipients` (Union[str, List[str]]): A string or a list of email addresses.\n\n**Usage Example:**\n```python\npipeline.set_email_recipients('dylan.doyle@jdpa.com')\n```\n\n#### `add_task(task: Task)`\n\nAdds a task to the pipeline.\n\n**Parameters:**\n- `task` (Task): The task to add.\n\n**Usage Example:**\n```python\ntask = Task(name='TASK_1', query_definition='SELECT * FROM dataset.table', table_alias='task_table')\npipeline.add_task(task)\n```\n\n#### `add_external_task(df: pd.DataFrame, temp_table_name: str)`\n\nAdds an external task to the pipeline by loading a DataFrame into BigQuery.\n\n**Parameters:**\n- `df` (pd.DataFrame): DataFrame to load.\n- `temp_table_name` (str): The name of the temporary table to create.\n\n**Usage Example:**\n```python\nquery_df = pd.DataFrame({'column': [1, 2, 3]})\npipeline.add_external_task(query_df, 'query_df')\n```\n\n#### `execute_all()`\n\nExecutes all tasks added to the pipeline in their respective stages.\n\n**Usage Example:**\n```python\npipeline.execute_all()\n```\n\n## Task Class\n\nThe `Task` class represents a unit of work within the pipeline. It contains the query definition, execution details, and any conditions that must be met for QA tasks.\n\n### Task Class Initialization\n\n```python\ntask = Task(name, query_definition, table_alias=None, query=None, is_qa=False, optional=False, stage=None, condition=None, include_html=False)\n```\n\n**Parameters:**\n- `name` (str): The name of the task.\n- `query_definition` (str): The SQL query definition for the task.\n- `table_alias` (Optional[str]): Alias for the table.\n- `query` (Optional[str]): Translated query.\n- `is_qa` (bool): Flag indicating if this is a QA task.\n- `optional` (bool): If `True`, the task can fail without halting the pipeline.\n- `stage` (Optional[int]): The stage in which the task should be executed.\n- `condition` (Optional[str]): A string representing a lambda function used for QA condition checking.\n- `include_html` (bool): Whether to include an HTML representation of the dataframe in the logs.\n\n### Methods\n\n#### `define_query(query_definition: str)`\n\nSets the query definition for the task.\n\n**Parameters:**\n- `query_definition` (str): The SQL query definition for the task.\n\n#### `define_table_alias(table_alias: str)`\n\nSets the table alias for the task.\n\n**Parameters:**\n- `table_alias` (str): Alias for the table.\n\n#### `define_optional(optional: bool)`\n\nSets whether the task is optional.\n\n**Parameters:**\n- `optional` (bool): If `True`, the task is optional.\n\n## FusionLogger Class\n\nThe `FusionLogger` class is responsible for logging pipeline execution details. It can log to the console, a file, and a memory stream, and can also send log messages to Slack.\n\n### Initialization\n\n```python\nlogger = FusionLogger(slack_bot_token=None, slack_channel=None)\n```\n\n**Parameters:**\n- `slack_bot_token` (Optional[str]): Token for Slack bot.\n- `slack_channel` (Optional[str]): Slack channel ID for logging.\n\n### Methods\n\n#### `log(message: str, level: str = 'info')`\n\nLogs a message at the specified logging level.\n\n**Parameters:**\n- `message` (str): The message to log.\n- `level` (str): The logging level (`'info'`, `'warning'`, `'error'`, `'critical'`).\n\n**Usage Example:**\n```python\nlogger.log('This is an info message')\n```\n\n#### `get_log_contents() -> str`\n\nReturns the contents of the log stored in the memory stream.\n\n**Returns:**\n- `str`: The contents of the log.\n\n#### `attach_to_email(email_message: MIMEMultipart)`\n\nAttaches the log contents as a text file to an email.\n\n**Parameters:**\n- `email_message` (MIMEMultipart): The email message object.\n\n### Example Usage\n\n```python\ntask = Task(name='TRANSCOUNTS', query_definition='SELECT COUNT(*) FROM table', is_qa=True, condition=\"df['count'] == 100\")\npipeline.add_task(task)\npipeline.execute_all()\n```\n\n## Additional Information\n\nThis documentation covers the essential methods and usage patterns of the `Pipeline`, `Task`, and `FusionLogger` classes. By following the examples provided, users can create, manage, and execute complex data pipelines with integrated QA checks and comprehensive logging.\n\n**Note:** This is a high-level overview. For more advanced usage and configurations, please refer to the source code or additional detailed documentation.\n\n\n#### External Tasks:\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A utility package Fusion 2.0",
    "version": "1.5.5",
    "project_urls": {
        "Homepage": "https://github.com/dylandoyle11/fusion_utils"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "16e073fa7dfef91d78d7f52d422422d56dc92d0af2ec425852e9835ff878431f",
                "md5": "d0d64af273bc1aafe9e94cf0a54a4e28",
                "sha256": "68d5341104e89b71576ec1fdfb764c999e038562e1be9ff46593fe7b5b7adf4c"
            },
            "downloads": -1,
            "filename": "fusion_utils-1.5.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d0d64af273bc1aafe9e94cf0a54a4e28",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 16910,
            "upload_time": "2024-11-13T14:11:50",
            "upload_time_iso_8601": "2024-11-13T14:11:50.849347Z",
            "url": "https://files.pythonhosted.org/packages/16/e0/73fa7dfef91d78d7f52d422422d56dc92d0af2ec425852e9835ff878431f/fusion_utils-1.5.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ce71f52ef5d5253f53ad545cccea54a425fa9f9f658d07dd55d95401e9ce37ec",
                "md5": "f2103cfe13dda219925520c93a21dbfb",
                "sha256": "d0e34ea3cca55f084bac78e044f3362c53e1a681160ab11a712e9c415def5c24"
            },
            "downloads": -1,
            "filename": "fusion_utils-1.5.5.tar.gz",
            "has_sig": false,
            "md5_digest": "f2103cfe13dda219925520c93a21dbfb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 19824,
            "upload_time": "2024-11-13T14:11:53",
            "upload_time_iso_8601": "2024-11-13T14:11:53.293843Z",
            "url": "https://files.pythonhosted.org/packages/ce/71/f52ef5d5253f53ad545cccea54a425fa9f9f658d07dd55d95401e9ce37ec/fusion_utils-1.5.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-13 14:11:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dylandoyle11",
    "github_project": "fusion_utils",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fusion-utils"
}

Dylan D