## Canonada
> ⚠️ Canonada is currently under development.
Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python.
[![GitHub branch check runs](https://img.shields.io/github/check-runs/rlado/canonada/master)](https://github.com/RLado/Canonada)
[![PyPI - Version](https://img.shields.io/pypi/v/canonada)](https://pypi.org/project/canonada/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/canonada)](https://pypi.org/project/canonada/)
## Why Canonada?
- **Standardized**: Canonada provides a standardized way to build your data projects
- **Modular**: Canonada is modular and allows you to build and visualize data pipelines with ease
- **Memory Efficient**: Canonada is memory efficient and can handle large datasets by streaming data through the pipeline instead of loading it all at once
## Features
- **Centralized control of data sources**: Manage all your data sources in one place, enabling you to keep your team in sync
- **Centralized control of the project configuration**: Manage all your project configurations in one place
- **Easy dataloading**: Load data from various sources like CSV, JSON, Parquet, etc.
- **Use functions as nodes**: Functions are the building blocks of Canonada. You can use any function as a node in your pipeline
- **Create streaming data pipelines**: Create parallel and sequential data pipelines with ease
- **Visualize your data pipeline**: Visualize your data pipelines, nodes and connections
## Project Structure
```
canonada.toml
config/
catalog.toml
parameters.toml
credentials.toml
data/
...
datahandlers/
__init__.py
custom_datahandler_1.py
custom_datahandler_2.py
...
notebooks/
...
pipelines/
__init__.py
pipeline_1.py
pipeline_2.py
nodes_1/
__init__.py
node_1.py
node_2.py
...
nodes_2/
__init__.py
node_3.py
node_4.py
...
...
systems/
__init__.py
system_1.py
system_2.py
...
tests/
test_node_group_1.py
test_node_group_2.py
...
```
## Usage
Available commands:
```
Usage: canonada <command> <args>
Commands:
new <project_name> - Create a new project
catalog [list/params] - List all available datasets or get the project parameters
registry [pipelines/systems] - List all available pipelines or systems
run [pipelines/systems] <name(s)> - Run a pipeline or system
view [pipelines/systems] <name(s)> - View a pipeline or system
version - Print the version of Canonada
```
## Installation
Canonada is available on [PyPI](https://pypi.org/project/canonada/) and can be installed using pip:
```bash
pip install canonada
```
> Check out the [Getting Started](https://github.com/RLado/Canonada/wiki/GettingStarted) guide to learn how to create a new project with Canonada.
## Documentation
Check out the project's documentation [here](https://github.com/RLado/Canonada/wiki)
## Contributing
Contributions are welcome! If you have any suggestions, examples, datahandlers, bug reports, or feature requests, please open an issue or a discussion thread.
Raw data
{
"_id": null,
"home_page": null,
"name": "canonada",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "data science, streaming, pipeline, dataflow, canonada",
"author": null,
"author_email": "Ricard Lado <ricard@lado.one>",
"download_url": "https://files.pythonhosted.org/packages/3f/1e/90b893a09a3f27c5bf4708f8ea2adcb43f660f8cc329111e991aa665c082/canonada-0.1.2.tar.gz",
"platform": null,
"description": "## Canonada\n> \u26a0\ufe0f Canonada is currently under development. \n\nCanonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python.\n\n[![GitHub branch check runs](https://img.shields.io/github/check-runs/rlado/canonada/master)](https://github.com/RLado/Canonada)\n[![PyPI - Version](https://img.shields.io/pypi/v/canonada)](https://pypi.org/project/canonada/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/canonada)](https://pypi.org/project/canonada/)\n\n## Why Canonada?\n- **Standardized**: Canonada provides a standardized way to build your data projects\n- **Modular**: Canonada is modular and allows you to build and visualize data pipelines with ease\n- **Memory Efficient**: Canonada is memory efficient and can handle large datasets by streaming data through the pipeline instead of loading it all at once\n\n## Features\n- **Centralized control of data sources**: Manage all your data sources in one place, enabling you to keep your team in sync\n- **Centralized control of the project configuration**: Manage all your project configurations in one place\n- **Easy dataloading**: Load data from various sources like CSV, JSON, Parquet, etc.\n- **Use functions as nodes**: Functions are the building blocks of Canonada. You can use any function as a node in your pipeline\n- **Create streaming data pipelines**: Create parallel and sequential data pipelines with ease\n- **Visualize your data pipeline**: Visualize your data pipelines, nodes and connections\n\n## Project Structure\n```\ncanonada.toml\nconfig/\n catalog.toml\n parameters.toml\n credentials.toml\ndata/\n ...\ndatahandlers/\n __init__.py\n custom_datahandler_1.py\n custom_datahandler_2.py\n ...\nnotebooks/\n ...\npipelines/\n __init__.py\n pipeline_1.py\n pipeline_2.py\n nodes_1/\n __init__.py\n node_1.py\n node_2.py\n ...\n nodes_2/\n __init__.py\n node_3.py\n node_4.py\n ...\n ...\nsystems/\n __init__.py\n system_1.py\n system_2.py\n ...\ntests/\n test_node_group_1.py\n test_node_group_2.py\n ...\n```\n\n## Usage\nAvailable commands:\n```\nUsage: canonada <command> <args>\nCommands:\n new <project_name> - Create a new project\n catalog [list/params] - List all available datasets or get the project parameters\n registry [pipelines/systems] - List all available pipelines or systems\n run [pipelines/systems] <name(s)> - Run a pipeline or system\n view [pipelines/systems] <name(s)> - View a pipeline or system\n version - Print the version of Canonada\n```\n\n## Installation\nCanonada is available on [PyPI](https://pypi.org/project/canonada/) and can be installed using pip:\n```bash\npip install canonada\n```\n\n> Check out the [Getting Started](https://github.com/RLado/Canonada/wiki/GettingStarted) guide to learn how to create a new project with Canonada.\n\n## Documentation\nCheck out the project's documentation [here](https://github.com/RLado/Canonada/wiki)\n\n## Contributing\nContributions are welcome! If you have any suggestions, examples, datahandlers, bug reports, or feature requests, please open an issue or a discussion thread.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Canonada is a data science framework that helps you build production-ready streaming pipelines for data processing in Python.",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/RLado/Canonada",
"Issues": "https://github.com/RLado/Canonada/issues"
},
"split_keywords": [
"data science",
" streaming",
" pipeline",
" dataflow",
" canonada"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f7015eb709361186e343f7318c11a0fa32420f3bdfd8b2f9da1b10d682b46fa8",
"md5": "6fc0458d91df56277561ecdee6691586",
"sha256": "2da0cdc6362fbe2abb58f2dde3cf3563a713f1a883c4c09ff6d030be7fc7253d"
},
"downloads": -1,
"filename": "canonada-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6fc0458d91df56277561ecdee6691586",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 18293,
"upload_time": "2024-12-16T12:22:05",
"upload_time_iso_8601": "2024-12-16T12:22:05.638906Z",
"url": "https://files.pythonhosted.org/packages/f7/01/5eb709361186e343f7318c11a0fa32420f3bdfd8b2f9da1b10d682b46fa8/canonada-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3f1e90b893a09a3f27c5bf4708f8ea2adcb43f660f8cc329111e991aa665c082",
"md5": "a21556d364994ab5f47977e50408fc43",
"sha256": "f7df51b38d6d430fc39a7d6fbdd53c51409e52ccfbf48c9f3937e701ef8f0d98"
},
"downloads": -1,
"filename": "canonada-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "a21556d364994ab5f47977e50408fc43",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 18074,
"upload_time": "2024-12-16T12:22:08",
"upload_time_iso_8601": "2024-12-16T12:22:08.045531Z",
"url": "https://files.pythonhosted.org/packages/3f/1e/90b893a09a3f27c5bf4708f8ea2adcb43f660f8cc329111e991aa665c082/canonada-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-16 12:22:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RLado",
"github_project": "Canonada",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "build",
"specs": [
[
">=",
"1.2.2"
]
]
},
{
"name": "coverage",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "setuptools",
"specs": [
[
">=",
"61.0"
]
]
}
],
"lcname": "canonada"
}