
_By [DataOmni Solutions](https://dataomnisolutions.com)_

OpenETL is a robust and scalable ETL (Extract, Transform, Load) application built with modern technologies like
**FastAPI**, **Next.js**, and **Apache Spark**. The application offers an intuitive, user-friendly interface that
simplifies the ETL process, empowering users to effortlessly extract data from various sources, apply transformations,
and load it into your desired target destinations.
### Key Features of OpenETL:
- **Backend**: Powered by **Python 3.12** and **FastAPI**, ensuring fast and efficient data processing and API
interactions.
- **Frontend**: Built with **Next.js**, providing a smooth and interactive user experience.
- **Compute Engine**: **Apache Spark** is integrated for distributed data processing, enabling scalable and
high-performance operations.
- **Task Execution**: Utilizes **Celery** to handle background task processing, ensuring reliable execution of
long-running operations.
- **Scheduling**: **APScheduler** is used to manage and schedule ETL jobs, allowing for automated workflows.
## Features
- **ETL with Full Load**: Easily extract data from different sources and load it into your preferred target location.
- **Scheduled Timing**: Schedule your ETL tasks to run at specific intervals, ensuring your data is always up-to-date.
- **User Interface**: A clean and user-friendly UI to monitor and control your ETL processes with ease.
- **Logging**: Comprehensive logging to track every action, error, and data transformation throughout the ETL pipeline.
- **Integration History**: Keep track of all your integration jobs with detailed records of past runs, including
statuses and errors.
- **Batch Processing**: Handle large volumes of data by processing it in batches for better efficiency.
- **Distributed Spark Computing**: Utilize Spark for distributed computing, allowing you to process large datasets
efficiently across multiple nodes.
## Benchmark
Check the detailed performance benchmark of OpenETL [here](https://cdn.dataomnisolutions.com/main/app/benchmark.html).
---
## Getting Started
To get started with OpenETL, follow these steps:
## Environment Variables
OpenETL relies on a `.env` file for configuration. Ensure the following variables are defined in your local `.env` file,
and **update them** according to your environment:
### 1. Generate and set encryption key (do this once):
```python
from cryptography.fernet import Fernet
key = Fernet.generate_key()
print(key.decode()) # Save this in your .env file as DB_ENCRYPTION_KEY
```
```bash
OPENETL_DOCUMENT_HOST=postgres
OPENETL_DOCUMENT_DB=openetl_db
OPENETL_DOCUMENT_SCHEMA=public
OPENETL_DOCUMENT_USER=openetl
OPENETL_DOCUMENT_PASS=openetl123
OPENETL_DOCUMENT_PORT=5432
OPENETL_DOCUMENT_ENGINE=PostgreSQL
OPENETL_HOME=/app
CELERY_BROKER_URL=redis://redis:6379/0
SPARK_MASTER=spark://spark-master:7077
SPARK_DRIVER_HOST=openetl-celery-worker-1
DB_ENCRYPTION_KEY=
```
### Using Docker
1. Ensure that you have Docker installed on your local machine.
2. Clone this repository to your local environment.
3. Open a terminal or command prompt and navigate to the project directory.
4. Build the `backend` image by running the following command:
4.1
```sh
docker compose up --build -d backend
```
5. Launch the Docker container:
```sh
docker compose up --build -d
```
6. Open your web browser and visit `http://localhost:3001` to access the OpenETL application.
After running successfully, the API documentation can be found
at [http://localhost:5009/docs](http://localhost:5009/docs), and the UI can be accessed
at [http://localhost:3001](http://localhost:3001).
## Need More?
OpenETL is a free application that offers a range of powerful features. However, if you're looking for advanced
capabilities, we also offer Pro and an Enterprise version with additional features and customizations.
### Features Comparison
| Feature | Basic Version | Pro Version | Enterprise Version |
|--------------------------------------------|:---------------:|:---------------------:|:------------------------:|
| Free Full Load ETL | ✅ Available | ✅ Available | ✅ Available |
| Scheduled Timing | ✅ Available | ✅ Available | ✅ Available |
| User Interface (UI) | ✅ Available | ✅ Available | ✅ Available |
| Logging | ✅ Available | ✅ Available | ✅ Available |
| Integration History | ✅ Available | ✅ Available | ✅ Available |
| Batches | ✅ Available | ✅ Available | ✅ Available |
| Distributed Spark Computing (Configurable) | ✅ Available | ✅ Available | ✅ Available |
| NaN Value Replacement Based on Data Type | ✅ Available | ✅ Available | ✅ Available |
| Views (ID mapping and data attachment) | ❌ Not Available | ✅ Available | ✅ Available |
| Support | ❌ Not Available | ✅ Available | ✅ Available |
| Dedicated Machine for Running the App | ❌ Not Available | ✅ Available | ✅ Available |
| Custom Schema Declarations | ❌ Not Available | ✅ Available | ✅ Available |
| Python-Based Transformations | ❌ Not Available | ✅ Available | ✅ Available |
| Permission-Based Users | ❌ Not Available | ✅ Available | ✅ Available |
| Dtype Casting | ❌ Not Available | ✅ Available | ✅ Available |
| Custom Development | ❌ Not Available | ❌ Not Available | ✅ Available |
If the features in the base version of OpenETL aren't quite cutting it for you, fear not! We're here to help. If you
require additional functionality, customizations, or have specific requirements, reach out to us.
For more information, visit [dataomnisolutions.com](https://www.dataomnisolutions.com) or contact us
at [sales.team@dataomnisolutions.com](mailto:sales.team@dataomnisolutions.com).
## Support and Feedback
If you encounter any issues or have suggestions for improving OpenETL, please don't hesitate to open an issue in the
GitHub repository. We greatly appreciate your feedback and are dedicated to enhancing the application based on user
input. You can read the proper way to report issues in the [Security Section](SECURITY.md).
## License
This project is licensed under the [Apache 2.0 License](LICENSE).
Thank you for choosing OpenETL! We hope it simplifies your ETL tasks and provides a seamless experience.
Raw data
{
"_id": null,
"home_page": "https://dataomnisolutions.com",
"name": "openetl-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "etl, data, utilities, openetl, openetl-sdk, openetl-utils, openetl-csdk, openetl-connector-sdk",
"author": "Rusab Khan",
"author_email": "rusabkhan7@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/06/5a/fee1d84b80d7b06b2199d1c51db275114c0f8b7519623912bf174a7596f6/openetl_sdk-0.0.4.tar.gz",
"platform": null,
"description": "\n\n_By [DataOmni Solutions](https://dataomnisolutions.com)_\n\n\n\nOpenETL is a robust and scalable ETL (Extract, Transform, Load) application built with modern technologies like\n**FastAPI**, **Next.js**, and **Apache Spark**. The application offers an intuitive, user-friendly interface that\nsimplifies the ETL process, empowering users to effortlessly extract data from various sources, apply transformations,\nand load it into your desired target destinations.\n\n### Key Features of OpenETL:\n\n- **Backend**: Powered by **Python 3.12** and **FastAPI**, ensuring fast and efficient data processing and API\n interactions.\n- **Frontend**: Built with **Next.js**, providing a smooth and interactive user experience.\n- **Compute Engine**: **Apache Spark** is integrated for distributed data processing, enabling scalable and\n high-performance operations.\n- **Task Execution**: Utilizes **Celery** to handle background task processing, ensuring reliable execution of\n long-running operations.\n- **Scheduling**: **APScheduler** is used to manage and schedule ETL jobs, allowing for automated workflows.\n\n## Features\n\n- **ETL with Full Load**: Easily extract data from different sources and load it into your preferred target location.\n- **Scheduled Timing**: Schedule your ETL tasks to run at specific intervals, ensuring your data is always up-to-date.\n- **User Interface**: A clean and user-friendly UI to monitor and control your ETL processes with ease.\n- **Logging**: Comprehensive logging to track every action, error, and data transformation throughout the ETL pipeline.\n- **Integration History**: Keep track of all your integration jobs with detailed records of past runs, including\n statuses and errors.\n- **Batch Processing**: Handle large volumes of data by processing it in batches for better efficiency.\n- **Distributed Spark Computing**: Utilize Spark for distributed computing, allowing you to process large datasets\n efficiently across multiple nodes.\n\n## Benchmark\n\nCheck the detailed performance benchmark of OpenETL [here](https://cdn.dataomnisolutions.com/main/app/benchmark.html).\n\n---\n\n## Getting Started\n\nTo get started with OpenETL, follow these steps:\n\n## Environment Variables\n\nOpenETL relies on a `.env` file for configuration. Ensure the following variables are defined in your local `.env` file,\nand **update them** according to your environment:\n\n### 1. Generate and set encryption key (do this once):\n```python\nfrom cryptography.fernet import Fernet\nkey = Fernet.generate_key()\nprint(key.decode()) # Save this in your .env file as DB_ENCRYPTION_KEY\n```\n\n```bash\nOPENETL_DOCUMENT_HOST=postgres\nOPENETL_DOCUMENT_DB=openetl_db\nOPENETL_DOCUMENT_SCHEMA=public\nOPENETL_DOCUMENT_USER=openetl\nOPENETL_DOCUMENT_PASS=openetl123\nOPENETL_DOCUMENT_PORT=5432\nOPENETL_DOCUMENT_ENGINE=PostgreSQL\nOPENETL_HOME=/app\nCELERY_BROKER_URL=redis://redis:6379/0\nSPARK_MASTER=spark://spark-master:7077\nSPARK_DRIVER_HOST=openetl-celery-worker-1\nDB_ENCRYPTION_KEY=\n```\n\n### Using Docker\n\n1. Ensure that you have Docker installed on your local machine.\n2. Clone this repository to your local environment.\n3. Open a terminal or command prompt and navigate to the project directory.\n4. Build the `backend` image by running the following command:\n 4.1\n ```sh\n docker compose up --build -d backend\n ```\n5. Launch the Docker container:\n ```sh\n docker compose up --build -d\n ```\n6. Open your web browser and visit `http://localhost:3001` to access the OpenETL application.\n\nAfter running successfully, the API documentation can be found\nat [http://localhost:5009/docs](http://localhost:5009/docs), and the UI can be accessed\nat [http://localhost:3001](http://localhost:3001).\n\n## Need More?\n\nOpenETL is a free application that offers a range of powerful features. However, if you're looking for advanced\ncapabilities, we also offer Pro and an Enterprise version with additional features and customizations.\n\n### Features Comparison\n| Feature | Basic Version | Pro Version | Enterprise Version |\n|--------------------------------------------|:---------------:|:---------------------:|:------------------------:|\n| Free Full Load ETL | \u2705 Available | \u2705 Available | \u2705 Available |\n| Scheduled Timing | \u2705 Available | \u2705 Available | \u2705 Available |\n| User Interface (UI) | \u2705 Available | \u2705 Available | \u2705 Available |\n| Logging | \u2705 Available | \u2705 Available | \u2705 Available |\n| Integration History | \u2705 Available | \u2705 Available | \u2705 Available |\n| Batches | \u2705 Available | \u2705 Available | \u2705 Available |\n| Distributed Spark Computing (Configurable) | \u2705 Available | \u2705 Available | \u2705 Available |\n| NaN Value Replacement Based on Data Type | \u2705 Available | \u2705 Available | \u2705 Available |\n| Views (ID mapping and data attachment) | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Support | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Dedicated Machine for Running the App | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Custom Schema Declarations | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Python-Based Transformations | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Permission-Based Users | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Dtype Casting | \u274c Not Available | \u2705 Available | \u2705 Available |\n| Custom Development | \u274c Not Available | \u274c Not Available | \u2705 Available |\n\n\nIf the features in the base version of OpenETL aren't quite cutting it for you, fear not! We're here to help. If you\nrequire additional functionality, customizations, or have specific requirements, reach out to us.\n\nFor more information, visit [dataomnisolutions.com](https://www.dataomnisolutions.com) or contact us\nat [sales.team@dataomnisolutions.com](mailto:sales.team@dataomnisolutions.com).\n\n## Support and Feedback\n\nIf you encounter any issues or have suggestions for improving OpenETL, please don't hesitate to open an issue in the\nGitHub repository. We greatly appreciate your feedback and are dedicated to enhancing the application based on user\ninput. You can read the proper way to report issues in the [Security Section](SECURITY.md).\n\n## License\n\nThis project is licensed under the [Apache 2.0 License](LICENSE).\n\nThank you for choosing OpenETL! We hope it simplifies your ETL tasks and provides a seamless experience.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "OpenETL backend lib with connectors.",
"version": "0.0.4",
"project_urls": {
"Documentation": "https://rusabkhan.github.io/OpenETL/",
"Homepage": "https://dataomnisolutions.com",
"Repository": "https://github.com/RusabKhan/OpenETL"
},
"split_keywords": [
"etl",
" data",
" utilities",
" openetl",
" openetl-sdk",
" openetl-utils",
" openetl-csdk",
" openetl-connector-sdk"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e032616a59235bb45b4791212b2915bc9664f7ba4c9ae97f0cce6e3f7e0b3a1f",
"md5": "306cfe462ff0c49364cb5a53e01c3bb6",
"sha256": "b23c3a7e13a6a7ae0a9752d32456e83a7f0da4f2dc90b652c12cacd219f2882d"
},
"downloads": -1,
"filename": "openetl_sdk-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "306cfe462ff0c49364cb5a53e01c3bb6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 48142,
"upload_time": "2025-10-23T13:22:44",
"upload_time_iso_8601": "2025-10-23T13:22:44.497151Z",
"url": "https://files.pythonhosted.org/packages/e0/32/616a59235bb45b4791212b2915bc9664f7ba4c9ae97f0cce6e3f7e0b3a1f/openetl_sdk-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "065afee1d84b80d7b06b2199d1c51db275114c0f8b7519623912bf174a7596f6",
"md5": "5f4c1bd3848725a1ad6cb3ba59a38c5f",
"sha256": "b82eb3fc9f57a41167bf30a578652644613e5deda5ba4cdef10c0d5e6b2645d7"
},
"downloads": -1,
"filename": "openetl_sdk-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "5f4c1bd3848725a1ad6cb3ba59a38c5f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 38270,
"upload_time": "2025-10-23T13:22:45",
"upload_time_iso_8601": "2025-10-23T13:22:45.350661Z",
"url": "https://files.pythonhosted.org/packages/06/5a/fee1d84b80d7b06b2199d1c51db275114c0f8b7519623912bf174a7596f6/openetl_sdk-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-23 13:22:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RusabKhan",
"github_project": "OpenETL",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "openetl-sdk"
}