# sefazetllib
[](https://opensource.org/licenses/Apache-2.0)
[](https://github.com/charliermarsh/ruff)
[](https://mypy-lang.org/)
[](https://github.com/psf/black)
---
**Documentation**: [https://main.d32to2oidohzrl.amplifyapp.com/](https://main.d32to2oidohzrl.amplifyapp.com/)
**Source code**: [AWS CodeCommit](https://sa-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/jobs-lib-sefaz-ce/browse?region=sa-east-1)
---
**sefazetllib** is a library that provides a simplified and abstracted way to construct ETL/ELT pipelines.
## Features
- Easy to use and understand library for constructing ETL/ELT pipelines.
- Compatibility with popular data processing frameworks, such as [pandas](https://pandas.pydata.org/) and [PySpark](https://spark.apache.org/).
- Support for file formats such as CSV and Parquet.
- Provides the ability to extract, transform and load data with customizable configurations.
## Requirements
**sefazetllib** requires the following to run:
- [Python](https://www.python.org/) 3.7.1+
- [pandas](https://pandas.pydata.org/) 1.3+
- [PyArrow](https://arrow.apache.org/) 6.0+
- [PySpark](https://spark.apache.org/) 3.0+
- [PyDeequ](https://pydeequ.readthedocs.io/) 1.0+
- [Boto3](https://github.com/boto/boto3) 1.24+
## Installation
Use [pip](https://pip.pypa.io/en/stable/) to install **sefazetllib**:
```bash
pip install sefazetllib
```
## Usage
Here is an example of how to use the **sefazetllib**:
```Python
from typing import Tuple
from pandas import DataFrame
from sefazetllib import Builder
from sefazetllib.etl import ETL
from sefazetllib.extract import ExtractLocal
from sefazetllib.factory.platform import PlatformFactory
from sefazetllib.load import LoadLocal
from sefazetllib.transform import Transform
from sefazetllib.utils.key import SurrogateKey
@Builder
class TestingDataFrame(Transform):
def execute(self) -> Tuple[str, DataFrame]:
return (
"dataframe",
DataFrame(
[["tom", 10], ["nick", 15], ["juli", 14]], columns=["Name", "Age"]
),
)
(
ETL()
.setPlatform(PlatformFactory("Pandas").create(name="test_pandas"))
.transform(TestingDataFrame)
.load(
LoadLocal()
.setFileFormat("parquet")
.setEntity("load_test")
.setMode("overwrite")
.setReference("dataframe")
.setDuplicates(True)
.setKey(SurrogateKey().setColumns(["Name", "Age"]).setDistribute(False))
)
.extract(
ExtractLocal()
.setFileFormat("parquet")
.setUrl("load_test.parquet")
.setReference("extract_test")
)
)
```
## Testing
To run the unit tests, run the following command:
```bash
py -m unittest tests/main.py -v
```
## License
**sefazetllib** is released under the [Apache-2.0](/LICENSE).
Raw data
{
"_id": null,
"home_page": "https://sa-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/jobs-lib-sefaz-ce/browse?region=sa-east-1",
"name": "sefazetllib",
"maintainer": "Bruno Santos",
"docs_url": null,
"requires_python": "<4.0.0,>=3.7.1",
"maintainer_email": "bruno.santos@elogroup.com.br",
"keywords": null,
"author": "Felipe Gochi",
"author_email": "felipe.gochi@elogroup.com.br",
"download_url": "https://files.pythonhosted.org/packages/43/60/766e1f545a939abb69f81a4b36733de79b68b6c983ca7cce64ab73ac5e41/sefazetllib-0.1.60.tar.gz",
"platform": null,
"description": "# sefazetllib\n\n[](https://opensource.org/licenses/Apache-2.0)\n[](https://github.com/charliermarsh/ruff)\n[](https://mypy-lang.org/)\n[](https://github.com/psf/black)\n\n---\n\n**Documentation**: [https://main.d32to2oidohzrl.amplifyapp.com/](https://main.d32to2oidohzrl.amplifyapp.com/)\n\n**Source code**: [AWS CodeCommit](https://sa-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/jobs-lib-sefaz-ce/browse?region=sa-east-1)\n\n---\n\n**sefazetllib** is a library that provides a simplified and abstracted way to construct ETL/ELT pipelines.\n\n## Features\n\n- Easy to use and understand library for constructing ETL/ELT pipelines.\n- Compatibility with popular data processing frameworks, such as [pandas](https://pandas.pydata.org/) and [PySpark](https://spark.apache.org/).\n- Support for file formats such as CSV and Parquet.\n- Provides the ability to extract, transform and load data with customizable configurations.\n\n## Requirements\n\n**sefazetllib** requires the following to run:\n\n- [Python](https://www.python.org/) 3.7.1+\n- [pandas](https://pandas.pydata.org/) 1.3+\n- [PyArrow](https://arrow.apache.org/) 6.0+\n- [PySpark](https://spark.apache.org/) 3.0+\n- [PyDeequ](https://pydeequ.readthedocs.io/) 1.0+\n- [Boto3](https://github.com/boto/boto3) 1.24+\n\n## Installation\n\nUse [pip](https://pip.pypa.io/en/stable/) to install **sefazetllib**:\n\n```bash\npip install sefazetllib\n```\n\n## Usage\n\nHere is an example of how to use the **sefazetllib**:\n\n```Python\nfrom typing import Tuple\n\nfrom pandas import DataFrame\n\nfrom sefazetllib import Builder\nfrom sefazetllib.etl import ETL\nfrom sefazetllib.extract import ExtractLocal\nfrom sefazetllib.factory.platform import PlatformFactory\nfrom sefazetllib.load import LoadLocal\nfrom sefazetllib.transform import Transform\nfrom sefazetllib.utils.key import SurrogateKey\n\n\n@Builder\nclass TestingDataFrame(Transform):\n def execute(self) -> Tuple[str, DataFrame]:\n return (\n \"dataframe\",\n DataFrame(\n [[\"tom\", 10], [\"nick\", 15], [\"juli\", 14]], columns=[\"Name\", \"Age\"]\n ),\n )\n\n\n(\n ETL()\n .setPlatform(PlatformFactory(\"Pandas\").create(name=\"test_pandas\"))\n .transform(TestingDataFrame)\n .load(\n LoadLocal()\n .setFileFormat(\"parquet\")\n .setEntity(\"load_test\")\n .setMode(\"overwrite\")\n .setReference(\"dataframe\")\n .setDuplicates(True)\n .setKey(SurrogateKey().setColumns([\"Name\", \"Age\"]).setDistribute(False))\n )\n .extract(\n ExtractLocal()\n .setFileFormat(\"parquet\")\n .setUrl(\"load_test.parquet\")\n .setReference(\"extract_test\")\n )\n)\n```\n\n## Testing\n\nTo run the unit tests, run the following command:\n\n```bash\npy -m unittest tests/main.py -v\n```\n\n## License\n\n**sefazetllib** is released under the [Apache-2.0](/LICENSE).\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "sefazetllib is a library that provides a simplified and abstracted way to construct ETL/ELT pipelines.",
"version": "0.1.60",
"project_urls": {
"Documentation": "https://main.unavailable.amplifyapp.com/",
"Homepage": "https://sa-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/jobs-lib-sefaz-ce/browse?region=sa-east-1",
"Repository": "https://sa-east-1.console.aws.amazon.com/codesuite/codecommit/repositories/jobs-lib-sefaz-ce/browse?region=sa-east-1"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "41b90625325d0a4036cfa03d0c6d55b543dce6dfe3183659e59ff04ab0369029",
"md5": "1db3554e919cd9f5d2f993fe95814452",
"sha256": "6d4873668dd13156c1998a294a3bbaf8ed548fda871f48a0cdb082d8ef60bbc7"
},
"downloads": -1,
"filename": "sefazetllib-0.1.60-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1db3554e919cd9f5d2f993fe95814452",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0.0,>=3.7.1",
"size": 59472,
"upload_time": "2024-06-25T16:59:33",
"upload_time_iso_8601": "2024-06-25T16:59:33.687468Z",
"url": "https://files.pythonhosted.org/packages/41/b9/0625325d0a4036cfa03d0c6d55b543dce6dfe3183659e59ff04ab0369029/sefazetllib-0.1.60-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4360766e1f545a939abb69f81a4b36733de79b68b6c983ca7cce64ab73ac5e41",
"md5": "0c3ab529d419fb33e9165b1432f7d1a7",
"sha256": "5c10f728351e4b61b066d51b9958c613ce77036355cf8352f2771685e52c5a27"
},
"downloads": -1,
"filename": "sefazetllib-0.1.60.tar.gz",
"has_sig": false,
"md5_digest": "0c3ab529d419fb33e9165b1432f7d1a7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0.0,>=3.7.1",
"size": 33152,
"upload_time": "2024-06-25T16:59:36",
"upload_time_iso_8601": "2024-06-25T16:59:36.917080Z",
"url": "https://files.pythonhosted.org/packages/43/60/766e1f545a939abb69f81a4b36733de79b68b6c983ca7cce64ab73ac5e41/sefazetllib-0.1.60.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-25 16:59:36",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "sefazetllib"
}