# spark_datax_schema_tools
[![Github License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Updates](https://pyup.io/repos/github/woctezuma/google-colab-transfer/shield.svg)](pyup)
[![Python 3](https://pyup.io/repos/github/woctezuma/google-colab-transfer/python-3-shield.svg)](pyup)
[![Code coverage](https://codecov.io/gh/woctezuma/google-colab-transfer/branch/master/graph/badge.svg)](codecov)
spark_datax_schema_tools is a Python library that implements for dataX schemas
## Installation
The code is packaged for PyPI, so that the installation consists in running:
```sh
pip install spark-datax-schema-tools
```
## Usage
wrapper take schemas for DataX
```sh
example1: (generate dummy_data)
================================
from spark_datax_schema_tools import generate_components
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_components(spark=spark,
path_excel="/content/Summary RQ22021-HF1.xlsx",
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom")
df2.show2()
example2: (generate transmission detail with schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD",
path_excel="Summary RQ22021-HF1.xlsx")
example3: (generate transmission detail without schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession
spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
uuaa_name="NZTG",
table_name="t_nztg_trade_core_inf_bo_eom",
table_version="0",
frequency="monthly",
group="CIB",
solution_model="CDD")
```
```sh
Parameter functions
===================
generate_transmission_holding:
frequency: ["daily", "monthly"]
group : ["CIB", "CLIENT_SOLUTIONS", "CORE_BANKING", "GLOBAL_DATA", "RISK_FINANCE"]
solution_model: ["CIB", "CDD"]
```
## License
[Apache License 2.0](https://www.dropbox.com/s/8t6xtgk06o3ij61/LICENSE?dl=0).
## New features v1.0
## BugFix
- choco install visualcpp-build-tools
## Reference
- Jonathan Quiza [github](https://github.com/jonaqp).
- Jonathan Quiza [RumiMLSpark](http://rumi-ml.herokuapp.com/).
- Jonathan Quiza [linkedin](https://www.linkedin.com/in/jonaqp/).
Raw data
{
"_id": null,
"home_page": "https://github.com/jonaqp/spark_datax_schema_tools/",
"name": "spark-datax-schema-tools",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "spark,datax,schema",
"author": "Jonathan Quiza",
"author_email": "jony327@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/0d/53/b337ecbfdd6e33c7ee56bffc469fb56b435b634a12d841d7e9f90e506e27/spark_datax_schema_tools-0.0.43.tar.gz",
"platform": null,
"description": "# spark_datax_schema_tools\r\n\r\n\r\n[![Github License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\r\n[![Updates](https://pyup.io/repos/github/woctezuma/google-colab-transfer/shield.svg)](pyup)\r\n[![Python 3](https://pyup.io/repos/github/woctezuma/google-colab-transfer/python-3-shield.svg)](pyup)\r\n[![Code coverage](https://codecov.io/gh/woctezuma/google-colab-transfer/branch/master/graph/badge.svg)](codecov)\r\n\r\n\r\n\r\n\r\nspark_datax_schema_tools is a Python library that implements for dataX schemas\r\n## Installation\r\n\r\nThe code is packaged for PyPI, so that the installation consists in running:\r\n```sh\r\npip install spark-datax-schema-tools \r\n```\r\n\r\n\r\n## Usage\r\n\r\nwrapper take schemas for DataX\r\n\r\n```sh\r\n\r\nexample1: (generate dummy_data)\r\n================================\r\nfrom spark_datax_schema_tools import generate_components\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_components(spark=spark,\r\n path_excel=\"/content/Summary RQ22021-HF1.xlsx\",\r\n uuaa_name=\"NZTG\",\r\n table_name=\"t_nztg_trade_core_inf_bo_eom\")\r\n\r\ndf2.show2()\r\n\r\n\r\n\r\nexample2: (generate transmission detail with schema json)\r\n============================================================\r\nfrom spark_datax_schema_tools import generate_transmission_holding\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_transmission_holding(spark=spark,\r\n uuaa_name=\"NZTG\",\r\n table_name=\"t_nztg_trade_core_inf_bo_eom\",\r\n table_version=\"0\",\r\n frequency=\"monthly\",\r\n group=\"CIB\",\r\n solution_model=\"CDD\",\r\n path_excel=\"Summary RQ22021-HF1.xlsx\")\r\n\r\n\r\nexample3: (generate transmission detail without schema json)\r\n============================================================\r\nfrom spark_datax_schema_tools import generate_transmission_holding\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_transmission_holding(spark=spark,\r\n uuaa_name=\"NZTG\",\r\n table_name=\"t_nztg_trade_core_inf_bo_eom\",\r\n table_version=\"0\",\r\n frequency=\"monthly\",\r\n group=\"CIB\",\r\n solution_model=\"CDD\")\r\n \r\n```\r\n```sh\r\nParameter functions\r\n===================\r\ngenerate_transmission_holding:\r\n frequency: [\"daily\", \"monthly\"]\r\n group : [\"CIB\", \"CLIENT_SOLUTIONS\", \"CORE_BANKING\", \"GLOBAL_DATA\", \"RISK_FINANCE\"]\r\n solution_model: [\"CIB\", \"CDD\"]\r\n\r\n\r\n```\r\n\r\n\r\n## License\r\n\r\n[Apache License 2.0](https://www.dropbox.com/s/8t6xtgk06o3ij61/LICENSE?dl=0).\r\n\r\n\r\n## New features v1.0\r\n\r\n \r\n## BugFix\r\n- choco install visualcpp-build-tools\r\n\r\n\r\n\r\n## Reference\r\n\r\n - Jonathan Quiza [github](https://github.com/jonaqp).\r\n - Jonathan Quiza [RumiMLSpark](http://rumi-ml.herokuapp.com/).\r\n - Jonathan Quiza [linkedin](https://www.linkedin.com/in/jonaqp/).\r\n",
"bugtrack_url": null,
"license": "",
"summary": "spark_datax_schema_tools",
"version": "0.0.43",
"split_keywords": [
"spark",
"datax",
"schema"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "697251a7eb050992b1bd7c43c35681c5",
"sha256": "5b27c3ce863bcd1fa5233ed3c4d2a0ed22f57f863e11653893faf9f31696f37b"
},
"downloads": -1,
"filename": "spark_datax_schema_tools-0.0.43-py3-none-any.whl",
"has_sig": false,
"md5_digest": "697251a7eb050992b1bd7c43c35681c5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 15787358,
"upload_time": "2022-12-21T16:52:48",
"upload_time_iso_8601": "2022-12-21T16:52:48.894284Z",
"url": "https://files.pythonhosted.org/packages/8b/78/89dfe8bc85584655e45ca1cdb9501d18db51149a9add38442a695d0f361e/spark_datax_schema_tools-0.0.43-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "88c62c9b36821dd09269e042d98c9aa0",
"sha256": "64c7231efd9060e4cbc607b3eabb4efd07abdf53ecd20d7fb880ce5a9d03e0f6"
},
"downloads": -1,
"filename": "spark_datax_schema_tools-0.0.43.tar.gz",
"has_sig": false,
"md5_digest": "88c62c9b36821dd09269e042d98c9aa0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 15575336,
"upload_time": "2022-12-21T16:53:02",
"upload_time_iso_8601": "2022-12-21T16:53:02.802564Z",
"url": "https://files.pythonhosted.org/packages/0d/53/b337ecbfdd6e33c7ee56bffc469fb56b435b634a12d841d7e9f90e506e27/spark_datax_schema_tools-0.0.43.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-21 16:53:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "jonaqp",
"github_project": "spark_datax_schema_tools",
"lcname": "spark-datax-schema-tools"
}