spark-datax-schema-tools


Namespark-datax-schema-tools JSON
Version 0.0.43 PyPI version JSON
download
home_pagehttps://github.com/jonaqp/spark_datax_schema_tools/
Summaryspark_datax_schema_tools
upload_time2022-12-21 16:53:02
maintainer
docs_urlNone
authorJonathan Quiza
requires_python
license
keywords spark datax schema
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # spark_datax_schema_tools


[![Github License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Updates](https://pyup.io/repos/github/woctezuma/google-colab-transfer/shield.svg)](pyup)
[![Python 3](https://pyup.io/repos/github/woctezuma/google-colab-transfer/python-3-shield.svg)](pyup)
[![Code coverage](https://codecov.io/gh/woctezuma/google-colab-transfer/branch/master/graph/badge.svg)](codecov)




spark_datax_schema_tools is a Python library that implements for dataX schemas
## Installation

The code is packaged for PyPI, so that the installation consists in running:
```sh
pip install spark-datax-schema-tools 
```


## Usage

wrapper take schemas for DataX

```sh

example1: (generate dummy_data)
================================
from spark_datax_schema_tools import generate_components
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_components(spark=spark,
                          path_excel="/content/Summary RQ22021-HF1.xlsx",
                          uuaa_name="NZTG",
                          table_name="t_nztg_trade_core_inf_bo_eom")

df2.show2()



example2: (generate transmission detail with schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
                                    uuaa_name="NZTG",
                                    table_name="t_nztg_trade_core_inf_bo_eom",
                                    table_version="0",
                                    frequency="monthly",
                                    group="CIB",
                                    solution_model="CDD",
                                    path_excel="Summary RQ22021-HF1.xlsx")


example3: (generate transmission detail without schema json)
============================================================
from spark_datax_schema_tools import generate_transmission_holding
from pyspark.sql import SparkSession

spark = SparkSession.builder.master("local[*]").appName("SparkAPP").getOrCreate()
df2 = generate_transmission_holding(spark=spark,
                                    uuaa_name="NZTG",
                                    table_name="t_nztg_trade_core_inf_bo_eom",
                                    table_version="0",
                                    frequency="monthly",
                                    group="CIB",
                                    solution_model="CDD")
                                                                                                    
```
```sh
Parameter functions
===================
generate_transmission_holding:
  frequency: ["daily", "monthly"]
  group : ["CIB", "CLIENT_SOLUTIONS", "CORE_BANKING", "GLOBAL_DATA", "RISK_FINANCE"]
  solution_model: ["CIB", "CDD"]


```


## License

[Apache License 2.0](https://www.dropbox.com/s/8t6xtgk06o3ij61/LICENSE?dl=0).


## New features v1.0

 
## BugFix
- choco install visualcpp-build-tools



## Reference

 - Jonathan Quiza [github](https://github.com/jonaqp).
 - Jonathan Quiza [RumiMLSpark](http://rumi-ml.herokuapp.com/).
 - Jonathan Quiza [linkedin](https://www.linkedin.com/in/jonaqp/).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jonaqp/spark_datax_schema_tools/",
    "name": "spark-datax-schema-tools",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "spark,datax,schema",
    "author": "Jonathan Quiza",
    "author_email": "jony327@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/0d/53/b337ecbfdd6e33c7ee56bffc469fb56b435b634a12d841d7e9f90e506e27/spark_datax_schema_tools-0.0.43.tar.gz",
    "platform": null,
    "description": "# spark_datax_schema_tools\r\n\r\n\r\n[![Github License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\r\n[![Updates](https://pyup.io/repos/github/woctezuma/google-colab-transfer/shield.svg)](pyup)\r\n[![Python 3](https://pyup.io/repos/github/woctezuma/google-colab-transfer/python-3-shield.svg)](pyup)\r\n[![Code coverage](https://codecov.io/gh/woctezuma/google-colab-transfer/branch/master/graph/badge.svg)](codecov)\r\n\r\n\r\n\r\n\r\nspark_datax_schema_tools is a Python library that implements for dataX schemas\r\n## Installation\r\n\r\nThe code is packaged for PyPI, so that the installation consists in running:\r\n```sh\r\npip install spark-datax-schema-tools \r\n```\r\n\r\n\r\n## Usage\r\n\r\nwrapper take schemas for DataX\r\n\r\n```sh\r\n\r\nexample1: (generate dummy_data)\r\n================================\r\nfrom spark_datax_schema_tools import generate_components\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_components(spark=spark,\r\n                          path_excel=\"/content/Summary RQ22021-HF1.xlsx\",\r\n                          uuaa_name=\"NZTG\",\r\n                          table_name=\"t_nztg_trade_core_inf_bo_eom\")\r\n\r\ndf2.show2()\r\n\r\n\r\n\r\nexample2: (generate transmission detail with schema json)\r\n============================================================\r\nfrom spark_datax_schema_tools import generate_transmission_holding\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_transmission_holding(spark=spark,\r\n                                    uuaa_name=\"NZTG\",\r\n                                    table_name=\"t_nztg_trade_core_inf_bo_eom\",\r\n                                    table_version=\"0\",\r\n                                    frequency=\"monthly\",\r\n                                    group=\"CIB\",\r\n                                    solution_model=\"CDD\",\r\n                                    path_excel=\"Summary RQ22021-HF1.xlsx\")\r\n\r\n\r\nexample3: (generate transmission detail without schema json)\r\n============================================================\r\nfrom spark_datax_schema_tools import generate_transmission_holding\r\nfrom pyspark.sql import SparkSession\r\n\r\nspark = SparkSession.builder.master(\"local[*]\").appName(\"SparkAPP\").getOrCreate()\r\ndf2 = generate_transmission_holding(spark=spark,\r\n                                    uuaa_name=\"NZTG\",\r\n                                    table_name=\"t_nztg_trade_core_inf_bo_eom\",\r\n                                    table_version=\"0\",\r\n                                    frequency=\"monthly\",\r\n                                    group=\"CIB\",\r\n                                    solution_model=\"CDD\")\r\n                                                                                                    \r\n```\r\n```sh\r\nParameter functions\r\n===================\r\ngenerate_transmission_holding:\r\n  frequency: [\"daily\", \"monthly\"]\r\n  group : [\"CIB\", \"CLIENT_SOLUTIONS\", \"CORE_BANKING\", \"GLOBAL_DATA\", \"RISK_FINANCE\"]\r\n  solution_model: [\"CIB\", \"CDD\"]\r\n\r\n\r\n```\r\n\r\n\r\n## License\r\n\r\n[Apache License 2.0](https://www.dropbox.com/s/8t6xtgk06o3ij61/LICENSE?dl=0).\r\n\r\n\r\n## New features v1.0\r\n\r\n \r\n## BugFix\r\n- choco install visualcpp-build-tools\r\n\r\n\r\n\r\n## Reference\r\n\r\n - Jonathan Quiza [github](https://github.com/jonaqp).\r\n - Jonathan Quiza [RumiMLSpark](http://rumi-ml.herokuapp.com/).\r\n - Jonathan Quiza [linkedin](https://www.linkedin.com/in/jonaqp/).\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "spark_datax_schema_tools",
    "version": "0.0.43",
    "split_keywords": [
        "spark",
        "datax",
        "schema"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "697251a7eb050992b1bd7c43c35681c5",
                "sha256": "5b27c3ce863bcd1fa5233ed3c4d2a0ed22f57f863e11653893faf9f31696f37b"
            },
            "downloads": -1,
            "filename": "spark_datax_schema_tools-0.0.43-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "697251a7eb050992b1bd7c43c35681c5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 15787358,
            "upload_time": "2022-12-21T16:52:48",
            "upload_time_iso_8601": "2022-12-21T16:52:48.894284Z",
            "url": "https://files.pythonhosted.org/packages/8b/78/89dfe8bc85584655e45ca1cdb9501d18db51149a9add38442a695d0f361e/spark_datax_schema_tools-0.0.43-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "88c62c9b36821dd09269e042d98c9aa0",
                "sha256": "64c7231efd9060e4cbc607b3eabb4efd07abdf53ecd20d7fb880ce5a9d03e0f6"
            },
            "downloads": -1,
            "filename": "spark_datax_schema_tools-0.0.43.tar.gz",
            "has_sig": false,
            "md5_digest": "88c62c9b36821dd09269e042d98c9aa0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15575336,
            "upload_time": "2022-12-21T16:53:02",
            "upload_time_iso_8601": "2022-12-21T16:53:02.802564Z",
            "url": "https://files.pythonhosted.org/packages/0d/53/b337ecbfdd6e33c7ee56bffc469fb56b435b634a12d841d7e9f90e506e27/spark_datax_schema_tools-0.0.43.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-21 16:53:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "jonaqp",
    "github_project": "spark_datax_schema_tools",
    "lcname": "spark-datax-schema-tools"
}
        
Elapsed time: 0.02322s