pyspark-testframework

Name	pyspark-testframework JSON
Version	2.4.1 JSON
	download
home_page	None
Summary	Testframework for PySpark DataFrames
upload_time	2024-10-10 08:44:06
maintainer	None
docs_url	None
author	None
requires_python	>=3.9.5
license	MIT License Copyright (c) 2024 Woonstad Rotterdam Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	pyspark dataframe test testframework
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            ![](https://img.shields.io/pypi/pyversions/pyspark-testframework)
![Build Status](https://github.com/woonstadrotterdam/pyspark-testframework/actions/workflows/cicd.yml/badge.svg)
[![Version](https://img.shields.io/pypi/v/pyspark-testframework)](https://pypi.org/project/pyspark-testframework/)
![](https://img.shields.io/github/license/woonstadrotterdam/pyspark-testframework)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

# pyspark-testframework

The goal of the `pyspark-testframework` is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.

# Tutorial

**Let's first create an example pyspark DataFrame**

The data will contain the primary keys, street names and house numbers of some addresses.

```python
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, IntegerType, StringType
from pyspark.sql import functions as F
```

```python
# Initialize Spark session
spark = SparkSession.builder.appName("PySparkTestFrameworkTutorial").getOrCreate()

# Define the schema
schema = StructType(
    [
        StructField("id", IntegerType(), True),
        StructField("street", StringType(), True),
        StructField("house_number", IntegerType(), True),
    ]
)

# Define the data
data = [
    (1, "Rochussenstraat", 27),
    (2, "Coolsingel", 31),
    (3, "%Witte de Withstraat", 27),
    (4, "Lijnbaan", -3),
    (5, None, 13),
]

df = spark.createDataFrame(data, schema)

df.show(truncate=False)
```

    +---+--------------------+------------+
    |id |street              |house_number|
    +---+--------------------+------------+
    |1  |Rochussenstraat     |27          |
    |2  |Coolsingel          |31          |
    |3  |%Witte de Withstraat|27          |
    |4  |Lijnbaan            |-3          |
    |5  |null                |13          |
    +---+--------------------+------------+

**Import and initialize the `DataFrameTester`**

```python
from testframework.dataquality import DataFrameTester
```

```python
df_tester = DataFrameTester(
    df=df,
    primary_key="id",
    spark=spark,
)
```

**Import configurable tests**

```python
from testframework.dataquality.tests import ValidNumericRange, RegexTest
```

**Initialize the `RegexTest` to test for valid street names**

```python
valid_street_format = RegexTest(
    name="ValidStreetFormat",
    pattern=r"^[A-Z][a-zéèáàëï]*([ -][A-Z]?[a-zéèáàëï]*)*$",
)
```

**Run `valid_street_format` on the _street_ column using the `.test()` method of `DataFrameTester`.**

```python
df_tester.test(
    col="street",
    test=valid_street_format,
    nullable=False,  # nullable is False, hence null values are converted to False
    description="Street is in valid Dutch street format.",
).show(truncate=False)
```

    +---+--------------------+-------------------------+
    |id |street              |street__ValidStreetFormat|
    +---+--------------------+-------------------------+
    |1  |Rochussenstraat     |true                     |
    |2  |Coolsingel          |true                     |
    |3  |%Witte de Withstraat|false                    |
    |4  |Lijnbaan            |true                     |
    |5  |null                |false                    |
    +---+--------------------+-------------------------+

**Run the `IntegerString` test on the _number_ column**

By setting the `return_failed_rows` parameter to `True`, we can get only the rows that failed the test.

```python
df_tester.test(
    col="house_number",
    test=ValidNumericRange(
        min_value=1,
    ),
    nullable=False,
    # description="House number is in a valid format" # optional, let's not define it for illustration purposes
    return_failed_rows=True,  # only return the failed rows
).show()
```

    +---+------------+-------------------------------+
    | id|house_number|house_number__ValidNumericRange|
    +---+------------+-------------------------------+
    |  4|          -3|                          false|
    +---+------------+-------------------------------+

**Let's take a look at the test results of the DataFrame using the `.results` attribute.**

```python
df_tester.results.show(truncate=False)
```

    +---+-------------------------+-------------------------------+
    |id |street__ValidStreetFormat|house_number__ValidNumericRange|
    +---+-------------------------+-------------------------------+
    |1  |true                     |true                           |
    |2  |true                     |true                           |
    |3  |false                    |true                           |
    |4  |true                     |false                          |
    |5  |false                    |true                           |
    +---+-------------------------+-------------------------------+

**We can use `.descriptions` or `.descriptions_df` to get the descriptions of the tests.**

<br>
This can be useful for reporting purposes.   
For example to create reports for the business with more detailed information than just the column name and the test name.

```python
df_tester.descriptions
```

    {'street__ValidStreetFormat': 'Street is in valid Dutch street format.',
     'house_number__ValidNumericRange': 'house_number__ValidNumericRange(min_value=1.0, max_value=inf)'}

```python
df_tester.description_df.show(truncate=False)
```

    +-------------------------------+-------------------------------------------------------------+
    |test                           |description                                                  |
    +-------------------------------+-------------------------------------------------------------+
    |street__ValidStreetFormat      |Street is in valid Dutch street format.                      |
    |house_number__ValidNumericRange|house_number__ValidNumericRange(min_value=1.0, max_value=inf)|
    +-------------------------------+-------------------------------------------------------------+

### Custom tests

Sometimes tests are too specific or complex to be covered by the configurable tests. That's why we can create custom tests and add them to the `DataFrameTester` object.

Let's do this using a custom test which should tests that every house has a bath room. We'll start by creating a new DataFrame with rooms rather than houses.

```python
rooms = [
    (1,1, "living room"),
    (2,1, "bathroom"),
    (3,1, "kitchen"),
    (4,1, "bed room"),
    (5,2, "living room"),
    (6,2, "bed room"),
    (7,2, "kitchen"),
]

schema_rooms = StructType(
    [   StructField("id", IntegerType(), True),
        StructField("house_id", IntegerType(), True),
        StructField("room", StringType(), True),
    ]
)

room_df = spark.createDataFrame(rooms, schema=schema_rooms)

room_df.show(truncate=False)
```

    +---+--------+-----------+
    |id |house_id|room       |
    +---+--------+-----------+
    |1  |1       |living room|
    |2  |1       |bathroom   |
    |3  |1       |kitchen    |
    |4  |1       |bed room   |
    |5  |2       |living room|
    |6  |2       |bed room   |
    |7  |2       |kitchen    |
    +---+--------+-----------+

To create a custom test, we should create a pyspark DataFrame which contains the same primary_key column as the DataFrame to be tested using the `DataFrameTester`.

Let's create a boolean column that indicates whether the house has a bath room or not.

```python
house_has_bathroom = room_df.groupBy("house_id").agg(
    F.max(F.when(F.col("room") == "bathroom", True).otherwise(False)).alias(
        "has_bathroom"
    )
)

house_has_bathroom.show(truncate=False)
```

    +--------+------------+
    |house_id|has_bathroom|
    +--------+------------+
    |1       |true        |
    |2       |false       |
    +--------+------------+

**We can add this 'custom test' to the `DataFrameTester` using `add_custom_test_result`.**

In the background, all kinds of data validation checks are done by `DataFrameTester` to make sure that it fits the requirements to be added to the other test results.

```python
df_tester.add_custom_test_result(
    result=house_has_bathroom.withColumnRenamed("house_id", "id"),
    name="has_bathroom",
    description="House has a bathroom",
    # fillna_value=0, # optional; by default null.
).show(truncate=False)
```

    +---+------------+
    |id |has_bathroom|
    +---+------------+
    |1  |true        |
    |2  |false       |
    |3  |null        |
    |4  |null        |
    |5  |null        |
    +---+------------+

**Despite that the data whether a house has a bath room is not available in the house DataFrame; we can still add the custom test to the `DataFrameTester` object.**

```python
df_tester.results.show(truncate=False)
```

    +---+-------------------------+-------------------------------+------------+
    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|
    +---+-------------------------+-------------------------------+------------+
    |1  |true                     |true                           |true        |
    |2  |true                     |true                           |false       |
    |3  |false                    |true                           |null        |
    |4  |true                     |false                          |null        |
    |5  |false                    |true                           |null        |
    +---+-------------------------+-------------------------------+------------+

```python
df_tester.descriptions
```

    {'street__ValidStreetFormat': 'Street is in valid Dutch street format.',
     'house_number__ValidNumericRange': 'house_number__ValidNumericRange(min_value=1.0, max_value=inf)',
     'has_bathroom': 'House has a bathroom'}

**We can also get a summary of the test results using the `.summary` attribute.**

```python
df_tester.summary.show(truncate=False)
```

    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+
    |test                           |description                                                  |n_tests|n_passed|percentage_passed|n_failed|percentage_failed|
    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+
    |street__ValidStreetFormat      |Street is in valid Dutch street format.                      |5      |3       |60.0             |2       |40.0             |
    |house_number__ValidNumericRange|house_number__ValidNumericRange(min_value=1.0, max_value=inf)|5      |4       |80.0             |1       |20.0             |
    |has_bathroom                   |House has a bathroom                                         |2      |1       |50.0             |1       |50.0             |
    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+

**If you want to see all rows that failed any of the tests, you can use the `.failed_tests` attribute.**

```python
df_tester.failed_tests.show(truncate=False)
```

    +---+-------------------------+-------------------------------+------------+
    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|
    +---+-------------------------+-------------------------------+------------+
    |2  |true                     |true                           |false       |
    |3  |false                    |true                           |null        |
    |4  |true                     |false                          |null        |
    |5  |false                    |true                           |null        |
    +---+-------------------------+-------------------------------+------------+

**Of course, you can also see all rows that passed all tests using the `.passed_tests` attribute.**

```python
df_tester.passed_tests.show(truncate=False)
```

    +---+-------------------------+-------------------------------+------------+
    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|
    +---+-------------------------+-------------------------------+------------+
    |1  |true                     |true                           |true        |
    +---+-------------------------+-------------------------------+------------+

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pyspark-testframework",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9.5",
    "maintainer_email": null,
    "keywords": "pyspark, dataframe, test, testframework",
    "author": null,
    "author_email": "Woonstad Rotterdam <info@woonstadrotterdam.nl>, Tomer Gabay <tomer.gabay@woonstadrotterdam.nl>, Vincent van der Meij <vincent.van.der.meij@woonstadrotterdam.nl>, Tiddo Loos <tiddo.loos@woonstadrotterdam.nl>, Ben Verhees <ben.verhees@woonstadrotterdam.nl>",
    "download_url": "https://files.pythonhosted.org/packages/4b/29/262bb12e303772340da75567394f6575123e550b105d3994dac60e1d4e68/pyspark_testframework-2.4.1.tar.gz",
    "platform": null,
    "description": "![](https://img.shields.io/pypi/pyversions/pyspark-testframework)\n![Build Status](https://github.com/woonstadrotterdam/pyspark-testframework/actions/workflows/cicd.yml/badge.svg)\n[![Version](https://img.shields.io/pypi/v/pyspark-testframework)](https://pypi.org/project/pyspark-testframework/)\n![](https://img.shields.io/github/license/woonstadrotterdam/pyspark-testframework)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n\n# pyspark-testframework\n\nThe goal of the `pyspark-testframework` is to provide a simple way to create tests for PySpark DataFrames. The test results are returned in DataFrame format as well.\n\n# Tutorial\n\n**Let's first create an example pyspark DataFrame**\n\nThe data will contain the primary keys, street names and house numbers of some addresses.\n\n```python\nfrom pyspark.sql import SparkSession\nfrom pyspark.sql.types import StructType, StructField, IntegerType, StringType\nfrom pyspark.sql import functions as F\n```\n\n```python\n# Initialize Spark session\nspark = SparkSession.builder.appName(\"PySparkTestFrameworkTutorial\").getOrCreate()\n\n# Define the schema\nschema = StructType(\n    [\n        StructField(\"id\", IntegerType(), True),\n        StructField(\"street\", StringType(), True),\n        StructField(\"house_number\", IntegerType(), True),\n    ]\n)\n\n# Define the data\ndata = [\n    (1, \"Rochussenstraat\", 27),\n    (2, \"Coolsingel\", 31),\n    (3, \"%Witte de Withstraat\", 27),\n    (4, \"Lijnbaan\", -3),\n    (5, None, 13),\n]\n\ndf = spark.createDataFrame(data, schema)\n\ndf.show(truncate=False)\n```\n\n    +---+--------------------+------------+\n    |id |street              |house_number|\n    +---+--------------------+------------+\n    |1  |Rochussenstraat     |27          |\n    |2  |Coolsingel          |31          |\n    |3  |%Witte de Withstraat|27          |\n    |4  |Lijnbaan            |-3          |\n    |5  |null                |13          |\n    +---+--------------------+------------+\n\n**Import and initialize the `DataFrameTester`**\n\n```python\nfrom testframework.dataquality import DataFrameTester\n```\n\n```python\ndf_tester = DataFrameTester(\n    df=df,\n    primary_key=\"id\",\n    spark=spark,\n)\n```\n\n**Import configurable tests**\n\n```python\nfrom testframework.dataquality.tests import ValidNumericRange, RegexTest\n```\n\n**Initialize the `RegexTest` to test for valid street names**\n\n```python\nvalid_street_format = RegexTest(\n    name=\"ValidStreetFormat\",\n    pattern=r\"^[A-Z][a-z\u00e9\u00e8\u00e1\u00e0\u00eb\u00ef]*([ -][A-Z]?[a-z\u00e9\u00e8\u00e1\u00e0\u00eb\u00ef]*)*$\",\n)\n```\n\n**Run `valid_street_format` on the _street_ column using the `.test()` method of `DataFrameTester`.**\n\n```python\ndf_tester.test(\n    col=\"street\",\n    test=valid_street_format,\n    nullable=False,  # nullable is False, hence null values are converted to False\n    description=\"Street is in valid Dutch street format.\",\n).show(truncate=False)\n```\n\n    +---+--------------------+-------------------------+\n    |id |street              |street__ValidStreetFormat|\n    +---+--------------------+-------------------------+\n    |1  |Rochussenstraat     |true                     |\n    |2  |Coolsingel          |true                     |\n    |3  |%Witte de Withstraat|false                    |\n    |4  |Lijnbaan            |true                     |\n    |5  |null                |false                    |\n    +---+--------------------+-------------------------+\n\n**Run the `IntegerString` test on the _number_ column**\n\nBy setting the `return_failed_rows` parameter to `True`, we can get only the rows that failed the test.\n\n```python\ndf_tester.test(\n    col=\"house_number\",\n    test=ValidNumericRange(\n        min_value=1,\n    ),\n    nullable=False,\n    # description=\"House number is in a valid format\" # optional, let's not define it for illustration purposes\n    return_failed_rows=True,  # only return the failed rows\n).show()\n```\n\n    +---+------------+-------------------------------+\n    | id|house_number|house_number__ValidNumericRange|\n    +---+------------+-------------------------------+\n    |  4|          -3|                          false|\n    +---+------------+-------------------------------+\n\n**Let's take a look at the test results of the DataFrame using the `.results` attribute.**\n\n```python\ndf_tester.results.show(truncate=False)\n```\n\n    +---+-------------------------+-------------------------------+\n    |id |street__ValidStreetFormat|house_number__ValidNumericRange|\n    +---+-------------------------+-------------------------------+\n    |1  |true                     |true                           |\n    |2  |true                     |true                           |\n    |3  |false                    |true                           |\n    |4  |true                     |false                          |\n    |5  |false                    |true                           |\n    +---+-------------------------+-------------------------------+\n\n**We can use `.descriptions` or `.descriptions_df` to get the descriptions of the tests.**\n\n<br>\nThis can be useful for reporting purposes.   \nFor example to create reports for the business with more detailed information than just the column name and the test name.\n\n```python\ndf_tester.descriptions\n```\n\n    {'street__ValidStreetFormat': 'Street is in valid Dutch street format.',\n     'house_number__ValidNumericRange': 'house_number__ValidNumericRange(min_value=1.0, max_value=inf)'}\n\n```python\ndf_tester.description_df.show(truncate=False)\n```\n\n    +-------------------------------+-------------------------------------------------------------+\n    |test                           |description                                                  |\n    +-------------------------------+-------------------------------------------------------------+\n    |street__ValidStreetFormat      |Street is in valid Dutch street format.                      |\n    |house_number__ValidNumericRange|house_number__ValidNumericRange(min_value=1.0, max_value=inf)|\n    +-------------------------------+-------------------------------------------------------------+\n\n### Custom tests\n\nSometimes tests are too specific or complex to be covered by the configurable tests. That's why we can create custom tests and add them to the `DataFrameTester` object.\n\nLet's do this using a custom test which should tests that every house has a bath room. We'll start by creating a new DataFrame with rooms rather than houses.\n\n```python\nrooms = [\n    (1,1, \"living room\"),\n    (2,1, \"bathroom\"),\n    (3,1, \"kitchen\"),\n    (4,1, \"bed room\"),\n    (5,2, \"living room\"),\n    (6,2, \"bed room\"),\n    (7,2, \"kitchen\"),\n]\n\nschema_rooms = StructType(\n    [   StructField(\"id\", IntegerType(), True),\n        StructField(\"house_id\", IntegerType(), True),\n        StructField(\"room\", StringType(), True),\n    ]\n)\n\nroom_df = spark.createDataFrame(rooms, schema=schema_rooms)\n\nroom_df.show(truncate=False)\n```\n\n    +---+--------+-----------+\n    |id |house_id|room       |\n    +---+--------+-----------+\n    |1  |1       |living room|\n    |2  |1       |bathroom   |\n    |3  |1       |kitchen    |\n    |4  |1       |bed room   |\n    |5  |2       |living room|\n    |6  |2       |bed room   |\n    |7  |2       |kitchen    |\n    +---+--------+-----------+\n\nTo create a custom test, we should create a pyspark DataFrame which contains the same primary_key column as the DataFrame to be tested using the `DataFrameTester`.\n\nLet's create a boolean column that indicates whether the house has a bath room or not.\n\n```python\nhouse_has_bathroom = room_df.groupBy(\"house_id\").agg(\n    F.max(F.when(F.col(\"room\") == \"bathroom\", True).otherwise(False)).alias(\n        \"has_bathroom\"\n    )\n)\n\nhouse_has_bathroom.show(truncate=False)\n```\n\n    +--------+------------+\n    |house_id|has_bathroom|\n    +--------+------------+\n    |1       |true        |\n    |2       |false       |\n    +--------+------------+\n\n**We can add this 'custom test' to the `DataFrameTester` using `add_custom_test_result`.**\n\nIn the background, all kinds of data validation checks are done by `DataFrameTester` to make sure that it fits the requirements to be added to the other test results.\n\n```python\ndf_tester.add_custom_test_result(\n    result=house_has_bathroom.withColumnRenamed(\"house_id\", \"id\"),\n    name=\"has_bathroom\",\n    description=\"House has a bathroom\",\n    # fillna_value=0, # optional; by default null.\n).show(truncate=False)\n```\n\n    +---+------------+\n    |id |has_bathroom|\n    +---+------------+\n    |1  |true        |\n    |2  |false       |\n    |3  |null        |\n    |4  |null        |\n    |5  |null        |\n    +---+------------+\n\n**Despite that the data whether a house has a bath room is not available in the house DataFrame; we can still add the custom test to the `DataFrameTester` object.**\n\n```python\ndf_tester.results.show(truncate=False)\n```\n\n    +---+-------------------------+-------------------------------+------------+\n    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|\n    +---+-------------------------+-------------------------------+------------+\n    |1  |true                     |true                           |true        |\n    |2  |true                     |true                           |false       |\n    |3  |false                    |true                           |null        |\n    |4  |true                     |false                          |null        |\n    |5  |false                    |true                           |null        |\n    +---+-------------------------+-------------------------------+------------+\n\n```python\ndf_tester.descriptions\n```\n\n    {'street__ValidStreetFormat': 'Street is in valid Dutch street format.',\n     'house_number__ValidNumericRange': 'house_number__ValidNumericRange(min_value=1.0, max_value=inf)',\n     'has_bathroom': 'House has a bathroom'}\n\n**We can also get a summary of the test results using the `.summary` attribute.**\n\n```python\ndf_tester.summary.show(truncate=False)\n```\n\n    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+\n    |test                           |description                                                  |n_tests|n_passed|percentage_passed|n_failed|percentage_failed|\n    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+\n    |street__ValidStreetFormat      |Street is in valid Dutch street format.                      |5      |3       |60.0             |2       |40.0             |\n    |house_number__ValidNumericRange|house_number__ValidNumericRange(min_value=1.0, max_value=inf)|5      |4       |80.0             |1       |20.0             |\n    |has_bathroom                   |House has a bathroom                                         |2      |1       |50.0             |1       |50.0             |\n    +-------------------------------+-------------------------------------------------------------+-------+--------+-----------------+--------+-----------------+\n\n**If you want to see all rows that failed any of the tests, you can use the `.failed_tests` attribute.**\n\n```python\ndf_tester.failed_tests.show(truncate=False)\n```\n\n    +---+-------------------------+-------------------------------+------------+\n    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|\n    +---+-------------------------+-------------------------------+------------+\n    |2  |true                     |true                           |false       |\n    |3  |false                    |true                           |null        |\n    |4  |true                     |false                          |null        |\n    |5  |false                    |true                           |null        |\n    +---+-------------------------+-------------------------------+------------+\n\n**Of course, you can also see all rows that passed all tests using the `.passed_tests` attribute.**\n\n```python\ndf_tester.passed_tests.show(truncate=False)\n```\n\n    +---+-------------------------+-------------------------------+------------+\n    |id |street__ValidStreetFormat|house_number__ValidNumericRange|has_bathroom|\n    +---+-------------------------+-------------------------------+------------+\n    |1  |true                     |true                           |true        |\n    +---+-------------------------+-------------------------------+------------+\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Woonstad Rotterdam  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Testframework for PySpark DataFrames",
    "version": "2.4.1",
    "project_urls": {
        "Homepage": "https://github.com/woonstadrotterdam/pyspark-testframework",
        "Issues": "https://github.com/woonstadrotterdam/pyspark-testframework/issues"
    },
    "split_keywords": [
        "pyspark",
        " dataframe",
        " test",
        " testframework"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "95696e8828639daaa24071da96c67e13635eece3d2c38f3bfb07d9ea59073278",
                "md5": "d6cb44cdec390c66095e613f3044eb27",
                "sha256": "e39b0764cadff1c0c1650e169cc589f66072c44cf847a0e62ab5bd606ac85b49"
            },
            "downloads": -1,
            "filename": "pyspark_testframework-2.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d6cb44cdec390c66095e613f3044eb27",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9.5",
            "size": 18426,
            "upload_time": "2024-10-10T08:44:05",
            "upload_time_iso_8601": "2024-10-10T08:44:05.307065Z",
            "url": "https://files.pythonhosted.org/packages/95/69/6e8828639daaa24071da96c67e13635eece3d2c38f3bfb07d9ea59073278/pyspark_testframework-2.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4b29262bb12e303772340da75567394f6575123e550b105d3994dac60e1d4e68",
                "md5": "58a21ce5f78af14c544da94f4ab0cf89",
                "sha256": "6b4aaa18dbce01fdbb99d6b406841a75f2b78daff86b74e4216c3027e2fbd553"
            },
            "downloads": -1,
            "filename": "pyspark_testframework-2.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "58a21ce5f78af14c544da94f4ab0cf89",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9.5",
            "size": 28291,
            "upload_time": "2024-10-10T08:44:06",
            "upload_time_iso_8601": "2024-10-10T08:44:06.298248Z",
            "url": "https://files.pythonhosted.org/packages/4b/29/262bb12e303772340da75567394f6575123e550b105d3994dac60e1d4e68/pyspark_testframework-2.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-10 08:44:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "woonstadrotterdam",
    "github_project": "pyspark-testframework",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pyspark-testframework"
}

None