ophelian


Nameophelian JSON
Version 0.1.4 PyPI version JSON
download
home_pagehttps://github.com/LuisFalva/ophelia
SummaryOphelian is a go-to framework for seamlessly putting ML & AI prototypes into production.
upload_time2024-07-23 09:15:45
maintainerNone
docs_urlNone
authorLuis Falva
requires_python<3.12,>=3.9
licenseFree for non-commercial use
keywords ophelian ophelian-on-mars
VCS
bugtrack_url
requirements absl-py astunparse certifi charset-normalizer click cloudpickle colorama dask-expr dask dask flatbuffers fsspec gast google-pasta grpcio h5py idna importlib-metadata joblib keras libclang llvmlite locket markdown-it-py markdown markupsafe mdurl ml-dtypes namex numba numpy opt-einsum optree packaging pandas partd protobuf py4j pyarrow pygments pyhocon pyparsing pyspark python-dateutil pytz pyyaml quadprog requests rich scikit-learn scipy setuptools shap six slicer tensorboard-data-server tensorboard tensorflow-io-gcs-filesystem tensorflow termcolor threadpoolctl toolz tqdm typing-extensions tzdata urllib3 werkzeug wheel wrapt zipp
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center" style="padding: 20px;">
  <img src="docs/img/ophelian-ai-sticker.png" alt="Ophelian AI Sticker" width="200" style="margin-top: 20px;">

  <p style="color: #FFF; font-family: 'Helvetica Neue', Arial, sans-serif; text-align: center; max-width: 600px; margin: 20px auto; font-weight: bold;">
  </p>
  
  <div style="margin-top: 20px;">
    <a href="https://pypi.org/project/ophelian/" style="margin-right: 10px;">
      <img src="https://img.shields.io/pypi/v/ophelian.svg" alt="PyPI">
    </a>
    <a href="https://hub.docker.com/r/luisfalva/ophelian" style="margin-right: 10px;">
      <img src="https://img.shields.io/docker/v/luisfalva/ophelian?sort=semver" alt="Docker Hub">
    </a>
    <a href="https://github.com/LuisFalva/ophelia/actions/workflows/release.yml" style="margin-right: 10px;">
      <img src="https://github.com/LuisFalva/ophelia/actions/workflows/release.yml/badge.svg" alt="Release Build Status">
    </a>
    <a href="https://github.com/LuisFalva/ophelia/actions/workflows/docker-image.yml" style="margin-right: 10px;">
      <img src="https://github.com/LuisFalva/ophelia/actions/workflows/docker-image.yml/badge.svg" alt="Docker Image Build Status">
    </a>
    <a href="https://ophelian.readme.io/">
      <img src="https://img.shields.io/badge/docs-Documentation-orange.svg" alt="Docs">
    </a>
  </div>
</div>

# Ophelian On Mars

---

'*Ophelian On Mars*' πŸ‘½, the ultimate destination for ML, Data Science, and AI professionals. Your go-to framework for seamlessly putting ML prototypes into productionβ€”where everyone wants to be, but only a few succeed.

# πŸš€ Motivations

As data professionals, we aim to minimize the time spent deciphering the intricacies of PySpark's framework. Often, we seek a straightforward, Pandas-style approach to compute tasks without delving into highly optimized Spark code. 

To address this need, Ophelian was created with the following goals:

- **Simplicity**: Provide a simple and intuitive way to perform data computations, emulating the ease of Pandas.
- **Efficiency**: Wrap common patterns for data extraction and transformation in a single entry function that ensures Spark-optimized performance.
- **Code Reduction**: Significantly reduce the amount of code required by leveraging a set of Spark optimization techniques for query execution.
- **Streamlined ML Pipelines**: Facilitate the lifecycle of any PySpark ML pipeline by incorporating optimized methods and reducing redundant coding efforts.

By focusing on these motivations, Ophelian aims to enhance productivity and efficiency for data engineers and scientists, allowing them to concentrate on their core tasks without worrying about underlying Spark optimizations.

# πŸ“ Generalized ML Features

Ophelian focuses on creating robust and efficient machine learning (ML) pipelines, making them easily replicable and secure for various ML tasks. Key features include optimized techniques for handling data skewness, user-friendly interfaces for building custom models, and streamlined data mining pipelines with Ophelian pipeline wrappers. Additionally, it functions as an emulator of NumPy and Pandas, offering similar functionalities for a seamless user experience. Below are the detailed features:

- **Framework for Building ML Pipelines**: Simplified and secure methods to construct ML pipelines using PySpark, ensuring replication and robustness.
- **Optimized Techniques for Data Skewness and Partitioning**: Embedded strategies to address and mitigate data skewness issues, improving model performance and accuracy.
- **Build Your Own Models (BYOM)**: User-friendly software for constructing custom models and data mining pipelines, leveraging frameworks like PySpark, Beam, Flink, PyTorch, and more, with Ophelian native wrappers for enhanced syntax flexibility and efficiency.
- **NumPy and Pandas Functionality Syntax Emulation**: Emulates the functions and features of NumPy and Pandas, making it intuitive and easy for users familiar with these libraries to transition and utilize similar functionalities within an ML pipeline.

These features empower users with the tools they need to handle complex ML tasks effectively, ensuring a seamless experience from data processing to model deployment. users with the tools they need to handle complex machine learning tasks effectively, ensuring a seamless experience from data processing to model deployment.

# Getting Started:

### πŸ“œ Requirements

Before starting, you'll need to have installed: 
- pyspark >= 3.0.x
- pandas >= 1.1.3
- numpy >= 1.19.1
- dask >= 2.30.x
- scikit-learn >= 0.23.x

Additionally, if you want to use the Ophelian packages, you'll also need Python (supported 3.7 and 3.8 versions) and pip installed.

### πŸ›  Install Ophelian pypi package

Just drop a pip install to the `Ophelian` pypi registry and import `Ophelian`:
```sh
pip install ophelian==0.1.4
```

### πŸ“¦ Importing and initializing Ophelian

To initialize `Ophelian` with Spark embedded session use:

```python
from ophelian.start import OphelianSession
ophelian = OphelianSession("Spark App Name")
sc = ophelian.Spark.build_spark_context()
  ____          _            _  _               
 / __ \        | |          | |(_)              
| |  | | _ __  | |__    ___ | | _   __ _  _ __  
| |  | || '_ \ | '_ \  / _ \| || | / _` || '_ \ 
| |__| || |_) || | | ||  __/| || || (_| || | | |
 \____/ | .__/ |_| |_| \___||_||_| \__,_||_| |_|
        | |                                     
        |_|                                     
  ____         
 / __ \        
| |  | | _ __  
| |  | || '_ \ 
| |__| || | | |
 \____/ |_| |_|       
               
 __  __                    _ 
|  \/  |                  | |
| \  / |  __ _  _ __  ___ | |
| |\/| | / _` || '__|/ __|| |
| |  | || (_| || |   \__ \|_|
|_|  |_| \__,_||_|   |___/(_)

```
Main class objects provided by initializing Ophelia session:

- `read` & `write`

```python
from ophelian.ophelian_spark.read.spark_read import Read
from ophelian.ophelian_spark.write.spark_write import Write
```
- `generic` & `functions`

```python
from ophelian.ophelian_spark.functions import (
  Shape, Rolling, Reshape, CorrMat, CrossTabular, 
  PctChange, Selects, DynamicSampling
)
from ophelian.ophelian_spark.generic import (
  split_date, row_index, lag_min_max_data, regex_expr, remove_duplicate_element,
  year_array, dates_index, sorted_date_list, feature_pick, binary_search,
  century_from_year, simple_average, delta_series, simple_moving_average, average,
  weight_moving_average, single_exp_smooth, double_exp_smooth, initial_seasonal_components,
  triple_exp_smooth, row_indexing, string_match
)
```
- ML package for `unsupervised`, `sampling` and `feature_miner` objects

```python
from ophelian.ophelian_spark.ml.sampling.synthetic_sample import SyntheticSample
from ophelian.ophelian_spark.ml.unsupervised.feature import PCAnalysis, SingularVD
from ophelian.ophelian_spark.ml.feature_miner import (
  BuildStringIndex, BuildOneHotEncoder, 
  BuildVectorAssembler, BuildStandardScaler,
  SparkToNumpy, NumpyToVector
)
```

Let me show you some application examples:

The `Read` class implements Spark reading object in multiple formats `{'csv', 'parquet', 'excel', 'json'}`

```python
from ophelian.ophelian_spark.read.spark_read import Read

spark_df = spark.readFile(path, 'csv', header=True, infer_schema=True)
```

Also, you may import class `Shape` from factory `functions` in order to see the dimension of our spark DataFrame such as numpy style.

```python
from ophelian.ophelian_spark.functions import Shape

dic = {
    'Product': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
    'Year': [2010, 2010, 2010, 2011, 2011, 2011, 2012, 2012, 2012],
    'Revenue': [100, 200, 300, 110, 190, 320, 120, 220, 350]
}
dic_to_df = spark.createDataFrame(pd.DataFrame(data=dic))
dic_to_df.show(10, False)

+---------+------------+-----------+
| Product |    Year    |  Revenue  |
+---------+------------+-----------+
|    A    |    2010    |    100    |
|    B    |    2010    |    200    |
|    C    |    2010    |    300    |
|    A    |    2011    |    110    |
|    B    |    2011    |    190    |
|    C    |    2011    |    320    |
|    A    |    2012    |    120    |
|    B    |    2012    |    220    |
|    C    |    2012    |    350    |
+---------+------------+-----------+

dic_to_df.Shape
(9, 3)
```

The `pct_change` wrapper is added to the Spark `DataFrame` class in order to have the most commonly used method in Pandas
objects to get the relative percentage change from one observation to another, sorted by a date-type column and lagged by a numeric-type column.

```python
from ophelian.ophelian_spark.functions import PctChange

dic_to_df.pctChange().show(10, False)

+---------------------+
|       Revenue       |
+---------------------+
| null                |
| 1.0                 |
| 0.5                 |
| -0.6333333333333333 |
| 0.7272727272727273  |
| 0.6842105263157894  |
| -0.625              |
| 0.8333333333333333  |
| 0.5909090909090908  |
+---------------------+
```

Another option is to configure all receiving parameters from the function, as follows:
- `periods`; this parameter will control the offset of the lag periods. Since the default value is 1, this will always return a lag-1 information DataFrame.
- `partition_by`; this parameter will fix the partition column over the DataFrame, e.g. 'bank_segment', 'assurance_product_type'.
- `order_by`; order by parameter will be the specific column to order the sequential observations, e.g. 'balance_date', 'trade_close_date', 'contract_date'.
- `pct_cols`; percentage change col (pct_cols) will be the specific column to lag-over giving back the relative change between one element to other, e.g. π‘₯𝑑 Γ· π‘₯𝑑 βˆ’ 1

In this case, we will specify only the `periods` parameter to yield a lag of -2 days over the DataFrame.
```python
dic_to_df.pctChange(periods=2).na.fill(0).show(5, False)

+--------------------+
|Revenue             |
+--------------------+
|0.0                 |
|0.0                 |
|2.0                 |
|-0.44999999999999996|
|-0.3666666666666667 |
+--------------------+
only showing top 5 rows
```

Adding parameters: `partition_by`, `order_by` & `pct_cols`
```python
dic_to_df.pctChange(partition_by="Product", order_by="Year", pct_cols="Revenue").na.fill(0).show(5, False)

+---------------------+
|Revenue              |
+---------------------+
|0.0                  |
|-0.050000000000000044|
|0.1578947368421053   |
|0.0                  |
|0.06666666666666665  |
+---------------------+
only showing top 5 rows
```

You may also lag more than one column at a time by simply adding a list with string column names:
```python
dic_to_df.pctChange(partition_by="Product", order_by="Year", pct_cols=["Year", "Revenue"]).na.fill(0).show(5, False)

+--------------------+---------------------+
|Year                |Revenue              |
+--------------------+---------------------+
|0.0                 |0.0                  |
|4.975124378110429E-4|-0.050000000000000044|
|4.972650422674363E-4|0.1578947368421053   |
|0.0                 |0.0                  |
|4.975124378110429E-4|0.06666666666666665  |
+--------------------+---------------------+
only showing top 5 rows
```
 
## πŸ€” Contributing to Ophelian On Mars

We welcome contributions from everyone! If you have an idea, a question, or if you've found a bug that needs fixing, please open an [issue ticket](https://github.com/LuisFalva/ophelian/issues).

You can find guidelines for submitting an issue request in our repository. Additionally, you can refer to the [Open Source Contribution Guide best practices](https://opensource.guide/) to get started.

## πŸ“  Support or Contact

Having trouble with Ophelian? Yo can DM me at [falvaluis@gmail.com](https://mail.google.com/mail/u/0/?tab=rm&ogbl#inbox?compose=CllgCJZZQVJHBJKmdjtXgzlrRcRktFLwFQsvWKqcTRtvQTVcHvgTNSxVzjZqjvDFhZlVJlPKqtg) and I’ll help you sort it out.

## πŸ“ƒ License

Released under the Apache License, version 2.0.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/LuisFalva/ophelia",
    "name": "ophelian",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.9",
    "maintainer_email": null,
    "keywords": "ophelian, ophelian-on-mars",
    "author": "Luis Falva",
    "author_email": "falvaluis@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/de/78/7f522843e7f90d35f1091a870e6e9e5af6fc53092131b1bf8526ee62d88f/ophelian-0.1.4.tar.gz",
    "platform": null,
    "description": "<div align=\"center\" style=\"padding: 20px;\">\n  <img src=\"docs/img/ophelian-ai-sticker.png\" alt=\"Ophelian AI Sticker\" width=\"200\" style=\"margin-top: 20px;\">\n\n  <p style=\"color: #FFF; font-family: 'Helvetica Neue', Arial, sans-serif; text-align: center; max-width: 600px; margin: 20px auto; font-weight: bold;\">\n  </p>\n  \n  <div style=\"margin-top: 20px;\">\n    <a href=\"https://pypi.org/project/ophelian/\" style=\"margin-right: 10px;\">\n      <img src=\"https://img.shields.io/pypi/v/ophelian.svg\" alt=\"PyPI\">\n    </a>\n    <a href=\"https://hub.docker.com/r/luisfalva/ophelian\" style=\"margin-right: 10px;\">\n      <img src=\"https://img.shields.io/docker/v/luisfalva/ophelian?sort=semver\" alt=\"Docker Hub\">\n    </a>\n    <a href=\"https://github.com/LuisFalva/ophelia/actions/workflows/release.yml\" style=\"margin-right: 10px;\">\n      <img src=\"https://github.com/LuisFalva/ophelia/actions/workflows/release.yml/badge.svg\" alt=\"Release Build Status\">\n    </a>\n    <a href=\"https://github.com/LuisFalva/ophelia/actions/workflows/docker-image.yml\" style=\"margin-right: 10px;\">\n      <img src=\"https://github.com/LuisFalva/ophelia/actions/workflows/docker-image.yml/badge.svg\" alt=\"Docker Image Build Status\">\n    </a>\n    <a href=\"https://ophelian.readme.io/\">\n      <img src=\"https://img.shields.io/badge/docs-Documentation-orange.svg\" alt=\"Docs\">\n    </a>\n  </div>\n</div>\n\n# Ophelian On Mars\n\n---\n\n'*Ophelian On Mars*' \ud83d\udc7d, the ultimate destination for ML, Data Science, and AI professionals. Your go-to framework for seamlessly putting ML prototypes into production\u2014where everyone wants to be, but only a few succeed.\n\n# \ud83d\ude80 Motivations\n\nAs data professionals, we aim to minimize the time spent deciphering the intricacies of PySpark's framework. Often, we seek a straightforward, Pandas-style approach to compute tasks without delving into highly optimized Spark code. \n\nTo address this need, Ophelian was created with the following goals:\n\n- **Simplicity**: Provide a simple and intuitive way to perform data computations, emulating the ease of Pandas.\n- **Efficiency**: Wrap common patterns for data extraction and transformation in a single entry function that ensures Spark-optimized performance.\n- **Code Reduction**: Significantly reduce the amount of code required by leveraging a set of Spark optimization techniques for query execution.\n- **Streamlined ML Pipelines**: Facilitate the lifecycle of any PySpark ML pipeline by incorporating optimized methods and reducing redundant coding efforts.\n\nBy focusing on these motivations, Ophelian aims to enhance productivity and efficiency for data engineers and scientists, allowing them to concentrate on their core tasks without worrying about underlying Spark optimizations.\n\n# \ud83d\udcdd Generalized ML Features\n\nOphelian focuses on creating robust and efficient machine learning (ML) pipelines, making them easily replicable and secure for various ML tasks. Key features include optimized techniques for handling data skewness, user-friendly interfaces for building custom models, and streamlined data mining pipelines with Ophelian pipeline wrappers. Additionally, it functions as an emulator of NumPy and Pandas, offering similar functionalities for a seamless user experience. Below are the detailed features:\n\n- **Framework for Building ML Pipelines**: Simplified and secure methods to construct ML pipelines using PySpark, ensuring replication and robustness.\n- **Optimized Techniques for Data Skewness and Partitioning**: Embedded strategies to address and mitigate data skewness issues, improving model performance and accuracy.\n- **Build Your Own Models (BYOM)**: User-friendly software for constructing custom models and data mining pipelines, leveraging frameworks like PySpark, Beam, Flink, PyTorch, and more, with Ophelian native wrappers for enhanced syntax flexibility and efficiency.\n- **NumPy and Pandas Functionality Syntax Emulation**: Emulates the functions and features of NumPy and Pandas, making it intuitive and easy for users familiar with these libraries to transition and utilize similar functionalities within an ML pipeline.\n\nThese features empower users with the tools they need to handle complex ML tasks effectively, ensuring a seamless experience from data processing to model deployment. users with the tools they need to handle complex machine learning tasks effectively, ensuring a seamless experience from data processing to model deployment.\n\n# Getting Started:\n\n### \ud83d\udcdc Requirements\n\nBefore starting, you'll need to have installed: \n- pyspark >= 3.0.x\n- pandas >= 1.1.3\n- numpy >= 1.19.1\n- dask >= 2.30.x\n- scikit-learn >= 0.23.x\n\nAdditionally, if you want to use the Ophelian packages, you'll also need Python (supported 3.7 and 3.8 versions) and pip installed.\n\n### \ud83d\udee0 Install Ophelian pypi package\n\nJust drop a pip install to the `Ophelian` pypi registry and import `Ophelian`:\n```sh\npip install ophelian==0.1.4\n```\n\n### \ud83d\udce6 Importing and initializing Ophelian\n\nTo initialize `Ophelian` with Spark embedded session use:\n\n```python\nfrom ophelian.start import OphelianSession\nophelian = OphelianSession(\"Spark App Name\")\nsc = ophelian.Spark.build_spark_context()\n  ____          _            _  _               \n / __ \\        | |          | |(_)              \n| |  | | _ __  | |__    ___ | | _   __ _  _ __  \n| |  | || '_ \\ | '_ \\  / _ \\| || | / _` || '_ \\ \n| |__| || |_) || | | ||  __/| || || (_| || | | |\n \\____/ | .__/ |_| |_| \\___||_||_| \\__,_||_| |_|\n        | |                                     \n        |_|                                     \n  ____         \n / __ \\        \n| |  | | _ __  \n| |  | || '_ \\ \n| |__| || | | |\n \\____/ |_| |_|       \n               \n __  __                    _ \n|  \\/  |                  | |\n| \\  / |  __ _  _ __  ___ | |\n| |\\/| | / _` || '__|/ __|| |\n| |  | || (_| || |   \\__ \\|_|\n|_|  |_| \\__,_||_|   |___/(_)\n\n```\nMain class objects provided by initializing Ophelia session:\n\n- `read` & `write`\n\n```python\nfrom ophelian.ophelian_spark.read.spark_read import Read\nfrom ophelian.ophelian_spark.write.spark_write import Write\n```\n- `generic` & `functions`\n\n```python\nfrom ophelian.ophelian_spark.functions import (\n  Shape, Rolling, Reshape, CorrMat, CrossTabular, \n  PctChange, Selects, DynamicSampling\n)\nfrom ophelian.ophelian_spark.generic import (\n  split_date, row_index, lag_min_max_data, regex_expr, remove_duplicate_element,\n  year_array, dates_index, sorted_date_list, feature_pick, binary_search,\n  century_from_year, simple_average, delta_series, simple_moving_average, average,\n  weight_moving_average, single_exp_smooth, double_exp_smooth, initial_seasonal_components,\n  triple_exp_smooth, row_indexing, string_match\n)\n```\n- ML package for `unsupervised`, `sampling` and `feature_miner` objects\n\n```python\nfrom ophelian.ophelian_spark.ml.sampling.synthetic_sample import SyntheticSample\nfrom ophelian.ophelian_spark.ml.unsupervised.feature import PCAnalysis, SingularVD\nfrom ophelian.ophelian_spark.ml.feature_miner import (\n  BuildStringIndex, BuildOneHotEncoder, \n  BuildVectorAssembler, BuildStandardScaler,\n  SparkToNumpy, NumpyToVector\n)\n```\n\nLet me show you some application examples:\n\nThe `Read` class implements Spark reading object in multiple formats `{'csv', 'parquet', 'excel', 'json'}`\n\n```python\nfrom ophelian.ophelian_spark.read.spark_read import Read\n\nspark_df = spark.readFile(path, 'csv', header=True, infer_schema=True)\n```\n\nAlso, you may import class `Shape` from factory `functions` in order to see the dimension of our spark DataFrame such as numpy style.\n\n```python\nfrom ophelian.ophelian_spark.functions import Shape\n\ndic = {\n    'Product': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],\n    'Year': [2010, 2010, 2010, 2011, 2011, 2011, 2012, 2012, 2012],\n    'Revenue': [100, 200, 300, 110, 190, 320, 120, 220, 350]\n}\ndic_to_df = spark.createDataFrame(pd.DataFrame(data=dic))\ndic_to_df.show(10, False)\n\n+---------+------------+-----------+\n| Product |    Year    |  Revenue  |\n+---------+------------+-----------+\n|    A    |    2010    |    100    |\n|    B    |    2010    |    200    |\n|    C    |    2010    |    300    |\n|    A    |    2011    |    110    |\n|    B    |    2011    |    190    |\n|    C    |    2011    |    320    |\n|    A    |    2012    |    120    |\n|    B    |    2012    |    220    |\n|    C    |    2012    |    350    |\n+---------+------------+-----------+\n\ndic_to_df.Shape\n(9, 3)\n```\n\nThe `pct_change` wrapper is added to the Spark `DataFrame` class in order to have the most commonly used method in Pandas\nobjects to get the relative percentage change from one observation to another, sorted by a date-type column and lagged by a numeric-type column.\n\n```python\nfrom ophelian.ophelian_spark.functions import PctChange\n\ndic_to_df.pctChange().show(10, False)\n\n+---------------------+\n|       Revenue       |\n+---------------------+\n| null                |\n| 1.0                 |\n| 0.5                 |\n| -0.6333333333333333 |\n| 0.7272727272727273  |\n| 0.6842105263157894  |\n| -0.625              |\n| 0.8333333333333333  |\n| 0.5909090909090908  |\n+---------------------+\n```\n\nAnother option is to configure all receiving parameters from the function, as follows:\n- `periods`; this parameter will control the offset of the lag periods. Since the default value is 1, this will always return a lag-1 information DataFrame.\n- `partition_by`; this parameter will fix the partition column over the DataFrame, e.g. 'bank_segment', 'assurance_product_type'.\n- `order_by`; order by parameter will be the specific column to order the sequential observations, e.g. 'balance_date', 'trade_close_date', 'contract_date'.\n- `pct_cols`; percentage change col (pct_cols) will be the specific column to lag-over giving back the relative change between one element to other, e.g. \ud835\udc65\ud835\udc61 \u00f7 \ud835\udc65\ud835\udc61 \u2212 1\n\nIn this case, we will specify only the `periods` parameter to yield a lag of -2 days over the DataFrame.\n```python\ndic_to_df.pctChange(periods=2).na.fill(0).show(5, False)\n\n+--------------------+\n|Revenue             |\n+--------------------+\n|0.0                 |\n|0.0                 |\n|2.0                 |\n|-0.44999999999999996|\n|-0.3666666666666667 |\n+--------------------+\nonly showing top 5 rows\n```\n\nAdding parameters: `partition_by`, `order_by` & `pct_cols`\n```python\ndic_to_df.pctChange(partition_by=\"Product\", order_by=\"Year\", pct_cols=\"Revenue\").na.fill(0).show(5, False)\n\n+---------------------+\n|Revenue              |\n+---------------------+\n|0.0                  |\n|-0.050000000000000044|\n|0.1578947368421053   |\n|0.0                  |\n|0.06666666666666665  |\n+---------------------+\nonly showing top 5 rows\n```\n\nYou may also lag more than one column at a time by simply adding a list with string column names:\n```python\ndic_to_df.pctChange(partition_by=\"Product\", order_by=\"Year\", pct_cols=[\"Year\", \"Revenue\"]).na.fill(0).show(5, False)\n\n+--------------------+---------------------+\n|Year                |Revenue              |\n+--------------------+---------------------+\n|0.0                 |0.0                  |\n|4.975124378110429E-4|-0.050000000000000044|\n|4.972650422674363E-4|0.1578947368421053   |\n|0.0                 |0.0                  |\n|4.975124378110429E-4|0.06666666666666665  |\n+--------------------+---------------------+\nonly showing top 5 rows\n```\n \n## \ud83e\udd14 Contributing to Ophelian On Mars\n\nWe welcome contributions from everyone! If you have an idea, a question, or if you've found a bug that needs fixing, please open an [issue ticket](https://github.com/LuisFalva/ophelian/issues).\n\nYou can find guidelines for submitting an issue request in our repository. Additionally, you can refer to the [Open Source Contribution Guide best practices](https://opensource.guide/) to get started.\n\n## \ud83d\udce0 Support or Contact\n\nHaving trouble with Ophelian? Yo can DM me at [falvaluis@gmail.com](https://mail.google.com/mail/u/0/?tab=rm&ogbl#inbox?compose=CllgCJZZQVJHBJKmdjtXgzlrRcRktFLwFQsvWKqcTRtvQTVcHvgTNSxVzjZqjvDFhZlVJlPKqtg) and I\u2019ll help you sort it out.\n\n## \ud83d\udcc3 License\n\nReleased under the Apache License, version 2.0.\n",
    "bugtrack_url": null,
    "license": "Free for non-commercial use",
    "summary": "Ophelian is a go-to framework for seamlessly putting ML & AI prototypes into production.",
    "version": "0.1.4",
    "project_urls": {
        "Documentation": "https://github.com/LuisFalva/ophelia",
        "Homepage": "https://github.com/LuisFalva/ophelia",
        "Repository": "https://github.com/LuisFalva/ophelia"
    },
    "split_keywords": [
        "ophelian",
        " ophelian-on-mars"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "30a5ae0227ce21268612fe7d5ae023828947727887e9067a109b443c87628415",
                "md5": "5d7f818a39489d12b1c0996652018b87",
                "sha256": "9cc9d39daa5670f5ab836073438b05d6c3e85bc4c31b1578a4e5fbb3efb8bb3f"
            },
            "downloads": -1,
            "filename": "ophelian-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5d7f818a39489d12b1c0996652018b87",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.9",
            "size": 57641,
            "upload_time": "2024-07-23T09:15:44",
            "upload_time_iso_8601": "2024-07-23T09:15:44.443343Z",
            "url": "https://files.pythonhosted.org/packages/30/a5/ae0227ce21268612fe7d5ae023828947727887e9067a109b443c87628415/ophelian-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "de787f522843e7f90d35f1091a870e6e9e5af6fc53092131b1bf8526ee62d88f",
                "md5": "ef3ae0312383e967525496976e8be6c1",
                "sha256": "f82853e07e29b2539970563fa65d14a1194ded9b6b7b026b2ffefc2f7b2642ef"
            },
            "downloads": -1,
            "filename": "ophelian-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "ef3ae0312383e967525496976e8be6c1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.9",
            "size": 53209,
            "upload_time": "2024-07-23T09:15:45",
            "upload_time_iso_8601": "2024-07-23T09:15:45.629680Z",
            "url": "https://files.pythonhosted.org/packages/de/78/7f522843e7f90d35f1091a870e6e9e5af6fc53092131b1bf8526ee62d88f/ophelian-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-23 09:15:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LuisFalva",
    "github_project": "ophelia",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "absl-py",
            "specs": [
                [
                    "==",
                    "2.1.0"
                ]
            ]
        },
        {
            "name": "astunparse",
            "specs": [
                [
                    "==",
                    "1.6.3"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2024.7.4"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "3.3.2"
                ]
            ]
        },
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "cloudpickle",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": [
                [
                    "==",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "dask-expr",
            "specs": [
                [
                    "==",
                    "1.1.9"
                ]
            ]
        },
        {
            "name": "dask",
            "specs": [
                [
                    "==",
                    "2024.7.1"
                ]
            ]
        },
        {
            "name": "dask",
            "specs": [
                [
                    "==",
                    "2024.7.1"
                ]
            ]
        },
        {
            "name": "flatbuffers",
            "specs": [
                [
                    "==",
                    "24.3.25"
                ]
            ]
        },
        {
            "name": "fsspec",
            "specs": [
                [
                    "==",
                    "2024.6.1"
                ]
            ]
        },
        {
            "name": "gast",
            "specs": [
                [
                    "==",
                    "0.6.0"
                ]
            ]
        },
        {
            "name": "google-pasta",
            "specs": [
                [
                    "==",
                    "0.2.0"
                ]
            ]
        },
        {
            "name": "grpcio",
            "specs": [
                [
                    "==",
                    "1.65.1"
                ]
            ]
        },
        {
            "name": "h5py",
            "specs": [
                [
                    "==",
                    "3.11.0"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.7"
                ]
            ]
        },
        {
            "name": "importlib-metadata",
            "specs": [
                [
                    "==",
                    "8.0.0"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    "==",
                    "1.4.2"
                ]
            ]
        },
        {
            "name": "keras",
            "specs": [
                [
                    "==",
                    "3.4.1"
                ]
            ]
        },
        {
            "name": "libclang",
            "specs": [
                [
                    "==",
                    "18.1.1"
                ]
            ]
        },
        {
            "name": "llvmlite",
            "specs": [
                [
                    "==",
                    "0.43.0"
                ]
            ]
        },
        {
            "name": "locket",
            "specs": [
                [
                    "==",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "markdown-it-py",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "markdown",
            "specs": [
                [
                    "==",
                    "3.6"
                ]
            ]
        },
        {
            "name": "markupsafe",
            "specs": [
                [
                    "==",
                    "2.1.5"
                ]
            ]
        },
        {
            "name": "mdurl",
            "specs": [
                [
                    "==",
                    "0.1.2"
                ]
            ]
        },
        {
            "name": "ml-dtypes",
            "specs": [
                [
                    "==",
                    "0.4.0"
                ]
            ]
        },
        {
            "name": "namex",
            "specs": [
                [
                    "==",
                    "0.0.8"
                ]
            ]
        },
        {
            "name": "numba",
            "specs": [
                [
                    "==",
                    "0.60.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.26.4"
                ]
            ]
        },
        {
            "name": "opt-einsum",
            "specs": [
                [
                    "==",
                    "3.3.0"
                ]
            ]
        },
        {
            "name": "optree",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "24.1"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.2"
                ]
            ]
        },
        {
            "name": "partd",
            "specs": [
                [
                    "==",
                    "1.4.2"
                ]
            ]
        },
        {
            "name": "protobuf",
            "specs": [
                [
                    "==",
                    "4.25.3"
                ]
            ]
        },
        {
            "name": "py4j",
            "specs": [
                [
                    "==",
                    "0.10.9.5"
                ]
            ]
        },
        {
            "name": "pyarrow",
            "specs": [
                [
                    "==",
                    "16.0.0"
                ]
            ]
        },
        {
            "name": "pygments",
            "specs": [
                [
                    "==",
                    "2.18.0"
                ]
            ]
        },
        {
            "name": "pyhocon",
            "specs": [
                [
                    "==",
                    "0.3.45"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    "==",
                    "3.1.2"
                ]
            ]
        },
        {
            "name": "pyspark",
            "specs": [
                [
                    "==",
                    "3.2.2"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        },
        {
            "name": "quadprog",
            "specs": [
                [
                    "==",
                    "0.1.12"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.3"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.7.1"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "==",
                    "1.5.1"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.13.1"
                ]
            ]
        },
        {
            "name": "setuptools",
            "specs": [
                [
                    "==",
                    "71.1.0"
                ]
            ]
        },
        {
            "name": "shap",
            "specs": [
                [
                    "==",
                    "0.46.0"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "slicer",
            "specs": [
                [
                    "==",
                    "0.0.8"
                ]
            ]
        },
        {
            "name": "tensorboard-data-server",
            "specs": [
                [
                    "==",
                    "0.7.2"
                ]
            ]
        },
        {
            "name": "tensorboard",
            "specs": [
                [
                    "==",
                    "2.17.0"
                ]
            ]
        },
        {
            "name": "tensorflow-io-gcs-filesystem",
            "specs": [
                [
                    "==",
                    "0.37.1"
                ]
            ]
        },
        {
            "name": "tensorflow",
            "specs": [
                [
                    "==",
                    "2.17.0"
                ]
            ]
        },
        {
            "name": "termcolor",
            "specs": [
                [
                    "==",
                    "2.4.0"
                ]
            ]
        },
        {
            "name": "threadpoolctl",
            "specs": [
                [
                    "==",
                    "3.5.0"
                ]
            ]
        },
        {
            "name": "toolz",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.66.4"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    "==",
                    "4.12.2"
                ]
            ]
        },
        {
            "name": "tzdata",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "2.2.2"
                ]
            ]
        },
        {
            "name": "werkzeug",
            "specs": [
                [
                    "==",
                    "3.0.3"
                ]
            ]
        },
        {
            "name": "wheel",
            "specs": [
                [
                    "==",
                    "0.38.1"
                ]
            ]
        },
        {
            "name": "wrapt",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "zipp",
            "specs": [
                [
                    "==",
                    "3.19.2"
                ]
            ]
        }
    ],
    "lcname": "ophelian"
}
        
Elapsed time: 0.26930s