tsforecasting


Nametsforecasting JSON
Version 1.4.10 PyPI version JSON
download
home_pagehttps://github.com/TsLu1s/TSForecasting
SummaryTSForecasting: Automated Time Series Forecasting Framework
upload_time2024-06-15 22:40:58
maintainerNone
docs_urlNone
authorLuís Santos
requires_pythonNone
licenseMIT
keywords data science machine learning time series forecasting automated time series multivariate time series univariate time series automated machine learning automl
VCS
bugtrack_url
requirements pandas numpy python-dateutil scikit-learn autogluon lightgbm xgboost catboost tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <br>
<p align="center">
  <h2 align="center"> TSForecasting - Automated Time Series Forecasting Framework
  <br>
  
## Framework Contextualization <a name = "ta"></a>

The `TSForecasting` project offers a comprehensive and integrated pipeline designed to Automate Time Series Forecasting applications. By implementing multivariate approaches that incorporate multiple regression models, it combines varied relevant modules such as `SKLearn`, `AutoGluon`, `CatBoost` and `XGBoost`, following an `Expanding Window` structured approach for performance evaluation ensuring a robust, scalable and optimized forecasting solution.

The architecture design includes five main sections, these being: data preprocessing, feature engineering, hyperparameter optimization, forecast ensembling and forecasting method selection which are organized and customizable in a pipeline structure.

This project aims at providing the following application capabilities:

* General applicability on tabular datasets: The developed forecasting procedures are applicable on any data table associated with any Time Series Forecasting scopes.

* Hyperparameter optimization and customization: It provides full configuration for each model hyperparameter through the customization of `model_configurations` dictionary, allowing optimal performance to be obtained for any use case.
    
* Robustness and improvement of predictive results: The implementation of the TSForecasting pipeline aims to improve the predictive performance directly associated with the application of the best performing forecasting method. 
   
#### Main Development Tools <a name = "pre1"></a>

Major frameworks used to built this project: 

* [Sklearn](https://scikit-learn.org/stable/)
* [AutoGluon](https://auto.gluon.ai/stable/index.html)
* [CatBoost](https://catboost.ai/)
* [XGBoost](https://xgboost.readthedocs.io/en/stable/)
* [LightGBM](https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html)

    
## Performance Evaluation Structure <a name = "ta"></a>

<p align="center">
  <img src="https://i.ibb.co/ctYj6tt/Expanding-Window-TSF.png" align="center" width="450" height="350" />
</p>  
    
The Expanding Window evaluation technique is a temporary approximation on the real value of the time series data. 
The first test segment is selected according to the train length and then it's forecasted in accordance with forecast size.
The starting position of the subsequent segment is set in direct relation to the sliding window size, this meaning, if the
sliding size is equal to the forecast size, each next segment starts at the end of the previous.
This process is repeated until all time series data gets segmented and it uses all the iterations and observations
to construct an aggregated and robust performance analysis to each predicted point.

## Where to get it <a name = "ta"></a>

The source code is currently hosted on GitHub at: https://github.com/TsLu1s/TSForecasting

Binary installer for the latest released version is available at the Python Package Index (PyPI).   

## Installation  

To install this package from Pypi repository run the following command:

```
pip install tsforecasting
```

# Usage Examples
    
## 1. TSForecasting - Automated Time Series Forecasting
    
The first needed step after importing the package is to load a dataset and define your DataTime (`datetime64[ns]` type) and Target column to be predicted, then rename them to `Date` and `y`, respectively.
The following step is to define your future running pipeline parameters variables, these being:
* train_size: Length of Train data in which will be applied the first Expanding Window iteration;
* lags: The number of time steps in each window, indicating how many past observations each input sample includes;
* horizon: Full length of test/future ahead predictions;
* sliding_size: Length of sliding window, sliding_size>=horizon is suggested;
* models: All selected models intented to be ensembled for evaluation. To fit and compare predictive performance of available models set them in paramater `models:list`, options are the following:
  * `RandomForest`
  * `ExtraTrees`
  * `GBR`
  * `KNN`
  * `GeneralizedLR`
  * `XGBoost`
  * `LightGBM`
  * `Catboost`
  * `AutoGluon`

* hparameters: Nested dictionary in which are contained all models and specific hyperparameters configurations. Feel free to customize each model as you see fit (customization example shown bellow); 
* granularity: Valid interval of periods correlated to data -> 1m,30m,1h,1d,1wk,1mo (default='1d');
* metric: Default predictive evaluation metric is `MAE` (Mean Absolute Error), other options are `MAPE` (Mean Absolute Percentage Error) and `MSE`
(Mean Squared Error);
 
The `fit_forecast` method set the default parameters for fitting and comparison of all segmented windows for each selected and configurated model. After implementation, the `history` method agregates the returning the variable `fit_performance` containing the detailed measures of each window iteration forecasted value and all segmented iterations performance.

The `forecast` method forecasts the future values based on the previously predefined best performing model.
        
```py

from tsforecasting.forecasting import (TSForecasting,
                                       model_configurations)
import pandas as pd
import warnings
warnings.filterwarnings("ignore", category=Warning) #-> For a clean console

## Dataframe Loading
data = pd.read_csv('csv_directory_path') 
data = data.rename(columns={'DateTime_Column': 'Date','Target_Name_Column':'y'})
data = data[['Date',"y"]]
    
## Get Models Hyperparameters Configurations
parameters = model_configurations()
print(parameters)

# Customization Hyperparameters Example
hparameters["RandomForest"]["n_estimators"] = 50
hparameters["KNN"]["n_neighbors"] = 5
hparameters["Catboost"]["iterations"] = 150
hparameters["AutoGluon"]["time_limit"] = 50

## Fit Forecasting Evaluation
tsf = TSForecasting(train_size = 0.90,
                    lags = 10,
                    horizon = 10,
                    sliding_size = 30,
                    models = ['RandomForest', 'ExtraTrees', 'GBR', 'KNN', 'GeneralizedLR',
                              'XGBoost', 'LightGBM', 'Catboost', 'AutoGluon'],
                    hparameters = hparameters,
                    granularity = '1h',
                    metric = 'MAE'
                    )
tsf = tsf.fit_forecast(dataset = data)

# Get Fit History
fit_performance = tsf.history()

## Forecast
forecast = tsf.forecast()

```  

## 2. TSForecasting - Extra Auxiliar Methods

The `make_timeseries` method transforms a DataFrame into a format ready for time series analysis. This transformation prepares data sets for forecasting future values based on historical data, optimizing the input for subsequent model training and analysis, taking into consideration both the recency of data and the horizon of the prediction.

* window_size: Determinates how many past observations each sample in the DataFrame should include. This creates a basis for learning from historical data.
* horizon: Defines the number of future time steps to forecast. This addition provides direct targets for prediction models.
* granularity: Adjusts the temporal detail from minutes to months, making the method suitable for diverse time series datasets (options -> 1m,30m,1h,1d,1wk,1mo).
* datetime_engineering: When activated enriches the dataset with extra date-time features, such as year, month, and day of the week, potentialy enhancing the predictive capabilities of the model.
 
```py   

from tsforecasting.forecasting import Processing

pr = Processing()

data = pr.make_timeseries(dataset = data,
			  window_size = 10, 
			  horizon = 2, 
			  granularity = '1h',
			  datetime_engineering = True)

```
    
## License

Distributed under the MIT License. See [LICENSE](https://github.com/TsLu1s/TSForecasting/blob/main/LICENSE) for more information.

## Contact 
 
Luis Santos - [LinkedIn](https://www.linkedin.com/in/lu%C3%ADsfssantos/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/TsLu1s/TSForecasting",
    "name": "tsforecasting",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "data science, machine learning, time series forecasting, automated time series, multivariate time series, univariate time series, automated machine learning, automl",
    "author": "Lu\u00eds Santos",
    "author_email": "luisf_ssantos@hotmail.com",
    "download_url": null,
    "platform": null,
    "description": "<br>\r\n<p align=\"center\">\r\n  <h2 align=\"center\"> TSForecasting - Automated Time Series Forecasting Framework\r\n  <br>\r\n  \r\n## Framework Contextualization <a name = \"ta\"></a>\r\n\r\nThe `TSForecasting` project offers a comprehensive and integrated pipeline designed to Automate Time Series Forecasting applications. By implementing multivariate approaches that incorporate multiple regression models, it combines varied relevant modules such as `SKLearn`, `AutoGluon`, `CatBoost` and `XGBoost`, following an `Expanding Window` structured approach for performance evaluation ensuring a robust, scalable and optimized forecasting solution.\r\n\r\nThe architecture design includes five main sections, these being: data preprocessing, feature engineering, hyperparameter optimization, forecast ensembling and forecasting method selection which are organized and customizable in a pipeline structure.\r\n\r\nThis project aims at providing the following application capabilities:\r\n\r\n* General applicability on tabular datasets: The developed forecasting procedures are applicable on any data table associated with any Time Series Forecasting scopes.\r\n\r\n* Hyperparameter optimization and customization: It provides full configuration for each model hyperparameter through the customization of `model_configurations` dictionary, allowing optimal performance to be obtained for any use case.\r\n    \r\n* Robustness and improvement of predictive results: The implementation of the TSForecasting pipeline aims to improve the predictive performance directly associated with the application of the best performing forecasting method. \r\n   \r\n#### Main Development Tools <a name = \"pre1\"></a>\r\n\r\nMajor frameworks used to built this project: \r\n\r\n* [Sklearn](https://scikit-learn.org/stable/)\r\n* [AutoGluon](https://auto.gluon.ai/stable/index.html)\r\n* [CatBoost](https://catboost.ai/)\r\n* [XGBoost](https://xgboost.readthedocs.io/en/stable/)\r\n* [LightGBM](https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html)\r\n\r\n    \r\n## Performance Evaluation Structure <a name = \"ta\"></a>\r\n\r\n<p align=\"center\">\r\n  <img src=\"https://i.ibb.co/ctYj6tt/Expanding-Window-TSF.png\" align=\"center\" width=\"450\" height=\"350\" />\r\n</p>  \r\n    \r\nThe Expanding Window evaluation technique is a temporary approximation on the real value of the time series data. \r\nThe first test segment is selected according to the train length and then it's forecasted in accordance with forecast size.\r\nThe starting position of the subsequent segment is set in direct relation to the sliding window size, this meaning, if the\r\nsliding size is equal to the forecast size, each next segment starts at the end of the previous.\r\nThis process is repeated until all time series data gets segmented and it uses all the iterations and observations\r\nto construct an aggregated and robust performance analysis to each predicted point.\r\n\r\n## Where to get it <a name = \"ta\"></a>\r\n\r\nThe source code is currently hosted on GitHub at: https://github.com/TsLu1s/TSForecasting\r\n\r\nBinary installer for the latest released version is available at the Python Package Index (PyPI).   \r\n\r\n## Installation  \r\n\r\nTo install this package from Pypi repository run the following command:\r\n\r\n```\r\npip install tsforecasting\r\n```\r\n\r\n# Usage Examples\r\n    \r\n## 1. TSForecasting - Automated Time Series Forecasting\r\n    \r\nThe first needed step after importing the package is to load a dataset and define your DataTime (`datetime64[ns]` type) and Target column to be predicted, then rename them to `Date` and `y`, respectively.\r\nThe following step is to define your future running pipeline parameters variables, these being:\r\n* train_size: Length of Train data in which will be applied the first Expanding Window iteration;\r\n* lags: The number of time steps in each window, indicating how many past observations each input sample includes;\r\n* horizon: Full length of test/future ahead predictions;\r\n* sliding_size: Length of sliding window, sliding_size>=horizon is suggested;\r\n* models: All selected models intented to be ensembled for evaluation. To fit and compare predictive performance of available models set them in paramater `models:list`, options are the following:\r\n  * `RandomForest`\r\n  * `ExtraTrees`\r\n  * `GBR`\r\n  * `KNN`\r\n  * `GeneralizedLR`\r\n  * `XGBoost`\r\n  * `LightGBM`\r\n  * `Catboost`\r\n  * `AutoGluon`\r\n\r\n* hparameters: Nested dictionary in which are contained all models and specific hyperparameters configurations. Feel free to customize each model as you see fit (customization example shown bellow); \r\n* granularity: Valid interval of periods correlated to data -> 1m,30m,1h,1d,1wk,1mo (default='1d');\r\n* metric: Default predictive evaluation metric is `MAE` (Mean Absolute Error), other options are `MAPE` (Mean Absolute Percentage Error) and `MSE`\r\n(Mean Squared Error);\r\n \r\nThe `fit_forecast` method set the default parameters for fitting and comparison of all segmented windows for each selected and configurated model. After implementation, the `history` method agregates the returning the variable `fit_performance` containing the detailed measures of each window iteration forecasted value and all segmented iterations performance.\r\n\r\nThe `forecast` method forecasts the future values based on the previously predefined best performing model.\r\n        \r\n```py\r\n\r\nfrom tsforecasting.forecasting import (TSForecasting,\r\n                                       model_configurations)\r\nimport pandas as pd\r\nimport warnings\r\nwarnings.filterwarnings(\"ignore\", category=Warning) #-> For a clean console\r\n\r\n## Dataframe Loading\r\ndata = pd.read_csv('csv_directory_path') \r\ndata = data.rename(columns={'DateTime_Column': 'Date','Target_Name_Column':'y'})\r\ndata = data[['Date',\"y\"]]\r\n    \r\n## Get Models Hyperparameters Configurations\r\nparameters = model_configurations()\r\nprint(parameters)\r\n\r\n# Customization Hyperparameters Example\r\nhparameters[\"RandomForest\"][\"n_estimators\"] = 50\r\nhparameters[\"KNN\"][\"n_neighbors\"] = 5\r\nhparameters[\"Catboost\"][\"iterations\"] = 150\r\nhparameters[\"AutoGluon\"][\"time_limit\"] = 50\r\n\r\n## Fit Forecasting Evaluation\r\ntsf = TSForecasting(train_size = 0.90,\r\n                    lags = 10,\r\n                    horizon = 10,\r\n                    sliding_size = 30,\r\n                    models = ['RandomForest', 'ExtraTrees', 'GBR', 'KNN', 'GeneralizedLR',\r\n                              'XGBoost', 'LightGBM', 'Catboost', 'AutoGluon'],\r\n                    hparameters = hparameters,\r\n                    granularity = '1h',\r\n                    metric = 'MAE'\r\n                    )\r\ntsf = tsf.fit_forecast(dataset = data)\r\n\r\n# Get Fit History\r\nfit_performance = tsf.history()\r\n\r\n## Forecast\r\nforecast = tsf.forecast()\r\n\r\n```  \r\n\r\n## 2. TSForecasting - Extra Auxiliar Methods\r\n\r\nThe `make_timeseries` method transforms a DataFrame into a format ready for time series analysis. This transformation prepares data sets for forecasting future values based on historical data, optimizing the input for subsequent model training and analysis, taking into consideration both the recency of data and the horizon of the prediction.\r\n\r\n* window_size: Determinates how many past observations each sample in the DataFrame should include. This creates a basis for learning from historical data.\r\n* horizon: Defines the number of future time steps to forecast. This addition provides direct targets for prediction models.\r\n* granularity: Adjusts the temporal detail from minutes to months, making the method suitable for diverse time series datasets (options -> 1m,30m,1h,1d,1wk,1mo).\r\n* datetime_engineering: When activated enriches the dataset with extra date-time features, such as year, month, and day of the week, potentialy enhancing the predictive capabilities of the model.\r\n \r\n```py   \r\n\r\nfrom tsforecasting.forecasting import Processing\r\n\r\npr = Processing()\r\n\r\ndata = pr.make_timeseries(dataset = data,\r\n\t\t\t  window_size = 10, \r\n\t\t\t  horizon = 2, \r\n\t\t\t  granularity = '1h',\r\n\t\t\t  datetime_engineering = True)\r\n\r\n```\r\n    \r\n## License\r\n\r\nDistributed under the MIT License. See [LICENSE](https://github.com/TsLu1s/TSForecasting/blob/main/LICENSE) for more information.\r\n\r\n## Contact \r\n \r\nLuis Santos - [LinkedIn](https://www.linkedin.com/in/lu%C3%ADsfssantos/)\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "TSForecasting: Automated Time Series Forecasting Framework",
    "version": "1.4.10",
    "project_urls": {
        "Homepage": "https://github.com/TsLu1s/TSForecasting"
    },
    "split_keywords": [
        "data science",
        " machine learning",
        " time series forecasting",
        " automated time series",
        " multivariate time series",
        " univariate time series",
        " automated machine learning",
        " automl"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "98418d542f1bbaacf09937640ee5c28e9020edd9b76346cecf6e637ad65f82b3",
                "md5": "f36d0bb32750a8c344bb6d8261a47c4b",
                "sha256": "b8d071c8d4e94e5a417d68502c6ecb1003f95d1953016fbd361463607296d34f"
            },
            "downloads": -1,
            "filename": "tsforecasting-1.4.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f36d0bb32750a8c344bb6d8261a47c4b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 24402,
            "upload_time": "2024-06-15T22:40:58",
            "upload_time_iso_8601": "2024-06-15T22:40:58.511539Z",
            "url": "https://files.pythonhosted.org/packages/98/41/8d542f1bbaacf09937640ee5c28e9020edd9b76346cecf6e637ad65f82b3/tsforecasting-1.4.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-15 22:40:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TsLu1s",
    "github_project": "TSForecasting",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.4.4"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.23.5"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    ">=",
                    "2.8.2"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.2.2"
                ]
            ]
        },
        {
            "name": "autogluon",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "lightgbm",
            "specs": [
                [
                    ">=",
                    "4.3.0"
                ]
            ]
        },
        {
            "name": "xgboost",
            "specs": [
                [
                    ">=",
                    "1.7.4"
                ]
            ]
        },
        {
            "name": "catboost",
            "specs": [
                [
                    ">=",
                    "1.2.2"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.65.2"
                ]
            ]
        }
    ],
    "lcname": "tsforecasting"
}
        
Elapsed time: 0.25855s