pasts


Namepasts JSON
Version 2023.3.1 PyPI version JSON
download
home_page
Summaryunified collections of models to analyze a time series
upload_time2023-11-15 08:52:34
maintainer
docs_urlNone
authorRawya Zreik
requires_python
license
keywords times series forcasting analysis machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Python AnalySis for Time Series

[![pytest](https://github.com/eurobios-mews-labs/pasts/actions/workflows/pytest.yml/badge.svg?event=push)](https://docs.pytest.org)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/eurobios-mews-labs/pasts/graphs/commit-activity)


This package aims to structure the way time series analysis and forecasting is done. 

#### Purpose of the Package 
+ The purpose of the package is to
provide a collection of forecasting models 
and analysis methods for time series in one unified library.

#### Features 
+ Collection of analysis methods:
  - Scipy and statsmodel for testing 
  - Time series processing
  - Statistical testing (stationarity, check seasonality, ...  )
  - Visualization
+ Collection of forecasting models using Darts, which is itself an aggregator of 
   - scikit-learn
   - tenserflow
   - prophet
   - Auto regression models (ARIMA, SARIMA, ....)
   - etc.


#### Installation 
The package can be installed by :
```bash
python3 -m pip install git+https://github.com/eurobios-mews-labs/pasts

```


#### Building the documentation

First, make sure you have sphinx and the Readthedocs theme installed.

If you use pip, open a terminal and enter the following commands:
```shell script
pip install sphinx
pip install sphinx_rtd_theme
```

If you use conda, open an Anaconda Powershell Prompt and enter the following commands:
```shell script
conda install sphinx
conda install sphinx_rtd_theme
```

Then, in the same terminal or anaconda prompt, build the doc with:
```shell script
cd doc
make html
```

The documentation can then be accessed from `doc/_build/html/index.html`.


## Usage and example
You can find examples for the `Signal` class for univariate and multivariate series here: `examples/ex_model.py`

The `Operation` class can be used on its own. Find an example here: `examples/ex_operations.py`

### Start project
To start using the package, import you data as a pandas dataframe with a temporal index and use the `Signal` class.
```python
import pandas as pd

from darts.datasets import AirPassengersDataset
from darts.models import AutoARIMA, Prophet, ExponentialSmoothing, XGBModel, VARIMA
from darts.utils.utils import ModelMode, SeasonalityMode

from pasts.signal import Signal
from pasts.visualization import Visualization

series = AirPassengersDataset().load()
dt = pd.DataFrame(series.values())
dt.rename(columns={0: 'passengers'}, inplace=True)
dt.index = series.time_index
signal = Signal(dt)
```
### Visualize and analyze data
The `properties` attribute contains some information about the data.
Use the `Visualization` class to generate various types of plots.
```python
print(signal.properties)
```
Output:
```python
>>> {'shape': (144, 1), 'types': passengers    float64
dtype: object, 
'is_univariate': True, 
'nanSum': passengers   0
dtype: int64, 
'quantiles':   0.00   0.05   0.50    0.95    0.99   1.00
passengers     104.0  121.6  265.5  488.15  585.79  622.0}
```
```python
Visualization(signal).plot_signal()
Visualization(signal).acf_plot()
```
Yield:

<img src="examples/ex_plot1.png" alt="drawing" width="700"/>
<img src="examples/ex_acf.png" alt="drawing" width="700"/>

You can also perform some statistical tests specific to time series.
```python
signal.apply_stat_test('stationary')
signal.apply_stat_test('stationary', 'kpss')
signal.apply_stat_test('seasonality')
print(signal.tests_stat)
```
Output: Whether the null hypothesis is rejected and p-value for stationarity, seasonality period for seasonality.
```python
>>> {'stationary: adfuller': (False, 0.9918802434376409),
'stationary: kpss': (False, 0.01),
'seasonality: check_seasonality': (<function check_seasonality at 0x000001D1D62EE310>, (True, 12))}
```

### Machine Learning
Choose a date to split the series between train and test.
```python
timestamp = '1958-12-01'
signal.validation_split(timestamp=timestamp)
```
The library provides some operations to apply before using forecasting models.
In this example, both linear trend and seasonality are removed. Machine Learning models will be trained on the remainig series, then inverse functions will be applied to the predicted series.
```python
signal.apply_operations(['trend', 'seasonality'])
Visualization(signal).plot_signal()
```
<img src="examples/ex_op.png" alt="drawing" width="700"/>

Use the method `apply_model` to apply models of your choice. If the parameters `gridsearch` and `parameters` are passed, a gridsearch will be performed.
```python
signal.apply_model(ExponentialSmoothing())

signal.apply_model(AutoARIMA())
signal.apply_model(Prophet())

# Be careful : if trend and seasonality were removed, this specific gridsearch cannot be performed.
param_grid = {'trend': [ModelMode.ADDITIVE, ModelMode.MULTIPLICATIVE, ModelMode.NONE],
              'seasonal': [SeasonalityMode.ADDITIVE, SeasonalityMode.MULTIPLICATIVE, SeasonalityMode.NONE],
              }
signal.apply_model(ExponentialSmoothing(), gridsearch=True, parameters=param_grid)
```
You can pass a list of metrics to the method `compute_scores`. By default, it tries to compute R², MSE, RMSE, MAPE, SMAPE and MAE. Warnings are raised if some metrics are impossible to compute with this type of data.
You can choose to compute scores time-wise or unit-wise with parameter `axis`. However, with univariate data it is preferable to keep the default value (axis=1, unit-wise).
```python
signal.compute_scores(list_metrics=['rmse', 'r2'])
signal.compute_scores()
print(signal.models['Prophet']['scores'])
```
Output:
```python
>>> {'unit_wise':      
                      r2        mse       rmse      mape     smape        mae
component                                                                
passengers        0.866962  590.08557  24.291677  3.694311  3.743241  18.008002, 
'time_wise': {}}
```
```python
print(signal.performance_models['unit_wise']['rmse'])
```
Output:
```python
>>>               Prophet ExponentialSmoothing  AutoARIMA
component                                            
passengers     24.291677            40.306771   26.718103

```
Visualize predictions with the `Visualization` class:
```python
Visualization(signal).show_predictions()
```
<img src="examples/ex_pred.png" alt="drawing" width="700"/>

When models have been trained, you can compute predictions for future dates using the `forecast`method by passing it the name of a trained model and the horizon of prediction.
```python
signal.forecast("Prophet", 100)
signal.forecast("AutoARIMA", 100)
signal.forecast("ExponentialSmoothing", 100)
Visualization(signal).show_forecast()
```
<img src="examples/ex_fc.png" alt="drawing" width="700"/>

#### Aggregation of models
The method `apply_aggregated_model` aggregates the passed list of trained estimators according to their RMSE on train data. All passed models will be kept, so make sure to exclude models that are always less performant. The more a model performs compared to others, the greater its weight in agregation.

```python
signal.apply_aggregated_model([AutoARIMA(), Prophet()])
signal.compute_scores(axis=1)
Visualization(signal).show_predictions()
signal.forecast("AggregatedModel", 100)
```
<img src="examples/ex_fc_ag.png" alt="drawing" width="700"/>

### Author
<img src="logoEurobiosMewsLabs.png" alt="drawing" width="400"/>


            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pasts",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "times series,forcasting,analysis,machine learning",
    "author": "Rawya Zreik",
    "author_email": "rzreik@eurobios.com",
    "download_url": "https://files.pythonhosted.org/packages/94/a7/f7d868b8c5c179132be43bbcc96cb2bc11a5cc4f8ed8512412975f0b1f0b/pasts-2023.3.1.tar.gz",
    "platform": null,
    "description": "# Python AnalySis for Time Series\n\n[![pytest](https://github.com/eurobios-mews-labs/pasts/actions/workflows/pytest.yml/badge.svg?event=push)](https://docs.pytest.org)\n[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://GitHub.com/eurobios-mews-labs/pasts/graphs/commit-activity)\n\n\nThis package aims to structure the way time series analysis and forecasting is done. \n\n#### Purpose of the Package \n+ The purpose of the package is to\nprovide a collection of forecasting models \nand analysis methods for time series in one unified library.\n\n#### Features \n+ Collection of analysis methods:\n  - Scipy and statsmodel for testing \n  - Time series processing\n  - Statistical testing (stationarity, check seasonality, ...  )\n  - Visualization\n+ Collection of forecasting models using Darts, which is itself an aggregator of \n   - scikit-learn\n   - tenserflow\n   - prophet\n   - Auto regression models (ARIMA, SARIMA, ....)\n   - etc.\n\n\n#### Installation \nThe package can be installed by :\n```bash\npython3 -m pip install git+https://github.com/eurobios-mews-labs/pasts\n\n```\n\n\n#### Building the documentation\n\nFirst, make sure you have sphinx and the Readthedocs theme installed.\n\nIf you use pip, open a terminal and enter the following commands:\n```shell script\npip install sphinx\npip install sphinx_rtd_theme\n```\n\nIf you use conda, open an Anaconda Powershell Prompt and enter the following commands:\n```shell script\nconda install sphinx\nconda install sphinx_rtd_theme\n```\n\nThen, in the same terminal or anaconda prompt, build the doc with:\n```shell script\ncd doc\nmake html\n```\n\nThe documentation can then be accessed from `doc/_build/html/index.html`.\n\n\n## Usage and example\nYou can find examples for the `Signal` class for univariate and multivariate series here: `examples/ex_model.py`\n\nThe `Operation` class can be used on its own. Find an example here: `examples/ex_operations.py`\n\n### Start project\nTo start using the package, import you data as a pandas dataframe with a temporal index and use the `Signal` class.\n```python\nimport pandas as pd\n\nfrom darts.datasets import AirPassengersDataset\nfrom darts.models import AutoARIMA, Prophet, ExponentialSmoothing, XGBModel, VARIMA\nfrom darts.utils.utils import ModelMode, SeasonalityMode\n\nfrom pasts.signal import Signal\nfrom pasts.visualization import Visualization\n\nseries = AirPassengersDataset().load()\ndt = pd.DataFrame(series.values())\ndt.rename(columns={0: 'passengers'}, inplace=True)\ndt.index = series.time_index\nsignal = Signal(dt)\n```\n### Visualize and analyze data\nThe `properties` attribute contains some information about the data.\nUse the `Visualization` class to generate various types of plots.\n```python\nprint(signal.properties)\n```\nOutput:\n```python\n>>> {'shape': (144, 1), 'types': passengers    float64\ndtype: object, \n'is_univariate': True, \n'nanSum': passengers   0\ndtype: int64, \n'quantiles':   0.00   0.05   0.50    0.95    0.99   1.00\npassengers     104.0  121.6  265.5  488.15  585.79  622.0}\n```\n```python\nVisualization(signal).plot_signal()\nVisualization(signal).acf_plot()\n```\nYield:\n\n<img src=\"examples/ex_plot1.png\" alt=\"drawing\" width=\"700\"/>\n<img src=\"examples/ex_acf.png\" alt=\"drawing\" width=\"700\"/>\n\nYou can also perform some statistical tests specific to time series.\n```python\nsignal.apply_stat_test('stationary')\nsignal.apply_stat_test('stationary', 'kpss')\nsignal.apply_stat_test('seasonality')\nprint(signal.tests_stat)\n```\nOutput: Whether the null hypothesis is rejected and p-value for stationarity, seasonality period for seasonality.\n```python\n>>> {'stationary: adfuller': (False, 0.9918802434376409),\n'stationary: kpss': (False, 0.01),\n'seasonality: check_seasonality': (<function check_seasonality at 0x000001D1D62EE310>, (True, 12))}\n```\n\n### Machine Learning\nChoose a date to split the series between train and test.\n```python\ntimestamp = '1958-12-01'\nsignal.validation_split(timestamp=timestamp)\n```\nThe library provides some operations to apply before using forecasting models.\nIn this example, both linear trend and seasonality are removed. Machine Learning models will be trained on the remainig series, then inverse functions will be applied to the predicted series.\n```python\nsignal.apply_operations(['trend', 'seasonality'])\nVisualization(signal).plot_signal()\n```\n<img src=\"examples/ex_op.png\" alt=\"drawing\" width=\"700\"/>\n\nUse the method `apply_model` to apply models of your choice. If the parameters `gridsearch` and `parameters` are passed, a gridsearch will be performed.\n```python\nsignal.apply_model(ExponentialSmoothing())\n\nsignal.apply_model(AutoARIMA())\nsignal.apply_model(Prophet())\n\n# Be careful : if trend and seasonality were removed, this specific gridsearch cannot be performed.\nparam_grid = {'trend': [ModelMode.ADDITIVE, ModelMode.MULTIPLICATIVE, ModelMode.NONE],\n              'seasonal': [SeasonalityMode.ADDITIVE, SeasonalityMode.MULTIPLICATIVE, SeasonalityMode.NONE],\n              }\nsignal.apply_model(ExponentialSmoothing(), gridsearch=True, parameters=param_grid)\n```\nYou can pass a list of metrics to the method `compute_scores`. By default, it tries to compute R\u00b2, MSE, RMSE, MAPE, SMAPE and MAE. Warnings are raised if some metrics are impossible to compute with this type of data.\nYou can choose to compute scores time-wise or unit-wise with parameter `axis`. However, with univariate data it is preferable to keep the default value (axis=1, unit-wise).\n```python\nsignal.compute_scores(list_metrics=['rmse', 'r2'])\nsignal.compute_scores()\nprint(signal.models['Prophet']['scores'])\n```\nOutput:\n```python\n>>> {'unit_wise':      \n                      r2        mse       rmse      mape     smape        mae\ncomponent                                                                \npassengers        0.866962  590.08557  24.291677  3.694311  3.743241  18.008002, \n'time_wise': {}}\n```\n```python\nprint(signal.performance_models['unit_wise']['rmse'])\n```\nOutput:\n```python\n>>>               Prophet ExponentialSmoothing  AutoARIMA\ncomponent                                            \npassengers     24.291677            40.306771   26.718103\n\n```\nVisualize predictions with the `Visualization` class:\n```python\nVisualization(signal).show_predictions()\n```\n<img src=\"examples/ex_pred.png\" alt=\"drawing\" width=\"700\"/>\n\nWhen models have been trained, you can compute predictions for future dates using the `forecast`method by passing it the name of a trained model and the horizon of prediction.\n```python\nsignal.forecast(\"Prophet\", 100)\nsignal.forecast(\"AutoARIMA\", 100)\nsignal.forecast(\"ExponentialSmoothing\", 100)\nVisualization(signal).show_forecast()\n```\n<img src=\"examples/ex_fc.png\" alt=\"drawing\" width=\"700\"/>\n\n#### Aggregation of models\nThe method `apply_aggregated_model` aggregates the passed list of trained estimators according to their RMSE on train data. All passed models will be kept, so make sure to exclude models that are always less performant. The more a model performs compared to others, the greater its weight in agregation.\n\n```python\nsignal.apply_aggregated_model([AutoARIMA(), Prophet()])\nsignal.compute_scores(axis=1)\nVisualization(signal).show_predictions()\nsignal.forecast(\"AggregatedModel\", 100)\n```\n<img src=\"examples/ex_fc_ag.png\" alt=\"drawing\" width=\"700\"/>\n\n### Author\n<img src=\"logoEurobiosMewsLabs.png\" alt=\"drawing\" width=\"400\"/>\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "unified collections of models to analyze a time series",
    "version": "2023.3.1",
    "project_urls": null,
    "split_keywords": [
        "times series",
        "forcasting",
        "analysis",
        "machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "701f91e05aac6352af7d9ae8046a9638344503c0e44ac63ab480f7a91f2a7708",
                "md5": "a83a82d9b363011d9e3380df71f12d95",
                "sha256": "9228ebb0ccdb2396c254d79921d340701524ababa6adca7c91beded289aa83cf"
            },
            "downloads": -1,
            "filename": "pasts-2023.3.1-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a83a82d9b363011d9e3380df71f12d95",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 25840,
            "upload_time": "2023-11-15T08:52:33",
            "upload_time_iso_8601": "2023-11-15T08:52:33.192673Z",
            "url": "https://files.pythonhosted.org/packages/70/1f/91e05aac6352af7d9ae8046a9638344503c0e44ac63ab480f7a91f2a7708/pasts-2023.3.1-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "94a7f7d868b8c5c179132be43bbcc96cb2bc11a5cc4f8ed8512412975f0b1f0b",
                "md5": "b91cabc8752e71d99da8288c27bc238b",
                "sha256": "755f07d1cda1248b4fc7d6633e8f08a20799bda96610c4c563a9ef29c2bd57ec"
            },
            "downloads": -1,
            "filename": "pasts-2023.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "b91cabc8752e71d99da8288c27bc238b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 22188,
            "upload_time": "2023-11-15T08:52:34",
            "upload_time_iso_8601": "2023-11-15T08:52:34.699663Z",
            "url": "https://files.pythonhosted.org/packages/94/a7/f7d868b8c5c179132be43bbcc96cb2bc11a5cc4f8ed8512412975f0b1f0b/pasts-2023.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-15 08:52:34",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pasts"
}
        
Elapsed time: 0.15463s