mlforecast


Namemlforecast JSON
Version 0.14.0 PyPI version JSON
download
home_pagehttps://github.com/Nixtla/mlforecast
SummaryScalable machine learning based time series forecasting
upload_time2024-11-11 19:22:44
maintainerNone
docs_urlNone
authorJosé Morales
requires_python>=3.8
licenseApache Software License 2.0
keywords python forecast forecasting machine-learning dask
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # mlforecast
[![Tweet](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/intent/tweet?text=Statistical%20Forecasting%20Algorithms%20by%20Nixtla%20&url=https://github.com/Nixtla/statsforecast&via=nixtlainc&hashtags=StatisticalModels,TimeSeries,Forecasting)
[![Slack](https://img.shields.io/badge/Slack-4A154B?&logo=slack&logoColor=white.png)](https://join.slack.com/t/nixtlacommunity/shared_invite/zt-1pmhan9j5-F54XR20edHk0UtYAPcW4KQ)

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

<div align="center">

<center>
<img src="https://raw.githubusercontent.com/Nixtla/mlforecast/main/nbs/figs/logo.png" />
</center>
<h1 align="center">
Machine Learning 🤖 Forecast
</h1>
<h3 align="center">
Scalable machine learning for time series forecasting
</h3>

[![CI](https://github.com/Nixtla/mlforecast/actions/workflows/ci.yaml/badge.svg)](https://github.com/Nixtla/mlforecast/actions/workflows/ci.yaml)
[![Python](https://img.shields.io/pypi/pyversions/mlforecast.png)](https://pypi.org/project/mlforecast/)
[![PyPi](https://img.shields.io/pypi/v/mlforecast?color=blue.png)](https://pypi.org/project/mlforecast/)
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/mlforecast?color=blue.png)](https://anaconda.org/conda-forge/mlforecast)
[![License](https://img.shields.io/github/license/Nixtla/mlforecast.png)](https://github.com/Nixtla/mlforecast/blob/main/LICENSE)

**mlforecast** is a framework to perform time series forecasting using
machine learning models, with the option to scale to massive amounts of
data using remote clusters.

</div>

## Install

### PyPI

`pip install mlforecast`

### conda-forge

`conda install -c conda-forge mlforecast`

For more detailed instructions you can refer to the [installation
page](https://nixtla.github.io/mlforecast/docs/getting-started/install.html).

## Quick Start

**Get Started with this [quick
guide](https://nixtla.github.io/mlforecast/docs/getting-started/quick_start_local.html).**

**Follow this [end-to-end
walkthrough](https://nixtla.github.io/mlforecast/docs/getting-started/end_to_end_walkthrough.html)
for best practices.**

### Sample notebooks

- [m5](https://www.kaggle.com/code/lemuz90/m5-mlforecast-eval)
- [m5-polars](https://www.kaggle.com/code/lemuz90/m5-mlforecast-eval-polars)
- [m4](https://www.kaggle.com/code/lemuz90/m4-competition)
- [m4-cv](https://www.kaggle.com/code/lemuz90/m4-competition-cv)
- [favorita](https://www.kaggle.com/code/lemuz90/mlforecast-favorita)

## Why?

Current Python alternatives for machine learning models are slow,
inaccurate and don’t scale well. So we created a library that can be
used to forecast in production environments.
[`MLForecast`](https://Nixtla.github.io/mlforecast/forecast.html#mlforecast)
includes efficient feature engineering to train any machine learning
model (with `fit` and `predict` methods such as
[`sklearn`](https://scikit-learn.org/stable/)) to fit millions of time
series.

## Features

- Fastest implementations of feature engineering for time series
  forecasting in Python.
- Out-of-the-box compatibility with pandas, polars, spark, dask, and
  ray.
- Probabilistic Forecasting with Conformal Prediction.
- Support for exogenous variables and static covariates.
- Familiar `sklearn` syntax: `.fit` and `.predict`.

Missing something? Please open an issue or write us in
[![Slack](https://img.shields.io/badge/Slack-4A154B?&logo=slack&logoColor=white.png)](https://join.slack.com/t/nixtlaworkspace/shared_invite/zt-135dssye9-fWTzMpv2WBthq8NK0Yvu6A)

## Examples and Guides

📚 [End to End
Walkthrough](https://nixtla.github.io/mlforecast/docs/getting-started/end_to_end_walkthrough.html):
model training, evaluation and selection for multiple time series.

🔎 [Probabilistic
Forecasting](https://nixtla.github.io/mlforecast/docs/how-to-guides/prediction_intervals.html):
use Conformal Prediction to produce prediciton intervals.

👩‍🔬 [Cross
Validation](https://nixtla.github.io/mlforecast/docs/how-to-guides/cross_validation.html):
robust model’s performance evaluation.

🔌 [Predict Demand
Peaks](https://nixtla.github.io/mlforecast/docs/tutorials/electricity_peak_forecasting.html):
electricity load forecasting for detecting daily peaks and reducing
electric bills.

📈 [Transfer
Learning](https://nixtla.github.io/mlforecast/docs/how-to-guides/transfer_learning.html):
pretrain a model using a set of time series and then predict another one
using that pretrained model.

🌡️ [Distributed
Training](https://nixtla.github.io/mlforecast/docs/getting-started/quick_start_distributed.html):
use a Dask, Ray or Spark cluster to train models at scale.

## How to use

The following provides a very basic overview, for a more detailed
description see the
[documentation](https://nixtla.github.io/mlforecast/).

### Data setup

Store your time series in a pandas dataframe in long format, that is,
each row represents an observation for a specific serie and timestamp.

``` python
from mlforecast.utils import generate_daily_series

series = generate_daily_series(
    n_series=20,
    max_length=100,
    n_static_features=1,
    static_as_categorical=False,
    with_trend=True
)
series.head()
```

<div>

|     | unique_id | ds         | y          | static_0 |
|-----|-----------|------------|------------|----------|
| 0   | id_00     | 2000-01-01 | 17.519167  | 72       |
| 1   | id_00     | 2000-01-02 | 87.799695  | 72       |
| 2   | id_00     | 2000-01-03 | 177.442975 | 72       |
| 3   | id_00     | 2000-01-04 | 232.704110 | 72       |
| 4   | id_00     | 2000-01-05 | 317.510474 | 72       |

</div>

> Note: The unique_id serves as an identifier for each distinct time
> series in your dataset. If you are using only single time series from
> your dataset, set this column to a constant value.

### Models

Next define your models, each one will be trained on all series. These
can be any regressor that follows the scikit-learn API.

``` python
import lightgbm as lgb
from sklearn.linear_model import LinearRegression
```

``` python
models = [
    lgb.LGBMRegressor(random_state=0, verbosity=-1),
    LinearRegression(),
]
```

### Forecast object

Now instantiate an
[`MLForecast`](https://Nixtla.github.io/mlforecast/forecast.html#mlforecast)
object with the models and the features that you want to use. The
features can be lags, transformations on the lags and date features. You
can also define transformations to apply to the target before fitting,
which will be restored when predicting.

``` python
from mlforecast import MLForecast
from mlforecast.lag_transforms import ExpandingMean, RollingMean
from mlforecast.target_transforms import Differences
```

``` python
fcst = MLForecast(
    models=models,
    freq='D',
    lags=[7, 14],
    lag_transforms={
        1: [ExpandingMean()],
        7: [RollingMean(window_size=28)]
    },
    date_features=['dayofweek'],
    target_transforms=[Differences([1])],
)
```

### Training

To compute the features and train the models call `fit` on your
`Forecast` object.

``` python
fcst.fit(series)
```

    MLForecast(models=[LGBMRegressor, LinearRegression], freq=D, lag_features=['lag7', 'lag14', 'expanding_mean_lag1', 'rolling_mean_lag7_window_size28'], date_features=['dayofweek'], num_threads=1)

### Predicting

To get the forecasts for the next `n` days call `predict(n)` on the
forecast object. This will automatically handle the updates required by
the features using a recursive strategy.

``` python
predictions = fcst.predict(14)
predictions
```

<div>

|     | unique_id | ds         | LGBMRegressor | LinearRegression |
|-----|-----------|------------|---------------|------------------|
| 0   | id_00     | 2000-04-04 | 299.923771    | 311.432371       |
| 1   | id_00     | 2000-04-05 | 365.424147    | 379.466214       |
| 2   | id_00     | 2000-04-06 | 432.562441    | 460.234028       |
| 3   | id_00     | 2000-04-07 | 495.628000    | 524.278924       |
| 4   | id_00     | 2000-04-08 | 60.786223     | 79.828767        |
| ... | ...       | ...        | ...           | ...              |
| 275 | id_19     | 2000-03-23 | 36.266780     | 28.333215        |
| 276 | id_19     | 2000-03-24 | 44.370984     | 33.368228        |
| 277 | id_19     | 2000-03-25 | 50.746222     | 38.613001        |
| 278 | id_19     | 2000-03-26 | 58.906524     | 43.447398        |
| 279 | id_19     | 2000-03-27 | 63.073949     | 48.666783        |

<p>280 rows × 4 columns</p>
</div>

### Visualize results

``` python
from utilsforecast.plotting import plot_series
```

``` python
fig = plot_series(series, predictions, max_ids=4, plot_random=False)
```

![](https://raw.githubusercontent.com/Nixtla/mlforecast/main/nbs/figs/index.png)

## How to contribute

See
[CONTRIBUTING.md](https://github.com/Nixtla/mlforecast/blob/main/CONTRIBUTING.md).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Nixtla/mlforecast",
    "name": "mlforecast",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "python forecast forecasting machine-learning dask",
    "author": "Jos\u00e9 Morales",
    "author_email": "jmoralz92@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/07/0a/77e006902069864d3f658e32e68de77ad756a49ebec524fa87181750bb3f/mlforecast-0.14.0.tar.gz",
    "platform": null,
    "description": "# mlforecast\n[![Tweet](https://img.shields.io/twitter/url/http/shields.io.svg?style=social)](https://twitter.com/intent/tweet?text=Statistical%20Forecasting%20Algorithms%20by%20Nixtla%20&url=https://github.com/Nixtla/statsforecast&via=nixtlainc&hashtags=StatisticalModels,TimeSeries,Forecasting)\n[![Slack](https://img.shields.io/badge/Slack-4A154B?&logo=slack&logoColor=white.png)](https://join.slack.com/t/nixtlacommunity/shared_invite/zt-1pmhan9j5-F54XR20edHk0UtYAPcW4KQ)\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n<div align=\"center\">\n\n<center>\n<img src=\"https://raw.githubusercontent.com/Nixtla/mlforecast/main/nbs/figs/logo.png\" />\n</center>\n<h1 align=\"center\">\nMachine Learning \ud83e\udd16 Forecast\n</h1>\n<h3 align=\"center\">\nScalable machine learning for time series forecasting\n</h3>\n\n[![CI](https://github.com/Nixtla/mlforecast/actions/workflows/ci.yaml/badge.svg)](https://github.com/Nixtla/mlforecast/actions/workflows/ci.yaml)\n[![Python](https://img.shields.io/pypi/pyversions/mlforecast.png)](https://pypi.org/project/mlforecast/)\n[![PyPi](https://img.shields.io/pypi/v/mlforecast?color=blue.png)](https://pypi.org/project/mlforecast/)\n[![conda-forge](https://img.shields.io/conda/vn/conda-forge/mlforecast?color=blue.png)](https://anaconda.org/conda-forge/mlforecast)\n[![License](https://img.shields.io/github/license/Nixtla/mlforecast.png)](https://github.com/Nixtla/mlforecast/blob/main/LICENSE)\n\n**mlforecast** is a framework to perform time series forecasting using\nmachine learning models, with the option to scale to massive amounts of\ndata using remote clusters.\n\n</div>\n\n## Install\n\n### PyPI\n\n`pip install mlforecast`\n\n### conda-forge\n\n`conda install -c conda-forge mlforecast`\n\nFor more detailed instructions you can refer to the [installation\npage](https://nixtla.github.io/mlforecast/docs/getting-started/install.html).\n\n## Quick Start\n\n**Get Started with this [quick\nguide](https://nixtla.github.io/mlforecast/docs/getting-started/quick_start_local.html).**\n\n**Follow this [end-to-end\nwalkthrough](https://nixtla.github.io/mlforecast/docs/getting-started/end_to_end_walkthrough.html)\nfor best practices.**\n\n### Sample notebooks\n\n- [m5](https://www.kaggle.com/code/lemuz90/m5-mlforecast-eval)\n- [m5-polars](https://www.kaggle.com/code/lemuz90/m5-mlforecast-eval-polars)\n- [m4](https://www.kaggle.com/code/lemuz90/m4-competition)\n- [m4-cv](https://www.kaggle.com/code/lemuz90/m4-competition-cv)\n- [favorita](https://www.kaggle.com/code/lemuz90/mlforecast-favorita)\n\n## Why?\n\nCurrent Python alternatives for machine learning models are slow,\ninaccurate and don\u2019t scale well. So we created a library that can be\nused to forecast in production environments.\n[`MLForecast`](https://Nixtla.github.io/mlforecast/forecast.html#mlforecast)\nincludes efficient feature engineering to train any machine learning\nmodel (with `fit` and `predict` methods such as\n[`sklearn`](https://scikit-learn.org/stable/)) to fit millions of time\nseries.\n\n## Features\n\n- Fastest implementations of feature engineering for time series\n  forecasting in Python.\n- Out-of-the-box compatibility with pandas, polars, spark, dask, and\n  ray.\n- Probabilistic Forecasting with Conformal Prediction.\n- Support for exogenous variables and static covariates.\n- Familiar `sklearn` syntax: `.fit` and `.predict`.\n\nMissing something? Please open an issue or write us in\n[![Slack](https://img.shields.io/badge/Slack-4A154B?&logo=slack&logoColor=white.png)](https://join.slack.com/t/nixtlaworkspace/shared_invite/zt-135dssye9-fWTzMpv2WBthq8NK0Yvu6A)\n\n## Examples and Guides\n\n\ud83d\udcda [End to End\nWalkthrough](https://nixtla.github.io/mlforecast/docs/getting-started/end_to_end_walkthrough.html):\nmodel training, evaluation and selection for multiple time series.\n\n\ud83d\udd0e [Probabilistic\nForecasting](https://nixtla.github.io/mlforecast/docs/how-to-guides/prediction_intervals.html):\nuse Conformal Prediction to produce prediciton intervals.\n\n\ud83d\udc69\u200d\ud83d\udd2c [Cross\nValidation](https://nixtla.github.io/mlforecast/docs/how-to-guides/cross_validation.html):\nrobust model\u2019s performance evaluation.\n\n\ud83d\udd0c [Predict Demand\nPeaks](https://nixtla.github.io/mlforecast/docs/tutorials/electricity_peak_forecasting.html):\nelectricity load forecasting for detecting daily peaks and reducing\nelectric bills.\n\n\ud83d\udcc8 [Transfer\nLearning](https://nixtla.github.io/mlforecast/docs/how-to-guides/transfer_learning.html):\npretrain a model using a set of time series and then predict another one\nusing that pretrained model.\n\n\ud83c\udf21\ufe0f [Distributed\nTraining](https://nixtla.github.io/mlforecast/docs/getting-started/quick_start_distributed.html):\nuse a Dask, Ray or Spark cluster to train models at scale.\n\n## How to use\n\nThe following provides a very basic overview, for a more detailed\ndescription see the\n[documentation](https://nixtla.github.io/mlforecast/).\n\n### Data setup\n\nStore your time series in a pandas dataframe in long format, that is,\neach row represents an observation for a specific serie and timestamp.\n\n``` python\nfrom mlforecast.utils import generate_daily_series\n\nseries = generate_daily_series(\n    n_series=20,\n    max_length=100,\n    n_static_features=1,\n    static_as_categorical=False,\n    with_trend=True\n)\nseries.head()\n```\n\n<div>\n\n|     | unique_id | ds         | y          | static_0 |\n|-----|-----------|------------|------------|----------|\n| 0   | id_00     | 2000-01-01 | 17.519167  | 72       |\n| 1   | id_00     | 2000-01-02 | 87.799695  | 72       |\n| 2   | id_00     | 2000-01-03 | 177.442975 | 72       |\n| 3   | id_00     | 2000-01-04 | 232.704110 | 72       |\n| 4   | id_00     | 2000-01-05 | 317.510474 | 72       |\n\n</div>\n\n> Note: The unique_id serves as an identifier for each distinct time\n> series in your dataset. If you are using only single time series from\n> your dataset, set this column to a constant value.\n\n### Models\n\nNext define your models, each one will be trained on all series. These\ncan be any regressor that follows the scikit-learn API.\n\n``` python\nimport lightgbm as lgb\nfrom sklearn.linear_model import LinearRegression\n```\n\n``` python\nmodels = [\n    lgb.LGBMRegressor(random_state=0, verbosity=-1),\n    LinearRegression(),\n]\n```\n\n### Forecast object\n\nNow instantiate an\n[`MLForecast`](https://Nixtla.github.io/mlforecast/forecast.html#mlforecast)\nobject with the models and the features that you want to use. The\nfeatures can be lags, transformations on the lags and date features. You\ncan also define transformations to apply to the target before fitting,\nwhich will be restored when predicting.\n\n``` python\nfrom mlforecast import MLForecast\nfrom mlforecast.lag_transforms import ExpandingMean, RollingMean\nfrom mlforecast.target_transforms import Differences\n```\n\n``` python\nfcst = MLForecast(\n    models=models,\n    freq='D',\n    lags=[7, 14],\n    lag_transforms={\n        1: [ExpandingMean()],\n        7: [RollingMean(window_size=28)]\n    },\n    date_features=['dayofweek'],\n    target_transforms=[Differences([1])],\n)\n```\n\n### Training\n\nTo compute the features and train the models call `fit` on your\n`Forecast` object.\n\n``` python\nfcst.fit(series)\n```\n\n    MLForecast(models=[LGBMRegressor, LinearRegression], freq=D, lag_features=['lag7', 'lag14', 'expanding_mean_lag1', 'rolling_mean_lag7_window_size28'], date_features=['dayofweek'], num_threads=1)\n\n### Predicting\n\nTo get the forecasts for the next `n` days call `predict(n)` on the\nforecast object. This will automatically handle the updates required by\nthe features using a recursive strategy.\n\n``` python\npredictions = fcst.predict(14)\npredictions\n```\n\n<div>\n\n|     | unique_id | ds         | LGBMRegressor | LinearRegression |\n|-----|-----------|------------|---------------|------------------|\n| 0   | id_00     | 2000-04-04 | 299.923771    | 311.432371       |\n| 1   | id_00     | 2000-04-05 | 365.424147    | 379.466214       |\n| 2   | id_00     | 2000-04-06 | 432.562441    | 460.234028       |\n| 3   | id_00     | 2000-04-07 | 495.628000    | 524.278924       |\n| 4   | id_00     | 2000-04-08 | 60.786223     | 79.828767        |\n| ... | ...       | ...        | ...           | ...              |\n| 275 | id_19     | 2000-03-23 | 36.266780     | 28.333215        |\n| 276 | id_19     | 2000-03-24 | 44.370984     | 33.368228        |\n| 277 | id_19     | 2000-03-25 | 50.746222     | 38.613001        |\n| 278 | id_19     | 2000-03-26 | 58.906524     | 43.447398        |\n| 279 | id_19     | 2000-03-27 | 63.073949     | 48.666783        |\n\n<p>280 rows \u00d7 4 columns</p>\n</div>\n\n### Visualize results\n\n``` python\nfrom utilsforecast.plotting import plot_series\n```\n\n``` python\nfig = plot_series(series, predictions, max_ids=4, plot_random=False)\n```\n\n![](https://raw.githubusercontent.com/Nixtla/mlforecast/main/nbs/figs/index.png)\n\n## How to contribute\n\nSee\n[CONTRIBUTING.md](https://github.com/Nixtla/mlforecast/blob/main/CONTRIBUTING.md).\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Scalable machine learning based time series forecasting",
    "version": "0.14.0",
    "project_urls": {
        "Homepage": "https://github.com/Nixtla/mlforecast"
    },
    "split_keywords": [
        "python",
        "forecast",
        "forecasting",
        "machine-learning",
        "dask"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a7a6e6a02d4f9da35cf1df821f54b5168e9f325578ce4c7d609e3f40d99e2501",
                "md5": "fc93bb15d1a7376db7275e69a4cff3ab",
                "sha256": "7418fc8a97cf505e2b33f859bdc0b336da64cc034f10acab260a5a19f47fa9e5"
            },
            "downloads": -1,
            "filename": "mlforecast-0.14.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fc93bb15d1a7376db7275e69a4cff3ab",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 71964,
            "upload_time": "2024-11-11T19:22:42",
            "upload_time_iso_8601": "2024-11-11T19:22:42.420560Z",
            "url": "https://files.pythonhosted.org/packages/a7/a6/e6a02d4f9da35cf1df821f54b5168e9f325578ce4c7d609e3f40d99e2501/mlforecast-0.14.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "070a77e006902069864d3f658e32e68de77ad756a49ebec524fa87181750bb3f",
                "md5": "4b86bf669a62dd0f3a78417a9d97ec36",
                "sha256": "2bc11ad747d82e08107444939bf292c7eb20d6d951ad3109697233588e7be4d8"
            },
            "downloads": -1,
            "filename": "mlforecast-0.14.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4b86bf669a62dd0f3a78417a9d97ec36",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 71787,
            "upload_time": "2024-11-11T19:22:44",
            "upload_time_iso_8601": "2024-11-11T19:22:44.774111Z",
            "url": "https://files.pythonhosted.org/packages/07/0a/77e006902069864d3f658e32e68de77ad756a49ebec524fa87181750bb3f/mlforecast-0.14.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-11 19:22:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Nixtla",
    "github_project": "mlforecast",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "mlforecast"
}
        
Elapsed time: 0.39562s