timetomodel


Nametimetomodel JSON
Version 0.7.3 PyPI version JSON
download
home_pagehttps://github.com/seitabv/timetomodel
SummarySane handling of time series data for forecast modelling - with production usage in mind.
upload_time2023-06-07 13:37:05
maintainer
docs_urlNone
authorSeita BV
requires_python
license
keywords time series forecasting
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            # timetomodel

Time series forecasting is a modern data science & engineering challenge.

We noticed that these two worlds, data science and engineering of time series forecasting, are not very compatible.
Often, work from the data scientist has to be re-implemented by engineers to be used in production.

`timetomodel` was created to change that. It describes the data treatment of a model, and also automates common data treatment tasks like building data for training and testing.

As a *data scientist*, experiment with a model in your notebook.
Load data from static files (e.g. CSV) and try out lags, regressors and so on.
Compare plots and mean square errors of the models you developed.

As an *engineer*, take over the model description and use it in your production code.
Often, this would entail not much more than changing the data source (e.g from CSV to a column in the database).

`timetomodel` is supposed to wrap around any fit/predict type model, e.g. from statsmodels or scikit-learn (some work needed here to ensure support).


## Features

Here are some features for both data scientists and engineers to enjoy:

* Describe how to load data for outcome and regressor variables. Load from Pandas objects, CSV files, Pandas pickles or databases via SQLAlchemy.
* Create train & test data, including lags.
* Timezone awareness support.
* Custom data transformations, after loading (e.g. to remove duplicate) or only for forecasting (e.g. to apply a BoxCox transformation).
* Evaluate a model by RMSE, and plot the cumulative error.
* Support for creating rolling forecasts.


## Installation

``pip install timetomodel``

## Example

Here is an example where we describe a solar time series problem, and use ``statsmodels.OLS``, a linear regression model, to forecast one hour ahead:

    import pandas as pd
    import pytz
    from datetime import datetime, timedelta
    from statsmodels.api import OLS
    from timetomodel import speccing, ModelState, create_fitted_model, evaluate_models
    from timetomodel.transforming import BoxCoxTransformation
    from timetomodel.forecasting import make_rolling_forecasts

    data_start = datetime(2015, 3, 1, tzinfo=pytz.utc)
    data_end = datetime(2015, 10, 31, tzinfo=pytz.utc)

    #### Solar model - 1h ahead  ####

    # spec outcome variable
    solar_outcome_var_spec = speccing.CSVFileSeriesSpecs(
        file_path="data.csv",
        time_column="datetime",
        value_column="solar_power",
        name="solar power",
        feature_transformation=BoxCoxTransformation(lambda2=0.1)
    )
    # spec regressor variable
    regressor_spec_1h = speccing.CSVFileSeriesSpecs(
        file_path="data.csv",
        time_column="datetime",
        value_column="irradiation_forecast1h",
        name="irradiation forecast",
        feature_transformation=BoxCoxTransformation(lambda2=0.1)
    )
    # spec whole model treatment
    solar_model1h_specs = speccing.ModelSpecs(
        outcome_var=solar_outcome_var_spec,
        model=OLS,
        frequency=timedelta(minutes=15),
        horizon=timedelta(hours=1),
        lags=[lag * 96 for lag in range(1, 8)],  # 7 days (data has daily seasonality)
        regressors=[regressor_spec_1h],
        start_of_training=data_start + timedelta(days=30),
        end_of_testing=data_end,
        ratio_training_testing_data=2/3,
        remodel_frequency=timedelta(days=14)  # re-train model every two weeks
    )

    solar_model1h = create_fitted_model(solar_model1h_specs, "Linear Regression Solar Horizon 1h")
    # solar_model_1h is now an OLS model object which can be pickled and re-used.
    # With the solar_model1h_specs in hand, your production code could always re-train a new one,
    # if the model has become outdated.

    # For data scientists: evaluate model
    evaluate_models(m1=ModelState(solar_model1h, solar_model1h_specs))

![Evaluation result](https://raw.githubusercontent.com/SeitaBV/timetomodel/master/img/solar-forecast-evaluation.png)

    # For engineers a): Change data sources to use database (hinted)
    solar_model1h_specs.outcome_var = speccing.DBSeriesSpecs(query=...)
    solar_model1h_specs.regressors[0] = speccing.DBSeriesSpecs(query=...)

    # For engineers b): Use model to make forecasts for an hour
    forecasts, model_state = make_rolling_forecasts(
        start=datetime(2015, 11, 1, tzinfo=pytz.utc),
        end=datetime(2015, 11, 1, 1, tzinfo=pytz.utc),
        model_specs=solar_model1h_specs
    )
    # model_state might have re-trained a new model automatically, by honoring the remodel_frequency




            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/seitabv/timetomodel",
    "name": "timetomodel",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "time series,forecasting",
    "author": "Seita BV",
    "author_email": "nicolas@seita.nl",
    "download_url": "https://files.pythonhosted.org/packages/9e/51/948aa9b4e8498924181eb3d17533eff0b43483d46ca9201b6c96d24b410c/timetomodel-0.7.3.tar.gz",
    "platform": null,
    "description": "# timetomodel\n\nTime series forecasting is a modern data science & engineering challenge.\n\nWe noticed that these two worlds, data science and engineering of time series forecasting, are not very compatible.\nOften, work from the data scientist has to be re-implemented by engineers to be used in production.\n\n`timetomodel` was created to change that. It describes the data treatment of a model, and also automates common data treatment tasks like building data for training and testing.\n\nAs a *data scientist*, experiment with a model in your notebook.\nLoad data from static files (e.g. CSV) and try out lags, regressors and so on.\nCompare plots and mean square errors of the models you developed.\n\nAs an *engineer*, take over the model description and use it in your production code.\nOften, this would entail not much more than changing the data source (e.g from CSV to a column in the database).\n\n`timetomodel` is supposed to wrap around any fit/predict type model, e.g. from statsmodels or scikit-learn (some work needed here to ensure support).\n\n\n## Features\n\nHere are some features for both data scientists and engineers to enjoy:\n\n* Describe how to load data for outcome and regressor variables. Load from Pandas objects, CSV files, Pandas pickles or databases via SQLAlchemy.\n* Create train & test data, including lags.\n* Timezone awareness support.\n* Custom data transformations, after loading (e.g. to remove duplicate) or only for forecasting (e.g. to apply a BoxCox transformation).\n* Evaluate a model by RMSE, and plot the cumulative error.\n* Support for creating rolling forecasts.\n\n\n## Installation\n\n``pip install timetomodel``\n\n## Example\n\nHere is an example where we describe a solar time series problem, and use ``statsmodels.OLS``, a linear regression model, to forecast one hour ahead:\n\n    import pandas as pd\n    import pytz\n    from datetime import datetime, timedelta\n    from statsmodels.api import OLS\n    from timetomodel import speccing, ModelState, create_fitted_model, evaluate_models\n    from timetomodel.transforming import BoxCoxTransformation\n    from timetomodel.forecasting import make_rolling_forecasts\n\n    data_start = datetime(2015, 3, 1, tzinfo=pytz.utc)\n    data_end = datetime(2015, 10, 31, tzinfo=pytz.utc)\n\n    #### Solar model - 1h ahead  ####\n\n    # spec outcome variable\n    solar_outcome_var_spec = speccing.CSVFileSeriesSpecs(\n        file_path=\"data.csv\",\n        time_column=\"datetime\",\n        value_column=\"solar_power\",\n        name=\"solar power\",\n        feature_transformation=BoxCoxTransformation(lambda2=0.1)\n    )\n    # spec regressor variable\n    regressor_spec_1h = speccing.CSVFileSeriesSpecs(\n        file_path=\"data.csv\",\n        time_column=\"datetime\",\n        value_column=\"irradiation_forecast1h\",\n        name=\"irradiation forecast\",\n        feature_transformation=BoxCoxTransformation(lambda2=0.1)\n    )\n    # spec whole model treatment\n    solar_model1h_specs = speccing.ModelSpecs(\n        outcome_var=solar_outcome_var_spec,\n        model=OLS,\n        frequency=timedelta(minutes=15),\n        horizon=timedelta(hours=1),\n        lags=[lag * 96 for lag in range(1, 8)],  # 7 days (data has daily seasonality)\n        regressors=[regressor_spec_1h],\n        start_of_training=data_start + timedelta(days=30),\n        end_of_testing=data_end,\n        ratio_training_testing_data=2/3,\n        remodel_frequency=timedelta(days=14)  # re-train model every two weeks\n    )\n\n    solar_model1h = create_fitted_model(solar_model1h_specs, \"Linear Regression Solar Horizon 1h\")\n    # solar_model_1h is now an OLS model object which can be pickled and re-used.\n    # With the solar_model1h_specs in hand, your production code could always re-train a new one,\n    # if the model has become outdated.\n\n    # For data scientists: evaluate model\n    evaluate_models(m1=ModelState(solar_model1h, solar_model1h_specs))\n\n![Evaluation result](https://raw.githubusercontent.com/SeitaBV/timetomodel/master/img/solar-forecast-evaluation.png)\n\n    # For engineers a): Change data sources to use database (hinted)\n    solar_model1h_specs.outcome_var = speccing.DBSeriesSpecs(query=...)\n    solar_model1h_specs.regressors[0] = speccing.DBSeriesSpecs(query=...)\n\n    # For engineers b): Use model to make forecasts for an hour\n    forecasts, model_state = make_rolling_forecasts(\n        start=datetime(2015, 11, 1, tzinfo=pytz.utc),\n        end=datetime(2015, 11, 1, 1, tzinfo=pytz.utc),\n        model_specs=solar_model1h_specs\n    )\n    # model_state might have re-trained a new model automatically, by honoring the remodel_frequency\n\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Sane handling of time series data for forecast modelling - with production usage in mind.",
    "version": "0.7.3",
    "project_urls": {
        "Homepage": "https://github.com/seitabv/timetomodel"
    },
    "split_keywords": [
        "time series",
        "forecasting"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "652d379105ee6650157b650c5a0da42ccab698fae50687d3ee481a2532bb1ba6",
                "md5": "115816b5dc27d33e9919205c5a88b7fe",
                "sha256": "031f0d2cd8d6320c9a8dfd35cf3d976bdc4f0d74f9322690d6ad74d05445ff05"
            },
            "downloads": -1,
            "filename": "timetomodel-0.7.3-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "115816b5dc27d33e9919205c5a88b7fe",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 29820,
            "upload_time": "2023-06-07T13:37:03",
            "upload_time_iso_8601": "2023-06-07T13:37:03.356921Z",
            "url": "https://files.pythonhosted.org/packages/65/2d/379105ee6650157b650c5a0da42ccab698fae50687d3ee481a2532bb1ba6/timetomodel-0.7.3-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9e51948aa9b4e8498924181eb3d17533eff0b43483d46ca9201b6c96d24b410c",
                "md5": "94d1af43fb5191f705a27808609aefde",
                "sha256": "dd57e20f1fea58240d8a93f38d0d9f598fc0600e88eeac5dceecbf34a750e08c"
            },
            "downloads": -1,
            "filename": "timetomodel-0.7.3.tar.gz",
            "has_sig": false,
            "md5_digest": "94d1af43fb5191f705a27808609aefde",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 141996,
            "upload_time": "2023-06-07T13:37:05",
            "upload_time_iso_8601": "2023-06-07T13:37:05.398638Z",
            "url": "https://files.pythonhosted.org/packages/9e/51/948aa9b4e8498924181eb3d17533eff0b43483d46ca9201b6c96d24b410c/timetomodel-0.7.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-07 13:37:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "seitabv",
    "github_project": "timetomodel",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "lcname": "timetomodel"
}
        
Elapsed time: 0.08291s