# π§ dynamic-sarimax
[](https://pypi.org/project/dynamic-sarimax/)
[](https://pypi.org/project/dynamic-sarimax/)
[](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/LICENSE)
[](https://github.com/NefariousNiru/dynamic-sarimax/actions)
---
**Delay-aware SARIMAX wrapper** that fixes the common pitfalls of `statsmodels.SARIMAX`:
proper lag alignment for exogenous variables, train-only scaling, and safe rolling-origin
evaluation β all built-in.
---
## β¨ Why this exists
Plain SARIMAX requires you to hand-align exogenous regressors (e.g. lagged mobility, weather),
risking leakage or off-by-one bugs.
`dynamic-sarimax` makes this safe by construction.
**Key guarantees**
* β
For delay `b`, trains only on valid pairs `(y_t, x_{t-b})` β never imputes missing lags.
* β
Scalers are fit *only on training windows* during CV.
* β
Forecasting refuses to run if required future exogenous rows are missing.
* β
Rolling-origin evaluation and AIC-based delay selection included.
---
## π Quickstart
```bash
# create venv and install deps
poetry install
# run example (uses example CSV under examples/)
poetry run python examples/ili_quickstart.py
```
```python
from dynamic_sarimax import (
SarimaxConfig,
select_delay_by_aic,
rolling_evaluate,
)
cfg = SarimaxConfig(order=(5,0,2), seasonal_order=(1,0,0,52))
best_b, best_aic = select_delay_by_aic(y_train, x_train, delays=[1,2,3], cfg=cfg)
print(f"Best lag = {best_b} | AIC = {best_aic:.2f}")
res = rolling_evaluate(y, x, cfg, delay=best_b, horizons=24, train_frac=0.8)
print(res.head())
```
---
## π Example output
```
Chosen delay b (on 80% train): 2 | Train AIC: 1234.56
Per-horizon scores (rolling validation on last 20%):
h n_origins MSE sMAPE
1 52 0.103 8.12
2 51 0.109 8.54
...
Average MSE = 0.124
Average sMAPE = 8.77 %
```
---
## βοΈ Installation
```bash
pip install dynamic-sarimax
# or
poetry add dynamic-sarimax
```
Python β₯ 3.10, tested on 3.10β3.12.
---
## π§© Components
| Module | Purpose |
| :-------------- | :--------------------------------------------- |
| `config.py` | Parameter dataclasses for SARIMAX and lag spec |
| `features.py` | Safe lagging + scaling transformer |
| `model.py` | Wrapper around `statsmodels.SARIMAX` |
| `selection.py` | Delay (lag) selection via AIC |
| `evaluation.py` | Rolling-origin cross-validation (new v1.2) |
| `metrics.py` | MSE & sMAPE helpers |
---
## π Rolling validation β strategies & knobs
`rolling_evaluate` is the batteries-included, safe rolling-origin evaluator.
### **Signature**
```python
agg = rolling_evaluate(
y, X, cfg,
delay, # int or None
horizons, # int > 0
train_frac=0.8,
min_train=30,
*,
# exogenous policy
allow_future_exog=False,
X_future_manual=None,
# window strategy
strategy="expanding", # "expanding" | "sliding"
window=None, # required if strategy="sliding"
refit_every=1, # >1 = refit every k origins
return_details=False, # if True returns (agg, details)
)
```
---
### π§± Strategies
| Strategy | Description |
| ------------- | -------------------------------------------------------------------------------------- |
| `"expanding"` | Default. Train on `[0..o-1]` for origin `o`. The training window grows over time. |
| `"sliding"` | Train on last `window` observations `[o-window..o-1]`. `window` must be β₯ `min_train`. |
---
### π Refitting cadence
| `refit_every` | Behavior |
| ------------- | ------------------------------------------------------------------ |
| `1` (default) | Refit at every origin (fully independent fits). |
| `k>1` | Refit every `k` origins; reuse parameters between refits. (Faster) |
> **Future v2 roadmap:** optional *state reconditioning* for partial re-use without full re-fit.
---
### βοΈ Exogenous policy (no-peek by default)
| Case | Behavior |
| -------------------------------------- | ------------------------------------------------------------------------------------------------- |
| `delay=None` | Univariate SARIMAX; forecasts all `horizons`. |
| `delay=int`, `allow_future_exog=False` | Evaluate at most `steps_eff = min(horizons, delay)` per origin β prevents future X leakage. |
| `delay=int`, `allow_future_exog=True` | Requires passing `X_future_manual` with the same columns as `X`. Allows full-horizon forecasting. |
> If `delay=0` and `allow_future_exog=False`, no valid horizon exists β raises `RuntimeError` (explicitly to prevent silent misuse).
---
### π€ Return values
| Mode | Description |
| -------------------------- | ----------------------------------------------------------------------------------------- |
| Default | Returns aggregate DataFrame (`agg`) with columns `["h", "n_origins", "MSE", "sMAPE"]`. |
| With `return_details=True` | Returns tuple `(agg, details)`, where `details` has `["origin", "h", "y_true", "y_hat"]`. |
`agg.attrs` always contains:
```python
{
"macro_MSE": float,
"macro_sMAPE": float
}
```
---
## π§ͺ Usage patterns
### 1οΈβ£ Univariate (default expanding window)
```python
cfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))
agg = rolling_evaluate(y, X=None, cfg=cfg, delay=None, horizons=12, train_frac=0.8)
```
### 2οΈβ£ With exogenous (no-peek, delay-limited)
```python
cfg = SarimaxConfig(order=(1,0,1), seasonal_order=(0,0,0,0))
agg = rolling_evaluate(y, X, cfg, delay=2, horizons=12, allow_future_exog=False)
# => Evaluates only h=1..2 per origin
```
### 3οΈβ£ With exogenous (opt-in future X)
```python
X_future_manual = pd.DataFrame({...}) # Future exogenous block
agg = rolling_evaluate(
y, X, cfg,
delay=2, horizons=12,
allow_future_exog=True,
X_future_manual=X_future_manual,
)
```
### 4οΈβ£ Sliding window with refit cadence
```python
agg = rolling_evaluate(
y, X, cfg,
delay=1, horizons=6,
strategy="sliding",
window=96,
refit_every=4,
)
```
### 5οΈβ£ Detailed results for plotting
```python
agg, details = rolling_evaluate(
y, X=None, cfg=cfg,
delay=None, horizons=8,
return_details=True,
)
# details has origin, h, y_true, y_hat
```
---
## β οΈ Common errors (by design)
| Error | Reason |
| ---------------------------------------------------------------------------- | ------------------------------------------------- |
| `ValueError("horizons must be positive")` | Invalid `horizons`. |
| `ValueError("window must be provided when strategy='sliding'")` | Missing window for sliding mode. |
| `ValueError("allow_future_exog=True but X_future_manual was not provided.")` | Required future exog missing. |
| `ValueError("Exogenous columns mismatch...")` | Column mismatch between X and X_future_manual. |
| `RuntimeError("No evaluations produced...")` | All origins skipped (e.g., delay=0 with no-peek). |
---
## π Example: Comparing rolling strategies
```python
cfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))
agg1 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="expanding")
agg2 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="sliding", window=80)
agg3 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy="expanding", refit_every=4)
```
Plot macro averages or per-horizon curves to compare trade-offs between accuracy and runtime.
---
## π§― Testing
```bash
poetry run pytest -q
```
Comprehensive tests cover:
* expanding vs sliding windows
* refit cadence (`refit_every`)
* no-peek & future-exog modes
* input validation and error cases
* optional return-details branch
---
## πΊοΈ Roadmap (v2)
* **State reconditioning** between refits (partial parameter reuse).
* **Parallel rolling origins** for large datasets.
* **Custom metric hooks** and progress callbacks.
---
## πͺ Project links
* [Repository](https://github.com/NefariousNiru/dynamic-sarimax)
* [Contributing guide](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/CONTRIBUTING.md)
* [Licence](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/LICENSE)
* [Issues](https://github.com/NefariousNiru/dynamic-sarimax/issues)
* [PyPI package](https://pypi.org/project/dynamic-sarimax/)
---
## π License
Apache-2.0 Β© 2025 **Nirupom Bose Roy**
Contributions welcome!
Raw data
{
"_id": null,
"home_page": "https://github.com/NefariousNiru/dynamic-sarimax",
"name": "dynamic-sarimax",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.12",
"maintainer_email": null,
"keywords": "time-series, sarimax, arima, forecasting, exogenous",
"author": "Nirupom Bose Roy",
"author_email": "nirupomboseroy@uga.edu",
"download_url": "https://files.pythonhosted.org/packages/36/36/92c31f6932081432bc0df88b8c7f6a2bf6109887b0ddb7723f8282ac7e67/dynamic_sarimax-1.0.0.tar.gz",
"platform": null,
"description": "# \ud83e\udded dynamic-sarimax\n\n[](https://pypi.org/project/dynamic-sarimax/)\n[](https://pypi.org/project/dynamic-sarimax/)\n[](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/LICENSE)\n[](https://github.com/NefariousNiru/dynamic-sarimax/actions)\n\n---\n\n**Delay-aware SARIMAX wrapper** that fixes the common pitfalls of `statsmodels.SARIMAX`:\nproper lag alignment for exogenous variables, train-only scaling, and safe rolling-origin\nevaluation \u2014 all built-in.\n\n---\n\n## \u2728 Why this exists\n\nPlain SARIMAX requires you to hand-align exogenous regressors (e.g. lagged mobility, weather),\nrisking leakage or off-by-one bugs.\n`dynamic-sarimax` makes this safe by construction.\n\n**Key guarantees**\n\n* \u2705 For delay `b`, trains only on valid pairs `(y_t, x_{t-b})` \u2014 never imputes missing lags.\n* \u2705 Scalers are fit *only on training windows* during CV.\n* \u2705 Forecasting refuses to run if required future exogenous rows are missing.\n* \u2705 Rolling-origin evaluation and AIC-based delay selection included.\n\n---\n\n## \ud83d\ude80 Quickstart\n\n```bash\n# create venv and install deps\npoetry install\n\n# run example (uses example CSV under examples/)\npoetry run python examples/ili_quickstart.py\n```\n\n```python\nfrom dynamic_sarimax import (\n SarimaxConfig,\n select_delay_by_aic,\n rolling_evaluate,\n)\n\ncfg = SarimaxConfig(order=(5,0,2), seasonal_order=(1,0,0,52))\nbest_b, best_aic = select_delay_by_aic(y_train, x_train, delays=[1,2,3], cfg=cfg)\nprint(f\"Best lag = {best_b} | AIC = {best_aic:.2f}\")\n\nres = rolling_evaluate(y, x, cfg, delay=best_b, horizons=24, train_frac=0.8)\nprint(res.head())\n```\n\n---\n\n## \ud83d\udcc8 Example output\n\n```\nChosen delay b (on 80% train): 2 | Train AIC: 1234.56\n\nPer-horizon scores (rolling validation on last 20%):\n h n_origins MSE sMAPE\n 1 52 0.103 8.12\n 2 51 0.109 8.54\n ...\n\nAverage MSE = 0.124\nAverage sMAPE = 8.77 %\n```\n\n---\n\n## \u2699\ufe0f Installation\n\n```bash\npip install dynamic-sarimax\n# or\npoetry add dynamic-sarimax\n```\n\nPython \u2265 3.10, tested on 3.10\u20133.12.\n\n---\n\n## \ud83e\udde9 Components\n\n| Module | Purpose |\n| :-------------- | :--------------------------------------------- |\n| `config.py` | Parameter dataclasses for SARIMAX and lag spec |\n| `features.py` | Safe lagging + scaling transformer |\n| `model.py` | Wrapper around `statsmodels.SARIMAX` |\n| `selection.py` | Delay (lag) selection via AIC |\n| `evaluation.py` | Rolling-origin cross-validation (new v1.2) |\n| `metrics.py` | MSE & sMAPE helpers |\n\n---\n\n## \ud83d\udd01 Rolling validation \u2014 strategies & knobs\n\n`rolling_evaluate` is the batteries-included, safe rolling-origin evaluator.\n\n### **Signature**\n\n```python\nagg = rolling_evaluate(\n y, X, cfg,\n delay, # int or None\n horizons, # int > 0\n train_frac=0.8,\n min_train=30,\n *,\n # exogenous policy\n allow_future_exog=False,\n X_future_manual=None,\n # window strategy\n strategy=\"expanding\", # \"expanding\" | \"sliding\"\n window=None, # required if strategy=\"sliding\"\n refit_every=1, # >1 = refit every k origins\n return_details=False, # if True returns (agg, details)\n)\n```\n\n---\n\n### \ud83e\uddf1 Strategies\n\n| Strategy | Description |\n| ------------- | -------------------------------------------------------------------------------------- |\n| `\"expanding\"` | Default. Train on `[0..o-1]` for origin `o`. The training window grows over time. |\n| `\"sliding\"` | Train on last `window` observations `[o-window..o-1]`. `window` must be \u2265 `min_train`. |\n\n---\n\n### \ud83d\udd01 Refitting cadence\n\n| `refit_every` | Behavior |\n| ------------- | ------------------------------------------------------------------ |\n| `1` (default) | Refit at every origin (fully independent fits). |\n| `k>1` | Refit every `k` origins; reuse parameters between refits. (Faster) |\n\n> **Future v2 roadmap:** optional *state reconditioning* for partial re-use without full re-fit.\n\n---\n\n### \u2696\ufe0f Exogenous policy (no-peek by default)\n\n| Case | Behavior |\n| -------------------------------------- | ------------------------------------------------------------------------------------------------- |\n| `delay=None` | Univariate SARIMAX; forecasts all `horizons`. |\n| `delay=int`, `allow_future_exog=False` | Evaluate at most `steps_eff = min(horizons, delay)` per origin \u2014 prevents future X leakage. |\n| `delay=int`, `allow_future_exog=True` | Requires passing `X_future_manual` with the same columns as `X`. Allows full-horizon forecasting. |\n\n> If `delay=0` and `allow_future_exog=False`, no valid horizon exists \u2192 raises `RuntimeError` (explicitly to prevent silent misuse).\n\n---\n\n### \ud83d\udce4 Return values\n\n| Mode | Description |\n| -------------------------- | ----------------------------------------------------------------------------------------- |\n| Default | Returns aggregate DataFrame (`agg`) with columns `[\"h\", \"n_origins\", \"MSE\", \"sMAPE\"]`. |\n| With `return_details=True` | Returns tuple `(agg, details)`, where `details` has `[\"origin\", \"h\", \"y_true\", \"y_hat\"]`. |\n\n`agg.attrs` always contains:\n\n```python\n{\n \"macro_MSE\": float,\n \"macro_sMAPE\": float\n}\n```\n\n---\n\n## \ud83e\uddea Usage patterns\n\n### 1\ufe0f\u20e3 Univariate (default expanding window)\n\n```python\ncfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))\nagg = rolling_evaluate(y, X=None, cfg=cfg, delay=None, horizons=12, train_frac=0.8)\n```\n\n### 2\ufe0f\u20e3 With exogenous (no-peek, delay-limited)\n\n```python\ncfg = SarimaxConfig(order=(1,0,1), seasonal_order=(0,0,0,0))\nagg = rolling_evaluate(y, X, cfg, delay=2, horizons=12, allow_future_exog=False)\n# => Evaluates only h=1..2 per origin\n```\n\n### 3\ufe0f\u20e3 With exogenous (opt-in future X)\n\n```python\nX_future_manual = pd.DataFrame({...}) # Future exogenous block\nagg = rolling_evaluate(\n y, X, cfg,\n delay=2, horizons=12,\n allow_future_exog=True,\n X_future_manual=X_future_manual,\n)\n```\n\n### 4\ufe0f\u20e3 Sliding window with refit cadence\n\n```python\nagg = rolling_evaluate(\n y, X, cfg,\n delay=1, horizons=6,\n strategy=\"sliding\",\n window=96,\n refit_every=4,\n)\n```\n\n### 5\ufe0f\u20e3 Detailed results for plotting\n\n```python\nagg, details = rolling_evaluate(\n y, X=None, cfg=cfg,\n delay=None, horizons=8,\n return_details=True,\n)\n# details has origin, h, y_true, y_hat\n```\n\n---\n\n## \u26a0\ufe0f Common errors (by design)\n\n| Error | Reason |\n| ---------------------------------------------------------------------------- | ------------------------------------------------- |\n| `ValueError(\"horizons must be positive\")` | Invalid `horizons`. |\n| `ValueError(\"window must be provided when strategy='sliding'\")` | Missing window for sliding mode. |\n| `ValueError(\"allow_future_exog=True but X_future_manual was not provided.\")` | Required future exog missing. |\n| `ValueError(\"Exogenous columns mismatch...\")` | Column mismatch between X and X_future_manual. |\n| `RuntimeError(\"No evaluations produced...\")` | All origins skipped (e.g., delay=0 with no-peek). |\n\n---\n\n## \ud83d\udcca Example: Comparing rolling strategies\n\n```python\ncfg = SarimaxConfig(order=(2,0,1), seasonal_order=(0,0,0,0))\n\nagg1 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy=\"expanding\")\nagg2 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy=\"sliding\", window=80)\nagg3 = rolling_evaluate(y, X, cfg, delay=1, horizons=6, strategy=\"expanding\", refit_every=4)\n```\n\nPlot macro averages or per-horizon curves to compare trade-offs between accuracy and runtime.\n\n---\n\n## \ud83e\uddef Testing\n\n```bash\npoetry run pytest -q\n```\n\nComprehensive tests cover:\n\n* expanding vs sliding windows\n* refit cadence (`refit_every`)\n* no-peek & future-exog modes\n* input validation and error cases\n* optional return-details branch\n\n---\n\n## \ud83d\uddfa\ufe0f Roadmap (v2)\n\n* **State reconditioning** between refits (partial parameter reuse).\n* **Parallel rolling origins** for large datasets.\n* **Custom metric hooks** and progress callbacks.\n\n---\n\n## \ud83e\ude9e Project links\n\n* [Repository](https://github.com/NefariousNiru/dynamic-sarimax)\n* [Contributing guide](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/CONTRIBUTING.md)\n* [Licence](https://github.com/NefariousNiru/dynamic-sarimax/blob/master/LICENSE)\n* [Issues](https://github.com/NefariousNiru/dynamic-sarimax/issues)\n* [PyPI package](https://pypi.org/project/dynamic-sarimax/)\n\n---\n\n## \ud83d\udcdc License\n\nApache-2.0 \u00a9 2025 **Nirupom Bose Roy**\nContributions welcome!\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Safe, delay-aware SARIMAX with rolling evaluation and AIC-based lag selection",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/NefariousNiru/dynamic-sarimax",
"Repository": "https://github.com/NefariousNiru/dynamic-sarimax"
},
"split_keywords": [
"time-series",
" sarimax",
" arima",
" forecasting",
" exogenous"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9ef7cddd58fa3812bf81506b2fc0ba33a180d1fb472cc2722a10d60e6afa237d",
"md5": "e18138b49293ea1cb404b71cd94a7fff",
"sha256": "299b16dbe9be10e824206098e93ef8c59e929d9c5e847a7b16304bf889a598c2"
},
"downloads": -1,
"filename": "dynamic_sarimax-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e18138b49293ea1cb404b71cd94a7fff",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.12",
"size": 21983,
"upload_time": "2025-10-10T02:26:20",
"upload_time_iso_8601": "2025-10-10T02:26:20.039188Z",
"url": "https://files.pythonhosted.org/packages/9e/f7/cddd58fa3812bf81506b2fc0ba33a180d1fb472cc2722a10d60e6afa237d/dynamic_sarimax-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "363692c31f6932081432bc0df88b8c7f6a2bf6109887b0ddb7723f8282ac7e67",
"md5": "b8e769942a1647da5463f5bd1b9cddac",
"sha256": "f8b66ecf46756d3a778f76adff6f96e20ef6a3b220805c0e80021361b2ec99bb"
},
"downloads": -1,
"filename": "dynamic_sarimax-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "b8e769942a1647da5463f5bd1b9cddac",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.12",
"size": 21111,
"upload_time": "2025-10-10T02:26:21",
"upload_time_iso_8601": "2025-10-10T02:26:21.574751Z",
"url": "https://files.pythonhosted.org/packages/36/36/92c31f6932081432bc0df88b8c7f6a2bf6109887b0ddb7723f8282ac7e67/dynamic_sarimax-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-10 02:26:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "NefariousNiru",
"github_project": "dynamic-sarimax",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dynamic-sarimax"
}