apdtflow


Nameapdtflow JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/yotambraun/APDTFlow
SummaryAPDTFlow: A modular forecasting framework for time series data
upload_time2025-10-18 11:14:32
maintainerNone
docs_urlNone
authorYotam Barun
requires_python>=3.10
licenseMIT
keywords time-series forecasting neural-ode deep-learning transformer tcn timeseries-analysis time-series-forecasting neural-differential-equations probabilistic-forecasting uncertainty-quantification multi-scale-decomposition pytorch machine-learning
VCS
bugtrack_url
requirements torch torchdiffeq pandas numpy matplotlib tensorboard pyyaml scikit-learn statsmodels
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # APDTFlow: A Modular Forecasting Framework for Time Series Data
<p align="center">
  <img src="assets/images/my_logo_framework.png" alt="APDTFlow Logo" width="300">
</p>

[![PyPI version](https://img.shields.io/pypi/v/apdtflow.svg)](https://pypi.org/project/apdtflow)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Downloads](https://pepy.tech/badge/apdtflow)](https://pepy.tech/project/apdtflow)
[![Python Versions](https://img.shields.io/pypi/pyversions/apdtflow.svg)](https://pypi.org/project/apdtflow/)
[![CI](https://github.com/yotambraun/APDTFlow/actions/workflows/ci.yml/badge.svg)](https://github.com/yotambraun/APDTFlow/actions/workflows/ci.yml)
[![Coverage](https://codecov.io/gh/yotambraun/APDTFlow/branch/main/graph/badge.svg)](https://codecov.io/gh/yotambraun/APDTFlow)


APDTFlow is a modern and extensible forecasting framework for time series data that leverages advanced techniques including neural ordinary differential equations (Neural ODEs), transformer-based components, and probabilistic modeling. Its modular design allows researchers and practitioners to experiment with multiple forecasting models and easily extend the framework for new methods.

## 🚀 New in v0.2.0: Advanced Features!

APDTFlow v0.2.0 introduces **cutting-edge forecasting capabilities** based on 2025 research:

### 🌟 Exogenous Variables Support
**Boost accuracy by 30-50%** with external features:

```python
from apdtflow import APDTFlowForecaster

# Use external features like temperature, holidays, promotions
model = APDTFlowForecaster(
    forecast_horizon=14,
    exog_fusion_type='gated'  # 3 fusion strategies available
)

model.fit(
    df,
    target_col='sales',
    date_col='date',
    exog_cols=['temperature', 'is_holiday', 'promotion'],
    future_exog_cols=['is_holiday', 'promotion']  # Known in advance
)

# Predict with future exogenous data
future_exog = pd.DataFrame({'is_holiday': [0, 1, ...], 'promotion': [1, 0, ...]})
predictions = model.predict(exog_future=future_exog)
```

### 📊 Conformal Prediction
**Rigorous uncertainty quantification** with coverage guarantees:

```python
model = APDTFlowForecaster(
    forecast_horizon=14,
    use_conformal=True,  # Enable conformal prediction
    conformal_method='adaptive'  # Adapts to changing data
)

model.fit(df, target_col='sales')

# Get calibrated 95% prediction intervals
lower, pred, upper = model.predict(
    alpha=0.05,  # 95% coverage guarantee
    return_intervals='conformal'
)
```

### ⚡ Simple & Powerful API
**5 lines to production-ready forecasts:**

```python
from apdtflow import APDTFlowForecaster

model = APDTFlowForecaster(forecast_horizon=14)
model.fit(df, target_col='sales', date_col='date')
predictions = model.predict()
```

### Why APDTFlow?

| Feature | APDTFlow v0.2.0 | Other Libraries |
|---------|----------|-----------------|
| **Neural ODEs** | ✅ Continuous-time modeling | ❌ Discrete-time only |
| **Exogenous Variables** | ✅ 3 fusion strategies | ⚠️ Limited |
| **Conformal Prediction** | ✅ Rigorous uncertainty | ❌ Not available |
| **Multi-Scale Decomposition** | ✅ Trends + seasonality | ⚠️ Limited |
| **Simple `fit()/predict()` API** | ✅ 5 lines of code | ⚠️ Varies |
| **Multiple Architectures** | ✅ ODE/Transformer/TCN/Ensemble | ⚠️ Usually one |
| **PyTorch-based** | ✅ GPU acceleration | ✅ Most |

![APDTFlow Forecast](https://github.com/yotambraun/APDTFlow/blob/main/experiments/results_plots/APDTFlow_Forecast_Horizon_14_CV2.png)

## Experiment Results
In our mega experiment we compared multiple forecasting models across different forecast horizons using 3-fold cross‑validation. For brevity, below we show two key plots:

1. **Validation Loss Comparison:** A bar plot comparing the average validation losses of the models (APDTFlow, TransformerForecaster, TCNForecaster, and EnsembleForecaster) across forecast horizons.
2. **Example Forecast (Horizon 7, CV Split 3):** A forecast plot for the APDTFlow model for a 7-step forecast from CV split 3.

## Validation Loss Comparison

The bar plot below summarizes the average validation losses (lower is better) for the different models across the forecast horizons (7, 10, and 30 time steps):

![Validation Loss Comparison](experiments/results_plots/Validation_Loss_Comparison.png)

**Explanation:**  
This plot shows that the APDTFlow model (and possibly the ensemble) generally achieved lower validation losses compared to the other models, especially for longer forecast horizons. This indicates that its multi-scale decomposition and neural ODE dynamics are well-suited for capturing the trends and seasonal patterns in the dataset.

**Discussion:**  
The plot demonstrates that, overall, the APDTFlow model (and, in some cases, the ensemble) tend to achieve lower validation losses—particularly as the forecast horizon increases.

## Performance vs. Forecast Horizon

The following line plot illustrates how the performance (average validation loss) of each model changes with different forecast horizons. This visualization helps to assess which models maintain consistent performance as the forecast horizon increases.

![Performance vs. Horizon](experiments/results_plots/Performance_vs_Horizon.png)

**Discussion:**  
The line plot reveals the trend in model performance across forecast horizons. It helps us understand which models degrade gracefully (or even improve) as the forecast horizon lengthens.

## Example Forecast (Horizon 7, CV Split 3)

Below is an example forecast produced by the APDTFlow model for a forecast horizon of 7 time steps on the third cross-validation split.

![APDTFlow Forecast Horizon 7, CV3](experiments/results_plots/APDTFlow_Forecast_Horizon_7_CV3.png)

**Discussion:** 
- **Input Sequence (Blue):** The historical data (last 30 time steps) used as input.
- **True Future (Dashed Orange):** The actual future values for the next 7 time steps.
- **Predicted Future (Dotted Line):** The forecast generated by the model.

---

*For a detailed explanation, more plots, and **additional analysis of these results**, please see our [Experiment Results and Analysis](docs/experiment_results.md) document.*


## Table of Contents

1. [Installation](#installation)
2. [Quick Start](#quick-start)
   - [Training](#training)
   - [Inference](#inference)
3. [Data Processing and Augmentation](#data-processing-and-augmentation)
4. [Forecasting Approaches](#forecasting-approaches)
   - [APDTFlow](#apdtflow)
   - [TransformerForecaster](#transformerforecaster)
   - [TCNForecaster](#tcnforecaster)
   - [EnsembleForecaster](#ensembleforecaster)
5. [Evaluation and Metrics](#evaluation-and-metrics)
6. [Command-Line Interface (CLI)](#command-line-interface-cli)
7. [Cross-Validation Strategies](#cross-validation-strategies)
8. [Documentation and Examples](#documentation-and-examples)
9. [License](#license)

## 1. Installation

APDTFlow is published on [PyPI](https://pypi.org/project/apdtflow). To install:

```bash
pip install apdtflow
```
For development, clone the repository and install in editable mode:

```bash
git clone https://github.com/yotambraun/APDTFlow.git
cd APDTFlow
pip install -e .
```
## ✨ New Features in v0.1.24

### 🚀 Easy-to-Use High-Level API (MAJOR UPDATE!)
- **Simple `fit()`/`predict()` interface** - No more complex DataLoaders or manual preprocessing
- **Works directly with pandas DataFrames** - Natural integration with your workflow
- **Automatic normalization and preprocessing** - Just pass your data and go
- **Built-in visualization** - `plot_forecast()` with uncertainty bands
- **Multiple model types in one API** - Switch between ODE/Transformer/TCN/Ensemble with one parameter

### 📈 Enhanced Package Discoverability
- **Comprehensive PyPI keywords** - Better search visibility for time-series, forecasting, neural-ode, etc.
- **Improved classifiers** - Clearer package categorization
- **CHANGELOG.md** - Track all improvements and updates

### Previous Features (v0.1.23)

- **Learnable Time Series Embedding:**
  APDTFlow includes a `TimeSeriesEmbedding` module that learns to encode temporal information using gated residual networks. This module processes both raw time indices and periodic signals (and optionally calendar features) to produce a rich embedding that improves the subsequent forecasting performance.

- **Configuration Options:**
  In `apdtflow/config/config.yaml`, you can specify:
  - `use_embedding`: Set to `true` to enable the new embedding.
  - `embed_dim`: The embedding dimension (recommended to match `hidden_dim`).

## 2. Quick Start

### Simple API (Recommended for Most Users)

The easiest way to get started with APDTFlow:

```python
import pandas as pd
from apdtflow import APDTFlowForecaster

# Load your time series data
df = pd.read_csv("dataset_examples/Electric_Production.csv", parse_dates=['DATE'])

# Create and train the forecaster
model = APDTFlowForecaster(
    forecast_horizon=14,     # Predict 14 steps ahead
    history_length=30,       # Use 30 historical points
    num_epochs=50           # Training epochs
)

# Fit the model (handles preprocessing automatically)
model.fit(df, target_col='IPG2211A2N', date_col='DATE')

# Make predictions with uncertainty estimates
predictions, uncertainty = model.predict(return_uncertainty=True)

# Visualize the forecast
model.plot_forecast(with_history=100, show_uncertainty=True)
```

**Try different models** by changing `model_type`:
```python
# Use Transformer instead of Neural ODE
model = APDTFlowForecaster(model_type='transformer', forecast_horizon=14)

# Or try Temporal Convolutional Network
model = APDTFlowForecaster(model_type='tcn', forecast_horizon=14)

# Or use Ensemble for maximum robustness
model = APDTFlowForecaster(model_type='ensemble', forecast_horizon=14)
```

### Advanced API (For Custom Workflows)

For advanced users who need more control:

```python
import torch
from torch.utils.data import DataLoader
from apdtflow.data import TimeSeriesWindowDataset
from apdtflow.models.apdtflow import APDTFlow

csv_file = "dataset_examples/Electric_Production.csv"
dataset = TimeSeriesWindowDataset(csv_file, date_col="DATE", value_col="IPG2211A2N", T_in=12, T_out=3)
train_loader = DataLoader(dataset, batch_size=16, shuffle=True)

model = APDTFlow(
    num_scales=3,
    input_channels=1,
    filter_size=5,
    hidden_dim=16,
    output_dim=1,
    forecast_horizon=3,
    use_embedding=True
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

model.train_model(
    train_loader=train_loader,
    num_epochs=15,
    learning_rate=0.001,
    device=device
)
```

### Inference
Use the following example to run inference on new data:

```python
import torch
from torch.utils.data import DataLoader
from apdtflow.data import TimeSeriesWindowDataset
from apdtflow.models.apdtflow import APDTFlow

test_dataset = TimeSeriesWindowDataset(
    csv_file="path/to/dataset.csv",
    date_col="DATE",
    value_col="VALUE",
    T_in=12,
    T_out=3
)
test_loader = DataLoader(test_dataset, batch_size=16, shuffle=False)

model = APDTFlow(
    num_scales=3,
    input_channels=1,
    filter_size=5,
    hidden_dim=16,
    output_dim=1,
    forecast_horizon=3,
    use_embedding=True
)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

checkpoint_path = "path/to/checkpoint.pt"
model.load_state_dict(torch.load(checkpoint_path, map_location=device))

metrics = model.evaluate(test_loader, device, metrics=["MSE", "MAE", "RMSE", "MAPE"])
print("Evaluation Metrics:", metrics)
```

## 3. Data Processing and Augmentation
APDTFlow provides robust functions to process and augment your time series data. Key features include:
* **Date Conversion:** Automatically converts date columns to **datetime** objects.
* **Gap Filling:** Reindexes data to ensure a consistent time frequency.
* **Missing Value Imputation:** Supports methods such as forward-fill, backward-fill, mean substitution, and interpolation.
* **Feature Engineering:** Generates lag features and rolling statistics to enhance predictive performance.
* **Data Augmentation:** Offers techniques like jittering, scaling, and time warping to improve model robustness.

---

## 4. Forecasting Approaches
APDTFlow includes several advanced forecasting strategies:

### APDTFlow
The **APDTFlow** model integrates:
- **Multi-Scale Decomposition:** Decomposes the input signal into multiple resolutions.
- **Neural ODE Dynamics:** Models continuous latent state evolution using Neural ODEs.
- **Probabilistic Fusion:** Merges latent representations while quantifying uncertainty.
- **Transformer-Based Decoding:** Generates forecasts using a time-aware attention mechanism.
Key parameters include **T_in**, **T_out**, **num_scales**, **filter_size**, **hidden_dim**, and **forecast_horizon**.

### TransformerForecaster
Leverages the Transformer architecture to capture long-range dependencies using self‑attention. This approach is ideal for complex temporal patterns where context from many time steps is relevant.

### TCNForecaster
Based on Temporal Convolutional Networks, the TCNForecaster uses dilated convolutions and residual connections to efficiently capture local and medium-range dependencies.

### EnsembleForecaster
Combines predictions from multiple forecasting models (such as APDTFlow, TransformerForecaster, and TCNForecaster) using aggregation strategies (e.g., weighted averaging) to improve overall forecast robustness and accuracy.

#### Core Model Parameters Explained:
For a comprehensive description of each model's architecture and additional details, please see the [Model Architectures Documentation](docs/models.md).
When configuring APDTFlow, several parameters play key roles in how the model processes and forecasts time series data. Here’s what they mean:

* **T_in (Input Sequence Length):** This parameter specifies the number of past time steps the model will use as input. For example, if T_in=12, the model will use the previous 12 observations to make a forecast.
* **T_out (Forecast Horizon):** This parameter defines the number of future time steps to predict. For instance, if T_out=3, the model will output predictions for the next 3 time steps.
* **num_scales:** APDTFlow employs a multi-scale decomposition technique to capture both global and local trends in the data. The num_scales parameter determines how many scales (or resolutions) the input signal will be decomposed into. A higher number of scales may allow the model to capture more complex temporal patterns, but it could also increase computational complexity.
* **filter_size:** This parameter is used in the convolutional component (or dynamic convolution) within the model’s decomposer module. It defines the size of the convolutional filter applied to the input signal, thereby affecting the receptive field. A larger filter size allows the model to consider a broader context in the time series but may smooth out finer details.
* **forecast_horizon:** This parameter is used within the model to indicate the number of future time steps that the decoder will produce. It should match T_out to ensure consistency between the training data and the model's output.
* **hidden_dim:** The size of the hidden state in the dynamics module and decoder. This parameter controls the capacity of the model to learn complex representations. Increasing hidden_dim may improve the model’s performance, but at the cost of additional computational resources and potential overfitting if not tuned properly.

---

## 5. Evaluation and Metrics
APDTFlow incorporates a flexible evaluation framework that supports several performance metrics including:
- **Mean Squared Error (MSE)**
- **Mean Absolute Error (MAE)**
- **Root Mean Squared Error (RMSE)**
- **Mean Absolute Percentage Error (MAPE)**

These metrics are computed via a dedicated evaluator, which can be extended with additional metrics as needed.
**Usage Example:**
```python
from apdtflow.evaluation.regression_evaluator import RegressionEvaluator

evaluator = RegressionEvaluator(metrics=["MSE", "MAE", "RMSE", "MAPE"])
results = evaluator.evaluate(predictions, targets)
print("MSE:", results["MSE"], "MAE:", results["MAE"], "RMSE:", results["RMSE"], "MAPE:", results["MAPE"])
```

---

## 6. Command-Line Interface (CLI)
For ease of use, APDTFlow provides a command‑line interface that allows you to run training, evaluation, and inference directly from the terminal. The CLI accepts various parameters to configure the forecasting process without modifying the code.
#### Available Commands:
- **apdtflow train:** – Train a forecasting model.
- **apdtflow infer:** – Run inference using a saved checkpoint.

#### Example Usage:
```bash
# Train a model (using the learnable embedding, which is enabled by default)
apdtflow train --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --num_scales 3 --filter_size 5 --hidden_dim 16 --batch_size 16 --learning_rate 0.001 --num_epochs 15 --checkpoint_dir ./checkpoints

# Alternatively, disable the learnable embedding by adding the flag:
apdtflow train --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --num_scales 3 --filter_size 5 --hidden_dim 16 --batch_size 16 --learning_rate 0.001 --num_epochs 15 --checkpoint_dir ./checkpoints --no_embedding

# Run inference (ensure that the embedding setting matches what was used during training)
apdtflow infer --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --checkpoint_path ./checkpoints/APDTFlow_checkpoint.pt --batch_size 16
```

---

## 7. Cross-Validation Strategies
To ensure robust forecasting evaluation, APDTFlow includes a Cross-Validation Factory that supports:
- **Rolling Splits:** Moving training and validation windows over time.
- **Expanding Splits:** Increasing training window size while keeping validation size constant.
- **Blocked Splits:** Dividing the dataset into contiguous blocks.

#### Usage Example:
```python
from apdtflow.cv_factory import TimeSeriesCVFactory
from torch.utils.data import Dataset

class SampleDataset(Dataset):
    def __init__(self, length=100):
        self.data = list(range(length))
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        return self.data[idx]

dataset = SampleDataset()
cv_factory = TimeSeriesCVFactory(dataset, method="rolling", train_size=40, val_size=10, step_size=10)
splits = cv_factory.get_splits()
print("Cross-Validation Splits:", splits)
```

---

## 8. Documentation and Examples
For comprehensive documentation—including user guides, API references, and example notebooks—please visit the docs directory. The examples provide step-by-step instructions for data preprocessing, model training, evaluation, and inference.

[APDTFlow Documentation](https://github.com/yotambraun/APDTFlow/blob/main/docs/index.md)

[Model Architectures Documentation](https://github.com/yotambraun/APDTFlow/blob/main/docs/models.md)

[Experiment Results and Analysis](https://github.com/yotambraun/APDTFlow/blob/main/docs/experiment_results.md)

[Configuration and YAML Files](https://github.com/yotambraun/APDTFlow/blob/main/apdtflow/config/config.yaml)

---

## 9. License
APDTFlow is licensed under the MIT License. For more details, please refer to the [LICENSE](LICENSE) file.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yotambraun/APDTFlow",
    "name": "apdtflow",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "time-series, forecasting, neural-ode, deep-learning, transformer, tcn, timeseries-analysis, time-series-forecasting, neural-differential-equations, probabilistic-forecasting, uncertainty-quantification, multi-scale-decomposition, pytorch, machine-learning",
    "author": "Yotam Barun",
    "author_email": "Yotam Barun <yotambarun93@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/47/07/03393edc156247cec5d7f7c0f606a9d825b853250a760bd06ea6f2aadb13/apdtflow-0.2.1.tar.gz",
    "platform": null,
    "description": "# APDTFlow: A Modular Forecasting Framework for Time Series Data\r\n<p align=\"center\">\r\n  <img src=\"assets/images/my_logo_framework.png\" alt=\"APDTFlow Logo\" width=\"300\">\r\n</p>\r\n\r\n[![PyPI version](https://img.shields.io/pypi/v/apdtflow.svg)](https://pypi.org/project/apdtflow)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\r\n[![Downloads](https://pepy.tech/badge/apdtflow)](https://pepy.tech/project/apdtflow)\r\n[![Python Versions](https://img.shields.io/pypi/pyversions/apdtflow.svg)](https://pypi.org/project/apdtflow/)\r\n[![CI](https://github.com/yotambraun/APDTFlow/actions/workflows/ci.yml/badge.svg)](https://github.com/yotambraun/APDTFlow/actions/workflows/ci.yml)\r\n[![Coverage](https://codecov.io/gh/yotambraun/APDTFlow/branch/main/graph/badge.svg)](https://codecov.io/gh/yotambraun/APDTFlow)\r\n\r\n\r\nAPDTFlow is a modern and extensible forecasting framework for time series data that leverages advanced techniques including neural ordinary differential equations (Neural ODEs), transformer-based components, and probabilistic modeling. Its modular design allows researchers and practitioners to experiment with multiple forecasting models and easily extend the framework for new methods.\r\n\r\n## \ud83d\ude80 New in v0.2.0: Advanced Features!\r\n\r\nAPDTFlow v0.2.0 introduces **cutting-edge forecasting capabilities** based on 2025 research:\r\n\r\n### \ud83c\udf1f Exogenous Variables Support\r\n**Boost accuracy by 30-50%** with external features:\r\n\r\n```python\r\nfrom apdtflow import APDTFlowForecaster\r\n\r\n# Use external features like temperature, holidays, promotions\r\nmodel = APDTFlowForecaster(\r\n    forecast_horizon=14,\r\n    exog_fusion_type='gated'  # 3 fusion strategies available\r\n)\r\n\r\nmodel.fit(\r\n    df,\r\n    target_col='sales',\r\n    date_col='date',\r\n    exog_cols=['temperature', 'is_holiday', 'promotion'],\r\n    future_exog_cols=['is_holiday', 'promotion']  # Known in advance\r\n)\r\n\r\n# Predict with future exogenous data\r\nfuture_exog = pd.DataFrame({'is_holiday': [0, 1, ...], 'promotion': [1, 0, ...]})\r\npredictions = model.predict(exog_future=future_exog)\r\n```\r\n\r\n### \ud83d\udcca Conformal Prediction\r\n**Rigorous uncertainty quantification** with coverage guarantees:\r\n\r\n```python\r\nmodel = APDTFlowForecaster(\r\n    forecast_horizon=14,\r\n    use_conformal=True,  # Enable conformal prediction\r\n    conformal_method='adaptive'  # Adapts to changing data\r\n)\r\n\r\nmodel.fit(df, target_col='sales')\r\n\r\n# Get calibrated 95% prediction intervals\r\nlower, pred, upper = model.predict(\r\n    alpha=0.05,  # 95% coverage guarantee\r\n    return_intervals='conformal'\r\n)\r\n```\r\n\r\n### \u26a1 Simple & Powerful API\r\n**5 lines to production-ready forecasts:**\r\n\r\n```python\r\nfrom apdtflow import APDTFlowForecaster\r\n\r\nmodel = APDTFlowForecaster(forecast_horizon=14)\r\nmodel.fit(df, target_col='sales', date_col='date')\r\npredictions = model.predict()\r\n```\r\n\r\n### Why APDTFlow?\r\n\r\n| Feature | APDTFlow v0.2.0 | Other Libraries |\r\n|---------|----------|-----------------|\r\n| **Neural ODEs** | \u2705 Continuous-time modeling | \u274c Discrete-time only |\r\n| **Exogenous Variables** | \u2705 3 fusion strategies | \u26a0\ufe0f Limited |\r\n| **Conformal Prediction** | \u2705 Rigorous uncertainty | \u274c Not available |\r\n| **Multi-Scale Decomposition** | \u2705 Trends + seasonality | \u26a0\ufe0f Limited |\r\n| **Simple `fit()/predict()` API** | \u2705 5 lines of code | \u26a0\ufe0f Varies |\r\n| **Multiple Architectures** | \u2705 ODE/Transformer/TCN/Ensemble | \u26a0\ufe0f Usually one |\r\n| **PyTorch-based** | \u2705 GPU acceleration | \u2705 Most |\r\n\r\n![APDTFlow Forecast](https://github.com/yotambraun/APDTFlow/blob/main/experiments/results_plots/APDTFlow_Forecast_Horizon_14_CV2.png)\r\n\r\n## Experiment Results\r\nIn our mega experiment we compared multiple forecasting models across different forecast horizons using 3-fold cross\u2011validation. For brevity, below we show two key plots:\r\n\r\n1. **Validation Loss Comparison:** A bar plot comparing the average validation losses of the models (APDTFlow, TransformerForecaster, TCNForecaster, and EnsembleForecaster) across forecast horizons.\r\n2. **Example Forecast (Horizon 7, CV Split 3):** A forecast plot for the APDTFlow model for a 7-step forecast from CV split 3.\r\n\r\n## Validation Loss Comparison\r\n\r\nThe bar plot below summarizes the average validation losses (lower is better) for the different models across the forecast horizons (7, 10, and 30 time steps):\r\n\r\n![Validation Loss Comparison](experiments/results_plots/Validation_Loss_Comparison.png)\r\n\r\n**Explanation:**  \r\nThis plot shows that the APDTFlow model (and possibly the ensemble) generally achieved lower validation losses compared to the other models, especially for longer forecast horizons. This indicates that its multi-scale decomposition and neural ODE dynamics are well-suited for capturing the trends and seasonal patterns in the dataset.\r\n\r\n**Discussion:**  \r\nThe plot demonstrates that, overall, the APDTFlow model (and, in some cases, the ensemble) tend to achieve lower validation losses\u2014particularly as the forecast horizon increases.\r\n\r\n## Performance vs. Forecast Horizon\r\n\r\nThe following line plot illustrates how the performance (average validation loss) of each model changes with different forecast horizons. This visualization helps to assess which models maintain consistent performance as the forecast horizon increases.\r\n\r\n![Performance vs. Horizon](experiments/results_plots/Performance_vs_Horizon.png)\r\n\r\n**Discussion:**  \r\nThe line plot reveals the trend in model performance across forecast horizons. It helps us understand which models degrade gracefully (or even improve) as the forecast horizon lengthens.\r\n\r\n## Example Forecast (Horizon 7, CV Split 3)\r\n\r\nBelow is an example forecast produced by the APDTFlow model for a forecast horizon of 7 time steps on the third cross-validation split.\r\n\r\n![APDTFlow Forecast Horizon 7, CV3](experiments/results_plots/APDTFlow_Forecast_Horizon_7_CV3.png)\r\n\r\n**Discussion:** \r\n- **Input Sequence (Blue):** The historical data (last 30 time steps) used as input.\r\n- **True Future (Dashed Orange):** The actual future values for the next 7 time steps.\r\n- **Predicted Future (Dotted Line):** The forecast generated by the model.\r\n\r\n---\r\n\r\n*For a detailed explanation, more plots, and **additional analysis of these results**, please see our [Experiment Results and Analysis](docs/experiment_results.md) document.*\r\n\r\n\r\n## Table of Contents\r\n\r\n1. [Installation](#installation)\r\n2. [Quick Start](#quick-start)\r\n   - [Training](#training)\r\n   - [Inference](#inference)\r\n3. [Data Processing and Augmentation](#data-processing-and-augmentation)\r\n4. [Forecasting Approaches](#forecasting-approaches)\r\n   - [APDTFlow](#apdtflow)\r\n   - [TransformerForecaster](#transformerforecaster)\r\n   - [TCNForecaster](#tcnforecaster)\r\n   - [EnsembleForecaster](#ensembleforecaster)\r\n5. [Evaluation and Metrics](#evaluation-and-metrics)\r\n6. [Command-Line Interface (CLI)](#command-line-interface-cli)\r\n7. [Cross-Validation Strategies](#cross-validation-strategies)\r\n8. [Documentation and Examples](#documentation-and-examples)\r\n9. [License](#license)\r\n\r\n## 1. Installation\r\n\r\nAPDTFlow is published on [PyPI](https://pypi.org/project/apdtflow). To install:\r\n\r\n```bash\r\npip install apdtflow\r\n```\r\nFor development, clone the repository and install in editable mode:\r\n\r\n```bash\r\ngit clone https://github.com/yotambraun/APDTFlow.git\r\ncd APDTFlow\r\npip install -e .\r\n```\r\n## \u2728 New Features in v0.1.24\r\n\r\n### \ud83d\ude80 Easy-to-Use High-Level API (MAJOR UPDATE!)\r\n- **Simple `fit()`/`predict()` interface** - No more complex DataLoaders or manual preprocessing\r\n- **Works directly with pandas DataFrames** - Natural integration with your workflow\r\n- **Automatic normalization and preprocessing** - Just pass your data and go\r\n- **Built-in visualization** - `plot_forecast()` with uncertainty bands\r\n- **Multiple model types in one API** - Switch between ODE/Transformer/TCN/Ensemble with one parameter\r\n\r\n### \ud83d\udcc8 Enhanced Package Discoverability\r\n- **Comprehensive PyPI keywords** - Better search visibility for time-series, forecasting, neural-ode, etc.\r\n- **Improved classifiers** - Clearer package categorization\r\n- **CHANGELOG.md** - Track all improvements and updates\r\n\r\n### Previous Features (v0.1.23)\r\n\r\n- **Learnable Time Series Embedding:**\r\n  APDTFlow includes a `TimeSeriesEmbedding` module that learns to encode temporal information using gated residual networks. This module processes both raw time indices and periodic signals (and optionally calendar features) to produce a rich embedding that improves the subsequent forecasting performance.\r\n\r\n- **Configuration Options:**\r\n  In `apdtflow/config/config.yaml`, you can specify:\r\n  - `use_embedding`: Set to `true` to enable the new embedding.\r\n  - `embed_dim`: The embedding dimension (recommended to match `hidden_dim`).\r\n\r\n## 2. Quick Start\r\n\r\n### Simple API (Recommended for Most Users)\r\n\r\nThe easiest way to get started with APDTFlow:\r\n\r\n```python\r\nimport pandas as pd\r\nfrom apdtflow import APDTFlowForecaster\r\n\r\n# Load your time series data\r\ndf = pd.read_csv(\"dataset_examples/Electric_Production.csv\", parse_dates=['DATE'])\r\n\r\n# Create and train the forecaster\r\nmodel = APDTFlowForecaster(\r\n    forecast_horizon=14,     # Predict 14 steps ahead\r\n    history_length=30,       # Use 30 historical points\r\n    num_epochs=50           # Training epochs\r\n)\r\n\r\n# Fit the model (handles preprocessing automatically)\r\nmodel.fit(df, target_col='IPG2211A2N', date_col='DATE')\r\n\r\n# Make predictions with uncertainty estimates\r\npredictions, uncertainty = model.predict(return_uncertainty=True)\r\n\r\n# Visualize the forecast\r\nmodel.plot_forecast(with_history=100, show_uncertainty=True)\r\n```\r\n\r\n**Try different models** by changing `model_type`:\r\n```python\r\n# Use Transformer instead of Neural ODE\r\nmodel = APDTFlowForecaster(model_type='transformer', forecast_horizon=14)\r\n\r\n# Or try Temporal Convolutional Network\r\nmodel = APDTFlowForecaster(model_type='tcn', forecast_horizon=14)\r\n\r\n# Or use Ensemble for maximum robustness\r\nmodel = APDTFlowForecaster(model_type='ensemble', forecast_horizon=14)\r\n```\r\n\r\n### Advanced API (For Custom Workflows)\r\n\r\nFor advanced users who need more control:\r\n\r\n```python\r\nimport torch\r\nfrom torch.utils.data import DataLoader\r\nfrom apdtflow.data import TimeSeriesWindowDataset\r\nfrom apdtflow.models.apdtflow import APDTFlow\r\n\r\ncsv_file = \"dataset_examples/Electric_Production.csv\"\r\ndataset = TimeSeriesWindowDataset(csv_file, date_col=\"DATE\", value_col=\"IPG2211A2N\", T_in=12, T_out=3)\r\ntrain_loader = DataLoader(dataset, batch_size=16, shuffle=True)\r\n\r\nmodel = APDTFlow(\r\n    num_scales=3,\r\n    input_channels=1,\r\n    filter_size=5,\r\n    hidden_dim=16,\r\n    output_dim=1,\r\n    forecast_horizon=3,\r\n    use_embedding=True\r\n)\r\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\r\nmodel.to(device)\r\n\r\nmodel.train_model(\r\n    train_loader=train_loader,\r\n    num_epochs=15,\r\n    learning_rate=0.001,\r\n    device=device\r\n)\r\n```\r\n\r\n### Inference\r\nUse the following example to run inference on new data:\r\n\r\n```python\r\nimport torch\r\nfrom torch.utils.data import DataLoader\r\nfrom apdtflow.data import TimeSeriesWindowDataset\r\nfrom apdtflow.models.apdtflow import APDTFlow\r\n\r\ntest_dataset = TimeSeriesWindowDataset(\r\n    csv_file=\"path/to/dataset.csv\",\r\n    date_col=\"DATE\",\r\n    value_col=\"VALUE\",\r\n    T_in=12,\r\n    T_out=3\r\n)\r\ntest_loader = DataLoader(test_dataset, batch_size=16, shuffle=False)\r\n\r\nmodel = APDTFlow(\r\n    num_scales=3,\r\n    input_channels=1,\r\n    filter_size=5,\r\n    hidden_dim=16,\r\n    output_dim=1,\r\n    forecast_horizon=3,\r\n    use_embedding=True\r\n)\r\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\r\nmodel.to(device)\r\n\r\ncheckpoint_path = \"path/to/checkpoint.pt\"\r\nmodel.load_state_dict(torch.load(checkpoint_path, map_location=device))\r\n\r\nmetrics = model.evaluate(test_loader, device, metrics=[\"MSE\", \"MAE\", \"RMSE\", \"MAPE\"])\r\nprint(\"Evaluation Metrics:\", metrics)\r\n```\r\n\r\n## 3. Data Processing and Augmentation\r\nAPDTFlow provides robust functions to process and augment your time series data. Key features include:\r\n* **Date Conversion:** Automatically converts date columns to **datetime** objects.\r\n* **Gap Filling:** Reindexes data to ensure a consistent time frequency.\r\n* **Missing Value Imputation:** Supports methods such as forward-fill, backward-fill, mean substitution, and interpolation.\r\n* **Feature Engineering:** Generates lag features and rolling statistics to enhance predictive performance.\r\n* **Data Augmentation:** Offers techniques like jittering, scaling, and time warping to improve model robustness.\r\n\r\n---\r\n\r\n## 4. Forecasting Approaches\r\nAPDTFlow includes several advanced forecasting strategies:\r\n\r\n### APDTFlow\r\nThe **APDTFlow** model integrates:\r\n- **Multi-Scale Decomposition:** Decomposes the input signal into multiple resolutions.\r\n- **Neural ODE Dynamics:** Models continuous latent state evolution using Neural ODEs.\r\n- **Probabilistic Fusion:** Merges latent representations while quantifying uncertainty.\r\n- **Transformer-Based Decoding:** Generates forecasts using a time-aware attention mechanism.\r\nKey parameters include **T_in**, **T_out**, **num_scales**, **filter_size**, **hidden_dim**, and **forecast_horizon**.\r\n\r\n### TransformerForecaster\r\nLeverages the Transformer architecture to capture long-range dependencies using self\u2011attention. This approach is ideal for complex temporal patterns where context from many time steps is relevant.\r\n\r\n### TCNForecaster\r\nBased on Temporal Convolutional Networks, the TCNForecaster uses dilated convolutions and residual connections to efficiently capture local and medium-range dependencies.\r\n\r\n### EnsembleForecaster\r\nCombines predictions from multiple forecasting models (such as APDTFlow, TransformerForecaster, and TCNForecaster) using aggregation strategies (e.g., weighted averaging) to improve overall forecast robustness and accuracy.\r\n\r\n#### Core Model Parameters Explained:\r\nFor a comprehensive description of each model's architecture and additional details, please see the [Model Architectures Documentation](docs/models.md).\r\nWhen configuring APDTFlow, several parameters play key roles in how the model processes and forecasts time series data. Here\u2019s what they mean:\r\n\r\n* **T_in (Input Sequence Length):** This parameter specifies the number of past time steps the model will use as input. For example, if T_in=12, the model will use the previous 12 observations to make a forecast.\r\n* **T_out (Forecast Horizon):** This parameter defines the number of future time steps to predict. For instance, if T_out=3, the model will output predictions for the next 3 time steps.\r\n* **num_scales:** APDTFlow employs a multi-scale decomposition technique to capture both global and local trends in the data. The num_scales parameter determines how many scales (or resolutions) the input signal will be decomposed into. A higher number of scales may allow the model to capture more complex temporal patterns, but it could also increase computational complexity.\r\n* **filter_size:** This parameter is used in the convolutional component (or dynamic convolution) within the model\u2019s decomposer module. It defines the size of the convolutional filter applied to the input signal, thereby affecting the receptive field. A larger filter size allows the model to consider a broader context in the time series but may smooth out finer details.\r\n* **forecast_horizon:** This parameter is used within the model to indicate the number of future time steps that the decoder will produce. It should match T_out to ensure consistency between the training data and the model's output.\r\n* **hidden_dim:** The size of the hidden state in the dynamics module and decoder. This parameter controls the capacity of the model to learn complex representations. Increasing hidden_dim may improve the model\u2019s performance, but at the cost of additional computational resources and potential overfitting if not tuned properly.\r\n\r\n---\r\n\r\n## 5. Evaluation and Metrics\r\nAPDTFlow incorporates a flexible evaluation framework that supports several performance metrics including:\r\n- **Mean Squared Error (MSE)**\r\n- **Mean Absolute Error (MAE)**\r\n- **Root Mean Squared Error (RMSE)**\r\n- **Mean Absolute Percentage Error (MAPE)**\r\n\r\nThese metrics are computed via a dedicated evaluator, which can be extended with additional metrics as needed.\r\n**Usage Example:**\r\n```python\r\nfrom apdtflow.evaluation.regression_evaluator import RegressionEvaluator\r\n\r\nevaluator = RegressionEvaluator(metrics=[\"MSE\", \"MAE\", \"RMSE\", \"MAPE\"])\r\nresults = evaluator.evaluate(predictions, targets)\r\nprint(\"MSE:\", results[\"MSE\"], \"MAE:\", results[\"MAE\"], \"RMSE:\", results[\"RMSE\"], \"MAPE:\", results[\"MAPE\"])\r\n```\r\n\r\n---\r\n\r\n## 6. Command-Line Interface (CLI)\r\nFor ease of use, APDTFlow provides a command\u2011line interface that allows you to run training, evaluation, and inference directly from the terminal. The CLI accepts various parameters to configure the forecasting process without modifying the code.\r\n#### Available Commands:\r\n- **apdtflow train:** \u2013 Train a forecasting model.\r\n- **apdtflow infer:** \u2013 Run inference using a saved checkpoint.\r\n\r\n#### Example Usage:\r\n```bash\r\n# Train a model (using the learnable embedding, which is enabled by default)\r\napdtflow train --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --num_scales 3 --filter_size 5 --hidden_dim 16 --batch_size 16 --learning_rate 0.001 --num_epochs 15 --checkpoint_dir ./checkpoints\r\n\r\n# Alternatively, disable the learnable embedding by adding the flag:\r\napdtflow train --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --num_scales 3 --filter_size 5 --hidden_dim 16 --batch_size 16 --learning_rate 0.001 --num_epochs 15 --checkpoint_dir ./checkpoints --no_embedding\r\n\r\n# Run inference (ensure that the embedding setting matches what was used during training)\r\napdtflow infer --csv_file path/to/dataset.csv --date_col DATE --value_col VALUE --T_in 12 --T_out 3 --checkpoint_path ./checkpoints/APDTFlow_checkpoint.pt --batch_size 16\r\n```\r\n\r\n---\r\n\r\n## 7. Cross-Validation Strategies\r\nTo ensure robust forecasting evaluation, APDTFlow includes a Cross-Validation Factory that supports:\r\n- **Rolling Splits:** Moving training and validation windows over time.\r\n- **Expanding Splits:** Increasing training window size while keeping validation size constant.\r\n- **Blocked Splits:** Dividing the dataset into contiguous blocks.\r\n\r\n#### Usage Example:\r\n```python\r\nfrom apdtflow.cv_factory import TimeSeriesCVFactory\r\nfrom torch.utils.data import Dataset\r\n\r\nclass SampleDataset(Dataset):\r\n    def __init__(self, length=100):\r\n        self.data = list(range(length))\r\n    def __len__(self):\r\n        return len(self.data)\r\n    def __getitem__(self, idx):\r\n        return self.data[idx]\r\n\r\ndataset = SampleDataset()\r\ncv_factory = TimeSeriesCVFactory(dataset, method=\"rolling\", train_size=40, val_size=10, step_size=10)\r\nsplits = cv_factory.get_splits()\r\nprint(\"Cross-Validation Splits:\", splits)\r\n```\r\n\r\n---\r\n\r\n## 8. Documentation and Examples\r\nFor comprehensive documentation\u2014including user guides, API references, and example notebooks\u2014please visit the docs directory. The examples provide step-by-step instructions for data preprocessing, model training, evaluation, and inference.\r\n\r\n[APDTFlow Documentation](https://github.com/yotambraun/APDTFlow/blob/main/docs/index.md)\r\n\r\n[Model Architectures Documentation](https://github.com/yotambraun/APDTFlow/blob/main/docs/models.md)\r\n\r\n[Experiment Results and Analysis](https://github.com/yotambraun/APDTFlow/blob/main/docs/experiment_results.md)\r\n\r\n[Configuration and YAML Files](https://github.com/yotambraun/APDTFlow/blob/main/apdtflow/config/config.yaml)\r\n\r\n---\r\n\r\n## 9. License\r\nAPDTFlow is licensed under the MIT License. For more details, please refer to the [LICENSE](LICENSE) file.\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "APDTFlow: A modular forecasting framework for time series data",
    "version": "0.2.1",
    "project_urls": {
        "Documentation": "https://github.com/yotambraun/APDTFlow",
        "Homepage": "https://github.com/yotambraun/APDTFlow",
        "Source": "https://github.com/yotambraun/APDTFlow"
    },
    "split_keywords": [
        "time-series",
        " forecasting",
        " neural-ode",
        " deep-learning",
        " transformer",
        " tcn",
        " timeseries-analysis",
        " time-series-forecasting",
        " neural-differential-equations",
        " probabilistic-forecasting",
        " uncertainty-quantification",
        " multi-scale-decomposition",
        " pytorch",
        " machine-learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2b74ce46ddcb4702853d991888a5368ae373ea93b8e578e3ecd1e2869d52bbe0",
                "md5": "a1dd5488ddd74fc383439d9d91b8c90c",
                "sha256": "bb7b9febf15d5a0ffadc4e14fc55fda519f5222c36633ea591886e04bf8490a3"
            },
            "downloads": -1,
            "filename": "apdtflow-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a1dd5488ddd74fc383439d9d91b8c90c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 61708,
            "upload_time": "2025-10-18T11:14:31",
            "upload_time_iso_8601": "2025-10-18T11:14:31.125083Z",
            "url": "https://files.pythonhosted.org/packages/2b/74/ce46ddcb4702853d991888a5368ae373ea93b8e578e3ecd1e2869d52bbe0/apdtflow-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "470703393edc156247cec5d7f7c0f606a9d825b853250a760bd06ea6f2aadb13",
                "md5": "15ef06b97b5c9ddd8a1cf1af80a0875f",
                "sha256": "8e705e4a57ff2e28be9e55c400a6635bc144c7803b25fe57f90308ce77823066"
            },
            "downloads": -1,
            "filename": "apdtflow-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "15ef06b97b5c9ddd8a1cf1af80a0875f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 52800,
            "upload_time": "2025-10-18T11:14:32",
            "upload_time_iso_8601": "2025-10-18T11:14:32.140080Z",
            "url": "https://files.pythonhosted.org/packages/47/07/03393edc156247cec5d7f7c0f606a9d825b853250a760bd06ea6f2aadb13/apdtflow-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-18 11:14:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yotambraun",
    "github_project": "APDTFlow",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "torchdiffeq",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "tensorboard",
            "specs": []
        },
        {
            "name": "pyyaml",
            "specs": []
        },
        {
            "name": "scikit-learn",
            "specs": []
        },
        {
            "name": "statsmodels",
            "specs": []
        }
    ],
    "lcname": "apdtflow"
}
        
Elapsed time: 1.34388s