temporal-forecasting

Name	temporal-forecasting JSON
Version	0.3.3 JSON
	download
home_page	None
Summary	A transformer-based model for time series forecasting inspired by modern attention mechanisms
upload_time	2025-11-08 22:40:22
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	GPL-3.0-or-later
keywords	time-series forecasting transformer deep-learning attention-mechanism pytorch machine-learning neural-network
VCS
bugtrack_url
requirements	torch numpy tqdm matplotlib
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Temporal: Transformer-Based Time Series Forecasting

A PyTorch implementation of a transformer-based model for time series forecasting, inspired by modern attention-based approaches.

## Overview

Temporal is a foundational model for time series forecasting based on the revolutionary self-attention mechanism introduced in "Attention is All You Need". Unlike language models, Temporal is specifically designed and trained to minimize forecasting error on time series data.

### Key Features

- **Self-Attention Mechanism**: Captures complex temporal dependencies and patterns
- **Encoder-Decoder Architecture**: Multi-layer transformer with residual connections and layer normalization
- **Flexible**: Supports both univariate and multivariate time series
- **Scalable**: Can handle various sequence lengths and forecasting horizons
- **Autoregressive Generation**: Inference mode for multi-step ahead forecasting

## Architecture

The Temporal model consists of:

1. **Input Embedding**: Projects time series data to model dimension
2. **Positional Encoding**: Captures temporal order (sinusoidal or learnable)
3. **Encoder Stack**: Multiple layers of self-attention and feed-forward networks
4. **Decoder Stack**: Multiple layers with self-attention, cross-attention, and feed-forward networks
5. **Output Projection**: Maps decoder output to forecasting window dimension

### Architecture Diagram

```mermaid
graph TD
    A[Input Time Series<br/>batch, lookback, features] --> B[Input Embedding<br/>Linear: features → d_model]
    B --> C[Positional Encoding<br/>Add temporal position info]
    C --> D[Encoder Stack<br/>6 layers]
    D --> E[Encoder Output<br/>batch, lookback, d_model]

    F[Decoder Input<br/>Previous predictions] --> G[Input Embedding<br/>Linear: features → d_model]
    G --> H[Positional Encoding]
    H --> I[Decoder Stack<br/>6 layers]
    E --> I
    I --> J[Decoder Output<br/>batch, horizon, d_model]
    J --> K[Output Projection<br/>Linear: d_model → features]
    K --> L[Forecast<br/>batch, horizon, features]

    style A fill:#e1f5ff
    style L fill:#e1ffe1
    style D fill:#fff4e1
    style I fill:#ffe1f5
```

Each layer includes:
- Multi-head self-attention
- Residual connections
- Layer normalization
- Feed-forward networks with GELU activation

**For more diagrams**, see [DIAGRAMS.md](DIAGRAMS.md) - complete visual documentation with:
- Encoder/Decoder architecture
- Multi-head attention mechanism
- Training and inference flows
- Data pipeline
- Component interactions

## Installation

### From PyPI

```bash
pip install temporal-forecasting
```

### With HuggingFace Support

```bash
pip install temporal-forecasting[huggingface]
```

This adds support for:
- Uploading models to HuggingFace Hub
- Downloading models from HuggingFace Hub
- HuggingFace ecosystem integration

### With Data Fetching Support

```bash
pip install temporal-forecasting[data]
```

This adds support for:
- Fetching stock prices from Yahoo Finance
- Fetching cryptocurrency data (Bitcoin, Ethereum, etc.)
- Downloading datasets from Kaggle
- Technical indicators (SMA, RSI, MACD, Bollinger Bands)
- Data preprocessing utilities

### From Source

```bash
git clone https://github.com/OptimalMatch/temporal.git
cd temporal
pip install -r requirements.txt
pip install -e .
```

### Requirements

- Python >= 3.8
- PyTorch >= 2.0.0
- NumPy >= 1.20.0
- tqdm >= 4.60.0
- matplotlib >= 3.3.0

### Optional Dependencies

- **HuggingFace**: `transformers>=4.30.0`, `huggingface-hub>=0.16.0`
- **Data Fetching**: `yfinance>=0.2.0`, `pandas>=1.3.0`, `scikit-learn>=1.0.0`, `kagglehub>=0.2.0`

## Quick Start

### Basic Usage

```python
import torch
from temporal import Temporal

# Create model
model = Temporal(
    input_dim=1,           # Univariate time series
    d_model=256,           # Model dimension
    num_encoder_layers=4,  # Number of encoder layers
    num_decoder_layers=4,  # Number of decoder layers
    num_heads=8,           # Attention heads
    d_ff=1024,            # Feed-forward dimension
    forecast_horizon=24,   # Predict 24 steps ahead
    dropout=0.1
)

# Input: (batch_size, sequence_length, input_dim)
x = torch.randn(32, 96, 1)

# Generate forecast
forecast = model.forecast(x)  # (32, 24, 1)
```

### Training Example

```python
from temporal import Temporal
from temporal.trainer import TimeSeriesDataset, TemporalTrainer
from torch.utils.data import DataLoader
import torch

# Prepare your data
train_data = ...  # Shape: (num_samples, num_features)

# Create dataset
dataset = TimeSeriesDataset(
    train_data,
    lookback=96,
    forecast_horizon=24,
    stride=1
)

# Create data loader
train_loader = DataLoader(dataset, batch_size=32, shuffle=True)

# Create model
model = Temporal(
    input_dim=train_data.shape[1],
    d_model=256,
    num_encoder_layers=4,
    num_decoder_layers=4,
    num_heads=8,
    d_ff=1024,
    forecast_horizon=24
)

# Create optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)

# Create trainer
trainer = TemporalTrainer(
    model=model,
    optimizer=optimizer,
    criterion=torch.nn.MSELoss()
)

# Train
history = trainer.fit(
    train_loader=train_loader,
    num_epochs=100,
    early_stopping_patience=10,
    save_path="best_model.pt"  # Automatically saves best model
)
```

### Saving and Loading Models

```python
# Save trained model
torch.save(model.state_dict(), 'temporal_model.pt')

# Load model for inference
model = Temporal(input_dim=1, forecast_horizon=24)
model.load_state_dict(torch.load('temporal_model.pt'))
model.eval()

# Make predictions
forecast = model.forecast(x)
```

For complete guide on model persistence, see [MODEL_PERSISTENCE.md](MODEL_PERSISTENCE.md).

## Examples

### Univariate Time Series

See `examples/basic_usage.py` for a complete example with synthetic data:

```bash
cd examples
python basic_usage.py
```

This will:
- Generate synthetic time series data
- Train a Temporal model
- Generate forecasts
- Visualize results

### Multivariate Time Series

See `examples/multivariate_example.py` for forecasting multiple correlated features:

```bash
cd examples
python multivariate_example.py
```

### Model Persistence

See `examples/model_persistence_example.py` for saving and loading trained models:

```bash
cd examples
python model_persistence_example.py
```

This demonstrates:
- Training and saving a model with all components
- Loading saved models for inference
- Production-ready model deployment

### HuggingFace Integration

See `examples/huggingface_example.py` for HuggingFace Hub integration:

```bash
cd examples
python huggingface_example.py
```

This demonstrates:
- Creating HuggingFace-compatible models
- Saving in HuggingFace format
- Loading from HuggingFace Hub
- Uploading models to share with the community

For complete guide, see [HUGGINGFACE_INTEGRATION.md](HUGGINGFACE_INTEGRATION.md).

### Stock Price Forecasting

See `examples/stock_forecasting.py` for real stock data forecasting:

```bash
cd examples
python stock_forecasting.py
```

This demonstrates:
- Fetching stock data from Yahoo Finance
- Training on Apple (AAPL) stock prices
- 5-day price forecasting
- Model evaluation and visualization

### Cryptocurrency Forecasting

See `examples/crypto_forecasting.py` for Bitcoin and crypto forecasting:

```bash
cd examples
python crypto_forecasting.py
```

This demonstrates:
- Fetching Bitcoin data
- Training on cryptocurrency prices
- 7-day price forecasting
- Multi-crypto comparison

For complete guide on data fetching, see [DATA_SOURCES.md](DATA_SOURCES.md).

## Reference Implementations

The following projects demonstrate real-world applications built using the Temporal forecasting library. These implementations showcase how to integrate Temporal into production systems and can serve as templates for your own projects.

### Temporal Trading Agents

**Repository:** [github.com/OptimalMatch/temporal-trading-agents](https://github.com/OptimalMatch/temporal-trading-agents)

A next-generation trading system that combines deep learning time-series forecasting with ensemble methods and multi-strategy consensus voting to predict market movements and generate trading signals.

#### Features

- **Multi-Horizon Forecasting**: Separate ensembles for 3-day, 7-day, 14-day, and 21-day predictions
- **Ensemble Learning**: Combines 5-8 models per time horizon with confidence quantification
- **8-Strategy Consensus System**: Analyzes predictions using gradient analysis, confidence weighting, volatility sizing, momentum, swing trading, risk-adjusted metrics, mean reversion, and multi-timeframe alignment
- **Production-Ready Platform**: React dashboard with FastAPI backend, MongoDB, and Docker deployment
- **Risk Management**: Dynamic position sizing, VaR calculations, and Sortino ratio analysis

#### Using Temporal in Your Project

Add Temporal to your `requirements.txt`:

```txt
temporal-forecasting>=0.3.1
```

Example usage from the trading agents implementation:

```python
from temporal import Temporal, TemporalTrainer, TimeSeriesDataset
from temporal.data_sources import fetch_crypto_data
import torch

# Fetch cryptocurrency data
data = fetch_crypto_data('BTC-USD', period='2y')

# Create and train ensemble of models for different horizons
horizons = [3, 7, 14, 21]  # days
models = {}

for horizon in horizons:
    # Prepare dataset
    dataset = TimeSeriesDataset(
        data,
        lookback=96,
        forecast_horizon=horizon * 24,  # Convert days to hours
        stride=1
    )

    # Create model
    model = Temporal(
        input_dim=data.shape[1],
        d_model=256,
        num_encoder_layers=4,
        num_decoder_layers=4,
        num_heads=8,
        forecast_horizon=horizon * 24
    )

    # Train model
    trainer = TemporalTrainer(model, optimizer=torch.optim.AdamW(model.parameters()))
    history = trainer.fit(train_loader, num_epochs=100)

    models[f'{horizon}d'] = model

# Generate multi-horizon forecasts
forecasts = {}
for horizon, model in models.items():
    forecast = model.forecast(recent_data)
    forecasts[horizon] = forecast

# Use forecasts for trading strategy consensus voting
# (See temporal-trading-agents for full strategy implementation)
```

#### Learn More

- **Documentation**: See the [temporal-trading-agents README](https://github.com/OptimalMatch/temporal-trading-agents#readme)
- **Live Demo**: Follow the Docker setup instructions for a complete trading dashboard
- **Strategies**: Review the 8-strategy consensus voting system for signal generation

---

### Contributing Your Implementation

Have you built something with Temporal? We'd love to feature your project! Submit a pull request adding your implementation to this section, including:

- Project description and repository link
- Key features and use cases
- Code example showing Temporal integration
- Any unique approaches or optimizations

## Model Configuration

### Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `input_dim` | Number of input features | 1 |
| `d_model` | Model dimension | 512 |
| `num_encoder_layers` | Number of encoder layers | 6 |
| `num_decoder_layers` | Number of decoder layers | 6 |
| `num_heads` | Number of attention heads | 8 |
| `d_ff` | Feed-forward dimension | 2048 |
| `forecast_horizon` | Number of steps to forecast | 24 |
| `max_seq_len` | Maximum sequence length | 5000 |
| `dropout` | Dropout probability | 0.1 |
| `use_learnable_pe` | Use learnable positional encoding | False |

### Recommended Configurations

**Small Model** (Fast training, lower accuracy):
```python
model = Temporal(
    d_model=128,
    num_encoder_layers=2,
    num_decoder_layers=2,
    num_heads=4,
    d_ff=512
)
```

**Medium Model** (Balanced):
```python
model = Temporal(
    d_model=256,
    num_encoder_layers=4,
    num_decoder_layers=4,
    num_heads=8,
    d_ff=1024
)
```

**Large Model** (Best accuracy, slower training):
```python
model = Temporal(
    d_model=512,
    num_encoder_layers=6,
    num_decoder_layers=6,
    num_heads=16,
    d_ff=2048
)
```

## Training Tips

1. **Learning Rate**: Start with 1e-4 and use a scheduler (e.g., ReduceLROnPlateau)
2. **Batch Size**: Use the largest batch size that fits in memory (32-128)
3. **Gradient Clipping**: Use gradient clipping (0.5-1.0) to prevent exploding gradients
4. **Early Stopping**: Monitor validation loss and stop when it plateaus
5. **Data Normalization**: Normalize your data (e.g., StandardScaler) before training

## Architecture Details

### Multi-Head Attention

The model uses scaled dot-product attention:

```
Attention(Q, K, V) = softmax(QK^T / √d_k)V
```

Multiple attention heads allow the model to attend to different aspects of the time series simultaneously.

### Positional Encoding

Two types of positional encoding are available:

1. **Sinusoidal** (default): Fixed sinusoidal functions
2. **Learnable**: Learned embeddings for each position

### Autoregressive Generation

During inference, the model generates forecasts autoregressively:
- Start with the last observed value
- Generate next step prediction
- Use prediction as input for next step
- Repeat for entire forecast horizon

## Modern Time Series Transformers

Temporal implements a transformer architecture similar to modern approaches in time series forecasting:

| Feature | Modern Approaches | Temporal |
|---------|---------|----------|
| Architecture | Transformer | Transformer |
| Attention | Multi-head | Multi-head |
| Layers | Encoder-Decoder | Encoder-Decoder |
| Training | Large-scale pre-training | User-provided data |
| Flexibility | Fixed models | Fully customizable |

## Performance

Performance varies by dataset and configuration. Typical metrics on benchmark datasets:

- **MSE**: 0.01-0.1 (normalized data)
- **MAE**: 0.05-0.3 (normalized data)
- **Training Time**: 1-10 minutes per epoch (depending on size)

## API Reference

### Temporal

Main model class for time series forecasting.

**Methods**:
- `forward(src, tgt=None, src_mask=None, tgt_mask=None)`: Forward pass
- `forecast(x, horizon=None)`: Generate forecasts
- `generate_causal_mask(size)`: Create causal attention mask

### TemporalTrainer

Training utilities for Temporal models.

**Methods**:
- `train_epoch(dataloader)`: Train for one epoch
- `validate(dataloader)`: Validate the model
- `fit(train_loader, val_loader, num_epochs, ...)`: Full training loop
- `predict(dataloader)`: Generate predictions

### TimeSeriesDataset

Dataset class for time series data.

**Parameters**:
- `data`: Time series data array
- `lookback`: Number of historical steps
- `forecast_horizon`: Number of future steps
- `stride`: Stride for sliding window

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

**Important:** By contributing to this project, you agree to the terms of the [Contributor Assignment Agreement (CAA)](CONTRIBUTOR_AGREEMENT.md), which assigns copyright of your contributions to Unidatum Integrated Products LLC. Please include the CAA statement in your pull request.

## License

This project is licensed under the GNU General Public License v3.0 (GPLv3) - see the [LICENSE](LICENSE) file for details.

Copyright (C) 2025 Unidatum Integrated Products LLC

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

### Commercial Licensing

Companies that cannot release their source code under the GPLv3 may purchase a commercial license from Unidatum Integrated Products LLC. A commercial license grants the right to use this software in closed-source, proprietary projects without the requirement to disclose source code.

For commercial licensing inquiries, please contact: licensing@unidatum.com

## Patents

This software is subject to a pending patent application:

**Transformer-based Time Series Forecasting System and Method**
US Patent Application No. 63/910,189 (Filed November 3, 2025)

The patent covers specific methods and systems related to transformer-based time series forecasting. Use of this software under the GPLv3 license includes the patent license provisions specified in Section 11 of the GPLv3. For more information, see the [PATENTS](PATENTS) file.

## Citation

If you use this code in your research, please cite:

```bibtex
@software{temporal2024,
  title = {Temporal: Transformer-Based Time Series Forecasting},
  year = {2024},
  note = {A PyTorch implementation of transformer architecture for time series},
  url = {https://github.com/OptimalMatch/temporal}
}
```

## References

- Vaswani et al., "Attention is All You Need" (2017)
- Modern transformer-based time series forecasting approaches

## Acknowledgments

This implementation is inspired by modern transformer architectures for time series forecasting and the original Transformer paper.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "temporal-forecasting",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Alex Chang <alex@unidatum.com>",
    "keywords": "time-series, forecasting, transformer, deep-learning, attention-mechanism, pytorch, machine-learning, neural-network",
    "author": null,
    "author_email": "Unidatum Integrated Products LLC <alex@unidatum.com>",
    "download_url": "https://files.pythonhosted.org/packages/72/e2/c1a8f5e8d037b25db6581ff91c09f376c6b01c7fa7ef6ec65cf3b0ee6b4d/temporal_forecasting-0.3.3.tar.gz",
    "platform": null,
    "description": "# Temporal: Transformer-Based Time Series Forecasting\n\nA PyTorch implementation of a transformer-based model for time series forecasting, inspired by modern attention-based approaches.\n\n## Overview\n\nTemporal is a foundational model for time series forecasting based on the revolutionary self-attention mechanism introduced in \"Attention is All You Need\". Unlike language models, Temporal is specifically designed and trained to minimize forecasting error on time series data.\n\n### Key Features\n\n- **Self-Attention Mechanism**: Captures complex temporal dependencies and patterns\n- **Encoder-Decoder Architecture**: Multi-layer transformer with residual connections and layer normalization\n- **Flexible**: Supports both univariate and multivariate time series\n- **Scalable**: Can handle various sequence lengths and forecasting horizons\n- **Autoregressive Generation**: Inference mode for multi-step ahead forecasting\n\n## Architecture\n\nThe Temporal model consists of:\n\n1. **Input Embedding**: Projects time series data to model dimension\n2. **Positional Encoding**: Captures temporal order (sinusoidal or learnable)\n3. **Encoder Stack**: Multiple layers of self-attention and feed-forward networks\n4. **Decoder Stack**: Multiple layers with self-attention, cross-attention, and feed-forward networks\n5. **Output Projection**: Maps decoder output to forecasting window dimension\n\n### Architecture Diagram\n\n```mermaid\ngraph TD\n    A[Input Time Series<br/>batch, lookback, features] --> B[Input Embedding<br/>Linear: features \u2192 d_model]\n    B --> C[Positional Encoding<br/>Add temporal position info]\n    C --> D[Encoder Stack<br/>6 layers]\n    D --> E[Encoder Output<br/>batch, lookback, d_model]\n\n    F[Decoder Input<br/>Previous predictions] --> G[Input Embedding<br/>Linear: features \u2192 d_model]\n    G --> H[Positional Encoding]\n    H --> I[Decoder Stack<br/>6 layers]\n    E --> I\n    I --> J[Decoder Output<br/>batch, horizon, d_model]\n    J --> K[Output Projection<br/>Linear: d_model \u2192 features]\n    K --> L[Forecast<br/>batch, horizon, features]\n\n    style A fill:#e1f5ff\n    style L fill:#e1ffe1\n    style D fill:#fff4e1\n    style I fill:#ffe1f5\n```\n\nEach layer includes:\n- Multi-head self-attention\n- Residual connections\n- Layer normalization\n- Feed-forward networks with GELU activation\n\n**For more diagrams**, see [DIAGRAMS.md](DIAGRAMS.md) - complete visual documentation with:\n- Encoder/Decoder architecture\n- Multi-head attention mechanism\n- Training and inference flows\n- Data pipeline\n- Component interactions\n\n## Installation\n\n### From PyPI\n\n```bash\npip install temporal-forecasting\n```\n\n### With HuggingFace Support\n\n```bash\npip install temporal-forecasting[huggingface]\n```\n\nThis adds support for:\n- Uploading models to HuggingFace Hub\n- Downloading models from HuggingFace Hub\n- HuggingFace ecosystem integration\n\n### With Data Fetching Support\n\n```bash\npip install temporal-forecasting[data]\n```\n\nThis adds support for:\n- Fetching stock prices from Yahoo Finance\n- Fetching cryptocurrency data (Bitcoin, Ethereum, etc.)\n- Downloading datasets from Kaggle\n- Technical indicators (SMA, RSI, MACD, Bollinger Bands)\n- Data preprocessing utilities\n\n### From Source\n\n```bash\ngit clone https://github.com/OptimalMatch/temporal.git\ncd temporal\npip install -r requirements.txt\npip install -e .\n```\n\n### Requirements\n\n- Python >= 3.8\n- PyTorch >= 2.0.0\n- NumPy >= 1.20.0\n- tqdm >= 4.60.0\n- matplotlib >= 3.3.0\n\n### Optional Dependencies\n\n- **HuggingFace**: `transformers>=4.30.0`, `huggingface-hub>=0.16.0`\n- **Data Fetching**: `yfinance>=0.2.0`, `pandas>=1.3.0`, `scikit-learn>=1.0.0`, `kagglehub>=0.2.0`\n\n## Quick Start\n\n### Basic Usage\n\n```python\nimport torch\nfrom temporal import Temporal\n\n# Create model\nmodel = Temporal(\n    input_dim=1,           # Univariate time series\n    d_model=256,           # Model dimension\n    num_encoder_layers=4,  # Number of encoder layers\n    num_decoder_layers=4,  # Number of decoder layers\n    num_heads=8,           # Attention heads\n    d_ff=1024,            # Feed-forward dimension\n    forecast_horizon=24,   # Predict 24 steps ahead\n    dropout=0.1\n)\n\n# Input: (batch_size, sequence_length, input_dim)\nx = torch.randn(32, 96, 1)\n\n# Generate forecast\nforecast = model.forecast(x)  # (32, 24, 1)\n```\n\n### Training Example\n\n```python\nfrom temporal import Temporal\nfrom temporal.trainer import TimeSeriesDataset, TemporalTrainer\nfrom torch.utils.data import DataLoader\nimport torch\n\n# Prepare your data\ntrain_data = ...  # Shape: (num_samples, num_features)\n\n# Create dataset\ndataset = TimeSeriesDataset(\n    train_data,\n    lookback=96,\n    forecast_horizon=24,\n    stride=1\n)\n\n# Create data loader\ntrain_loader = DataLoader(dataset, batch_size=32, shuffle=True)\n\n# Create model\nmodel = Temporal(\n    input_dim=train_data.shape[1],\n    d_model=256,\n    num_encoder_layers=4,\n    num_decoder_layers=4,\n    num_heads=8,\n    d_ff=1024,\n    forecast_horizon=24\n)\n\n# Create optimizer\noptimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)\n\n# Create trainer\ntrainer = TemporalTrainer(\n    model=model,\n    optimizer=optimizer,\n    criterion=torch.nn.MSELoss()\n)\n\n# Train\nhistory = trainer.fit(\n    train_loader=train_loader,\n    num_epochs=100,\n    early_stopping_patience=10,\n    save_path=\"best_model.pt\"  # Automatically saves best model\n)\n```\n\n### Saving and Loading Models\n\n```python\n# Save trained model\ntorch.save(model.state_dict(), 'temporal_model.pt')\n\n# Load model for inference\nmodel = Temporal(input_dim=1, forecast_horizon=24)\nmodel.load_state_dict(torch.load('temporal_model.pt'))\nmodel.eval()\n\n# Make predictions\nforecast = model.forecast(x)\n```\n\nFor complete guide on model persistence, see [MODEL_PERSISTENCE.md](MODEL_PERSISTENCE.md).\n\n## Examples\n\n### Univariate Time Series\n\nSee `examples/basic_usage.py` for a complete example with synthetic data:\n\n```bash\ncd examples\npython basic_usage.py\n```\n\nThis will:\n- Generate synthetic time series data\n- Train a Temporal model\n- Generate forecasts\n- Visualize results\n\n### Multivariate Time Series\n\nSee `examples/multivariate_example.py` for forecasting multiple correlated features:\n\n```bash\ncd examples\npython multivariate_example.py\n```\n\n### Model Persistence\n\nSee `examples/model_persistence_example.py` for saving and loading trained models:\n\n```bash\ncd examples\npython model_persistence_example.py\n```\n\nThis demonstrates:\n- Training and saving a model with all components\n- Loading saved models for inference\n- Production-ready model deployment\n\n### HuggingFace Integration\n\nSee `examples/huggingface_example.py` for HuggingFace Hub integration:\n\n```bash\ncd examples\npython huggingface_example.py\n```\n\nThis demonstrates:\n- Creating HuggingFace-compatible models\n- Saving in HuggingFace format\n- Loading from HuggingFace Hub\n- Uploading models to share with the community\n\nFor complete guide, see [HUGGINGFACE_INTEGRATION.md](HUGGINGFACE_INTEGRATION.md).\n\n### Stock Price Forecasting\n\nSee `examples/stock_forecasting.py` for real stock data forecasting:\n\n```bash\ncd examples\npython stock_forecasting.py\n```\n\nThis demonstrates:\n- Fetching stock data from Yahoo Finance\n- Training on Apple (AAPL) stock prices\n- 5-day price forecasting\n- Model evaluation and visualization\n\n### Cryptocurrency Forecasting\n\nSee `examples/crypto_forecasting.py` for Bitcoin and crypto forecasting:\n\n```bash\ncd examples\npython crypto_forecasting.py\n```\n\nThis demonstrates:\n- Fetching Bitcoin data\n- Training on cryptocurrency prices\n- 7-day price forecasting\n- Multi-crypto comparison\n\nFor complete guide on data fetching, see [DATA_SOURCES.md](DATA_SOURCES.md).\n\n## Reference Implementations\n\nThe following projects demonstrate real-world applications built using the Temporal forecasting library. These implementations showcase how to integrate Temporal into production systems and can serve as templates for your own projects.\n\n### Temporal Trading Agents\n\n**Repository:** [github.com/OptimalMatch/temporal-trading-agents](https://github.com/OptimalMatch/temporal-trading-agents)\n\nA next-generation trading system that combines deep learning time-series forecasting with ensemble methods and multi-strategy consensus voting to predict market movements and generate trading signals.\n\n#### Features\n\n- **Multi-Horizon Forecasting**: Separate ensembles for 3-day, 7-day, 14-day, and 21-day predictions\n- **Ensemble Learning**: Combines 5-8 models per time horizon with confidence quantification\n- **8-Strategy Consensus System**: Analyzes predictions using gradient analysis, confidence weighting, volatility sizing, momentum, swing trading, risk-adjusted metrics, mean reversion, and multi-timeframe alignment\n- **Production-Ready Platform**: React dashboard with FastAPI backend, MongoDB, and Docker deployment\n- **Risk Management**: Dynamic position sizing, VaR calculations, and Sortino ratio analysis\n\n#### Using Temporal in Your Project\n\nAdd Temporal to your `requirements.txt`:\n\n```txt\ntemporal-forecasting>=0.3.1\n```\n\nExample usage from the trading agents implementation:\n\n```python\nfrom temporal import Temporal, TemporalTrainer, TimeSeriesDataset\nfrom temporal.data_sources import fetch_crypto_data\nimport torch\n\n# Fetch cryptocurrency data\ndata = fetch_crypto_data('BTC-USD', period='2y')\n\n# Create and train ensemble of models for different horizons\nhorizons = [3, 7, 14, 21]  # days\nmodels = {}\n\nfor horizon in horizons:\n    # Prepare dataset\n    dataset = TimeSeriesDataset(\n        data,\n        lookback=96,\n        forecast_horizon=horizon * 24,  # Convert days to hours\n        stride=1\n    )\n\n    # Create model\n    model = Temporal(\n        input_dim=data.shape[1],\n        d_model=256,\n        num_encoder_layers=4,\n        num_decoder_layers=4,\n        num_heads=8,\n        forecast_horizon=horizon * 24\n    )\n\n    # Train model\n    trainer = TemporalTrainer(model, optimizer=torch.optim.AdamW(model.parameters()))\n    history = trainer.fit(train_loader, num_epochs=100)\n\n    models[f'{horizon}d'] = model\n\n# Generate multi-horizon forecasts\nforecasts = {}\nfor horizon, model in models.items():\n    forecast = model.forecast(recent_data)\n    forecasts[horizon] = forecast\n\n# Use forecasts for trading strategy consensus voting\n# (See temporal-trading-agents for full strategy implementation)\n```\n\n#### Learn More\n\n- **Documentation**: See the [temporal-trading-agents README](https://github.com/OptimalMatch/temporal-trading-agents#readme)\n- **Live Demo**: Follow the Docker setup instructions for a complete trading dashboard\n- **Strategies**: Review the 8-strategy consensus voting system for signal generation\n\n---\n\n### Contributing Your Implementation\n\nHave you built something with Temporal? We'd love to feature your project! Submit a pull request adding your implementation to this section, including:\n\n- Project description and repository link\n- Key features and use cases\n- Code example showing Temporal integration\n- Any unique approaches or optimizations\n\n## Model Configuration\n\n### Parameters\n\n| Parameter | Description | Default |\n|-----------|-------------|---------|\n| `input_dim` | Number of input features | 1 |\n| `d_model` | Model dimension | 512 |\n| `num_encoder_layers` | Number of encoder layers | 6 |\n| `num_decoder_layers` | Number of decoder layers | 6 |\n| `num_heads` | Number of attention heads | 8 |\n| `d_ff` | Feed-forward dimension | 2048 |\n| `forecast_horizon` | Number of steps to forecast | 24 |\n| `max_seq_len` | Maximum sequence length | 5000 |\n| `dropout` | Dropout probability | 0.1 |\n| `use_learnable_pe` | Use learnable positional encoding | False |\n\n### Recommended Configurations\n\n**Small Model** (Fast training, lower accuracy):\n```python\nmodel = Temporal(\n    d_model=128,\n    num_encoder_layers=2,\n    num_decoder_layers=2,\n    num_heads=4,\n    d_ff=512\n)\n```\n\n**Medium Model** (Balanced):\n```python\nmodel = Temporal(\n    d_model=256,\n    num_encoder_layers=4,\n    num_decoder_layers=4,\n    num_heads=8,\n    d_ff=1024\n)\n```\n\n**Large Model** (Best accuracy, slower training):\n```python\nmodel = Temporal(\n    d_model=512,\n    num_encoder_layers=6,\n    num_decoder_layers=6,\n    num_heads=16,\n    d_ff=2048\n)\n```\n\n## Training Tips\n\n1. **Learning Rate**: Start with 1e-4 and use a scheduler (e.g., ReduceLROnPlateau)\n2. **Batch Size**: Use the largest batch size that fits in memory (32-128)\n3. **Gradient Clipping**: Use gradient clipping (0.5-1.0) to prevent exploding gradients\n4. **Early Stopping**: Monitor validation loss and stop when it plateaus\n5. **Data Normalization**: Normalize your data (e.g., StandardScaler) before training\n\n## Architecture Details\n\n### Multi-Head Attention\n\nThe model uses scaled dot-product attention:\n\n```\nAttention(Q, K, V) = softmax(QK^T / \u221ad_k)V\n```\n\nMultiple attention heads allow the model to attend to different aspects of the time series simultaneously.\n\n### Positional Encoding\n\nTwo types of positional encoding are available:\n\n1. **Sinusoidal** (default): Fixed sinusoidal functions\n2. **Learnable**: Learned embeddings for each position\n\n### Autoregressive Generation\n\nDuring inference, the model generates forecasts autoregressively:\n- Start with the last observed value\n- Generate next step prediction\n- Use prediction as input for next step\n- Repeat for entire forecast horizon\n\n## Modern Time Series Transformers\n\nTemporal implements a transformer architecture similar to modern approaches in time series forecasting:\n\n| Feature | Modern Approaches | Temporal |\n|---------|---------|----------|\n| Architecture | Transformer | Transformer |\n| Attention | Multi-head | Multi-head |\n| Layers | Encoder-Decoder | Encoder-Decoder |\n| Training | Large-scale pre-training | User-provided data |\n| Flexibility | Fixed models | Fully customizable |\n\n## Performance\n\nPerformance varies by dataset and configuration. Typical metrics on benchmark datasets:\n\n- **MSE**: 0.01-0.1 (normalized data)\n- **MAE**: 0.05-0.3 (normalized data)\n- **Training Time**: 1-10 minutes per epoch (depending on size)\n\n## API Reference\n\n### Temporal\n\nMain model class for time series forecasting.\n\n**Methods**:\n- `forward(src, tgt=None, src_mask=None, tgt_mask=None)`: Forward pass\n- `forecast(x, horizon=None)`: Generate forecasts\n- `generate_causal_mask(size)`: Create causal attention mask\n\n### TemporalTrainer\n\nTraining utilities for Temporal models.\n\n**Methods**:\n- `train_epoch(dataloader)`: Train for one epoch\n- `validate(dataloader)`: Validate the model\n- `fit(train_loader, val_loader, num_epochs, ...)`: Full training loop\n- `predict(dataloader)`: Generate predictions\n\n### TimeSeriesDataset\n\nDataset class for time series data.\n\n**Parameters**:\n- `data`: Time series data array\n- `lookback`: Number of historical steps\n- `forecast_horizon`: Number of future steps\n- `stride`: Stride for sliding window\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n**Important:** By contributing to this project, you agree to the terms of the [Contributor Assignment Agreement (CAA)](CONTRIBUTOR_AGREEMENT.md), which assigns copyright of your contributions to Unidatum Integrated Products LLC. Please include the CAA statement in your pull request.\n\n## License\n\nThis project is licensed under the GNU General Public License v3.0 (GPLv3) - see the [LICENSE](LICENSE) file for details.\n\nCopyright (C) 2025 Unidatum Integrated Products LLC\n\nThis program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.\n\n### Commercial Licensing\n\nCompanies that cannot release their source code under the GPLv3 may purchase a commercial license from Unidatum Integrated Products LLC. A commercial license grants the right to use this software in closed-source, proprietary projects without the requirement to disclose source code.\n\nFor commercial licensing inquiries, please contact: licensing@unidatum.com\n\n## Patents\n\nThis software is subject to a pending patent application:\n\n**Transformer-based Time Series Forecasting System and Method**\nUS Patent Application No. 63/910,189 (Filed November 3, 2025)\n\nThe patent covers specific methods and systems related to transformer-based time series forecasting. Use of this software under the GPLv3 license includes the patent license provisions specified in Section 11 of the GPLv3. For more information, see the [PATENTS](PATENTS) file.\n\n## Citation\n\nIf you use this code in your research, please cite:\n\n```bibtex\n@software{temporal2024,\n  title = {Temporal: Transformer-Based Time Series Forecasting},\n  year = {2024},\n  note = {A PyTorch implementation of transformer architecture for time series},\n  url = {https://github.com/OptimalMatch/temporal}\n}\n```\n\n## References\n\n- Vaswani et al., \"Attention is All You Need\" (2017)\n- Modern transformer-based time series forecasting approaches\n\n## Acknowledgments\n\nThis implementation is inspired by modern transformer architectures for time series forecasting and the original Transformer paper.\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-or-later",
    "summary": "A transformer-based model for time series forecasting inspired by modern attention mechanisms",
    "version": "0.3.3",
    "project_urls": {
        "Changelog": "https://github.com/OptimalMatch/temporal/releases",
        "Documentation": "https://github.com/OptimalMatch/temporal#readme",
        "Homepage": "https://github.com/OptimalMatch/temporal",
        "Issues": "https://github.com/OptimalMatch/temporal/issues",
        "Repository": "https://github.com/OptimalMatch/temporal"
    },
    "split_keywords": [
        "time-series",
        " forecasting",
        " transformer",
        " deep-learning",
        " attention-mechanism",
        " pytorch",
        " machine-learning",
        " neural-network"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "94c0c623635b65c6c5bcb910bbf23488840913f06dbd068dc4600317867a18b4",
                "md5": "f5b9a3e48a8a5e75dc8e5e8b0a5a1f29",
                "sha256": "ddac24a7b5aba8a5842598b47c41ef04a6b388cd9da8dae13b4116ea77c39a0c"
            },
            "downloads": -1,
            "filename": "temporal_forecasting-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f5b9a3e48a8a5e75dc8e5e8b0a5a1f29",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 43172,
            "upload_time": "2025-11-08T22:40:21",
            "upload_time_iso_8601": "2025-11-08T22:40:21.777641Z",
            "url": "https://files.pythonhosted.org/packages/94/c0/c623635b65c6c5bcb910bbf23488840913f06dbd068dc4600317867a18b4/temporal_forecasting-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "72e2c1a8f5e8d037b25db6581ff91c09f376c6b01c7fa7ef6ec65cf3b0ee6b4d",
                "md5": "bb5a5449d783bf85834d45907087ee54",
                "sha256": "9e6ea54bb8d01ce8e2d78dee0d7b100c14c71053a66db5db08fe2a078440fa48"
            },
            "downloads": -1,
            "filename": "temporal_forecasting-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "bb5a5449d783bf85834d45907087ee54",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 50809,
            "upload_time": "2025-11-08T22:40:22",
            "upload_time_iso_8601": "2025-11-08T22:40:22.875851Z",
            "url": "https://files.pythonhosted.org/packages/72/e2/c1a8f5e8d037b25db6581ff91c09f376c6b01c7fa7ef6ec65cf3b0ee6b4d/temporal_forecasting-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-08 22:40:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "OptimalMatch",
    "github_project": "temporal",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.20.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.60.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.3.0"
                ]
            ]
        }
    ],
    "lcname": "temporal-forecasting"
}

None