# Temporal: Transformer-Based Time Series Forecasting
A PyTorch implementation of a transformer-based model for time series forecasting, inspired by modern attention-based approaches.
## Overview
Temporal is a foundational model for time series forecasting based on the revolutionary self-attention mechanism introduced in "Attention is All You Need". Unlike language models, Temporal is specifically designed and trained to minimize forecasting error on time series data.
### Key Features
- **Self-Attention Mechanism**: Captures complex temporal dependencies and patterns
- **Encoder-Decoder Architecture**: Multi-layer transformer with residual connections and layer normalization
- **Flexible**: Supports both univariate and multivariate time series
- **Scalable**: Can handle various sequence lengths and forecasting horizons
- **Autoregressive Generation**: Inference mode for multi-step ahead forecasting
## Architecture
The Temporal model consists of:
1. **Input Embedding**: Projects time series data to model dimension
2. **Positional Encoding**: Captures temporal order (sinusoidal or learnable)
3. **Encoder Stack**: Multiple layers of self-attention and feed-forward networks
4. **Decoder Stack**: Multiple layers with self-attention, cross-attention, and feed-forward networks
5. **Output Projection**: Maps decoder output to forecasting window dimension
### Architecture Diagram
```mermaid
graph TD
A[Input Time Series<br/>batch, lookback, features] --> B[Input Embedding<br/>Linear: features → d_model]
B --> C[Positional Encoding<br/>Add temporal position info]
C --> D[Encoder Stack<br/>6 layers]
D --> E[Encoder Output<br/>batch, lookback, d_model]
F[Decoder Input<br/>Previous predictions] --> G[Input Embedding<br/>Linear: features → d_model]
G --> H[Positional Encoding]
H --> I[Decoder Stack<br/>6 layers]
E --> I
I --> J[Decoder Output<br/>batch, horizon, d_model]
J --> K[Output Projection<br/>Linear: d_model → features]
K --> L[Forecast<br/>batch, horizon, features]
style A fill:#e1f5ff
style L fill:#e1ffe1
style D fill:#fff4e1
style I fill:#ffe1f5
```
Each layer includes:
- Multi-head self-attention
- Residual connections
- Layer normalization
- Feed-forward networks with GELU activation
**For more diagrams**, see [DIAGRAMS.md](DIAGRAMS.md) - complete visual documentation with:
- Encoder/Decoder architecture
- Multi-head attention mechanism
- Training and inference flows
- Data pipeline
- Component interactions
## Installation
### From PyPI
```bash
pip install temporal-forecasting
```
### With HuggingFace Support
```bash
pip install temporal-forecasting[huggingface]
```
This adds support for:
- Uploading models to HuggingFace Hub
- Downloading models from HuggingFace Hub
- HuggingFace ecosystem integration
### With Data Fetching Support
```bash
pip install temporal-forecasting[data]
```
This adds support for:
- Fetching stock prices from Yahoo Finance
- Fetching cryptocurrency data (Bitcoin, Ethereum, etc.)
- Downloading datasets from Kaggle
- Technical indicators (SMA, RSI, MACD, Bollinger Bands)
- Data preprocessing utilities
### From Source
```bash
git clone https://github.com/OptimalMatch/temporal.git
cd temporal
pip install -r requirements.txt
pip install -e .
```
### Requirements
- Python >= 3.8
- PyTorch >= 2.0.0
- NumPy >= 1.20.0
- tqdm >= 4.60.0
- matplotlib >= 3.3.0
### Optional Dependencies
- **HuggingFace**: `transformers>=4.30.0`, `huggingface-hub>=0.16.0`
- **Data Fetching**: `yfinance>=0.2.0`, `pandas>=1.3.0`, `scikit-learn>=1.0.0`, `kagglehub>=0.2.0`
## Quick Start
### Basic Usage
```python
import torch
from temporal import Temporal
# Create model
model = Temporal(
input_dim=1, # Univariate time series
d_model=256, # Model dimension
num_encoder_layers=4, # Number of encoder layers
num_decoder_layers=4, # Number of decoder layers
num_heads=8, # Attention heads
d_ff=1024, # Feed-forward dimension
forecast_horizon=24, # Predict 24 steps ahead
dropout=0.1
)
# Input: (batch_size, sequence_length, input_dim)
x = torch.randn(32, 96, 1)
# Generate forecast
forecast = model.forecast(x) # (32, 24, 1)
```
### Training Example
```python
from temporal import Temporal
from temporal.trainer import TimeSeriesDataset, TemporalTrainer
from torch.utils.data import DataLoader
import torch
# Prepare your data
train_data = ... # Shape: (num_samples, num_features)
# Create dataset
dataset = TimeSeriesDataset(
train_data,
lookback=96,
forecast_horizon=24,
stride=1
)
# Create data loader
train_loader = DataLoader(dataset, batch_size=32, shuffle=True)
# Create model
model = Temporal(
input_dim=train_data.shape[1],
d_model=256,
num_encoder_layers=4,
num_decoder_layers=4,
num_heads=8,
d_ff=1024,
forecast_horizon=24
)
# Create optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)
# Create trainer
trainer = TemporalTrainer(
model=model,
optimizer=optimizer,
criterion=torch.nn.MSELoss()
)
# Train
history = trainer.fit(
train_loader=train_loader,
num_epochs=100,
early_stopping_patience=10,
save_path="best_model.pt" # Automatically saves best model
)
```
### Saving and Loading Models
```python
# Save trained model
torch.save(model.state_dict(), 'temporal_model.pt')
# Load model for inference
model = Temporal(input_dim=1, forecast_horizon=24)
model.load_state_dict(torch.load('temporal_model.pt'))
model.eval()
# Make predictions
forecast = model.forecast(x)
```
For complete guide on model persistence, see [MODEL_PERSISTENCE.md](MODEL_PERSISTENCE.md).
## Examples
### Univariate Time Series
See `examples/basic_usage.py` for a complete example with synthetic data:
```bash
cd examples
python basic_usage.py
```
This will:
- Generate synthetic time series data
- Train a Temporal model
- Generate forecasts
- Visualize results
### Multivariate Time Series
See `examples/multivariate_example.py` for forecasting multiple correlated features:
```bash
cd examples
python multivariate_example.py
```
### Model Persistence
See `examples/model_persistence_example.py` for saving and loading trained models:
```bash
cd examples
python model_persistence_example.py
```
This demonstrates:
- Training and saving a model with all components
- Loading saved models for inference
- Production-ready model deployment
### HuggingFace Integration
See `examples/huggingface_example.py` for HuggingFace Hub integration:
```bash
cd examples
python huggingface_example.py
```
This demonstrates:
- Creating HuggingFace-compatible models
- Saving in HuggingFace format
- Loading from HuggingFace Hub
- Uploading models to share with the community
For complete guide, see [HUGGINGFACE_INTEGRATION.md](HUGGINGFACE_INTEGRATION.md).
### Stock Price Forecasting
See `examples/stock_forecasting.py` for real stock data forecasting:
```bash
cd examples
python stock_forecasting.py
```
This demonstrates:
- Fetching stock data from Yahoo Finance
- Training on Apple (AAPL) stock prices
- 5-day price forecasting
- Model evaluation and visualization
### Cryptocurrency Forecasting
See `examples/crypto_forecasting.py` for Bitcoin and crypto forecasting:
```bash
cd examples
python crypto_forecasting.py
```
This demonstrates:
- Fetching Bitcoin data
- Training on cryptocurrency prices
- 7-day price forecasting
- Multi-crypto comparison
For complete guide on data fetching, see [DATA_SOURCES.md](DATA_SOURCES.md).
## Model Configuration
### Parameters
| Parameter | Description | Default |
|-----------|-------------|---------|
| `input_dim` | Number of input features | 1 |
| `d_model` | Model dimension | 512 |
| `num_encoder_layers` | Number of encoder layers | 6 |
| `num_decoder_layers` | Number of decoder layers | 6 |
| `num_heads` | Number of attention heads | 8 |
| `d_ff` | Feed-forward dimension | 2048 |
| `forecast_horizon` | Number of steps to forecast | 24 |
| `max_seq_len` | Maximum sequence length | 5000 |
| `dropout` | Dropout probability | 0.1 |
| `use_learnable_pe` | Use learnable positional encoding | False |
### Recommended Configurations
**Small Model** (Fast training, lower accuracy):
```python
model = Temporal(
d_model=128,
num_encoder_layers=2,
num_decoder_layers=2,
num_heads=4,
d_ff=512
)
```
**Medium Model** (Balanced):
```python
model = Temporal(
d_model=256,
num_encoder_layers=4,
num_decoder_layers=4,
num_heads=8,
d_ff=1024
)
```
**Large Model** (Best accuracy, slower training):
```python
model = Temporal(
d_model=512,
num_encoder_layers=6,
num_decoder_layers=6,
num_heads=16,
d_ff=2048
)
```
## Training Tips
1. **Learning Rate**: Start with 1e-4 and use a scheduler (e.g., ReduceLROnPlateau)
2. **Batch Size**: Use the largest batch size that fits in memory (32-128)
3. **Gradient Clipping**: Use gradient clipping (0.5-1.0) to prevent exploding gradients
4. **Early Stopping**: Monitor validation loss and stop when it plateaus
5. **Data Normalization**: Normalize your data (e.g., StandardScaler) before training
## Architecture Details
### Multi-Head Attention
The model uses scaled dot-product attention:
```
Attention(Q, K, V) = softmax(QK^T / √d_k)V
```
Multiple attention heads allow the model to attend to different aspects of the time series simultaneously.
### Positional Encoding
Two types of positional encoding are available:
1. **Sinusoidal** (default): Fixed sinusoidal functions
2. **Learnable**: Learned embeddings for each position
### Autoregressive Generation
During inference, the model generates forecasts autoregressively:
- Start with the last observed value
- Generate next step prediction
- Use prediction as input for next step
- Repeat for entire forecast horizon
## Modern Time Series Transformers
Temporal implements a transformer architecture similar to modern approaches in time series forecasting:
| Feature | Modern Approaches | Temporal |
|---------|---------|----------|
| Architecture | Transformer | Transformer |
| Attention | Multi-head | Multi-head |
| Layers | Encoder-Decoder | Encoder-Decoder |
| Training | Large-scale pre-training | User-provided data |
| Flexibility | Fixed models | Fully customizable |
## Performance
Performance varies by dataset and configuration. Typical metrics on benchmark datasets:
- **MSE**: 0.01-0.1 (normalized data)
- **MAE**: 0.05-0.3 (normalized data)
- **Training Time**: 1-10 minutes per epoch (depending on size)
## API Reference
### Temporal
Main model class for time series forecasting.
**Methods**:
- `forward(src, tgt=None, src_mask=None, tgt_mask=None)`: Forward pass
- `forecast(x, horizon=None)`: Generate forecasts
- `generate_causal_mask(size)`: Create causal attention mask
### TemporalTrainer
Training utilities for Temporal models.
**Methods**:
- `train_epoch(dataloader)`: Train for one epoch
- `validate(dataloader)`: Validate the model
- `fit(train_loader, val_loader, num_epochs, ...)`: Full training loop
- `predict(dataloader)`: Generate predictions
### TimeSeriesDataset
Dataset class for time series data.
**Parameters**:
- `data`: Time series data array
- `lookback`: Number of historical steps
- `forecast_horizon`: Number of future steps
- `stride`: Stride for sliding window
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Citation
If you use this code in your research, please cite:
```bibtex
@software{temporal2024,
title = {Temporal: Transformer-Based Time Series Forecasting},
year = {2024},
note = {A PyTorch implementation of transformer architecture for time series},
url = {https://github.com/OptimalMatch/temporal}
}
```
## References
- Vaswani et al., "Attention is All You Need" (2017)
- Modern transformer-based time series forecasting approaches
## Acknowledgments
This implementation is inspired by modern transformer architectures for time series forecasting and the original Transformer paper.
Raw data
{
"_id": null,
"home_page": null,
"name": "temporal-forecasting",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Alex Chang <alex@unidatum.com>",
"keywords": "time-series, forecasting, transformer, deep-learning, attention-mechanism, pytorch, machine-learning, neural-network",
"author": null,
"author_email": "Unidatum Integrated Products LLC <alex@unidatum.com>",
"download_url": "https://files.pythonhosted.org/packages/4f/82/c5ad8e152294cd2a0e3a7f52dd2f031e12452bda4dfa2f7db2ba54ec1e6d/temporal_forecasting-0.3.1.tar.gz",
"platform": null,
"description": "# Temporal: Transformer-Based Time Series Forecasting\n\nA PyTorch implementation of a transformer-based model for time series forecasting, inspired by modern attention-based approaches.\n\n## Overview\n\nTemporal is a foundational model for time series forecasting based on the revolutionary self-attention mechanism introduced in \"Attention is All You Need\". Unlike language models, Temporal is specifically designed and trained to minimize forecasting error on time series data.\n\n### Key Features\n\n- **Self-Attention Mechanism**: Captures complex temporal dependencies and patterns\n- **Encoder-Decoder Architecture**: Multi-layer transformer with residual connections and layer normalization\n- **Flexible**: Supports both univariate and multivariate time series\n- **Scalable**: Can handle various sequence lengths and forecasting horizons\n- **Autoregressive Generation**: Inference mode for multi-step ahead forecasting\n\n## Architecture\n\nThe Temporal model consists of:\n\n1. **Input Embedding**: Projects time series data to model dimension\n2. **Positional Encoding**: Captures temporal order (sinusoidal or learnable)\n3. **Encoder Stack**: Multiple layers of self-attention and feed-forward networks\n4. **Decoder Stack**: Multiple layers with self-attention, cross-attention, and feed-forward networks\n5. **Output Projection**: Maps decoder output to forecasting window dimension\n\n### Architecture Diagram\n\n```mermaid\ngraph TD\n A[Input Time Series<br/>batch, lookback, features] --> B[Input Embedding<br/>Linear: features \u2192 d_model]\n B --> C[Positional Encoding<br/>Add temporal position info]\n C --> D[Encoder Stack<br/>6 layers]\n D --> E[Encoder Output<br/>batch, lookback, d_model]\n\n F[Decoder Input<br/>Previous predictions] --> G[Input Embedding<br/>Linear: features \u2192 d_model]\n G --> H[Positional Encoding]\n H --> I[Decoder Stack<br/>6 layers]\n E --> I\n I --> J[Decoder Output<br/>batch, horizon, d_model]\n J --> K[Output Projection<br/>Linear: d_model \u2192 features]\n K --> L[Forecast<br/>batch, horizon, features]\n\n style A fill:#e1f5ff\n style L fill:#e1ffe1\n style D fill:#fff4e1\n style I fill:#ffe1f5\n```\n\nEach layer includes:\n- Multi-head self-attention\n- Residual connections\n- Layer normalization\n- Feed-forward networks with GELU activation\n\n**For more diagrams**, see [DIAGRAMS.md](DIAGRAMS.md) - complete visual documentation with:\n- Encoder/Decoder architecture\n- Multi-head attention mechanism\n- Training and inference flows\n- Data pipeline\n- Component interactions\n\n## Installation\n\n### From PyPI\n\n```bash\npip install temporal-forecasting\n```\n\n### With HuggingFace Support\n\n```bash\npip install temporal-forecasting[huggingface]\n```\n\nThis adds support for:\n- Uploading models to HuggingFace Hub\n- Downloading models from HuggingFace Hub\n- HuggingFace ecosystem integration\n\n### With Data Fetching Support\n\n```bash\npip install temporal-forecasting[data]\n```\n\nThis adds support for:\n- Fetching stock prices from Yahoo Finance\n- Fetching cryptocurrency data (Bitcoin, Ethereum, etc.)\n- Downloading datasets from Kaggle\n- Technical indicators (SMA, RSI, MACD, Bollinger Bands)\n- Data preprocessing utilities\n\n### From Source\n\n```bash\ngit clone https://github.com/OptimalMatch/temporal.git\ncd temporal\npip install -r requirements.txt\npip install -e .\n```\n\n### Requirements\n\n- Python >= 3.8\n- PyTorch >= 2.0.0\n- NumPy >= 1.20.0\n- tqdm >= 4.60.0\n- matplotlib >= 3.3.0\n\n### Optional Dependencies\n\n- **HuggingFace**: `transformers>=4.30.0`, `huggingface-hub>=0.16.0`\n- **Data Fetching**: `yfinance>=0.2.0`, `pandas>=1.3.0`, `scikit-learn>=1.0.0`, `kagglehub>=0.2.0`\n\n## Quick Start\n\n### Basic Usage\n\n```python\nimport torch\nfrom temporal import Temporal\n\n# Create model\nmodel = Temporal(\n input_dim=1, # Univariate time series\n d_model=256, # Model dimension\n num_encoder_layers=4, # Number of encoder layers\n num_decoder_layers=4, # Number of decoder layers\n num_heads=8, # Attention heads\n d_ff=1024, # Feed-forward dimension\n forecast_horizon=24, # Predict 24 steps ahead\n dropout=0.1\n)\n\n# Input: (batch_size, sequence_length, input_dim)\nx = torch.randn(32, 96, 1)\n\n# Generate forecast\nforecast = model.forecast(x) # (32, 24, 1)\n```\n\n### Training Example\n\n```python\nfrom temporal import Temporal\nfrom temporal.trainer import TimeSeriesDataset, TemporalTrainer\nfrom torch.utils.data import DataLoader\nimport torch\n\n# Prepare your data\ntrain_data = ... # Shape: (num_samples, num_features)\n\n# Create dataset\ndataset = TimeSeriesDataset(\n train_data,\n lookback=96,\n forecast_horizon=24,\n stride=1\n)\n\n# Create data loader\ntrain_loader = DataLoader(dataset, batch_size=32, shuffle=True)\n\n# Create model\nmodel = Temporal(\n input_dim=train_data.shape[1],\n d_model=256,\n num_encoder_layers=4,\n num_decoder_layers=4,\n num_heads=8,\n d_ff=1024,\n forecast_horizon=24\n)\n\n# Create optimizer\noptimizer = torch.optim.AdamW(model.parameters(), lr=1e-4)\n\n# Create trainer\ntrainer = TemporalTrainer(\n model=model,\n optimizer=optimizer,\n criterion=torch.nn.MSELoss()\n)\n\n# Train\nhistory = trainer.fit(\n train_loader=train_loader,\n num_epochs=100,\n early_stopping_patience=10,\n save_path=\"best_model.pt\" # Automatically saves best model\n)\n```\n\n### Saving and Loading Models\n\n```python\n# Save trained model\ntorch.save(model.state_dict(), 'temporal_model.pt')\n\n# Load model for inference\nmodel = Temporal(input_dim=1, forecast_horizon=24)\nmodel.load_state_dict(torch.load('temporal_model.pt'))\nmodel.eval()\n\n# Make predictions\nforecast = model.forecast(x)\n```\n\nFor complete guide on model persistence, see [MODEL_PERSISTENCE.md](MODEL_PERSISTENCE.md).\n\n## Examples\n\n### Univariate Time Series\n\nSee `examples/basic_usage.py` for a complete example with synthetic data:\n\n```bash\ncd examples\npython basic_usage.py\n```\n\nThis will:\n- Generate synthetic time series data\n- Train a Temporal model\n- Generate forecasts\n- Visualize results\n\n### Multivariate Time Series\n\nSee `examples/multivariate_example.py` for forecasting multiple correlated features:\n\n```bash\ncd examples\npython multivariate_example.py\n```\n\n### Model Persistence\n\nSee `examples/model_persistence_example.py` for saving and loading trained models:\n\n```bash\ncd examples\npython model_persistence_example.py\n```\n\nThis demonstrates:\n- Training and saving a model with all components\n- Loading saved models for inference\n- Production-ready model deployment\n\n### HuggingFace Integration\n\nSee `examples/huggingface_example.py` for HuggingFace Hub integration:\n\n```bash\ncd examples\npython huggingface_example.py\n```\n\nThis demonstrates:\n- Creating HuggingFace-compatible models\n- Saving in HuggingFace format\n- Loading from HuggingFace Hub\n- Uploading models to share with the community\n\nFor complete guide, see [HUGGINGFACE_INTEGRATION.md](HUGGINGFACE_INTEGRATION.md).\n\n### Stock Price Forecasting\n\nSee `examples/stock_forecasting.py` for real stock data forecasting:\n\n```bash\ncd examples\npython stock_forecasting.py\n```\n\nThis demonstrates:\n- Fetching stock data from Yahoo Finance\n- Training on Apple (AAPL) stock prices\n- 5-day price forecasting\n- Model evaluation and visualization\n\n### Cryptocurrency Forecasting\n\nSee `examples/crypto_forecasting.py` for Bitcoin and crypto forecasting:\n\n```bash\ncd examples\npython crypto_forecasting.py\n```\n\nThis demonstrates:\n- Fetching Bitcoin data\n- Training on cryptocurrency prices\n- 7-day price forecasting\n- Multi-crypto comparison\n\nFor complete guide on data fetching, see [DATA_SOURCES.md](DATA_SOURCES.md).\n\n## Model Configuration\n\n### Parameters\n\n| Parameter | Description | Default |\n|-----------|-------------|---------|\n| `input_dim` | Number of input features | 1 |\n| `d_model` | Model dimension | 512 |\n| `num_encoder_layers` | Number of encoder layers | 6 |\n| `num_decoder_layers` | Number of decoder layers | 6 |\n| `num_heads` | Number of attention heads | 8 |\n| `d_ff` | Feed-forward dimension | 2048 |\n| `forecast_horizon` | Number of steps to forecast | 24 |\n| `max_seq_len` | Maximum sequence length | 5000 |\n| `dropout` | Dropout probability | 0.1 |\n| `use_learnable_pe` | Use learnable positional encoding | False |\n\n### Recommended Configurations\n\n**Small Model** (Fast training, lower accuracy):\n```python\nmodel = Temporal(\n d_model=128,\n num_encoder_layers=2,\n num_decoder_layers=2,\n num_heads=4,\n d_ff=512\n)\n```\n\n**Medium Model** (Balanced):\n```python\nmodel = Temporal(\n d_model=256,\n num_encoder_layers=4,\n num_decoder_layers=4,\n num_heads=8,\n d_ff=1024\n)\n```\n\n**Large Model** (Best accuracy, slower training):\n```python\nmodel = Temporal(\n d_model=512,\n num_encoder_layers=6,\n num_decoder_layers=6,\n num_heads=16,\n d_ff=2048\n)\n```\n\n## Training Tips\n\n1. **Learning Rate**: Start with 1e-4 and use a scheduler (e.g., ReduceLROnPlateau)\n2. **Batch Size**: Use the largest batch size that fits in memory (32-128)\n3. **Gradient Clipping**: Use gradient clipping (0.5-1.0) to prevent exploding gradients\n4. **Early Stopping**: Monitor validation loss and stop when it plateaus\n5. **Data Normalization**: Normalize your data (e.g., StandardScaler) before training\n\n## Architecture Details\n\n### Multi-Head Attention\n\nThe model uses scaled dot-product attention:\n\n```\nAttention(Q, K, V) = softmax(QK^T / \u221ad_k)V\n```\n\nMultiple attention heads allow the model to attend to different aspects of the time series simultaneously.\n\n### Positional Encoding\n\nTwo types of positional encoding are available:\n\n1. **Sinusoidal** (default): Fixed sinusoidal functions\n2. **Learnable**: Learned embeddings for each position\n\n### Autoregressive Generation\n\nDuring inference, the model generates forecasts autoregressively:\n- Start with the last observed value\n- Generate next step prediction\n- Use prediction as input for next step\n- Repeat for entire forecast horizon\n\n## Modern Time Series Transformers\n\nTemporal implements a transformer architecture similar to modern approaches in time series forecasting:\n\n| Feature | Modern Approaches | Temporal |\n|---------|---------|----------|\n| Architecture | Transformer | Transformer |\n| Attention | Multi-head | Multi-head |\n| Layers | Encoder-Decoder | Encoder-Decoder |\n| Training | Large-scale pre-training | User-provided data |\n| Flexibility | Fixed models | Fully customizable |\n\n## Performance\n\nPerformance varies by dataset and configuration. Typical metrics on benchmark datasets:\n\n- **MSE**: 0.01-0.1 (normalized data)\n- **MAE**: 0.05-0.3 (normalized data)\n- **Training Time**: 1-10 minutes per epoch (depending on size)\n\n## API Reference\n\n### Temporal\n\nMain model class for time series forecasting.\n\n**Methods**:\n- `forward(src, tgt=None, src_mask=None, tgt_mask=None)`: Forward pass\n- `forecast(x, horizon=None)`: Generate forecasts\n- `generate_causal_mask(size)`: Create causal attention mask\n\n### TemporalTrainer\n\nTraining utilities for Temporal models.\n\n**Methods**:\n- `train_epoch(dataloader)`: Train for one epoch\n- `validate(dataloader)`: Validate the model\n- `fit(train_loader, val_loader, num_epochs, ...)`: Full training loop\n- `predict(dataloader)`: Generate predictions\n\n### TimeSeriesDataset\n\nDataset class for time series data.\n\n**Parameters**:\n- `data`: Time series data array\n- `lookback`: Number of historical steps\n- `forecast_horizon`: Number of future steps\n- `stride`: Stride for sliding window\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Citation\n\nIf you use this code in your research, please cite:\n\n```bibtex\n@software{temporal2024,\n title = {Temporal: Transformer-Based Time Series Forecasting},\n year = {2024},\n note = {A PyTorch implementation of transformer architecture for time series},\n url = {https://github.com/OptimalMatch/temporal}\n}\n```\n\n## References\n\n- Vaswani et al., \"Attention is All You Need\" (2017)\n- Modern transformer-based time series forecasting approaches\n\n## Acknowledgments\n\nThis implementation is inspired by modern transformer architectures for time series forecasting and the original Transformer paper.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A transformer-based model for time series forecasting inspired by modern attention mechanisms",
"version": "0.3.1",
"project_urls": {
"Changelog": "https://github.com/OptimalMatch/temporal/releases",
"Documentation": "https://github.com/OptimalMatch/temporal#readme",
"Homepage": "https://github.com/OptimalMatch/temporal",
"Issues": "https://github.com/OptimalMatch/temporal/issues",
"Repository": "https://github.com/OptimalMatch/temporal"
},
"split_keywords": [
"time-series",
" forecasting",
" transformer",
" deep-learning",
" attention-mechanism",
" pytorch",
" machine-learning",
" neural-network"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "318fab3a96743be6f58e35e4bcffb76fa1f71fa2f1d1c6ae1c0c0168f3166548",
"md5": "83554bbd0f6584d45eeebe5eeeb94679",
"sha256": "4cef1bc6f9339045cf770814ebc008e6a153b1c57547d37c0608bb41a17a5a11"
},
"downloads": -1,
"filename": "temporal_forecasting-0.3.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "83554bbd0f6584d45eeebe5eeeb94679",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 25371,
"upload_time": "2025-10-27T08:28:22",
"upload_time_iso_8601": "2025-10-27T08:28:22.812804Z",
"url": "https://files.pythonhosted.org/packages/31/8f/ab3a96743be6f58e35e4bcffb76fa1f71fa2f1d1c6ae1c0c0168f3166548/temporal_forecasting-0.3.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4f82c5ad8e152294cd2a0e3a7f52dd2f031e12452bda4dfa2f7db2ba54ec1e6d",
"md5": "0f8ccd8b96f7278777993e78b07ff0e9",
"sha256": "89a470280c2920953954c73fbc1be182a72d16c88f572f1ef8a43a961e2f4e04"
},
"downloads": -1,
"filename": "temporal_forecasting-0.3.1.tar.gz",
"has_sig": false,
"md5_digest": "0f8ccd8b96f7278777993e78b07ff0e9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 34565,
"upload_time": "2025-10-27T08:28:24",
"upload_time_iso_8601": "2025-10-27T08:28:24.086085Z",
"url": "https://files.pythonhosted.org/packages/4f/82/c5ad8e152294cd2a0e3a7f52dd2f031e12452bda4dfa2f7db2ba54ec1e6d/temporal_forecasting-0.3.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-27 08:28:24",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "OptimalMatch",
"github_project": "temporal",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "torch",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.20.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.60.0"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.3.0"
]
]
}
],
"lcname": "temporal-forecasting"
}