# Valdo Python Bindings
Python bindings for Valdo, a time series anomaly detection library.
## Installation
### Prerequisites
- Python 3.10+
- Rust toolchain (for building from source)
### Install from Source
```bash
# Clone the repository
git clone https://github.com/shenxiangzhuang/valdo.git
cd valdo/bindings/python
# Install with uv (recommended)
uv sync
# Run the example
uv run python example/valdo_demo.py
# Or install with pip for system-wide use
pip install maturin
maturin develop --release
```
## Quick Start
```python
from valdo import Detector, AnomalyStatus
import random
# Create a detector with window size 10
detector = Detector(window_size=10)
# Generate training data with timestamp-value pairs (normal behavior)
training_data = [(i, random.random()) for i in range(10000)]
# Train the detector
detector.train(training_data)
# Detect anomalies in new data points
status = detector.detect(timestamp=1000, value=1.0)
print(f"Status: {status}") # Normal
status = detector.detect(timestamp=1001, value=10.0)
print(f"Status: {status}") # Anomaly
```
## API Reference
### `Detector`
The main class for anomaly detection.
#### Constructor
```python
Detector(window_size, quantile=None, level=None, max_excess=None)
```
**Parameters:**
- `window_size` (int): Size of the sliding window for processing (required)
- `quantile` (float, optional): Quantile parameter for SPOT detector (default: 0.0001)
- `level` (float, optional): Level parameter for SPOT detector (default: 0.998)
- `max_excess` (int, optional): Maximum excess for SPOT detector (default: 200)
#### Methods
##### `train(data)`
Train the detector on historical data.
**Parameters:**
- `data` (List[Tuple[int, float]]): List of (timestamp, value) pairs
**Raises:**
- `ValueError`: If training fails
##### `detect(timestamp, value)`
Detect anomalies in a new data point.
**Parameters:**
- `timestamp` (int): Timestamp of the data point
- `value` (float): Value of the data point
**Returns:**
- `AnomalyStatus`: Either `AnomalyStatus.Normal` or `AnomalyStatus.Anomaly`
**Raises:**
- `ValueError`: If detection fails
### `AnomalyStatus`
Enum representing the detection result.
- `AnomalyStatus.Normal`: The data point is normal
- `AnomalyStatus.Anomaly`: The data point is an anomaly
## Examples
### Basic Usage
```python
from valdo import Detector, AnomalyStatus
# Create detector
detector = Detector(window_size=10)
# Train on normal data with timestamps
normal_data = [(i, val) for i, val in enumerate([1.0, 1.1, 0.9, 1.05, 0.95, 1.2, 0.8, 1.15, 0.85, 1.25] * 100)]
detector.train(normal_data)
# Test detection
test_points = [
(1000, 1.0), # Normal
(1001, 5.0), # Anomaly - significantly higher
(1002, 1.1), # Normal
]
for timestamp, value in test_points:
status = detector.detect(timestamp, value)
print(f"Point ({timestamp}, {value}): {status}")
```
### Custom Parameters
```python
from valdo import Detector
# Create detector with custom parameters
detector = Detector(
window_size=5, # Smaller window for faster response
quantile=0.001, # Less sensitive to small deviations
level=0.99, # Lower confidence level
max_excess=100 # Limit excess tracking
)
# Create training data and use as normal
training_data = [(i, val) for i, val in enumerate(your_values)]
detector.train(training_data)
status = detector.detect(timestamp, value)
```
### Sine Wave Example
```python
import math
from valdo import Detector
# Create detector
detector = Detector(window_size=10)
# Generate sine wave training data with timestamps and noise
training_data = [
(i, math.sin(i * 0.1) + random.gauss(0, 0.1))
for i in range(1000)
]
# Train detector
detector.train(training_data)
# Test with normal and anomalous values
test_cases = [
(2000, math.sin(200 * 0.1)), # Normal sine value
(2001, 5.0), # Clear anomaly
(2002, math.sin(202 * 0.1)), # Back to normal
]
for timestamp, value in test_cases:
status = detector.detect(timestamp, value)
print(f"Value {value:.3f}: {status}")
```
## How It Works
Valdo uses a cascaded smoothing approach for anomaly detection:
1. **Residual Calculation**: Uses EWMA (Exponentially Weighted Moving Average) to predict the next value and calculate residuals
2. **Fluctuation Estimation**: Uses standard deviation to estimate fluctuations in the residual values
3. **Anomaly Detection**: Uses SPOT (Streaming Peaks-Over-Threshold) to detect anomalies in fluctuation changes
The detector maintains sliding windows for real-time processing and only updates its internal state when normal points are detected, preventing anomalies from corrupting the model.
## Performance
The Python bindings provide excellent performance thanks to the underlying Rust implementation:
- Training on 1M points: ~150ms
- Real-time detection: <1ms per point
- Memory efficient sliding window approach
- Thread-safe for concurrent use
## Error Handling
The bindings raise `ValueError` exceptions for various error conditions:
```python
try:
detector = Detector(window_size=10)
detector.train(training_data)
status = detector.detect(timestamp, value)
except ValueError as e:
print(f"Error: {e}")
```
## Contributing
Contributions are welcome! Please see the main repository for contribution guidelines.
## License
This project is licensed under the MIT License - see the LICENSE file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "valdo",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "time-series, outlier-detection, analysis",
"author": "Mathew Shen <datahonor@gmail.com>",
"author_email": "Mathew Shen <datahonor@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/55/a3/286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae/valdo-0.0.1.tar.gz",
"platform": null,
"description": "# Valdo Python Bindings\n\nPython bindings for Valdo, a time series anomaly detection library.\n\n## Installation\n\n### Prerequisites\n\n- Python 3.10+\n- Rust toolchain (for building from source)\n\n### Install from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/shenxiangzhuang/valdo.git\ncd valdo/bindings/python\n\n# Install with uv (recommended)\nuv sync\n\n# Run the example\nuv run python example/valdo_demo.py\n\n# Or install with pip for system-wide use\npip install maturin\nmaturin develop --release\n```\n\n## Quick Start\n\n```python\nfrom valdo import Detector, AnomalyStatus\nimport random\n\n# Create a detector with window size 10\ndetector = Detector(window_size=10)\n\n# Generate training data with timestamp-value pairs (normal behavior)\ntraining_data = [(i, random.random()) for i in range(10000)]\n\n# Train the detector\ndetector.train(training_data)\n\n# Detect anomalies in new data points\nstatus = detector.detect(timestamp=1000, value=1.0)\nprint(f\"Status: {status}\") # Normal\n\nstatus = detector.detect(timestamp=1001, value=10.0) \nprint(f\"Status: {status}\") # Anomaly\n```\n\n## API Reference\n\n### `Detector`\n\nThe main class for anomaly detection.\n\n#### Constructor\n\n```python\nDetector(window_size, quantile=None, level=None, max_excess=None)\n```\n\n**Parameters:**\n- `window_size` (int): Size of the sliding window for processing (required)\n- `quantile` (float, optional): Quantile parameter for SPOT detector (default: 0.0001)\n- `level` (float, optional): Level parameter for SPOT detector (default: 0.998) \n- `max_excess` (int, optional): Maximum excess for SPOT detector (default: 200)\n\n#### Methods\n\n##### `train(data)`\n\nTrain the detector on historical data.\n\n**Parameters:**\n- `data` (List[Tuple[int, float]]): List of (timestamp, value) pairs\n\n**Raises:**\n- `ValueError`: If training fails\n\n##### `detect(timestamp, value)`\n\nDetect anomalies in a new data point.\n\n**Parameters:**\n- `timestamp` (int): Timestamp of the data point\n- `value` (float): Value of the data point\n\n**Returns:**\n- `AnomalyStatus`: Either `AnomalyStatus.Normal` or `AnomalyStatus.Anomaly`\n\n**Raises:**\n- `ValueError`: If detection fails\n\n### `AnomalyStatus`\n\nEnum representing the detection result.\n\n- `AnomalyStatus.Normal`: The data point is normal\n- `AnomalyStatus.Anomaly`: The data point is an anomaly\n\n## Examples\n\n### Basic Usage\n\n```python\nfrom valdo import Detector, AnomalyStatus\n\n# Create detector\ndetector = Detector(window_size=10)\n\n# Train on normal data with timestamps\nnormal_data = [(i, val) for i, val in enumerate([1.0, 1.1, 0.9, 1.05, 0.95, 1.2, 0.8, 1.15, 0.85, 1.25] * 100)]\ndetector.train(normal_data)\n\n# Test detection\ntest_points = [\n (1000, 1.0), # Normal\n (1001, 5.0), # Anomaly - significantly higher\n (1002, 1.1), # Normal\n]\n\nfor timestamp, value in test_points:\n status = detector.detect(timestamp, value)\n print(f\"Point ({timestamp}, {value}): {status}\")\n```\n\n### Custom Parameters\n\n```python\nfrom valdo import Detector\n\n# Create detector with custom parameters\ndetector = Detector(\n window_size=5, # Smaller window for faster response\n quantile=0.001, # Less sensitive to small deviations\n level=0.99, # Lower confidence level\n max_excess=100 # Limit excess tracking\n)\n\n# Create training data and use as normal\ntraining_data = [(i, val) for i, val in enumerate(your_values)]\ndetector.train(training_data)\nstatus = detector.detect(timestamp, value)\n```\n\n### Sine Wave Example\n\n```python\nimport math\nfrom valdo import Detector\n\n# Create detector\ndetector = Detector(window_size=10)\n\n# Generate sine wave training data with timestamps and noise\ntraining_data = [\n (i, math.sin(i * 0.1) + random.gauss(0, 0.1)) \n for i in range(1000)\n]\n\n# Train detector\ndetector.train(training_data)\n\n# Test with normal and anomalous values\ntest_cases = [\n (2000, math.sin(200 * 0.1)), # Normal sine value\n (2001, 5.0), # Clear anomaly\n (2002, math.sin(202 * 0.1)), # Back to normal\n]\n\nfor timestamp, value in test_cases:\n status = detector.detect(timestamp, value)\n print(f\"Value {value:.3f}: {status}\")\n```\n\n## How It Works\n\nValdo uses a cascaded smoothing approach for anomaly detection:\n\n1. **Residual Calculation**: Uses EWMA (Exponentially Weighted Moving Average) to predict the next value and calculate residuals\n2. **Fluctuation Estimation**: Uses standard deviation to estimate fluctuations in the residual values\n3. **Anomaly Detection**: Uses SPOT (Streaming Peaks-Over-Threshold) to detect anomalies in fluctuation changes\n\nThe detector maintains sliding windows for real-time processing and only updates its internal state when normal points are detected, preventing anomalies from corrupting the model.\n\n## Performance\n\nThe Python bindings provide excellent performance thanks to the underlying Rust implementation:\n\n- Training on 1M points: ~150ms\n- Real-time detection: <1ms per point\n- Memory efficient sliding window approach\n- Thread-safe for concurrent use\n\n## Error Handling\n\nThe bindings raise `ValueError` exceptions for various error conditions:\n\n```python\ntry:\n detector = Detector(window_size=10)\n detector.train(training_data)\n status = detector.detect(timestamp, value)\nexcept ValueError as e:\n print(f\"Error: {e}\")\n```\n\n## Contributing\n\nContributions are welcome! Please see the main repository for contribution guidelines.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A time series anomaly detection library",
"version": "0.0.1",
"project_urls": {
"Source Code": "https://github.com/shenxiangzhuang/valdo"
},
"split_keywords": [
"time-series",
" outlier-detection",
" analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "677dee87ece17d2422ed3b2c435d4a5cffdbc1ab3e08bfc9c02024425a4d9b8d",
"md5": "5466f187d1a9569039904d834ca6e873",
"sha256": "66e5b45d5e7287b5a3ad83e4a9a514a7c99dbd01d2221af5a723f9fc682ca06f"
},
"downloads": -1,
"filename": "valdo-0.0.1-cp310-abi3-manylinux_2_34_x86_64.whl",
"has_sig": false,
"md5_digest": "5466f187d1a9569039904d834ca6e873",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 264416,
"upload_time": "2025-08-18T14:27:47",
"upload_time_iso_8601": "2025-08-18T14:27:47.526028Z",
"url": "https://files.pythonhosted.org/packages/67/7d/ee87ece17d2422ed3b2c435d4a5cffdbc1ab3e08bfc9c02024425a4d9b8d/valdo-0.0.1-cp310-abi3-manylinux_2_34_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "55a3286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae",
"md5": "68ee9affa30abb3c45ca9315a65737ed",
"sha256": "05780169d0588e3123ab7a09de91fdaebfca334b27649fda64ad2b695b22d170"
},
"downloads": -1,
"filename": "valdo-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "68ee9affa30abb3c45ca9315a65737ed",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 32772,
"upload_time": "2025-08-18T14:27:49",
"upload_time_iso_8601": "2025-08-18T14:27:49.207511Z",
"url": "https://files.pythonhosted.org/packages/55/a3/286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae/valdo-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-18 14:27:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "shenxiangzhuang",
"github_project": "valdo",
"github_not_found": true,
"lcname": "valdo"
}