valdo


Namevaldo JSON
Version 0.0.1 PyPI version JSON
download
home_pageNone
SummaryA time series anomaly detection library
upload_time2025-08-18 14:27:49
maintainerNone
docs_urlNone
authorMathew Shen <datahonor@gmail.com>
requires_python>=3.10
licenseMIT
keywords time-series outlier-detection analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Valdo Python Bindings

Python bindings for Valdo, a time series anomaly detection library.

## Installation

### Prerequisites

- Python 3.10+
- Rust toolchain (for building from source)

### Install from Source

```bash
# Clone the repository
git clone https://github.com/shenxiangzhuang/valdo.git
cd valdo/bindings/python

# Install with uv (recommended)
uv sync

# Run the example
uv run python example/valdo_demo.py

# Or install with pip for system-wide use
pip install maturin
maturin develop --release
```

## Quick Start

```python
from valdo import Detector, AnomalyStatus
import random

# Create a detector with window size 10
detector = Detector(window_size=10)

# Generate training data with timestamp-value pairs (normal behavior)
training_data = [(i, random.random()) for i in range(10000)]

# Train the detector
detector.train(training_data)

# Detect anomalies in new data points
status = detector.detect(timestamp=1000, value=1.0)
print(f"Status: {status}")  # Normal

status = detector.detect(timestamp=1001, value=10.0)  
print(f"Status: {status}")  # Anomaly
```

## API Reference

### `Detector`

The main class for anomaly detection.

#### Constructor

```python
Detector(window_size, quantile=None, level=None, max_excess=None)
```

**Parameters:**
- `window_size` (int): Size of the sliding window for processing (required)
- `quantile` (float, optional): Quantile parameter for SPOT detector (default: 0.0001)
- `level` (float, optional): Level parameter for SPOT detector (default: 0.998)  
- `max_excess` (int, optional): Maximum excess for SPOT detector (default: 200)

#### Methods

##### `train(data)`

Train the detector on historical data.

**Parameters:**
- `data` (List[Tuple[int, float]]): List of (timestamp, value) pairs

**Raises:**
- `ValueError`: If training fails

##### `detect(timestamp, value)`

Detect anomalies in a new data point.

**Parameters:**
- `timestamp` (int): Timestamp of the data point
- `value` (float): Value of the data point

**Returns:**
- `AnomalyStatus`: Either `AnomalyStatus.Normal` or `AnomalyStatus.Anomaly`

**Raises:**
- `ValueError`: If detection fails

### `AnomalyStatus`

Enum representing the detection result.

- `AnomalyStatus.Normal`: The data point is normal
- `AnomalyStatus.Anomaly`: The data point is an anomaly

## Examples

### Basic Usage

```python
from valdo import Detector, AnomalyStatus

# Create detector
detector = Detector(window_size=10)

# Train on normal data with timestamps
normal_data = [(i, val) for i, val in enumerate([1.0, 1.1, 0.9, 1.05, 0.95, 1.2, 0.8, 1.15, 0.85, 1.25] * 100)]
detector.train(normal_data)

# Test detection
test_points = [
    (1000, 1.0),   # Normal
    (1001, 5.0),   # Anomaly - significantly higher
    (1002, 1.1),   # Normal
]

for timestamp, value in test_points:
    status = detector.detect(timestamp, value)
    print(f"Point ({timestamp}, {value}): {status}")
```

### Custom Parameters

```python
from valdo import Detector

# Create detector with custom parameters
detector = Detector(
    window_size=5,      # Smaller window for faster response
    quantile=0.001,     # Less sensitive to small deviations
    level=0.99,         # Lower confidence level
    max_excess=100      # Limit excess tracking
)

# Create training data and use as normal
training_data = [(i, val) for i, val in enumerate(your_values)]
detector.train(training_data)
status = detector.detect(timestamp, value)
```

### Sine Wave Example

```python
import math
from valdo import Detector

# Create detector
detector = Detector(window_size=10)

# Generate sine wave training data with timestamps and noise
training_data = [
    (i, math.sin(i * 0.1) + random.gauss(0, 0.1)) 
    for i in range(1000)
]

# Train detector
detector.train(training_data)

# Test with normal and anomalous values
test_cases = [
    (2000, math.sin(200 * 0.1)),  # Normal sine value
    (2001, 5.0),                  # Clear anomaly
    (2002, math.sin(202 * 0.1)),  # Back to normal
]

for timestamp, value in test_cases:
    status = detector.detect(timestamp, value)
    print(f"Value {value:.3f}: {status}")
```

## How It Works

Valdo uses a cascaded smoothing approach for anomaly detection:

1. **Residual Calculation**: Uses EWMA (Exponentially Weighted Moving Average) to predict the next value and calculate residuals
2. **Fluctuation Estimation**: Uses standard deviation to estimate fluctuations in the residual values
3. **Anomaly Detection**: Uses SPOT (Streaming Peaks-Over-Threshold) to detect anomalies in fluctuation changes

The detector maintains sliding windows for real-time processing and only updates its internal state when normal points are detected, preventing anomalies from corrupting the model.

## Performance

The Python bindings provide excellent performance thanks to the underlying Rust implementation:

- Training on 1M points: ~150ms
- Real-time detection: <1ms per point
- Memory efficient sliding window approach
- Thread-safe for concurrent use

## Error Handling

The bindings raise `ValueError` exceptions for various error conditions:

```python
try:
    detector = Detector(window_size=10)
    detector.train(training_data)
    status = detector.detect(timestamp, value)
except ValueError as e:
    print(f"Error: {e}")
```

## Contributing

Contributions are welcome! Please see the main repository for contribution guidelines.

## License

This project is licensed under the MIT License - see the LICENSE file for details.



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "valdo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "time-series, outlier-detection, analysis",
    "author": "Mathew Shen <datahonor@gmail.com>",
    "author_email": "Mathew Shen <datahonor@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/55/a3/286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae/valdo-0.0.1.tar.gz",
    "platform": null,
    "description": "# Valdo Python Bindings\n\nPython bindings for Valdo, a time series anomaly detection library.\n\n## Installation\n\n### Prerequisites\n\n- Python 3.10+\n- Rust toolchain (for building from source)\n\n### Install from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/shenxiangzhuang/valdo.git\ncd valdo/bindings/python\n\n# Install with uv (recommended)\nuv sync\n\n# Run the example\nuv run python example/valdo_demo.py\n\n# Or install with pip for system-wide use\npip install maturin\nmaturin develop --release\n```\n\n## Quick Start\n\n```python\nfrom valdo import Detector, AnomalyStatus\nimport random\n\n# Create a detector with window size 10\ndetector = Detector(window_size=10)\n\n# Generate training data with timestamp-value pairs (normal behavior)\ntraining_data = [(i, random.random()) for i in range(10000)]\n\n# Train the detector\ndetector.train(training_data)\n\n# Detect anomalies in new data points\nstatus = detector.detect(timestamp=1000, value=1.0)\nprint(f\"Status: {status}\")  # Normal\n\nstatus = detector.detect(timestamp=1001, value=10.0)  \nprint(f\"Status: {status}\")  # Anomaly\n```\n\n## API Reference\n\n### `Detector`\n\nThe main class for anomaly detection.\n\n#### Constructor\n\n```python\nDetector(window_size, quantile=None, level=None, max_excess=None)\n```\n\n**Parameters:**\n- `window_size` (int): Size of the sliding window for processing (required)\n- `quantile` (float, optional): Quantile parameter for SPOT detector (default: 0.0001)\n- `level` (float, optional): Level parameter for SPOT detector (default: 0.998)  \n- `max_excess` (int, optional): Maximum excess for SPOT detector (default: 200)\n\n#### Methods\n\n##### `train(data)`\n\nTrain the detector on historical data.\n\n**Parameters:**\n- `data` (List[Tuple[int, float]]): List of (timestamp, value) pairs\n\n**Raises:**\n- `ValueError`: If training fails\n\n##### `detect(timestamp, value)`\n\nDetect anomalies in a new data point.\n\n**Parameters:**\n- `timestamp` (int): Timestamp of the data point\n- `value` (float): Value of the data point\n\n**Returns:**\n- `AnomalyStatus`: Either `AnomalyStatus.Normal` or `AnomalyStatus.Anomaly`\n\n**Raises:**\n- `ValueError`: If detection fails\n\n### `AnomalyStatus`\n\nEnum representing the detection result.\n\n- `AnomalyStatus.Normal`: The data point is normal\n- `AnomalyStatus.Anomaly`: The data point is an anomaly\n\n## Examples\n\n### Basic Usage\n\n```python\nfrom valdo import Detector, AnomalyStatus\n\n# Create detector\ndetector = Detector(window_size=10)\n\n# Train on normal data with timestamps\nnormal_data = [(i, val) for i, val in enumerate([1.0, 1.1, 0.9, 1.05, 0.95, 1.2, 0.8, 1.15, 0.85, 1.25] * 100)]\ndetector.train(normal_data)\n\n# Test detection\ntest_points = [\n    (1000, 1.0),   # Normal\n    (1001, 5.0),   # Anomaly - significantly higher\n    (1002, 1.1),   # Normal\n]\n\nfor timestamp, value in test_points:\n    status = detector.detect(timestamp, value)\n    print(f\"Point ({timestamp}, {value}): {status}\")\n```\n\n### Custom Parameters\n\n```python\nfrom valdo import Detector\n\n# Create detector with custom parameters\ndetector = Detector(\n    window_size=5,      # Smaller window for faster response\n    quantile=0.001,     # Less sensitive to small deviations\n    level=0.99,         # Lower confidence level\n    max_excess=100      # Limit excess tracking\n)\n\n# Create training data and use as normal\ntraining_data = [(i, val) for i, val in enumerate(your_values)]\ndetector.train(training_data)\nstatus = detector.detect(timestamp, value)\n```\n\n### Sine Wave Example\n\n```python\nimport math\nfrom valdo import Detector\n\n# Create detector\ndetector = Detector(window_size=10)\n\n# Generate sine wave training data with timestamps and noise\ntraining_data = [\n    (i, math.sin(i * 0.1) + random.gauss(0, 0.1)) \n    for i in range(1000)\n]\n\n# Train detector\ndetector.train(training_data)\n\n# Test with normal and anomalous values\ntest_cases = [\n    (2000, math.sin(200 * 0.1)),  # Normal sine value\n    (2001, 5.0),                  # Clear anomaly\n    (2002, math.sin(202 * 0.1)),  # Back to normal\n]\n\nfor timestamp, value in test_cases:\n    status = detector.detect(timestamp, value)\n    print(f\"Value {value:.3f}: {status}\")\n```\n\n## How It Works\n\nValdo uses a cascaded smoothing approach for anomaly detection:\n\n1. **Residual Calculation**: Uses EWMA (Exponentially Weighted Moving Average) to predict the next value and calculate residuals\n2. **Fluctuation Estimation**: Uses standard deviation to estimate fluctuations in the residual values\n3. **Anomaly Detection**: Uses SPOT (Streaming Peaks-Over-Threshold) to detect anomalies in fluctuation changes\n\nThe detector maintains sliding windows for real-time processing and only updates its internal state when normal points are detected, preventing anomalies from corrupting the model.\n\n## Performance\n\nThe Python bindings provide excellent performance thanks to the underlying Rust implementation:\n\n- Training on 1M points: ~150ms\n- Real-time detection: <1ms per point\n- Memory efficient sliding window approach\n- Thread-safe for concurrent use\n\n## Error Handling\n\nThe bindings raise `ValueError` exceptions for various error conditions:\n\n```python\ntry:\n    detector = Detector(window_size=10)\n    detector.train(training_data)\n    status = detector.detect(timestamp, value)\nexcept ValueError as e:\n    print(f\"Error: {e}\")\n```\n\n## Contributing\n\nContributions are welcome! Please see the main repository for contribution guidelines.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A time series anomaly detection library",
    "version": "0.0.1",
    "project_urls": {
        "Source Code": "https://github.com/shenxiangzhuang/valdo"
    },
    "split_keywords": [
        "time-series",
        " outlier-detection",
        " analysis"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "677dee87ece17d2422ed3b2c435d4a5cffdbc1ab3e08bfc9c02024425a4d9b8d",
                "md5": "5466f187d1a9569039904d834ca6e873",
                "sha256": "66e5b45d5e7287b5a3ad83e4a9a514a7c99dbd01d2221af5a723f9fc682ca06f"
            },
            "downloads": -1,
            "filename": "valdo-0.0.1-cp310-abi3-manylinux_2_34_x86_64.whl",
            "has_sig": false,
            "md5_digest": "5466f187d1a9569039904d834ca6e873",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10",
            "size": 264416,
            "upload_time": "2025-08-18T14:27:47",
            "upload_time_iso_8601": "2025-08-18T14:27:47.526028Z",
            "url": "https://files.pythonhosted.org/packages/67/7d/ee87ece17d2422ed3b2c435d4a5cffdbc1ab3e08bfc9c02024425a4d9b8d/valdo-0.0.1-cp310-abi3-manylinux_2_34_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "55a3286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae",
                "md5": "68ee9affa30abb3c45ca9315a65737ed",
                "sha256": "05780169d0588e3123ab7a09de91fdaebfca334b27649fda64ad2b695b22d170"
            },
            "downloads": -1,
            "filename": "valdo-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "68ee9affa30abb3c45ca9315a65737ed",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 32772,
            "upload_time": "2025-08-18T14:27:49",
            "upload_time_iso_8601": "2025-08-18T14:27:49.207511Z",
            "url": "https://files.pythonhosted.org/packages/55/a3/286e4547b1b68360bb25fae76e709ff9064637f885b45ee4d108f827b2ae/valdo-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-18 14:27:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "shenxiangzhuang",
    "github_project": "valdo",
    "github_not_found": true,
    "lcname": "valdo"
}
        
Elapsed time: 0.45156s