<p align="center">
<img height="115px" src="https://raw.githubusercontent.com/IFCA-Advanced-Computing/frouros/main/images/logo.png" alt="logo">
</p>
---
<p align="center">
<!-- CI -->
<a href="https://github.com/IFCA-Advanced-Computing/frouros/actions/workflows/ci.yml">
<img src="https://github.com/IFCA-Advanced-Computing/frouros/actions/workflows/ci.yml/badge.svg?style=flat-square" alt="ci"/>
</a>
<!-- Code coverage -->
<a href="https://codecov.io/gh/IFCA-Advanced-Computing/frouros">
<img src="https://codecov.io/gh/IFCA-Advanced-Computing/frouros/graph/badge.svg?token=DLKQSWYTYM" alt="coverage"/>
</a>
<!-- Documentation -->
<a href="https://frouros.readthedocs.io/">
<img src="https://readthedocs.org/projects/frouros/badge/?version=latest" alt="documentation"/>
</a>
<!-- Downloads -->
<a href="https://pepy.tech/project/frouros">
<img src="https://static.pepy.tech/badge/frouros" alt="downloads"/>
</a>
<!-- Platform -->
<a href="https://github.com/IFCA-Advanced-Computing/frouros">
<img src="https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-blue.svg" alt="downloads"/>
</a>
<!-- PyPI -->
<a href="https://pypi.org/project/frouros">
<img src="https://img.shields.io/pypi/v/frouros.svg?label=release&color=blue" alt="pypi">
</a>
<!-- Python -->
<a href="https://pypi.org/project/frouros">
<img src="https://img.shields.io/pypi/pyversions/frouros" alt="python">
</a>
<!-- License -->
<a href="https://opensource.org/licenses/BSD-3-Clause">
<img src="https://img.shields.io/badge/license-BSD%203--Clause-blue.svg" alt="bsd_3_license">
</a>
<!-- Journal -->
<a href="https://doi.org/10.1016/j.softx.2024.101733">
<img src="https://img.shields.io/badge/SoftwareX-10.1016%2Fj.softx.2024.101733-blue.svg" alt="SoftwareX">
</a>
</p>
Frouros is a Python library for drift detection in machine learning systems that provides a combination of classical and more recent algorithms for both concept and data drift detection.
<p align="center">
<i>
"Everything changes and nothing stands still"
</i>
</p>
<p align="center">
<i>
"You could not step twice into the same river"
</i>
</p>
<div align="center" style="width: 70%;">
<p align="right">
<i>
Heraclitus of Ephesus (535-475 BCE.)
</i>
</p>
</div>
----
## ⚡️ Quickstart
### 🔄 Concept drift
As a quick example, we can use the breast cancer dataset to which concept drift it is induced and show the use of a concept drift detector like DDM (Drift Detection Method). We can see how concept drift affects the performance in terms of accuracy.
```python
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from frouros.detectors.concept_drift import DDM, DDMConfig
from frouros.metrics import PrequentialError
np.random.seed(seed=31)
# Load breast cancer dataset
X, y = load_breast_cancer(return_X_y=True)
# Split train (70%) and test (30%)
(
X_train,
X_test,
y_train,
y_test,
) = train_test_split(X, y, train_size=0.7, random_state=31)
# Define and fit model
pipeline = Pipeline(
[
("scaler", StandardScaler()),
("model", LogisticRegression()),
]
)
pipeline.fit(X=X_train, y=y_train)
# Detector configuration and instantiation
config = DDMConfig(
warning_level=2.0,
drift_level=3.0,
min_num_instances=25, # minimum number of instances before checking for concept drift
)
detector = DDM(config=config)
# Metric to compute accuracy
metric = PrequentialError(alpha=1.0) # alpha=1.0 is equivalent to normal accuracy
def stream_test(X_test, y_test, y, metric, detector):
"""Simulate data stream over X_test and y_test. y is the true label."""
drift_flag = False
for i, (X, y) in enumerate(zip(X_test, y_test)):
y_pred = pipeline.predict(X.reshape(1, -1))
error = 1 - (y_pred.item() == y.item())
metric_error = metric(error_value=error)
_ = detector.update(value=error)
status = detector.status
if status["drift"] and not drift_flag:
drift_flag = True
print(f"Concept drift detected at step {i}. Accuracy: {1 - metric_error:.4f}")
if not drift_flag:
print("No concept drift detected")
print(f"Final accuracy: {1 - metric_error:.4f}\n")
# Simulate data stream (assuming test label available after each prediction)
# No concept drift is expected to occur
stream_test(
X_test=X_test,
y_test=y_test,
y=y,
metric=metric,
detector=detector,
)
# >> No concept drift detected
# >> Final accuracy: 0.9766
# IMPORTANT: Induce/simulate concept drift in the last part (20%)
# of y_test by modifying some labels (50% approx). Therefore, changing P(y|X))
drift_size = int(y_test.shape[0] * 0.2)
y_test_drift = y_test[-drift_size:]
modify_idx = np.random.rand(*y_test_drift.shape) <= 0.5
y_test_drift[modify_idx] = (y_test_drift[modify_idx] + 1) % len(np.unique(y_test))
y_test[-drift_size:] = y_test_drift
# Reset detector and metric
detector.reset()
metric.reset()
# Simulate data stream (assuming test label available after each prediction)
# Concept drift is expected to occur because of the label modification
stream_test(
X_test=X_test,
y_test=y_test,
y=y,
metric=metric,
detector=detector,
)
# >> Concept drift detected at step 142. Accuracy: 0.9510
# >> Final accuracy: 0.8480
```
More concept drift examples can be found [here](https://frouros.readthedocs.io/en/latest/examples/concept_drift.html).
### 📊 Data drift
As a quick example, we can use the iris dataset to which data drift is induced and show the use of a data drift detector like Kolmogorov-Smirnov test.
```python
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from frouros.detectors.data_drift import KSTest
np.random.seed(seed=31)
# Load iris dataset
X, y = load_iris(return_X_y=True)
# Split train (70%) and test (30%)
(
X_train,
X_test,
y_train,
y_test,
) = train_test_split(X, y, train_size=0.7, random_state=31)
# Set the feature index to which detector is applied
feature_idx = 0
# IMPORTANT: Induce/simulate data drift in the selected feature of y_test by
# applying some gaussian noise. Therefore, changing P(X))
X_test[:, feature_idx] += np.random.normal(
loc=0.0,
scale=3.0,
size=X_test.shape[0],
)
# Define and fit model
model = DecisionTreeClassifier(random_state=31)
model.fit(X=X_train, y=y_train)
# Set significance level for hypothesis testing
alpha = 0.001
# Define and fit detector
detector = KSTest()
_ = detector.fit(X=X_train[:, feature_idx])
# Apply detector to the selected feature of X_test
result, _ = detector.compare(X=X_test[:, feature_idx])
# Check if drift is taking place
if result.p_value <= alpha:
print(f"Data drift detected at feature {feature_idx}")
else:
print(f"No data drift detected at feature {feature_idx}")
# >> Data drift detected at feature 0
# Therefore, we can reject H0 (both samples come from the same distribution).
```
More data drift examples can be found [here](https://frouros.readthedocs.io/en/latest/examples/data_drift.html).
## 🛠 Installation
Frouros can be installed via pip:
```bash
pip install frouros
```
## 🕵🏻♂️️ Drift detection methods
The currently implemented detectors are listed in the following table.
<table style="width: 100%; text-align: center; border-collapse: collapse; border: 1px solid grey;">
<thead>
<tr>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Drift detector</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Type</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Family</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Univariate (U) / Multivariate (M)</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Numerical (N) / Categorical (C)</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Method</th>
<th style="text-align: center; border: 1px solid grey; padding: 4px;">Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="13" style="text-align: center; border: 1px solid grey; padding: 8px;">Concept drift</td>
<td rowspan="13" style="text-align: center; border: 1px solid grey; padding: 8px;">Streaming</td>
<td rowspan="4" style="text-align: center; border: 1px solid grey; padding: 8px;">Change detection</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">BOCD</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.48550/arXiv.0710.3742">Adams and MacKay (2007)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">CUSUM</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2333009">Page (1954)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Geometric moving average</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/1266443">Roberts (1959)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Page Hinkley</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2333009">Page (1954)</a></td>
</tr>
<tr>
<td rowspan="6" style="text-align: center; border: 1px solid grey; padding: 8px;">Statistical process control</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">DDM</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1007/978-3-540-28645-5_29">Gama et al. (2004)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">ECDD-WT</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1016/j.patrec.2011.08.019">Ross et al. (2012)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">EDDM</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://www.researchgate.net/publication/245999704_Early_Drift_Detection_Method">Baena-Garcıa et al. (2006)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">HDDM-A</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1109/TKDE.2014.2345382">Frias-Blanco et al. (2014)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">HDDM-W</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1109/TKDE.2014.2345382">Frias-Blanco et al. (2014)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">RDDM</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1016/j.eswa.2017.08.023">Barros et al. (2017)</a></td>
</tr>
<tr>
<td rowspan="3" style="text-align: center; border: 1px solid grey; padding: 8px;">Window based</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">ADWIN</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1137/1.9781611972771.42">Bifet and Gavalda (2007)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">KSWIN</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1016/j.neucom.2019.11.111">Raab et al. (2020)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">STEPD</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1007/978-3-540-75488-6_27">Nishida and Yamauchi (2007)</a></td>
</tr>
<tr>
<td rowspan="19" style="text-align: center; border: 1px solid grey; padding: 8px;">Data drift</td>
<td rowspan="17" style="text-align: center; border: 1px solid grey; padding: 8px;">Batch</td>
<td rowspan="9" style="text-align: center; border: 1px solid grey; padding: 8px;">Distance based</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Bhattacharyya distance</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://www.jstor.org/stable/25047882">Bhattacharyya (1946)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Earth Mover's distance</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1023/A:1026543900054">Rubner et al. (2000)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Energy distance</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1016/j.jspi.2013.03.018">Székely et al. (2013)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Hellinger distance</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1515/CRLL.1909.136.210">Hellinger (1909)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Histogram intersection normalized complement</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1007/BF00130487">Swain and Ballard (1991)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Jensen-Shannon distance</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1109/18.61115">Lin (1991)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Kullback-Leibler divergence</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1214/aoms/1177729694">Kullback and Leibler (1951)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">M</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Maximum Mean Discrepancy</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://dl.acm.org/doi/10.5555/2188385.2188410">Gretton et al. (2012)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Population Stability Index</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1057/jors.2008.144">Wu and Olson (2010)</a></td>
</tr>
<tr>
<td rowspan="8" style="text-align: center; border: 1px solid grey; padding: 8px;">Statistical test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Anderson-Darling test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2288805">Scholz and Stephens (1987)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Baumgartner-Weiss-Schindler test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2533862">Baumgartner et al. (1998)</a></td>
</tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">C</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Chi-square test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1080/14786440009463897">Pearson (1900)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Cramér-von Mises test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1080/03461238.1928.10416862">Cramér (1902)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Kolmogorov-Smirnov test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2280095">Massey Jr (1951)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Kuiper's test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1016/S1385-7258(60)50006-0">Kuiper (1960)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Mann-Whitney U test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1214/aoms/1177730491">Mann and Whitney (1947)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Welch's t-test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.2307/2332510">Welch (1947)</a></td>
</tr>
<tr>
<td rowspan="2" style="text-align: center; border: 1px solid grey; padding: 8px;">Streaming</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Distance based</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">M</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Maximum Mean Discrepancy</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://dl.acm.org/doi/10.5555/2188385.2188410">Gretton et al. (2012)</a></td>
</tr>
<tr>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Statistical test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">U</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">N</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;">Incremental Kolmogorov-Smirnov test</td>
<td style="text-align: center; border: 1px solid grey; padding: 8px;"><a href="https://doi.org/10.1145/2939672.2939836">dos Reis et al. (2016)</a></td>
</tr>
</tbody>
</table>
## ❗ What is and what is not Frouros?
Unlike other libraries that in addition to provide drift detection algorithms, include other functionalities such as anomaly/outlier detection, adversarial detection, imbalance learning, among others, Frouros has and will **ONLY** have one purpose: **drift detection**.
We firmly believe that machine learning related libraries or frameworks should not follow [Jack of all trades, master of none](https://en.wikipedia.org/wiki/Jack_of_all_trades,_master_of_none) principle. Instead, they should be focused on a single task and do it well.
## ✅ Who is using Frouros?
Frouros is actively being used by the following projects to implement drift
detection in machine learning pipelines:
* [AI4EOSC](https://ai4eosc.eu).
* [iMagine](https://imagine-ai.eu).
If you want your project listed here, do not hesitate to send us a pull request.
## 👍 Contributing
Check out the [contribution](https://github.com/IFCA/frouros/blob/main/CONTRIBUTING.md) section.
## 💬 Citation
If you want to cite Frouros you can use the [SoftwareX publication](https://doi.org/10.1016/j.softx.2024.101733).
```bibtex
@article{CESPEDESSISNIEGA2024101733,
title = {Frouros: An open-source Python library for drift detection in machine learning systems},
journal = {SoftwareX},
volume = {26},
pages = {101733},
year = {2024},
issn = {2352-7110},
doi = {https://doi.org/10.1016/j.softx.2024.101733},
url = {https://www.sciencedirect.com/science/article/pii/S2352711024001043},
author = {Jaime {Céspedes Sisniega} and Álvaro {López García}},
keywords = {Machine learning, Drift detection, Concept drift, Data drift, Python},
abstract = {Frouros is an open-source Python library capable of detecting drift in machine learning systems. It provides a combination of classical and more recent algorithms for drift detection, covering both concept and data drift. We have designed it to be compatible with any machine learning framework and easily adaptable to real-world use cases. The library is developed following best development and continuous integration practices to ensure ease of maintenance and extensibility.}
}
```
## 📝 License
Frouros is an open-source software licensed under the [BSD-3-Clause license](https://github.com/IFCA/frouros/blob/main/LICENSE).
## 🙏 Acknowledgements
Frouros has received funding from the Agencia Estatal de Investigación, Unidad de Excelencia María de Maeztu, ref. MDM-2017-0765.
Raw data
{
"_id": null,
"home_page": "https://github.com/IFCA-Advanced-Computing/frouros",
"name": "frouros",
"maintainer": "Jaime C\u00e9spedes Sisniega",
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": "Jaime C\u00e9spedes Sisniega <cespedes@ifca.unican.es>",
"keywords": "drift-detection, concept-drift, data-drift, machine-learning, data-science, machine-learning-operations, machine-learning-systems",
"author": "Jaime C\u00e9spedes Sisniega",
"author_email": "Jaime C\u00e9spedes Sisniega <cespedes@ifca.unican.es>",
"download_url": "https://files.pythonhosted.org/packages/bc/c6/550d5f85fe3c7cd3b23d328097d2279be5ae09887016c91383492d415b46/frouros-0.9.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <img height=\"115px\" src=\"https://raw.githubusercontent.com/IFCA-Advanced-Computing/frouros/main/images/logo.png\" alt=\"logo\">\n</p>\n\n---\n\n<p align=\"center\">\n <!-- CI -->\n <a href=\"https://github.com/IFCA-Advanced-Computing/frouros/actions/workflows/ci.yml\">\n <img src=\"https://github.com/IFCA-Advanced-Computing/frouros/actions/workflows/ci.yml/badge.svg?style=flat-square\" alt=\"ci\"/>\n </a>\n <!-- Code coverage -->\n <a href=\"https://codecov.io/gh/IFCA-Advanced-Computing/frouros\">\n <img src=\"https://codecov.io/gh/IFCA-Advanced-Computing/frouros/graph/badge.svg?token=DLKQSWYTYM\" alt=\"coverage\"/>\n </a>\n <!-- Documentation -->\n <a href=\"https://frouros.readthedocs.io/\">\n <img src=\"https://readthedocs.org/projects/frouros/badge/?version=latest\" alt=\"documentation\"/>\n </a>\n <!-- Downloads -->\n <a href=\"https://pepy.tech/project/frouros\">\n <img src=\"https://static.pepy.tech/badge/frouros\" alt=\"downloads\"/>\n </a>\n <!-- Platform -->\n <a href=\"https://github.com/IFCA-Advanced-Computing/frouros\">\n <img src=\"https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-blue.svg\" alt=\"downloads\"/>\n </a>\n <!-- PyPI -->\n <a href=\"https://pypi.org/project/frouros\">\n <img src=\"https://img.shields.io/pypi/v/frouros.svg?label=release&color=blue\" alt=\"pypi\">\n </a>\n <!-- Python -->\n <a href=\"https://pypi.org/project/frouros\">\n <img src=\"https://img.shields.io/pypi/pyversions/frouros\" alt=\"python\">\n </a>\n <!-- License -->\n <a href=\"https://opensource.org/licenses/BSD-3-Clause\">\n <img src=\"https://img.shields.io/badge/license-BSD%203--Clause-blue.svg\" alt=\"bsd_3_license\">\n </a>\n <!-- Journal -->\n <a href=\"https://doi.org/10.1016/j.softx.2024.101733\">\n <img src=\"https://img.shields.io/badge/SoftwareX-10.1016%2Fj.softx.2024.101733-blue.svg\" alt=\"SoftwareX\">\n </a>\n</p>\n\nFrouros is a Python library for drift detection in machine learning systems that provides a combination of classical and more recent algorithms for both concept and data drift detection.\n\n<p align=\"center\">\n <i>\n \"Everything changes and nothing stands still\"\n </i>\n</p>\n<p align=\"center\">\n <i>\n \"You could not step twice into the same river\"\n </i>\n</p>\n<div align=\"center\" style=\"width: 70%;\">\n <p align=\"right\">\n <i>\n Heraclitus of Ephesus (535-475 BCE.)\n </i>\n </p>\n</div>\n\n----\n\n## \u26a1\ufe0f Quickstart\n\n### \ud83d\udd04 Concept drift\n\nAs a quick example, we can use the breast cancer dataset to which concept drift it is induced and show the use of a concept drift detector like DDM (Drift Detection Method). We can see how concept drift affects the performance in terms of accuracy.\n\n```python\nimport numpy as np\nfrom sklearn.datasets import load_breast_cancer\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nfrom frouros.detectors.concept_drift import DDM, DDMConfig\nfrom frouros.metrics import PrequentialError\n\nnp.random.seed(seed=31)\n\n# Load breast cancer dataset\nX, y = load_breast_cancer(return_X_y=True)\n\n# Split train (70%) and test (30%)\n(\n X_train,\n X_test,\n y_train,\n y_test,\n) = train_test_split(X, y, train_size=0.7, random_state=31)\n\n# Define and fit model\npipeline = Pipeline(\n [\n (\"scaler\", StandardScaler()),\n (\"model\", LogisticRegression()),\n ]\n)\npipeline.fit(X=X_train, y=y_train)\n\n# Detector configuration and instantiation\nconfig = DDMConfig(\n warning_level=2.0,\n drift_level=3.0,\n min_num_instances=25, # minimum number of instances before checking for concept drift\n)\ndetector = DDM(config=config)\n\n# Metric to compute accuracy\nmetric = PrequentialError(alpha=1.0) # alpha=1.0 is equivalent to normal accuracy\n\ndef stream_test(X_test, y_test, y, metric, detector):\n \"\"\"Simulate data stream over X_test and y_test. y is the true label.\"\"\"\n drift_flag = False\n for i, (X, y) in enumerate(zip(X_test, y_test)):\n y_pred = pipeline.predict(X.reshape(1, -1))\n error = 1 - (y_pred.item() == y.item())\n metric_error = metric(error_value=error)\n _ = detector.update(value=error)\n status = detector.status\n if status[\"drift\"] and not drift_flag:\n drift_flag = True\n print(f\"Concept drift detected at step {i}. Accuracy: {1 - metric_error:.4f}\")\n if not drift_flag:\n print(\"No concept drift detected\")\n print(f\"Final accuracy: {1 - metric_error:.4f}\\n\")\n\n# Simulate data stream (assuming test label available after each prediction)\n# No concept drift is expected to occur\nstream_test(\n X_test=X_test,\n y_test=y_test,\n y=y,\n metric=metric,\n detector=detector,\n)\n# >> No concept drift detected\n# >> Final accuracy: 0.9766\n\n# IMPORTANT: Induce/simulate concept drift in the last part (20%)\n# of y_test by modifying some labels (50% approx). Therefore, changing P(y|X))\ndrift_size = int(y_test.shape[0] * 0.2)\ny_test_drift = y_test[-drift_size:]\nmodify_idx = np.random.rand(*y_test_drift.shape) <= 0.5\ny_test_drift[modify_idx] = (y_test_drift[modify_idx] + 1) % len(np.unique(y_test))\ny_test[-drift_size:] = y_test_drift\n\n# Reset detector and metric\ndetector.reset()\nmetric.reset()\n\n# Simulate data stream (assuming test label available after each prediction)\n# Concept drift is expected to occur because of the label modification\nstream_test(\n X_test=X_test,\n y_test=y_test,\n y=y,\n metric=metric,\n detector=detector,\n)\n# >> Concept drift detected at step 142. Accuracy: 0.9510\n# >> Final accuracy: 0.8480\n```\n\nMore concept drift examples can be found [here](https://frouros.readthedocs.io/en/latest/examples/concept_drift.html).\n\n### \ud83d\udcca Data drift\n\nAs a quick example, we can use the iris dataset to which data drift is induced and show the use of a data drift detector like Kolmogorov-Smirnov test.\n\n```python\nimport numpy as np\nfrom sklearn.datasets import load_iris\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.tree import DecisionTreeClassifier\n\nfrom frouros.detectors.data_drift import KSTest\n\nnp.random.seed(seed=31)\n\n# Load iris dataset\nX, y = load_iris(return_X_y=True)\n\n# Split train (70%) and test (30%)\n(\n X_train,\n X_test,\n y_train,\n y_test,\n) = train_test_split(X, y, train_size=0.7, random_state=31)\n\n# Set the feature index to which detector is applied\nfeature_idx = 0\n\n# IMPORTANT: Induce/simulate data drift in the selected feature of y_test by\n# applying some gaussian noise. Therefore, changing P(X))\nX_test[:, feature_idx] += np.random.normal(\n loc=0.0,\n scale=3.0,\n size=X_test.shape[0],\n)\n\n# Define and fit model\nmodel = DecisionTreeClassifier(random_state=31)\nmodel.fit(X=X_train, y=y_train)\n\n# Set significance level for hypothesis testing\nalpha = 0.001\n# Define and fit detector\ndetector = KSTest()\n_ = detector.fit(X=X_train[:, feature_idx])\n\n# Apply detector to the selected feature of X_test\nresult, _ = detector.compare(X=X_test[:, feature_idx])\n\n# Check if drift is taking place\nif result.p_value <= alpha:\n print(f\"Data drift detected at feature {feature_idx}\")\nelse:\n print(f\"No data drift detected at feature {feature_idx}\")\n# >> Data drift detected at feature 0\n# Therefore, we can reject H0 (both samples come from the same distribution).\n```\n\nMore data drift examples can be found [here](https://frouros.readthedocs.io/en/latest/examples/data_drift.html).\n\n## \ud83d\udee0 Installation\n\nFrouros can be installed via pip:\n\n```bash\npip install frouros\n```\n\n## \ud83d\udd75\ud83c\udffb\u200d\u2642\ufe0f\ufe0f Drift detection methods\n\nThe currently implemented detectors are listed in the following table.\n\n<table style=\"width: 100%; text-align: center; border-collapse: collapse; border: 1px solid grey;\">\n <thead>\n <tr>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Drift detector</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Type</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Family</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Univariate (U) / Multivariate (M)</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Numerical (N) / Categorical (C)</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Method</th>\n <th style=\"text-align: center; border: 1px solid grey; padding: 4px;\">Reference</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <td rowspan=\"13\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Concept drift</td>\n <td rowspan=\"13\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Streaming</td>\n <td rowspan=\"4\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Change detection</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">BOCD</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.48550/arXiv.0710.3742\">Adams and MacKay (2007)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">CUSUM</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2333009\">Page (1954)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Geometric moving average</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/1266443\">Roberts (1959)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Page Hinkley</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2333009\">Page (1954)</a></td>\n </tr>\n <tr>\n <td rowspan=\"6\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Statistical process control</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">DDM</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1007/978-3-540-28645-5_29\">Gama et al. (2004)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">ECDD-WT</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1016/j.patrec.2011.08.019\">Ross et al. (2012)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">EDDM</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://www.researchgate.net/publication/245999704_Early_Drift_Detection_Method\">Baena-Garc\u0131a et al. (2006)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">HDDM-A</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1109/TKDE.2014.2345382\">Frias-Blanco et al. (2014)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">HDDM-W</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1109/TKDE.2014.2345382\">Frias-Blanco et al. (2014)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">RDDM</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1016/j.eswa.2017.08.023\">Barros et al. (2017)</a></td>\n </tr>\n <tr>\n <td rowspan=\"3\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Window based</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">ADWIN</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1137/1.9781611972771.42\">Bifet and Gavalda (2007)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">KSWIN</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1016/j.neucom.2019.11.111\">Raab et al. (2020)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">STEPD</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1007/978-3-540-75488-6_27\">Nishida and Yamauchi (2007)</a></td>\n </tr>\n <tr>\n <td rowspan=\"19\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Data drift</td>\n <td rowspan=\"17\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Batch</td>\n <td rowspan=\"9\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Distance based</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Bhattacharyya distance</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://www.jstor.org/stable/25047882\">Bhattacharyya (1946)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Earth Mover's distance</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1023/A:1026543900054\">Rubner et al. (2000)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Energy distance</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1016/j.jspi.2013.03.018\">Sz\u00e9kely et al. (2013)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Hellinger distance</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1515/CRLL.1909.136.210\">Hellinger (1909)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Histogram intersection normalized complement</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1007/BF00130487\">Swain and Ballard (1991)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Jensen-Shannon distance</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1109/18.61115\">Lin (1991)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Kullback-Leibler divergence</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1214/aoms/1177729694\">Kullback and Leibler (1951)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">M</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Maximum Mean Discrepancy</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://dl.acm.org/doi/10.5555/2188385.2188410\">Gretton et al. (2012)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Population Stability Index</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1057/jors.2008.144\">Wu and Olson (2010)</a></td>\n </tr>\n <tr>\n <td rowspan=\"8\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Statistical test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Anderson-Darling test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2288805\">Scholz and Stephens (1987)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Baumgartner-Weiss-Schindler test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2533862\">Baumgartner et al. (1998)</a></td>\n </tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">C</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Chi-square test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1080/14786440009463897\">Pearson (1900)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Cram\u00e9r-von Mises test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1080/03461238.1928.10416862\">Cram\u00e9r (1902)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Kolmogorov-Smirnov test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2280095\">Massey Jr (1951)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Kuiper's test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1016/S1385-7258(60)50006-0\">Kuiper (1960)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Mann-Whitney U test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1214/aoms/1177730491\">Mann and Whitney (1947)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Welch's t-test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.2307/2332510\">Welch (1947)</a></td>\n </tr>\n <tr>\n <td rowspan=\"2\" style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Streaming</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Distance based</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">M</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Maximum Mean Discrepancy</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://dl.acm.org/doi/10.5555/2188385.2188410\">Gretton et al. (2012)</a></td>\n </tr>\n <tr>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Statistical test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">U</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">N</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\">Incremental Kolmogorov-Smirnov test</td>\n <td style=\"text-align: center; border: 1px solid grey; padding: 8px;\"><a href=\"https://doi.org/10.1145/2939672.2939836\">dos Reis et al. (2016)</a></td>\n </tr>\n</tbody>\n</table>\n\n## \u2757 What is and what is not Frouros?\n\nUnlike other libraries that in addition to provide drift detection algorithms, include other functionalities such as anomaly/outlier detection, adversarial detection, imbalance learning, among others, Frouros has and will **ONLY** have one purpose: **drift detection**.\n\nWe firmly believe that machine learning related libraries or frameworks should not follow [Jack of all trades, master of none](https://en.wikipedia.org/wiki/Jack_of_all_trades,_master_of_none) principle. Instead, they should be focused on a single task and do it well.\n\n## \u2705 Who is using Frouros?\n\nFrouros is actively being used by the following projects to implement drift\ndetection in machine learning pipelines:\n\n * [AI4EOSC](https://ai4eosc.eu).\n * [iMagine](https://imagine-ai.eu).\n\nIf you want your project listed here, do not hesitate to send us a pull request.\n\n## \ud83d\udc4d Contributing\n\nCheck out the [contribution](https://github.com/IFCA/frouros/blob/main/CONTRIBUTING.md) section.\n\n## \ud83d\udcac Citation\n\nIf you want to cite Frouros you can use the [SoftwareX publication](https://doi.org/10.1016/j.softx.2024.101733).\n\n```bibtex\n@article{CESPEDESSISNIEGA2024101733,\ntitle = {Frouros: An open-source Python library for drift detection in machine learning systems},\njournal = {SoftwareX},\nvolume = {26},\npages = {101733},\nyear = {2024},\nissn = {2352-7110},\ndoi = {https://doi.org/10.1016/j.softx.2024.101733},\nurl = {https://www.sciencedirect.com/science/article/pii/S2352711024001043},\nauthor = {Jaime {C\u00e9spedes Sisniega} and \u00c1lvaro {L\u00f3pez Garc\u00eda}},\nkeywords = {Machine learning, Drift detection, Concept drift, Data drift, Python},\nabstract = {Frouros is an open-source Python library capable of detecting drift in machine learning systems. It provides a combination of classical and more recent algorithms for drift detection, covering both concept and data drift. We have designed it to be compatible with any machine learning framework and easily adaptable to real-world use cases. The library is developed following best development and continuous integration practices to ensure ease of maintenance and extensibility.}\n}\n```\n\n## \ud83d\udcdd License\n\nFrouros is an open-source software licensed under the [BSD-3-Clause license](https://github.com/IFCA/frouros/blob/main/LICENSE).\n\n## \ud83d\ude4f Acknowledgements\n\nFrouros has received funding from the Agencia Estatal de Investigaci\u00f3n, Unidad de Excelencia Mar\u00eda de Maeztu, ref. MDM-2017-0765.\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "An open-source Python library for drift detection in machine learning systems",
"version": "0.9.0",
"project_urls": {
"Homepage": "https://github.com/IFCA-Advanced-Computing/frouros",
"documentation": "https://frouros.readthedocs.io",
"download": "https://pypi.org/project/frouros/",
"homepage": "https://frouros.readthedocs.io",
"repository": "https://github.com/IFCA-Advanced-Computing/frouros"
},
"split_keywords": [
"drift-detection",
" concept-drift",
" data-drift",
" machine-learning",
" data-science",
" machine-learning-operations",
" machine-learning-systems"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c077022a06fd20e7595a2fd9de4931e7b86fd974f69558b29ed5f3ccca2b8457",
"md5": "3c30d44b6e79e370492ef2bd0e705ef5",
"sha256": "0c88ddeccfe2ac1f105b44efcfc65ba5b879cd8a07e492139c4316677b708980"
},
"downloads": -1,
"filename": "frouros-0.9.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c30d44b6e79e370492ef2bd0e705ef5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 126179,
"upload_time": "2024-10-05T12:06:35",
"upload_time_iso_8601": "2024-10-05T12:06:35.221660Z",
"url": "https://files.pythonhosted.org/packages/c0/77/022a06fd20e7595a2fd9de4931e7b86fd974f69558b29ed5f3ccca2b8457/frouros-0.9.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bcc6550d5f85fe3c7cd3b23d328097d2279be5ae09887016c91383492d415b46",
"md5": "41b25d8285d7227de43a2e28139c9a96",
"sha256": "5e45bc332863e9800b9592ead828dfbaf7b403fa17b9c44a5f96a61aac2842fa"
},
"downloads": -1,
"filename": "frouros-0.9.0.tar.gz",
"has_sig": false,
"md5_digest": "41b25d8285d7227de43a2e28139c9a96",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 79539,
"upload_time": "2024-10-05T12:06:37",
"upload_time_iso_8601": "2024-10-05T12:06:37.661779Z",
"url": "https://files.pythonhosted.org/packages/bc/c6/550d5f85fe3c7cd3b23d328097d2279be5ae09887016c91383492d415b46/frouros-0.9.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-05 12:06:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "IFCA-Advanced-Computing",
"github_project": "frouros",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"tox": true,
"lcname": "frouros"
}