<img src="Anomaly.png" width="100">
# flex-anomalies
flex-anomalies is a Python library dedicated to anomaly detection in machine learning. It offers a wide range of algorithms and techniques, including models based on distance, density, trees, and neural networks such as convolutional and recurrent architectures. The library also provides aggregators, anomaly score processing techniques, and pre-processing techniques for data.
Anomaly detection involves examining data and detecting deviations or anomalies present in the data, with the goal of purifying data sets and identifying anomalies for further analysis.
### Details
Anomaly Detection with <a href=https://github.com/FLEXible-FL/FLEXible/tree/main>FLEXible</a> Federated Learning: This repository contains implementations of anomaly detection algorithms using the Flexible Federated Learning library. <a href=https://github.com/FLEXible-FL/FLEXible/tree/main>FLEXible</a> is a Python library for realizing federated learning in an efficient and scalable manner.
From the study of state-of-the-art research works on federated learning for network intrusion detection.
This repository also includes:
- An organized folder structure that makes it easy to navigate and understand the project.
- Explanatory notebooks showing practical examples and detailed explanations for the use of the library.
#### Folder structure
- **flexanomalies/pool**: Here are the aggregators and primitives for each of the models following the FLEXible structure.
- **flexanomalies/utils**: Contains the source code of the implementations of the anomaly detection algorithms, anomaly score processing techniques, metrics for the evaluation,
function to federate a centralized dataset using FLEXible and data loading.
- **flexanomalies/datasets**: some pre-processing techniques for data.
- **notebooks**: Contains explanatory notebooks showing how to use the anomaly detection algorithms on data.
#### Explanatory Notebooks
- **AnomalyDetection_Autoencoder_FLEX.ipynb**: A notebook showing a step-by-step example of how to use Auto Encoder model for anomaly detection with federated learning for static data.
- **AnomalyDetection_AutoEncoder_FLEX_ts.ipynb**: Notebook showing a step-by-step example of how to use the Auto Encoder model for anomaly detection with federated learning for time series.The structure of the sliding window, data federation, federated training and model evaluation at the server and client level.
- **AnomalyDetection_PCA_FLEX.ipynb**: A notebook demonstrating the application of PCA_Anomaly for anomaly detection with federated learning for a static dataset.
- **AnomalyDetection_Cluster_FLEX.ipynb**: Notebook showing a step-by-step example of how to use the ClusterAnomaly model for anomaly detection with federated learning for static data and evaluating the model on test sets.
- **AnomalyDetection_IsolationForest_FLEX.ipynb**: Notebook showing an example of how to use the IsolationForest model with federated learning for an example set of static data. From data federation and training to model evaluation on a test set.
- **AnomalyDetection_CNNN_LSTM_FLEX_ts.ipynb**: Notebook showing the use of the DeepCNN_LSTM model with federated learning for anomaly detection in time series. The structure of the sliding window, data federation, federated training and model evaluation at server and client level.
## Features
For more information on the implemented algorithms see the table that follows:
<table>
<thead>
<tr>
<th>Models</th>
<th>Description</th>
<th>Citation</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan= 1>IsolationForest</td>
<td rowspan=1 align="center">
Algorithm for data anomaly detection, detects anomalies using binary trees.
</td>
<td>
<a href=https://ieeexplore.ieee.org/document/4781136>
Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In *International Conference on Data Mining*\ , pp. 413-422. IEEE.
</td>
</tr>
<tr>
<td rowspan= 1>PCA_Anomaly</td>
<td rowspan=1 align="center">
Principal component analysis (PCA), algorithm for detecting outlier.Outlier scores can be obtained as the sum of weighted euclidean distance between each sample to the hyperplane constructed by the selected eigenvectors
</td>
<td>
<a href=https://www.researchgate.net/publication/228709094_A_Novel_Anomaly_Detection_Scheme_Based_on_Principal_Component_Classifier>
Shyu, M.L., Chen, S.C., Sarinnapakorn, K. and Chang, L., 2003. A novel anomaly detection scheme based on principal component classifier. *MIAMI UNIV CORAL GABLES FL DEPT OF ELECTRICAL AND COMPUTER ENGINEERING*.
</td>
</tr>
<tr>
<td rowspan= 1>ClusterAnomaly</td>
<td rowspan=1 align="center">
Model based on clustering. Outliers scores are solely computed based on their distance to the closest large cluster center, kMeans is used for clustering algorithm.
</td>
<td>
<a href=https://epubs.siam.org/doi/10.1137/1.9781611972832.21>
Chawla, S., & Gionis, A. (2013, May). k-means–: A unified approach to clustering and outlier detection. In Proceedings of the 2013 SIAM international conference on data mining (pp. 189-197).
</td>
</tr>
<tr>
<td rowspan= 1>DeepCNN_LSTM</td>
<td rowspan=1 align="center">
Neural network model for time series and static data including convolutional and recurrent architecture.
</td>
<td>
<a href=https://arxiv.org/abs/2206.03179>
Aguilera-Martos, I., García-Vico, Á. M., Luengo, J., Damas, S., Melero, F. J., Valle-Alonso, J. J., & Herrera, F. (2022). TSFEDL: A Python Library for Time Series Spatio-Temporal Feature Extraction and Prediction using Deep Learning (with Appendices on Detailed Network Architectures and Experimental Cases of Study). arXiv preprint arXiv:2206.03179.
</td>
</tr>
<tr>
<td rowspan= 1>AutoEncoder</td>
<td rowspan=1 align="center">
Fully connected AutoEncoder for time series and static data. Neural network for learning useful data representations unsupervisedly. detect anomalies in the data by calculating the reconstruction.
</td>
<td>
<a href=https://link.springer.com/chapter/10.1007/978-3-319-14142-8_8>
Aggarwal, C.C., 2015. Outlier analysis. In Data mining (pp. 237-263), Ch.3. Springer, Cham. Ch.3
</td>
</tr>
</tbody>
</table>
## Installation
FLEX-Anomalies is available on the PyPi repository and can be easily installed using:
``` pip: pip install flexanomalies ```
Install the necessary dependencies:
```pip install -r requirements.txt```
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Citation
If you use this repository in your research work, please cite the Flexible paper:
Raw data
{
"_id": null,
"home_page": "",
"name": "flexanomalies",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "anomaly detection federated-learning flexible outlier",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/07/18/10f23f0d8c4356391371442d0b51c148e2dcde3d9a5465ee9a4319b15f89/flexanomalies-0.0.2.tar.gz",
"platform": null,
"description": "<img src=\"Anomaly.png\" width=\"100\">\n\n# flex-anomalies \nflex-anomalies is a Python library dedicated to anomaly detection in machine learning. It offers a wide range of algorithms and techniques, including models based on distance, density, trees, and neural networks such as convolutional and recurrent architectures. The library also provides aggregators, anomaly score processing techniques, and pre-processing techniques for data. \n\nAnomaly detection involves examining data and detecting deviations or anomalies present in the data, with the goal of purifying data sets and identifying anomalies for further analysis.\n\n\n### Details\n\nAnomaly Detection with <a href=https://github.com/FLEXible-FL/FLEXible/tree/main>FLEXible</a> Federated Learning: This repository contains implementations of anomaly detection algorithms using the Flexible Federated Learning library. <a href=https://github.com/FLEXible-FL/FLEXible/tree/main>FLEXible</a> is a Python library for realizing federated learning in an efficient and scalable manner. \nFrom the study of state-of-the-art research works on federated learning for network intrusion detection.\n\nThis repository also includes:\n- An organized folder structure that makes it easy to navigate and understand the project.\n- Explanatory notebooks showing practical examples and detailed explanations for the use of the library.\n\n#### Folder structure\n- **flexanomalies/pool**: Here are the aggregators and primitives for each of the models following the FLEXible structure.\n- **flexanomalies/utils**: Contains the source code of the implementations of the anomaly detection algorithms, anomaly score processing techniques, metrics for the evaluation,\nfunction to federate a centralized dataset using FLEXible and data loading.\n- **flexanomalies/datasets**: some pre-processing techniques for data.\n- **notebooks**: Contains explanatory notebooks showing how to use the anomaly detection algorithms on data. \n\n#### Explanatory Notebooks\n- **AnomalyDetection_Autoencoder_FLEX.ipynb**: A notebook showing a step-by-step example of how to use Auto Encoder model for anomaly detection with federated learning for static data.\n- **AnomalyDetection_AutoEncoder_FLEX_ts.ipynb**: Notebook showing a step-by-step example of how to use the Auto Encoder model for anomaly detection with federated learning for time series.The structure of the sliding window, data federation, federated training and model evaluation at the server and client level.\n- **AnomalyDetection_PCA_FLEX.ipynb**: A notebook demonstrating the application of PCA_Anomaly for anomaly detection with federated learning for a static dataset.\n- **AnomalyDetection_Cluster_FLEX.ipynb**: Notebook showing a step-by-step example of how to use the ClusterAnomaly model for anomaly detection with federated learning for static data and evaluating the model on test sets. \n- **AnomalyDetection_IsolationForest_FLEX.ipynb**: Notebook showing an example of how to use the IsolationForest model with federated learning for an example set of static data. From data federation and training to model evaluation on a test set.\n- **AnomalyDetection_CNNN_LSTM_FLEX_ts.ipynb**: Notebook showing the use of the DeepCNN_LSTM model with federated learning for anomaly detection in time series. The structure of the sliding window, data federation, federated training and model evaluation at server and client level.\n\n## Features\nFor more information on the implemented algorithms see the table that follows:\n<table>\n <thead>\n <tr>\n <th>Models</th>\n <th>Description</th>\n <th>Citation</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <td rowspan= 1>IsolationForest</td>\n <td rowspan=1 align=\"center\"> \n Algorithm for data anomaly detection, detects anomalies using binary trees. \n </td>\n <td>\n <a href=https://ieeexplore.ieee.org/document/4781136>\n Liu, F.T., Ting, K.M. and Zhou, Z.H., 2008, December. Isolation forest. In *International Conference on Data Mining*\\ , pp. 413-422. IEEE.\n </td> \n </tr>\n <tr>\n <td rowspan= 1>PCA_Anomaly</td>\n <td rowspan=1 align=\"center\"> \n Principal component analysis (PCA), algorithm for detecting outlier.Outlier scores can be obtained as the sum of weighted euclidean distance between each sample to the hyperplane constructed by the selected eigenvectors\n </td>\n <td>\n <a href=https://www.researchgate.net/publication/228709094_A_Novel_Anomaly_Detection_Scheme_Based_on_Principal_Component_Classifier>\n Shyu, M.L., Chen, S.C., Sarinnapakorn, K. and Chang, L., 2003. A novel anomaly detection scheme based on principal component classifier. *MIAMI UNIV CORAL GABLES FL DEPT OF ELECTRICAL AND COMPUTER ENGINEERING*.\n </td> \n </tr>\n <tr>\n <td rowspan= 1>ClusterAnomaly</td>\n <td rowspan=1 align=\"center\"> \n Model based on clustering. Outliers scores are solely computed based on their distance to the closest large cluster center, kMeans is used for clustering algorithm.\n </td>\n <td>\n <a href=https://epubs.siam.org/doi/10.1137/1.9781611972832.21>\n Chawla, S., & Gionis, A. (2013, May). k-means\u2013: A unified approach to clustering and outlier detection. In Proceedings of the 2013 SIAM international conference on data mining (pp. 189-197).\n </td> \n </tr>\n <tr>\n <td rowspan= 1>DeepCNN_LSTM</td>\n <td rowspan=1 align=\"center\"> \n Neural network model for time series and static data including convolutional and recurrent architecture.\n </td>\n <td>\n <a href=https://arxiv.org/abs/2206.03179>\n Aguilera-Martos, I., Garc\u00eda-Vico, \u00c1. M., Luengo, J., Damas, S., Melero, F. J., Valle-Alonso, J. J., & Herrera, F. (2022). TSFEDL: A Python Library for Time Series Spatio-Temporal Feature Extraction and Prediction using Deep Learning (with Appendices on Detailed Network Architectures and Experimental Cases of Study). arXiv preprint arXiv:2206.03179.\n </td> \n </tr>\n <tr>\n <td rowspan= 1>AutoEncoder</td>\n <td rowspan=1 align=\"center\"> \n Fully connected AutoEncoder for time series and static data. Neural network for learning useful data representations unsupervisedly. detect anomalies in the data by calculating the reconstruction.\n </td>\n <td>\n <a href=https://link.springer.com/chapter/10.1007/978-3-319-14142-8_8>\n Aggarwal, C.C., 2015. Outlier analysis. In Data mining (pp. 237-263), Ch.3. Springer, Cham. Ch.3\n </td> \n </tr>\n </tbody>\n\n \n</table>\n\n## Installation\n\nFLEX-Anomalies is available on the PyPi repository and can be easily installed using: \n\n``` pip: pip install flexanomalies ```\n\nInstall the necessary dependencies:\n\n```pip install -r requirements.txt```\n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Citation\n\nIf you use this repository in your research work, please cite the Flexible paper: \n",
"bugtrack_url": null,
"license": "",
"summary": "",
"version": "0.0.2",
"project_urls": null,
"split_keywords": [
"anomaly",
"detection",
"federated-learning",
"flexible",
"outlier"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5c18755abeb2e0d7de4c9a7dd14648ec2798eb9d64e28ec087a744e87b3cc66a",
"md5": "ae23b0ed3a81eb508887cc2f4cbae8d4",
"sha256": "2d32f4e05bb365b2d65a4252245a9849f06fe97dbb7a30aa62c1fbd08490d638"
},
"downloads": -1,
"filename": "flexanomalies-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ae23b0ed3a81eb508887cc2f4cbae8d4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 15693,
"upload_time": "2024-03-13T15:13:02",
"upload_time_iso_8601": "2024-03-13T15:13:02.210110Z",
"url": "https://files.pythonhosted.org/packages/5c/18/755abeb2e0d7de4c9a7dd14648ec2798eb9d64e28ec087a744e87b3cc66a/flexanomalies-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "071810f23f0d8c4356391371442d0b51c148e2dcde3d9a5465ee9a4319b15f89",
"md5": "d62408e29623f143fcf9086e8e07f810",
"sha256": "c0cf8613d6cc9a09c2779b43d16673deeed9b2fb00cefd08d2bda3f0b40035be"
},
"downloads": -1,
"filename": "flexanomalies-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "d62408e29623f143fcf9086e8e07f810",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 16091,
"upload_time": "2024-03-13T15:13:04",
"upload_time_iso_8601": "2024-03-13T15:13:04.659266Z",
"url": "https://files.pythonhosted.org/packages/07/18/10f23f0d8c4356391371442d0b51c148e2dcde3d9a5465ee9a4319b15f89/flexanomalies-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-13 15:13:04",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "flexanomalies"
}