autooc


Nameautooc JSON
Version 0.0.17 PyPI version JSON
download
home_pagehttps://github.com/luisferreira97/AutoOC
SummaryAutoOC: Automated Machine Learning (AutoML) library for One-Class Learning
upload_time2024-03-24 11:43:38
maintainerNone
docs_urlNone
authorLuís Ferreira
requires_python>=3.6
licenseNone
keywords automl machine learning one-class learning one-class classification autoencoder isolation forest one-class svm
VCS
bugtrack_url
requirements keras matplotlib mlflow pandas pydot pygmo numpy scikit-learn tensorflow tqdm
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <!-- PROJECT SHIELDS -->
<!--
*** I'm using markdown "reference style" links for readability.
*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
*** See the bottom of this document for the declaration of the reference variables
*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
*** https://www.markdownguide.org/basic-syntax/#reference-style-links
-->
<!--[![Downloads](https://static.pepy.tech/personalized-badge/autooc?period=total&units=international_system&left_color=black&right_color=orange&left_text=Downloads)](https://pepy.tech/project/autooc)-->
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![LinkedIn][linkedin-shield]][linkedin-url]

<!-- PROJECT LOGO -->
<br />
<p align="center">
  <a href="https://github.com/luisferreira97/AutoOC">
    <img src="https://raw.githubusercontent.com/luisferreira97/AutoOC/main/images/logo.png" alt="Logo" width="100" height="100">
  </a>

  <h3 align="center">AutoOC (in Beta)</h3>

  <p align="center">
    AutoOC: Automated Machine Learning (AutoML) library focused on One-Class Learning algorithms (Deep AutoEncoders, Variational AutoEncoders, Isolation Forest, Local Outlier Factor and One-Class SVM)
    <br />
    <!--<a href="https://github.com/luisferreira97/AutoOC"><strong>Explore the docs »</strong></a>
    <br />
    <br />-->
    <!--<a href="https://github.com/luisferreira97/AutoOC">View Demo</a>
    ·-->
    <a href="https://github.com/luisferreira97/AutoOC/issues">Report Bug</a>
    ·
    <a href="https://github.com/luisferreira97/AutoOC/issues">Request Feature</a>
  </p>
</p>



<!-- TABLE OF CONTENTS 
<details open="open">
  <summary>Table of Contents</summary>
  <ol>
    <li>
      <a href="#about-the-project">About The Project</a>
      <ul>
        <li><a href="#built-with">Built With</a></li>
      </ul>
    </li>
    <li>
      <a href="#getting-started">Getting Started</a>
      <ul>
        <li><a href="#prerequisites">Prerequisites</a></li>
        <li><a href="#installation">Installation</a></li>
      </ul>
    </li>
    <li><a href="#usage">Usage</a></li>
    <li><a href="#roadmap">Roadmap</a></li>
    <li><a href="#contributing">Contributing</a></li>
    <li><a href="#license">License</a></li>
    <li><a href="#contact">Contact</a></li>
    <li><a href="#acknowledgements">Acknowledgements</a></li>
  </ol>
</details>-->



<!-- ABOUT THE PROJECT 
## About The Project

[![Product Name Screen Shot][product-screenshot]](https://example.com)

There are many great README templates available on GitHub, however, I didn't find one that really suit my needs so I created this enhanced one. I want to create a README template so amazing that it'll be the last one you ever need -- I think this is it.

Here's why:
* Your time should be focused on creating something amazing. A project that solves a problem and helps others
* You shouldn't be doing the same tasks over and over like creating a README from scratch
* You should element DRY principles to the rest of your life :smile:

Of course, no one template will serve all projects since your needs may be different. So I'll be adding more in the near future. You may also suggest changes by forking this repo and creating a pull request or opening an issue. Thanks to all the people have have contributed to expanding this template!

A list of commonly used resources that I find helpful are listed in the acknowledgements.
-->

<!-- GETTING STARTED -->
## Getting Started

This section presents how the package can be reached and installed.

### Where to get it

The source code is currently hosted on GitHub at: https://github.com/luisferreira97/AutoOC

Binary installer for the latest released version are available at the Python Package Index (PyPI). The PyPI name of the package is `autooc`.

```sh
pip install autooc
```

<!-- USAGE EXAMPLES -->
## Usage

### 1. Import the package
The first step in using the package is, after it has been installed, to import it. The main class from which all the methods are available is ```AutoOC```.

```python
from autooc.autooc import AutoOC
```

### 2. Instantiate a AutoOC object
The second step is to instantiate the AutoOC class with the information about your dataset and context (e.g., normal and anomaly classes, wether to run single-objective or multi-objective, the performance_metric, and the algorithm).
You can change the ```algorithm``` parameter to select which algorithms are used during the optimization. The options are:
- "autoencoders": Deep AutoEncoders (from TensorFlow)
- "vae": Variational AutoEncoders (from TensorFlow)
- "iforest": Isolation Forest (from Scikit-Learn)
- "lof": Local Outlier Factor (Scikit-Learn)
- "svm": One-Class SVM (from Scikit-Learn)
- "nas": the optimization is done using AutoEncoders and VAEs
- "all": the optimization is done using all five algorithms

For the ```performance_metric``` parameter to select which algorithms are used during the optimization. The options are:
- "training_time": Minimizes training time
- "predict_time": Minimizes the time it takes to predict one record
- "num_params": Minimizes the number of parameters (```count_params()``` in Keras); only available when ```algorithm``` equals to ```autoencoders```, ```vae```, or ```nas```.
- "bic": Minimizes the value of the Bayesian Information Criterion

```python
aoc = AutoOC(anomaly_class = 0,
    normal_class = 1,
    multiobjective=True,
    performance_metric="training_time",
    algorithm = "autoencoder"
)
```

### 3. Load dataset
The third step is to load the dataset. Depending on the type of validation you need *train data* (only 'normal' instances), *validation data* (you can use (1) only 'normal' instances or (2) both 'normal' and 'anomaly' instances with the respective labels), and *test data* (both types of instances and labels). You can use the ```load_example_data()``` function to load the popular ECG dataset.


```python
X_train, X_val, X_test, y_test = aoc.load_example_data()
```

### 4. Train
The fourth step is to train the model. The ```fit()``` function computes the optimization using the given parameters.

```python
run = aoc.fit(
    X=X_train,
    X_val=X_val,
    pop=3,
    gen=3,
    epochs=100,
    mlflow_tracking_uri="../results",
    mlflow_experiment_name="test_experiment",
    mlflow_run_name="test_run",
    results_path="../results"
)
```

### 5. Predict

The fifth step is to predict the labels of the test data. You can use the ```predict()``` function to predict the labels of the test data. You can change the ```mode``` parameter to select which individuals are used to predict.
- "all": uses all individuals (models) from the last generation
- "best": uses the from the last generation which achieved the best predictive metric
- "simplest": uses the from the last generation which achieved the best efficiency metric
- "pareto": uses the pareto individuals from the last generation (only for multiobjective. These are the models that achieved simultaneouly the best predictive metric and efficiency metric.

Additionally, you can use the ```threshold``` parameter (only used for AutoEncoders) to set the threshold for the prediction. You can use the following values:
- "default": uses a different threshold value for each individual (model). For each model the threshold value is the associated default value (currently this works similar to the "mean" value).
- "mean": For each model the threshold value is the sum of the mean reconstruction error obtained on the validation data and one standard deviation.
- "percentile": For each model the threshold value is the 95th percentile of the reconstruction error obtained on the validation data (you can also use the ```percentile``` parameter to change the percentile).
- "max": For each model the threshold value is maximum reconstruction error obtained on the validation data.
- You can also pass an Integer of Float value. In this case, the threshold value is the same for all the models.


```python
predictions = aoc.predict(X_test,
    mode="all",
    threshold="default")
```

### 6. Evaluate

You can use the predictions to calculate manually the performance metrics of the model. However, the ```evaluate()``` function is a more convenient way to do it. You can also use the ```mode``` parameter (works similarly to the ```predict()``` function) and use metrics from the ```sklearn.metrics``` package (currently available are "roc_auc", "accuracy", "precision", "recall", and "f1").

```python
score = aoc.evaluate(X_test,
    y_test,
    mode="all",
    metric="roc_auc",
    threshold="default")
```

## Usage (Full Example)

```python
from autooc.autooc import AutoOC

aoc = AutoOC(anomaly_class = 0,
    normal_class = 1,
    multiobjective=True,
    performance_metric="training_time",
    algorithm = "autoencoder"
)

X_train, X_val, X_test, y_test = aoc.load_example_data()

run = aoc.fit(
    X=X_train,
    X_val=X_val,
    pop=3,
    gen=3,
    epochs=100,
    mlflow_tracking_uri="../results",
    mlflow_experiment_name="test_experiment",
    mlflow_run_name="test_run",
    results_path="../results"
)

predictions = aoc.predict(X_test,
    mode="all",
    threshold="default")

score = aoc.evaluate(X_test,
    y_test,
    mode="all",
    metric="roc_auc",
    threshold="default")
print(score)
```

## Topic Definition

### Grammatical Evolution (GE)

Grammatical Evolution (GE) is a biologically inspired evolutionary algorithm for generating computer programs. The algorithm was proposed by O’Neill and Ryan in 2001 and has been widely used in both optimization and ML tasks. In GE, a set of programs is represented as strings of characters, known as chromosomes. The chromosomes are encoded using a formal grammar, which defines the syntax and structure of the programs. The grammar is used to parse the chromosomes and generate the corresponding programs, which are then evaluated using a fitness function. The fitness function measures the quality of the programs and is used to guide the evolution process toward better solutions.

One of the main advantages of GE is its ability to generate programs in any language, as long as a suitable grammar is defined. This makes GE a versatile tool for developing custom software solutions for a wide range of applications. In addition to its flexibility and versatility, GE can handle complex optimization problems with a large number of objectives and constraints. It can also handle continuous and discrete optimization problems, as well as problems with mixed variables. GE has been shown to be effective in finding high-quality solutions in a relatively short time, compared to other optimization methods.

A GE execution starts by creating an initial population of solutions (usually randomly), where each solution (usually named individual) corresponds to an array of integers (or genome) that is used to generate the program (or phenotype). In the evolutionary process of GE, each generation consists of two main phases: evolution and evaluation. During the evolution phase, new solutions are generated using operations such as crossovers and mutations. Crossover involves selecting pairs of individuals as parents and swapping their genetic material to produce new individuals, known as children. Mutation, which is applied to the children individuals after crossover, consists of randomly altering their genome to maintain genetic diversity. In the evaluation phase, the population of individuals is evaluated using the fitness function.

GE uses a mapping process to generate programs from a genome encoded using a formal grammar, typically in Backus-Naur Form (BNF) notation. This notation consists of terminals, which represent items that can appear in the language, and non-terminals, which are variables that include one or more terminals.

### Nondominated Sorting Genetic Algorithm II (NSGA-II)

Nondominated Sorting Genetic Algorithm II (NSGA-II) is a multi-objective optimization algorithm that was proposed in 2002. The algorithm is based on the concept of non-dominance, which means that a solution is considered superior to another solution if it is not worse than the other solution in any objective and strictly better in at least one objective. The goal of NSGA-II is to find a set of non-dominated solutions, known as the Pareto front, which represents the trade-off between the different objectives.

One of the main features of NSGA-II is its ability to handle constraints. The algorithm handles constraints by assigning a penalty value to solutions that violate the constraints. The penalty value is then used as an additional objective, which is minimized during the optimization process. NSGA-II also includes a crowding distance measure, which is used to preserve diversity among the solutions and avoid premature convergence. The algorithm has been widely used in various fields, including engineering, economics, and biology, and has shown promising results in a variety of multi-objective optimization problems.

### One-Class Classification (OCC)

Also known as unary classification, One-Class Classification (OCC) can be viewed as a subclass of unsupervised learning, where the Machine Learning model only learns using training examples from a single class [8, 9]. This type of learning is valuable in diverse real-world scenarios where labeled data is non-existent, infeasible, or difficult (e.g., requiring a costly and slow manual class assignment), such as fraud detection, cybersecurity, predictive maintenance or industrial quality assessment.


<!--_For more examples, please refer to the [Documentation](https://example.com)_-->

<!-- CITATION -->
## Citation

To cite this work please use the following article:

```
@article{FERREIRA2023110496,
  author = {Luís Ferreira and Paulo Cortez}
  title = {AutoOC: Automated multi-objective design of deep autoencoders and one-class classifiers using grammatical evolution},
  journal = {Applied Soft Computing},
  volume = {144},
  pages = {110496},
  year = {2023},
  issn = {1568-4946},
  doi = {https://doi.org/10.1016/j.asoc.2023.110496},
  url = {https://www.sciencedirect.com/science/article/pii/S1568494623005148}
}
```

### Built With

* [Python](https://www.python.org)
* [PonyGE2](https://github.com/PonyGE/PonyGE2)
* [TensorFlow](https://www.tensorflow.org/)
* [Scikit-Learn](https://scikit-learn.org/)

<!-- ROADMAP -->
## Roadmap

See the [open issues](https://github.com/luisferreira97/AutoOC) for a list of proposed features (and known issues).

<!-- CONTRIBUTING -->
## Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are **greatly appreciated**.

1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

<!-- LICENSE -->
## License

Distributed under the MIT License. See `LICENSE` for more information.

<!-- CONTACT -->
## Contact

Luís Ferreira - [LinkedIn](https://www.linkedin.com/in/luisferreira97/) - luis_ferreira223@hotmail.com

Project Link: [https://github.com/luisferreira97/AutoOC](https://github.com/luisferreira97/AutoOC)



<!-- ACKNOWLEDGEMENTS -->
## Acknowledgements
* [PonyGE2](https://github.com/PonyGE/PonyGE2)





<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/luisferreira97/AutoOC.svg?style=for-the-badge
[contributors-url]: https://github.com/luisferreira97/AutoOC/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/luisferreira97/AutoOC.svg?style=for-the-badge
[forks-url]: https://github.com/luisferreira97/AutoOC/network/members
[stars-shield]: https://img.shields.io/github/stars/luisferreira97/AutoOC.svg?style=for-the-badge
[stars-url]: https://github.com/luisferreira97/AutoOC/stargazers
[issues-shield]: https://img.shields.io/github/issues/luisferreira97/AutoOC.svg?style=for-the-badge
[issues-url]: https://github.com/luisferreira97/AutoOC/issues
[license-shield]: https://img.shields.io/github/license/luisferreira97/AutoOC.svg?style=for-the-badge
[license-url]: https://github.com/luisferreira97/AutoOC/blob/master/LICENSE
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://www.linkedin.com/in/luisferreira97/
[product-screenshot]: images/logo.png

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/luisferreira97/AutoOC",
    "name": "autooc",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "automl, machine learning, one-class learning, one-class classification, autoencoder, isolation forest, one-class svm",
    "author": "Lu\u00eds Ferreira",
    "author_email": "luis_ferreira223@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/5c/89/2a37e7532275629d6c8207a78b14475d5e19bdf3d050b595c8c83b096c1e/autooc-0.0.17.tar.gz",
    "platform": null,
    "description": "<!-- PROJECT SHIELDS -->\n<!--\n*** I'm using markdown \"reference style\" links for readability.\n*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).\n*** See the bottom of this document for the declaration of the reference variables\n*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.\n*** https://www.markdownguide.org/basic-syntax/#reference-style-links\n-->\n<!--[![Downloads](https://static.pepy.tech/personalized-badge/autooc?period=total&units=international_system&left_color=black&right_color=orange&left_text=Downloads)](https://pepy.tech/project/autooc)-->\n[![Contributors][contributors-shield]][contributors-url]\n[![Forks][forks-shield]][forks-url]\n[![Stargazers][stars-shield]][stars-url]\n[![Issues][issues-shield]][issues-url]\n[![MIT License][license-shield]][license-url]\n[![LinkedIn][linkedin-shield]][linkedin-url]\n\n<!-- PROJECT LOGO -->\n<br />\n<p align=\"center\">\n  <a href=\"https://github.com/luisferreira97/AutoOC\">\n    <img src=\"https://raw.githubusercontent.com/luisferreira97/AutoOC/main/images/logo.png\" alt=\"Logo\" width=\"100\" height=\"100\">\n  </a>\n\n  <h3 align=\"center\">AutoOC (in Beta)</h3>\n\n  <p align=\"center\">\n    AutoOC: Automated Machine Learning (AutoML) library focused on One-Class Learning algorithms (Deep AutoEncoders, Variational AutoEncoders, Isolation Forest, Local Outlier Factor and One-Class SVM)\n    <br />\n    <!--<a href=\"https://github.com/luisferreira97/AutoOC\"><strong>Explore the docs \u00bb</strong></a>\n    <br />\n    <br />-->\n    <!--<a href=\"https://github.com/luisferreira97/AutoOC\">View Demo</a>\n    \u00b7-->\n    <a href=\"https://github.com/luisferreira97/AutoOC/issues\">Report Bug</a>\n    \u00b7\n    <a href=\"https://github.com/luisferreira97/AutoOC/issues\">Request Feature</a>\n  </p>\n</p>\n\n\n\n<!-- TABLE OF CONTENTS \n<details open=\"open\">\n  <summary>Table of Contents</summary>\n  <ol>\n    <li>\n      <a href=\"#about-the-project\">About The Project</a>\n      <ul>\n        <li><a href=\"#built-with\">Built With</a></li>\n      </ul>\n    </li>\n    <li>\n      <a href=\"#getting-started\">Getting Started</a>\n      <ul>\n        <li><a href=\"#prerequisites\">Prerequisites</a></li>\n        <li><a href=\"#installation\">Installation</a></li>\n      </ul>\n    </li>\n    <li><a href=\"#usage\">Usage</a></li>\n    <li><a href=\"#roadmap\">Roadmap</a></li>\n    <li><a href=\"#contributing\">Contributing</a></li>\n    <li><a href=\"#license\">License</a></li>\n    <li><a href=\"#contact\">Contact</a></li>\n    <li><a href=\"#acknowledgements\">Acknowledgements</a></li>\n  </ol>\n</details>-->\n\n\n\n<!-- ABOUT THE PROJECT \n## About The Project\n\n[![Product Name Screen Shot][product-screenshot]](https://example.com)\n\nThere are many great README templates available on GitHub, however, I didn't find one that really suit my needs so I created this enhanced one. I want to create a README template so amazing that it'll be the last one you ever need -- I think this is it.\n\nHere's why:\n* Your time should be focused on creating something amazing. A project that solves a problem and helps others\n* You shouldn't be doing the same tasks over and over like creating a README from scratch\n* You should element DRY principles to the rest of your life :smile:\n\nOf course, no one template will serve all projects since your needs may be different. So I'll be adding more in the near future. You may also suggest changes by forking this repo and creating a pull request or opening an issue. Thanks to all the people have have contributed to expanding this template!\n\nA list of commonly used resources that I find helpful are listed in the acknowledgements.\n-->\n\n<!-- GETTING STARTED -->\n## Getting Started\n\nThis section presents how the package can be reached and installed.\n\n### Where to get it\n\nThe source code is currently hosted on GitHub at: https://github.com/luisferreira97/AutoOC\n\nBinary installer for the latest released version are available at the Python Package Index (PyPI). The PyPI name of the package is `autooc`.\n\n```sh\npip install autooc\n```\n\n<!-- USAGE EXAMPLES -->\n## Usage\n\n### 1. Import the package\nThe first step in using the package is, after it has been installed, to import it. The main class from which all the methods are available is ```AutoOC```.\n\n```python\nfrom autooc.autooc import AutoOC\n```\n\n### 2. Instantiate a AutoOC object\nThe second step is to instantiate the AutoOC class with the information about your dataset and context (e.g., normal and anomaly classes, wether to run single-objective or multi-objective, the performance_metric, and the algorithm).\nYou can change the ```algorithm``` parameter to select which algorithms are used during the optimization. The options are:\n- \"autoencoders\": Deep AutoEncoders (from TensorFlow)\n- \"vae\": Variational AutoEncoders (from TensorFlow)\n- \"iforest\": Isolation Forest (from Scikit-Learn)\n- \"lof\": Local Outlier Factor (Scikit-Learn)\n- \"svm\": One-Class SVM (from Scikit-Learn)\n- \"nas\": the optimization is done using AutoEncoders and VAEs\n- \"all\": the optimization is done using all five algorithms\n\nFor the ```performance_metric``` parameter to select which algorithms are used during the optimization. The options are:\n- \"training_time\": Minimizes training time\n- \"predict_time\": Minimizes the time it takes to predict one record\n- \"num_params\": Minimizes the number of parameters (```count_params()``` in Keras); only available when ```algorithm``` equals to ```autoencoders```, ```vae```, or ```nas```.\n- \"bic\": Minimizes the value of the Bayesian Information Criterion\n\n```python\naoc = AutoOC(anomaly_class = 0,\n    normal_class = 1,\n    multiobjective=True,\n    performance_metric=\"training_time\",\n    algorithm = \"autoencoder\"\n)\n```\n\n### 3. Load dataset\nThe third step is to load the dataset. Depending on the type of validation you need *train data* (only 'normal' instances), *validation data* (you can use (1) only 'normal' instances or (2) both 'normal' and 'anomaly' instances with the respective labels), and *test data* (both types of instances and labels). You can use the ```load_example_data()``` function to load the popular ECG dataset.\n\n\n```python\nX_train, X_val, X_test, y_test = aoc.load_example_data()\n```\n\n### 4. Train\nThe fourth step is to train the model. The ```fit()``` function computes the optimization using the given parameters.\n\n```python\nrun = aoc.fit(\n    X=X_train,\n    X_val=X_val,\n    pop=3,\n    gen=3,\n    epochs=100,\n    mlflow_tracking_uri=\"../results\",\n    mlflow_experiment_name=\"test_experiment\",\n    mlflow_run_name=\"test_run\",\n    results_path=\"../results\"\n)\n```\n\n### 5. Predict\n\nThe fifth step is to predict the labels of the test data. You can use the ```predict()``` function to predict the labels of the test data. You can change the ```mode``` parameter to select which individuals are used to predict.\n- \"all\": uses all individuals (models) from the last generation\n- \"best\": uses the from the last generation which achieved the best predictive metric\n- \"simplest\": uses the from the last generation which achieved the best efficiency metric\n- \"pareto\": uses the pareto individuals from the last generation (only for multiobjective. These are the models that achieved simultaneouly the best predictive metric and efficiency metric.\n\nAdditionally, you can use the ```threshold``` parameter (only used for AutoEncoders) to set the threshold for the prediction. You can use the following values:\n- \"default\": uses a different threshold value for each individual (model). For each model the threshold value is the associated default value (currently this works similar to the \"mean\" value).\n- \"mean\": For each model the threshold value is the sum of the mean reconstruction error obtained on the validation data and one standard deviation.\n- \"percentile\": For each model the threshold value is the 95th percentile of the reconstruction error obtained on the validation data (you can also use the ```percentile``` parameter to change the percentile).\n- \"max\": For each model the threshold value is maximum reconstruction error obtained on the validation data.\n- You can also pass an Integer of Float value. In this case, the threshold value is the same for all the models.\n\n\n```python\npredictions = aoc.predict(X_test,\n    mode=\"all\",\n    threshold=\"default\")\n```\n\n### 6. Evaluate\n\nYou can use the predictions to calculate manually the performance metrics of the model. However, the ```evaluate()``` function is a more convenient way to do it. You can also use the ```mode``` parameter (works similarly to the ```predict()``` function) and use metrics from the ```sklearn.metrics``` package (currently available are \"roc_auc\", \"accuracy\", \"precision\", \"recall\", and \"f1\").\n\n```python\nscore = aoc.evaluate(X_test,\n    y_test,\n    mode=\"all\",\n    metric=\"roc_auc\",\n    threshold=\"default\")\n```\n\n## Usage (Full Example)\n\n```python\nfrom autooc.autooc import AutoOC\n\naoc = AutoOC(anomaly_class = 0,\n    normal_class = 1,\n    multiobjective=True,\n    performance_metric=\"training_time\",\n    algorithm = \"autoencoder\"\n)\n\nX_train, X_val, X_test, y_test = aoc.load_example_data()\n\nrun = aoc.fit(\n    X=X_train,\n    X_val=X_val,\n    pop=3,\n    gen=3,\n    epochs=100,\n    mlflow_tracking_uri=\"../results\",\n    mlflow_experiment_name=\"test_experiment\",\n    mlflow_run_name=\"test_run\",\n    results_path=\"../results\"\n)\n\npredictions = aoc.predict(X_test,\n    mode=\"all\",\n    threshold=\"default\")\n\nscore = aoc.evaluate(X_test,\n    y_test,\n    mode=\"all\",\n    metric=\"roc_auc\",\n    threshold=\"default\")\nprint(score)\n```\n\n## Topic Definition\n\n### Grammatical Evolution (GE)\n\nGrammatical Evolution (GE) is a biologically inspired evolutionary algorithm for generating computer programs. The algorithm was proposed by O\u2019Neill and Ryan in 2001 and has been widely used in both optimization and ML tasks. In GE, a set of programs is represented as strings of characters, known as chromosomes. The chromosomes are encoded using a formal grammar, which defines the syntax and structure of the programs. The grammar is used to parse the chromosomes and generate the corresponding programs, which are then evaluated using a fitness function. The fitness function measures the quality of the programs and is used to guide the evolution process toward better solutions.\n\nOne of the main advantages of GE is its ability to generate programs in any language, as long as a suitable grammar is defined. This makes GE a versatile tool for developing custom software solutions for a wide range of applications. In addition to its flexibility and versatility, GE can handle complex optimization problems with a large number of objectives and constraints. It can also handle continuous and discrete optimization problems, as well as problems with mixed variables. GE has been shown to be effective in finding high-quality solutions in a relatively short time, compared to other optimization methods.\n\nA GE execution starts by creating an initial population of solutions (usually randomly), where each solution (usually named individual) corresponds to an array of integers (or genome) that is used to generate the program (or phenotype). In the evolutionary process of GE, each generation consists of two main phases: evolution and evaluation. During the evolution phase, new solutions are generated using operations such as crossovers and mutations. Crossover involves selecting pairs of individuals as parents and swapping their genetic material to produce new individuals, known as children. Mutation, which is applied to the children individuals after crossover, consists of randomly altering their genome to maintain genetic diversity. In the evaluation phase, the population of individuals is evaluated using the fitness function.\n\nGE uses a mapping process to generate programs from a genome encoded using a formal grammar, typically in Backus-Naur Form (BNF) notation. This notation consists of terminals, which represent items that can appear in the language, and non-terminals, which are variables that include one or more terminals.\n\n### Nondominated Sorting Genetic Algorithm II (NSGA-II)\n\nNondominated Sorting Genetic Algorithm II (NSGA-II) is a multi-objective optimization algorithm that was proposed in 2002. The algorithm is based on the concept of non-dominance, which means that a solution is considered superior to another solution if it is not worse than the other solution in any objective and strictly better in at least one objective. The goal of NSGA-II is to find a set of non-dominated solutions, known as the Pareto front, which represents the trade-off between the different objectives.\n\nOne of the main features of NSGA-II is its ability to handle constraints. The algorithm handles constraints by assigning a penalty value to solutions that violate the constraints. The penalty value is then used as an additional objective, which is minimized during the optimization process. NSGA-II also includes a crowding distance measure, which is used to preserve diversity among the solutions and avoid premature convergence. The algorithm has been widely used in various fields, including engineering, economics, and biology, and has shown promising results in a variety of multi-objective optimization problems.\n\n### One-Class Classification (OCC)\n\nAlso known as unary classification, One-Class Classification (OCC) can be viewed as a subclass of unsupervised learning, where the Machine Learning model only learns using training examples from a single class [8, 9]. This type of learning is valuable in diverse real-world scenarios where labeled data is non-existent, infeasible, or difficult (e.g., requiring a costly and slow manual class assignment), such as fraud detection, cybersecurity, predictive maintenance or industrial quality assessment.\n\n\n<!--_For more examples, please refer to the [Documentation](https://example.com)_-->\n\n<!-- CITATION -->\n## Citation\n\nTo cite this work please use the following article:\n\n```\n@article{FERREIRA2023110496,\n  author = {Lu\u00eds Ferreira and Paulo Cortez}\n  title = {AutoOC: Automated multi-objective design of deep autoencoders and one-class classifiers using grammatical evolution},\n  journal = {Applied Soft Computing},\n  volume = {144},\n  pages = {110496},\n  year = {2023},\n  issn = {1568-4946},\n  doi = {https://doi.org/10.1016/j.asoc.2023.110496},\n  url = {https://www.sciencedirect.com/science/article/pii/S1568494623005148}\n}\n```\n\n### Built With\n\n* [Python](https://www.python.org)\n* [PonyGE2](https://github.com/PonyGE/PonyGE2)\n* [TensorFlow](https://www.tensorflow.org/)\n* [Scikit-Learn](https://scikit-learn.org/)\n\n<!-- ROADMAP -->\n## Roadmap\n\nSee the [open issues](https://github.com/luisferreira97/AutoOC) for a list of proposed features (and known issues).\n\n<!-- CONTRIBUTING -->\n## Contributing\n\nContributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are **greatly appreciated**.\n\n1. Fork the Project\n2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the Branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n<!-- LICENSE -->\n## License\n\nDistributed under the MIT License. See `LICENSE` for more information.\n\n<!-- CONTACT -->\n## Contact\n\nLu\u00eds Ferreira - [LinkedIn](https://www.linkedin.com/in/luisferreira97/) - luis_ferreira223@hotmail.com\n\nProject Link: [https://github.com/luisferreira97/AutoOC](https://github.com/luisferreira97/AutoOC)\n\n\n\n<!-- ACKNOWLEDGEMENTS -->\n## Acknowledgements\n* [PonyGE2](https://github.com/PonyGE/PonyGE2)\n\n\n\n\n\n<!-- MARKDOWN LINKS & IMAGES -->\n<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->\n[contributors-shield]: https://img.shields.io/github/contributors/luisferreira97/AutoOC.svg?style=for-the-badge\n[contributors-url]: https://github.com/luisferreira97/AutoOC/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/luisferreira97/AutoOC.svg?style=for-the-badge\n[forks-url]: https://github.com/luisferreira97/AutoOC/network/members\n[stars-shield]: https://img.shields.io/github/stars/luisferreira97/AutoOC.svg?style=for-the-badge\n[stars-url]: https://github.com/luisferreira97/AutoOC/stargazers\n[issues-shield]: https://img.shields.io/github/issues/luisferreira97/AutoOC.svg?style=for-the-badge\n[issues-url]: https://github.com/luisferreira97/AutoOC/issues\n[license-shield]: https://img.shields.io/github/license/luisferreira97/AutoOC.svg?style=for-the-badge\n[license-url]: https://github.com/luisferreira97/AutoOC/blob/master/LICENSE\n[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555\n[linkedin-url]: https://www.linkedin.com/in/luisferreira97/\n[product-screenshot]: images/logo.png\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "AutoOC: Automated Machine Learning (AutoML) library for One-Class Learning",
    "version": "0.0.17",
    "project_urls": {
        "Homepage": "https://github.com/luisferreira97/AutoOC"
    },
    "split_keywords": [
        "automl",
        " machine learning",
        " one-class learning",
        " one-class classification",
        " autoencoder",
        " isolation forest",
        " one-class svm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5c892a37e7532275629d6c8207a78b14475d5e19bdf3d050b595c8c83b096c1e",
                "md5": "6765dd768bd79bef08b2daf0c3f2ea85",
                "sha256": "6aec1ef712d1330dfc90ed919af9142c19ff074cceb037106a19eb0cca9f0f26"
            },
            "downloads": -1,
            "filename": "autooc-0.0.17.tar.gz",
            "has_sig": false,
            "md5_digest": "6765dd768bd79bef08b2daf0c3f2ea85",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 130675,
            "upload_time": "2024-03-24T11:43:38",
            "upload_time_iso_8601": "2024-03-24T11:43:38.155488Z",
            "url": "https://files.pythonhosted.org/packages/5c/89/2a37e7532275629d6c8207a78b14475d5e19bdf3d050b595c8c83b096c1e/autooc-0.0.17.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-24 11:43:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "luisferreira97",
    "github_project": "AutoOC",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "keras",
            "specs": [
                [
                    "==",
                    "2.6.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.4.1"
                ]
            ]
        },
        {
            "name": "mlflow",
            "specs": [
                [
                    "==",
                    "1.15.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "1.2.4"
                ]
            ]
        },
        {
            "name": "pydot",
            "specs": [
                [
                    "==",
                    "1.4.2"
                ]
            ]
        },
        {
            "name": "pygmo",
            "specs": [
                [
                    "==",
                    "2.16.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.19.2"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "==",
                    "1.0"
                ]
            ]
        },
        {
            "name": "tensorflow",
            "specs": [
                [
                    "==",
                    "2.6.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.60.0"
                ]
            ]
        }
    ],
    "lcname": "autooc"
}
        
Elapsed time: 0.25725s