# "MLRL-Common": Building-Blocks for Multi-Output Rule Learning Algorithms
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/mlrl-common.svg)](https://badge.fury.io/py/mlrl-common) [![Documentation Status](https://readthedocs.org/projects/mlrl-boomer/badge/?version=latest)](https://mlrl-boomer.readthedocs.io/en/latest/?badge=latest)
**Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)
This software package provides common modules to be used by different types of **multi-output rule learning (MLRL)** algorithms that integrate with the popular [scikit-learn](https://scikit-learn.org) machine learning framework.
The problem domains addressed by this software include the following:
- **Multi-label classification**: The goal of [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification) is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics.
- **Multi-output regression**: Multivariate [regression](https://en.wikipedia.org/wiki/Regression_analysis) problems require to predict for more than a single numerical output variable.
The library serves as the basis for the implementation of the following rule learning algorithms:
- **BOOMER (Gradient Boosted Multi-Output Rules)**: A state-of-the art algorithm that uses [gradient boosting](https://en.wikipedia.org/wiki/Gradient_boosting) to learn an ensemble of rules that is built with respect to a given multivariate loss function.
- **Multi-label Separate-and-Conquer (SeCo) Rule Learning Algorithm**: A heuristic rule learning algorithm based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.
## Functionalities
This package follows a unified and modular framework for the implementation of different types of rule learning algorithms. In the following, we provide an overview of the individual modules an instantiation of the framework must implement.
### Rule Induction
A module for rule induction that is responsible for the construction of individual rules. Currently, the following modules of this kind are implemented:
- A module for **greedy rule induction** that conducts a top-down search, where rules are constructed by adding one condition after the other and adjusting its prediction accordingly.
- Rule induction based on a **beam search**, where a top-down search is conducted as described above. However, instead of focusing on the best solution at each step, the algorithm keeps track of a predefined number of promising solutions and picks the best one at the end.
All of the above modules support **numerical, ordinal, and nominal features** and can handle **missing feature values**. They can also be combined with methods for **unsupervised feature binning**, where training examples with similar features values are assigned to bins in order to reduce the training complexity. Moreover, **multi-threading** can be used to speed up training.
### Model Assemblage
A module for the assemblage of a rule model that consists of several rules. Currently, the following strategies can be used for constructing a model:
- **Sequential assemblage of rule models**, where one rule is learned after the other.
### Sampling Methods
A wide variety of sampling methods, including **sampling with and without replacement**, as well as **stratified sampling techniques**, is provided by this package. They can be used to learn new rules on a subset of the available training examples, features, or labels.
### (Output Space) Statistics
So-called output space statistics serve as the basis for assessing the quality of potential rules and determining their predictions. The notion of the statistics heavily depend on the rule learning algorithm at hand. For this reason, no particular implementation is currently included in this package.
### Post-Processing
Post-processing methods can be used to alter the predictions of a rule after it has been learned. Whether this is desirable or not heavily depends on the rule learning algorithm at hand. For this reason, no post-processing methods are currently provided by this package.
### Pruning Methods
Rule pruning techniques can optionally be applied to a rule after its construction to improve its generalization to unseen data and prevent overfitting. The following pruning techniques are currently supported by this package:
- **Incremental reduced error pruning (IREP)** removes overly specific conditions from a rule if this results in an increase of predictive performance (measured on a holdout set of the training data).
### Stopping Criteria
One or several stopping criteria can be used to decide whether additional rules should be added to a model or not. Currently, the following criteria are provided out-of-the-box:
- A **size-based stopping criterion** that ensures that a certain number of rules is not exceeded.
- A **time-based stopping criterion** that stops training as soon as a predefined runtime was exceeded.
- **Pre-pruning (a.k.a. early stopping)** aims at terminating the training process as soon as the performance of a model stagnates or declines (measured on a holdout set of the training data).
### Post-Optimization
Post-optimization methods can be employed to further improve the predictive performance of a model after it has been assembled. Currently, the following post-optimization techniques can be used:
- **Sequential post-optimization** reconstructs each rule in a model in the context of the other rules.
- **Post-pruning** may remove trailing rules from a model in this increases the models performance (as measured on a holdout set of the training data).
### Prediction algorithm
A prediction algorithm is needed to derive predictions from the rules in a previously assembled model. As prediction methods heavily depend on the rule learning algorithm and problem domain at hand, no implementation is provided by this package out-of-the-box. However, it defines interfaces for the prediction of **scores, binary predictions, or probability estimates.**
## License
This project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).
Raw data
{
"_id": null,
"home_page": "https://github.com/mrapp-ke/MLRL-Boomer",
"name": "mlrl-common",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "machine learning, scikit-learn, multi-label classification, rule learning",
"author": "Michael Rapp",
"author_email": "michael.rapp.ml@gmail.com",
"download_url": "https://github.com/mrapp-ke/MLRL-Boomer/releases",
"platform": "Linux",
"description": "# \"MLRL-Common\": Building-Blocks for Multi-Output Rule Learning Algorithms\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![PyPI version](https://badge.fury.io/py/mlrl-common.svg)](https://badge.fury.io/py/mlrl-common) [![Documentation Status](https://readthedocs.org/projects/mlrl-boomer/badge/?version=latest)](https://mlrl-boomer.readthedocs.io/en/latest/?badge=latest)\n\n**Important links:** [Documentation](https://mlrl-boomer.readthedocs.io/en/latest/) | [Issue Tracker](https://github.com/mrapp-ke/MLRL-Boomer/issues) | [Changelog](https://mlrl-boomer.readthedocs.io/en/latest/misc/CHANGELOG.html) | [Contributors](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html) | [Code of Conduct](https://mlrl-boomer.readthedocs.io/en/latest/misc/CODE_OF_CONDUCT.html) | [License](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html)\n\nThis software package provides common modules to be used by different types of **multi-output rule learning (MLRL)** algorithms that integrate with the popular [scikit-learn](https://scikit-learn.org) machine learning framework.\n\nThe problem domains addressed by this software include the following:\n\n- **Multi-label classification**: The goal of [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification) is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics.\n- **Multi-output regression**: Multivariate [regression](https://en.wikipedia.org/wiki/Regression_analysis) problems require to predict for more than a single numerical output variable.\n\nThe library serves as the basis for the implementation of the following rule learning algorithms:\n\n- **BOOMER (Gradient Boosted Multi-Output Rules)**: A state-of-the art algorithm that uses [gradient boosting](https://en.wikipedia.org/wiki/Gradient_boosting) to learn an ensemble of rules that is built with respect to a given multivariate loss function.\n- **Multi-label Separate-and-Conquer (SeCo) Rule Learning Algorithm**: A heuristic rule learning algorithm based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.\n\n## Functionalities\n\nThis package follows a unified and modular framework for the implementation of different types of rule learning algorithms. In the following, we provide an overview of the individual modules an instantiation of the framework must implement.\n\n### Rule Induction\n\nA module for rule induction that is responsible for the construction of individual rules. Currently, the following modules of this kind are implemented:\n\n- A module for **greedy rule induction** that conducts a top-down search, where rules are constructed by adding one condition after the other and adjusting its prediction accordingly.\n- Rule induction based on a **beam search**, where a top-down search is conducted as described above. However, instead of focusing on the best solution at each step, the algorithm keeps track of a predefined number of promising solutions and picks the best one at the end.\n\nAll of the above modules support **numerical, ordinal, and nominal features** and can handle **missing feature values**. They can also be combined with methods for **unsupervised feature binning**, where training examples with similar features values are assigned to bins in order to reduce the training complexity. Moreover, **multi-threading** can be used to speed up training.\n\n### Model Assemblage\n\nA module for the assemblage of a rule model that consists of several rules. Currently, the following strategies can be used for constructing a model:\n\n- **Sequential assemblage of rule models**, where one rule is learned after the other.\n\n### Sampling Methods\n\nA wide variety of sampling methods, including **sampling with and without replacement**, as well as **stratified sampling techniques**, is provided by this package. They can be used to learn new rules on a subset of the available training examples, features, or labels.\n\n### (Output Space) Statistics\n\nSo-called output space statistics serve as the basis for assessing the quality of potential rules and determining their predictions. The notion of the statistics heavily depend on the rule learning algorithm at hand. For this reason, no particular implementation is currently included in this package.\n\n### Post-Processing\n\nPost-processing methods can be used to alter the predictions of a rule after it has been learned. Whether this is desirable or not heavily depends on the rule learning algorithm at hand. For this reason, no post-processing methods are currently provided by this package.\n\n### Pruning Methods\n\nRule pruning techniques can optionally be applied to a rule after its construction to improve its generalization to unseen data and prevent overfitting. The following pruning techniques are currently supported by this package:\n\n- **Incremental reduced error pruning (IREP)** removes overly specific conditions from a rule if this results in an increase of predictive performance (measured on a holdout set of the training data).\n\n### Stopping Criteria\n\nOne or several stopping criteria can be used to decide whether additional rules should be added to a model or not. Currently, the following criteria are provided out-of-the-box:\n\n- A **size-based stopping criterion** that ensures that a certain number of rules is not exceeded.\n- A **time-based stopping criterion** that stops training as soon as a predefined runtime was exceeded.\n- **Pre-pruning (a.k.a. early stopping)** aims at terminating the training process as soon as the performance of a model stagnates or declines (measured on a holdout set of the training data).\n\n### Post-Optimization\n\nPost-optimization methods can be employed to further improve the predictive performance of a model after it has been assembled. Currently, the following post-optimization techniques can be used:\n\n- **Sequential post-optimization** reconstructs each rule in a model in the context of the other rules.\n\n- **Post-pruning** may remove trailing rules from a model in this increases the models performance (as measured on a holdout set of the training data).\n\n### Prediction algorithm\n\nA prediction algorithm is needed to derive predictions from the rules in a previously assembled model. As prediction methods heavily depend on the rule learning algorithm and problem domain at hand, no implementation is provided by this package out-of-the-box. However, it defines interfaces for the prediction of **scores, binary predictions, or probability estimates.**\n\n## License\n\nThis project is open source software licensed under the terms of the [MIT license](https://mlrl-boomer.readthedocs.io/en/latest/misc/LICENSE.html). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience. A frequently updated list of contributors is available [here](https://mlrl-boomer.readthedocs.io/en/latest/misc/CONTRIBUTORS.html).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Provides common modules to be used by different types of multi-label rule learning algorithms",
"version": "0.11.1",
"project_urls": {
"Documentation": "https://mlrl-boomer.readthedocs.io/en/latest",
"Download": "https://github.com/mrapp-ke/MLRL-Boomer/releases",
"Homepage": "https://github.com/mrapp-ke/MLRL-Boomer",
"Issue Tracker": "https://github.com/mrapp-ke/MLRL-Boomer/issues"
},
"split_keywords": [
"machine learning",
" scikit-learn",
" multi-label classification",
" rule learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "13073efe1b9a299d8cb95858621b678d87587ef39858430cac1235967b61734a",
"md5": "77c32fd99fe607efc127170039a83cc8",
"sha256": "4b83f9f17ea56bffbf680759a11212d75f07e66435337290a2f16b65d84c696d"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp310-cp310-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "77c32fd99fe607efc127170039a83cc8",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 1930641,
"upload_time": "2024-09-24T21:23:57",
"upload_time_iso_8601": "2024-09-24T21:23:57.593157Z",
"url": "https://files.pythonhosted.org/packages/13/07/3efe1b9a299d8cb95858621b678d87587ef39858430cac1235967b61734a/mlrl_common-0.11.1-cp310-cp310-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d23b45cad10f8f88e7706100a5b494f9c7672bfb09a07bdf454380d7e20a08ec",
"md5": "cc362af62d6a874408233178182f752e",
"sha256": "beb85fba524e75246cea661548552c6ad201b404ac0e6411dd9d9c79feed9145"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "cc362af62d6a874408233178182f752e",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 2867507,
"upload_time": "2024-09-24T21:26:45",
"upload_time_iso_8601": "2024-09-24T21:26:45.309516Z",
"url": "https://files.pythonhosted.org/packages/d2/3b/45cad10f8f88e7706100a5b494f9c7672bfb09a07bdf454380d7e20a08ec/mlrl_common-0.11.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a22e9458be867c7d0eab79ceac0f919f77fcb86a6b6905cad4f4f7da115d73ec",
"md5": "442c401e4b4838369a414c52945baa18",
"sha256": "9e06ca94dfb58249efc64aa284f147189b134e6c6c931114d072f852021cfe2d"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "442c401e4b4838369a414c52945baa18",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 1449213,
"upload_time": "2024-09-24T21:31:58",
"upload_time_iso_8601": "2024-09-24T21:31:58.153889Z",
"url": "https://files.pythonhosted.org/packages/a2/2e/9458be867c7d0eab79ceac0f919f77fcb86a6b6905cad4f4f7da115d73ec/mlrl_common-0.11.1-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5759535c961639347e1e683e628300089664ad33fe686949107c41d095e870f8",
"md5": "e6c89747d8732c6fac5d337408a6476b",
"sha256": "1309d3943b5bab7eaec368d4185948c428b3cfca76f152cbb535484efd5732c1"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp311-cp311-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "e6c89747d8732c6fac5d337408a6476b",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.10",
"size": 1907675,
"upload_time": "2024-09-24T21:23:59",
"upload_time_iso_8601": "2024-09-24T21:23:59.368033Z",
"url": "https://files.pythonhosted.org/packages/57/59/535c961639347e1e683e628300089664ad33fe686949107c41d095e870f8/mlrl_common-0.11.1-cp311-cp311-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "07c7280d99a1f2d0e09240f47a6a837d664efccfd21b40b17a2a47271aa5f202",
"md5": "fbd4e038c2c5313350d6541b6a8c73a1",
"sha256": "4900c5632cee6e8557ab1193e55de385cf91a5fd08d2fe62ae7540ebef821777"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "fbd4e038c2c5313350d6541b6a8c73a1",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.10",
"size": 2781285,
"upload_time": "2024-09-24T21:26:47",
"upload_time_iso_8601": "2024-09-24T21:26:47.421697Z",
"url": "https://files.pythonhosted.org/packages/07/c7/280d99a1f2d0e09240f47a6a837d664efccfd21b40b17a2a47271aa5f202/mlrl_common-0.11.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "892032efbc9ae24e54672406b4ba131d873b5a075fb90cf36f60cef97e75e5fd",
"md5": "39ce29e5ef4347bcddfce4594851df7b",
"sha256": "89704be855e6eb1bf348690d54c319ddd765871329c6670089bdbbf4e7d861c1"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "39ce29e5ef4347bcddfce4594851df7b",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.10",
"size": 1432339,
"upload_time": "2024-09-24T21:32:00",
"upload_time_iso_8601": "2024-09-24T21:32:00.038294Z",
"url": "https://files.pythonhosted.org/packages/89/20/32efbc9ae24e54672406b4ba131d873b5a075fb90cf36f60cef97e75e5fd/mlrl_common-0.11.1-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6b601ae1449e58d5f02f78094bedc91ba7bb8f288dc6d67d8c898369c9156b24",
"md5": "ba37667f435274f7fbd1c02d44b0add8",
"sha256": "2504cb3284fecd14ef1b88e0826850287fb3988c55ebedcf770864b662909fa3"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp312-cp312-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "ba37667f435274f7fbd1c02d44b0add8",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.10",
"size": 1937671,
"upload_time": "2024-09-24T21:24:01",
"upload_time_iso_8601": "2024-09-24T21:24:01.190057Z",
"url": "https://files.pythonhosted.org/packages/6b/60/1ae1449e58d5f02f78094bedc91ba7bb8f288dc6d67d8c898369c9156b24/mlrl_common-0.11.1-cp312-cp312-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ffaf3a07e74472b5c12cbec8475290ac3cbcaa2c0636e1ff8ea922c6867c69b1",
"md5": "1045f60d0071275c58e332f9a21e7189",
"sha256": "7a24eb6262b72d3c172e11cecf96cb4b74282f2c684a75240288306e92f08338"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "1045f60d0071275c58e332f9a21e7189",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.10",
"size": 2785971,
"upload_time": "2024-09-24T21:26:49",
"upload_time_iso_8601": "2024-09-24T21:26:49.248445Z",
"url": "https://files.pythonhosted.org/packages/ff/af/3a07e74472b5c12cbec8475290ac3cbcaa2c0636e1ff8ea922c6867c69b1/mlrl_common-0.11.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5d6b1a61b7bd5a4c892ac1fecd881bbb9a006142df2b1fff700dec90797a7261",
"md5": "642ff3e1985d8af572f8d0e41ee80a80",
"sha256": "c7979dad77222ae5ff238f2369279cd235ec7139fab2b732089c9d245dc22537"
},
"downloads": -1,
"filename": "mlrl_common-0.11.1-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "642ff3e1985d8af572f8d0e41ee80a80",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.10",
"size": 1444142,
"upload_time": "2024-09-24T21:32:01",
"upload_time_iso_8601": "2024-09-24T21:32:01.773461Z",
"url": "https://files.pythonhosted.org/packages/5d/6b/1a61b7bd5a4c892ac1fecd881bbb9a006142df2b1fff700dec90797a7261/mlrl_common-0.11.1-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-24 21:23:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "mrapp-ke",
"github_project": "MLRL-Boomer",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "mlrl-common"
}