example-wise-f1-maximizer


Nameexample-wise-f1-maximizer JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/mrapp-ke/ExampleWiseF1Maximizer
SummaryA scikit-learn meta-estimator for multi-label classification that aims to maximize the example-wise F1 measure
upload_time2023-06-16 11:15:57
maintainer
docs_urlNone
authorMichael Rapp
requires_python>=3.7
licenseMIT
keywords machine learning scikit-learn multi-label classification
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Example-wise F1 Maximizer

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://badge.fury.io/py/example-wise-f1-maximizer.svg)](https://badge.fury.io/py/example-wise-f1-maximizer)

**Important links:** [Issue Tracker](https://github.com/mrapp-ke/ExampleWiseF1Maximizer/issues) | [Changelog](CHANGELOG.md) | [Code of Conduct](CODE_OF_CONDUCT.md)

This software package provides an implementation of a meta-learning algorithm for multi-label classification that aims to maximize the example-wise F1-measure. It integrates with the popular [scikit-learn](https://scikit-learn.org) machine learning framework and can also be used with frameworks for multi-label classification like [scikit-multilearn](http://scikit.ml).

The goal of [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification) is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics. The example-wise [F1-measure](https://en.wikipedia.org/wiki/F-score) is a particularly relevant evaluation measure for this kind of predictions, as it requires a classifier to achieve a good balance between labels predicted as relevant or irrelevant for an example, i.e., it must neither be to conservative nor to aggressive when it comes to predicting labels as relevant.

## Methodology

The algorithm implemented by this project transforms an original multi-label problem with `n` labels into a series of `n * n + 1` binary classification problems. A probabilistic base estimator is then fit to each of these independent sub-problems as described in the following [paper](http://proceedings.mlr.press/v119/zhang20w/zhang20w.pdf):

*Mingyuan Zhan, Harish G. Ramaswamy, and Shivani Agarwal. Convex Calibrated Surrogates for the Multi-Label F-Measure. In: Proceedings of the International Conference on Machine Learning (ICML), 2020.*
    
The probabilities predicted by the individual base estimators for unseen examples consitute a `n x n` probability matrix `p`, as well as an additional probability `p_0`. Whereas `p_0` corresponds to the prior probability of the null vector, i.e., a label vector that does not contain any relevant labels, each probability `p_ik` at the `i`-th row and `k`-th column of `p` corresponds to the conditional probability of a label vector with `k` relevant labels, where the `i`-th label is relevant. In order to identify the label vector that maximizes the F1-measure in expectation, these probabilities are used as inputs to the "General F-Measure maximizer" (GFM), as proposed in the following [paper](https://proceedings.neurips.cc/paper/2011/file/71ad16ad2c4d81f348082ff6c4b20768-Paper.pdf):

*Krzysztof Dembczyński, Willem Waegeman, Weiwei Cheng, and Eyke Hüllermeier. An Exact Algorithm for F-Measure Maximization. In: Advances in Neural Information Processing Systems, 2011.*

**Please note that this implementation has not been written by any of the authors shown above.**

## Documentation

### Installation

The software package is available at [PiPy](https://pypi.org/project/example-wise-f1-maximizer/) and can easily be installed via PIP using the following command:

```
pip install example-wise-f1-maximizer
```

### Usage

To use the classifier in your own Python code, you need to import the class `ExampleWiseF1Maximizer`. It can be instantiated and used as shown below:

```python
from example_wise_f1_maximizer import ExampleWiseF1Maximizer
from sklearn.linear_model import LogisticRegression

clf = ExampleWiseF1Maximizer(estimator=LogisticRegression())
x = [[  1,  2,  3],  # Two training examples with three features
     [ 11, 12, 13]]
y = [[1, 0],  # Ground truth labels of each training example
     [0, 1]]
clf.fit(x, y)
pred = clf.predict(x)
```

The fit method accepts two inputs, `x` and `y`:

* A two-dimensional feature matrix `x`, where each row corresponds to a training example and each column corresponds to a particular feature.
* A two-dimensional binary label matrix `y`, where each row corresponds to a training examples and each column corresponds to a label. If an element in the matrix is unlike zero, it indicates that respective label is relevant to an example. Elements that are equal to zero denote irrevant labels.

Both, `x` and `y`, are expected to be [numpy arrays](https://numpy.org/doc/stable/reference/generated/numpy.array.html) or equivalent [array-like](https://scikit-learn.org/stable/glossary.html#term-array-like) data types. In particular, the use of [scipy sparse matrices](https://docs.scipy.org/doc/scipy/reference/sparse.html) is supported.

In the previous example, logistic regression as implemented by the class `LogisticRegression` from the scikit-learn framework is used as a base estimator. Alternatively, you can use any probabilistic estimator for binary classification that is compatible with the scikit-learn framework and implements the `predict_proba` function.

## License

This project is open source software licensed under the terms of the [MIT license](LICENSE.md). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience.

All contributions to the project and discussions on the [issue tracker](https://github.com/mrapp-ke/ExampleWiseF1Maximizer/issues) are expected to follow the [code of conduct](CODE_OF_CONDUCT.md).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mrapp-ke/ExampleWiseF1Maximizer",
    "name": "example-wise-f1-maximizer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "machine learning,scikit-learn,multi-label classification",
    "author": "Michael Rapp",
    "author_email": "michael.rapp.ml@gmail.com",
    "download_url": "https://github.com/mrapp-ke/ExampleWiseF1Maximizer/releases",
    "platform": "any",
    "description": "# Example-wise F1 Maximizer\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI version](https://badge.fury.io/py/example-wise-f1-maximizer.svg)](https://badge.fury.io/py/example-wise-f1-maximizer)\n\n**Important links:** [Issue Tracker](https://github.com/mrapp-ke/ExampleWiseF1Maximizer/issues) | [Changelog](CHANGELOG.md) | [Code of Conduct](CODE_OF_CONDUCT.md)\n\nThis software package provides an implementation of a meta-learning algorithm for multi-label classification that aims to maximize the example-wise F1-measure. It integrates with the popular [scikit-learn](https://scikit-learn.org) machine learning framework and can also be used with frameworks for multi-label classification like [scikit-multilearn](http://scikit.ml).\n\nThe goal of [multi-label classification](https://en.wikipedia.org/wiki/Multi-label_classification) is the automatic assignment of sets of labels to individual data points, for example, the annotation of text documents with topics. The example-wise [F1-measure](https://en.wikipedia.org/wiki/F-score) is a particularly relevant evaluation measure for this kind of predictions, as it requires a classifier to achieve a good balance between labels predicted as relevant or irrelevant for an example, i.e., it must neither be to conservative nor to aggressive when it comes to predicting labels as relevant.\n\n## Methodology\n\nThe algorithm implemented by this project transforms an original multi-label problem with `n` labels into a series of `n * n + 1` binary classification problems. A probabilistic base estimator is then fit to each of these independent sub-problems as described in the following [paper](http://proceedings.mlr.press/v119/zhang20w/zhang20w.pdf):\n\n*Mingyuan Zhan, Harish G. Ramaswamy, and Shivani Agarwal. Convex Calibrated Surrogates for the Multi-Label F-Measure. In: Proceedings of the International Conference on Machine Learning (ICML), 2020.*\n    \nThe probabilities predicted by the individual base estimators for unseen examples consitute a `n x n` probability matrix `p`, as well as an additional probability `p_0`. Whereas `p_0` corresponds to the prior probability of the null vector, i.e., a label vector that does not contain any relevant labels, each probability `p_ik` at the `i`-th row and `k`-th column of `p` corresponds to the conditional probability of a label vector with `k` relevant labels, where the `i`-th label is relevant. In order to identify the label vector that maximizes the F1-measure in expectation, these probabilities are used as inputs to the \"General F-Measure maximizer\" (GFM), as proposed in the following [paper](https://proceedings.neurips.cc/paper/2011/file/71ad16ad2c4d81f348082ff6c4b20768-Paper.pdf):\n\n*Krzysztof Dembczy\u0144ski, Willem Waegeman, Weiwei Cheng, and Eyke H\u00fcllermeier. An Exact Algorithm for F-Measure Maximization. In: Advances in Neural Information Processing Systems, 2011.*\n\n**Please note that this implementation has not been written by any of the authors shown above.**\n\n## Documentation\n\n### Installation\n\nThe software package is available at [PiPy](https://pypi.org/project/example-wise-f1-maximizer/) and can easily be installed via PIP using the following command:\n\n```\npip install example-wise-f1-maximizer\n```\n\n### Usage\n\nTo use the classifier in your own Python code, you need to import the class `ExampleWiseF1Maximizer`. It can be instantiated and used as shown below:\n\n```python\nfrom example_wise_f1_maximizer import ExampleWiseF1Maximizer\nfrom sklearn.linear_model import LogisticRegression\n\nclf = ExampleWiseF1Maximizer(estimator=LogisticRegression())\nx = [[  1,  2,  3],  # Two training examples with three features\n     [ 11, 12, 13]]\ny = [[1, 0],  # Ground truth labels of each training example\n     [0, 1]]\nclf.fit(x, y)\npred = clf.predict(x)\n```\n\nThe fit method accepts two inputs, `x` and `y`:\n\n* A two-dimensional feature matrix `x`, where each row corresponds to a training example and each column corresponds to a particular feature.\n* A two-dimensional binary label matrix `y`, where each row corresponds to a training examples and each column corresponds to a label. If an element in the matrix is unlike zero, it indicates that respective label is relevant to an example. Elements that are equal to zero denote irrevant labels.\n\nBoth, `x` and `y`, are expected to be [numpy arrays](https://numpy.org/doc/stable/reference/generated/numpy.array.html) or equivalent [array-like](https://scikit-learn.org/stable/glossary.html#term-array-like) data types. In particular, the use of [scipy sparse matrices](https://docs.scipy.org/doc/scipy/reference/sparse.html) is supported.\n\nIn the previous example, logistic regression as implemented by the class `LogisticRegression` from the scikit-learn framework is used as a base estimator. Alternatively, you can use any probabilistic estimator for binary classification that is compatible with the scikit-learn framework and implements the `predict_proba` function.\n\n## License\n\nThis project is open source software licensed under the terms of the [MIT license](LICENSE.md). We welcome contributions to the project to enhance its functionality and make it more accessible to a broader audience.\n\nAll contributions to the project and discussions on the [issue tracker](https://github.com/mrapp-ke/ExampleWiseF1Maximizer/issues) are expected to follow the [code of conduct](CODE_OF_CONDUCT.md).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A scikit-learn meta-estimator for multi-label classification that aims to maximize the example-wise F1 measure",
    "version": "0.1.5",
    "project_urls": {
        "Download": "https://github.com/mrapp-ke/ExampleWiseF1Maximizer/releases",
        "Homepage": "https://github.com/mrapp-ke/ExampleWiseF1Maximizer",
        "Issue Tracker": "https://github.com/mrapp-ke/ExampleWiseF1Maximizer/issues"
    },
    "split_keywords": [
        "machine learning",
        "scikit-learn",
        "multi-label classification"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cb5a81a986125c9ed4c4d77aeded7928a8716f2ed7450da57422303e48880897",
                "md5": "0ae13395ee7e889006f9614b4ff4ceda",
                "sha256": "5268c66962e4162d94335dd693a142c913092bd4da6207c563d2ce6fc7c3f869"
            },
            "downloads": -1,
            "filename": "example_wise_f1_maximizer-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0ae13395ee7e889006f9614b4ff4ceda",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 7894,
            "upload_time": "2023-06-16T11:15:57",
            "upload_time_iso_8601": "2023-06-16T11:15:57.375186Z",
            "url": "https://files.pythonhosted.org/packages/cb/5a/81a986125c9ed4c4d77aeded7928a8716f2ed7450da57422303e48880897/example_wise_f1_maximizer-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-16 11:15:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mrapp-ke",
    "github_project": "ExampleWiseF1Maximizer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "example-wise-f1-maximizer"
}
        
Elapsed time: 0.07861s