xcolumns


Namexcolumns JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA small library for Consistent Optimization of Label-wise Utilities in Multi-label clasifficatioN
upload_time2024-04-06 23:34:22
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License
keywords machine learning multi-label classification performance metrics label-wise utilities classification metrics macro-measures optimization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI version](https://badge.fury.io/py/xcolumns.svg)](https://badge.fury.io/py/xcolumns)
[![Documentation Status](https://readthedocs.org/projects/xcolumns/badge/?version=latest)](https://xcolumns.readthedocs.io/en/latest/?badge=latest)
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://pre-commit.com/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)


<p align="center">
  <img src="https://raw.githubusercontent.com/mwydmuch/xCOLUMNs/master/docs/_static/xCOLUMNs_logo.png" width="500px"/>
</p>

# x **Consistent Optimization of Label-wise Utilities in Multi-label classificatioN** s

xCOLUMNs is a small Python library that aims to implement different methods for the optimization of a general family of
metrics that can be defined on multi-label classification matrices.
These include, but are not limited to, label-wise metrics.
The library provides an efficient implementation of the different optimization methods that easily scale to the extreme multi-label classification (XMLC) - problems with a very large number of labels and instances.

All the methods operate on conditional probability estimates of the labels, which are the output of the multi-label classification models.
Based on these estimates, the methods aim to find the optimal prediction for a given test set or to find the optimal population classifier as a plug-in rule on top of the conditional probability estimator.
This makes the library very flexible and allows to use it with any multi-label classification model that provides conditional probability estimates.
The library directly supports numpy arrays, PyTorch tensors, and sparse CSR matrices from scipy as input/output data types.

For more details, please see our short usage guide, the documentation, and/or the papers that describe the methods implemented in the library.


## Quick start

### Installation

The library can be installed using pip:
```sh
pip install xcolumns
```
It should work on all major platforms (Linux, macOS, Windows) and with Python 3.8+.


### Usage

We provide a short usage guide for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.
You can also check the documentation for more details.


## Methods, usage, and how to cite

The library implements the following methods:

### Instance-wise weighted prediction

The library implements a set of methods for instance-wise weighted prediction, that include optimal prediction strategies for different metrics, such as:
- Precision at k
- Propensity-scored precision at k
- Macro-averaged recall at k
- Macro-averaged balanced accuracy at k
- and others ...

### Optimization of prediction for a given test set using Block Coordinate Ascent/Descent (BCA/BCD)

The method aims to optimize the prediction for a given test set using the block coordinate ascent/descent algorithm.

The method was first introduced and described in the paper:
> [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)

### Finding optimal population classifier via Frank-Wolfe (FW)

The method was first introduced and described in the paper:
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)


## Repository structure

The repository is organized as follows:
- `docs/` - Sphinx documentation (work in progress)
- `experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for details
- `xcolumns/` - Python package with the library
- `tests/` - tests for the library (the coverage is bit limited at the moment, but these test should guarantee that the main components of the library works as expected)


## Development and contributing

The library was created as a part of our research projects.
We are happy to share it with the community and we hope that someone will find it useful.
If you have any questions or suggestions or if you found a bug, please open an issue.
We are also happy to accept contributions in the form of pull requests.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "xcolumns",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "machine learning, multi-label classification, performance metrics, label-wise utilities, classification metrics, macro-measures, optimization",
    "author": null,
    "author_email": "\"Marek Wydmuch, Erik Schultheis, Wojciech Kot\u0142owski, Rohit Babbar, Krzysztof Dembczy\u0144ski\" <mwydmuch@cs.put.poznan.pl>",
    "download_url": "https://files.pythonhosted.org/packages/cf/01/69c473153bbcd3aaae375b8e57b0035afe3b17e1ba65462faff5377a2304/xcolumns-0.0.2.tar.gz",
    "platform": null,
    "description": "[![PyPI version](https://badge.fury.io/py/xcolumns.svg)](https://badge.fury.io/py/xcolumns)\n[![Documentation Status](https://readthedocs.org/projects/xcolumns/badge/?version=latest)](https://xcolumns.readthedocs.io/en/latest/?badge=latest)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://pre-commit.com/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n\n<p align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/mwydmuch/xCOLUMNs/master/docs/_static/xCOLUMNs_logo.png\" width=\"500px\"/>\n</p>\n\n# x **Consistent Optimization of Label-wise Utilities in Multi-label classificatioN** s\n\nxCOLUMNs is a small Python library that aims to implement different methods for the optimization of a general family of\nmetrics that can be defined on multi-label classification matrices.\nThese include, but are not limited to, label-wise metrics.\nThe library provides an efficient implementation of the different optimization methods that easily scale to the extreme multi-label classification (XMLC) - problems with a very large number of labels and instances.\n\nAll the methods operate on conditional probability estimates of the labels, which are the output of the multi-label classification models.\nBased on these estimates, the methods aim to find the optimal prediction for a given test set or to find the optimal population classifier as a plug-in rule on top of the conditional probability estimator.\nThis makes the library very flexible and allows to use it with any multi-label classification model that provides conditional probability estimates.\nThe library directly supports numpy arrays, PyTorch tensors, and sparse CSR matrices from scipy as input/output data types.\n\nFor more details, please see our short usage guide, the documentation, and/or the papers that describe the methods implemented in the library.\n\n\n## Quick start\n\n### Installation\n\nThe library can be installed using pip:\n```sh\npip install xcolumns\n```\nIt should work on all major platforms (Linux, macOS, Windows) and with Python 3.8+.\n\n\n### Usage\n\nWe provide a short usage guide for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.\nYou can also check the documentation for more details.\n\n\n## Methods, usage, and how to cite\n\nThe library implements the following methods:\n\n### Instance-wise weighted prediction\n\nThe library implements a set of methods for instance-wise weighted prediction, that include optimal prediction strategies for different metrics, such as:\n- Precision at k\n- Propensity-scored precision at k\n- Macro-averaged recall at k\n- Macro-averaged balanced accuracy at k\n- and others ...\n\n### Optimization of prediction for a given test set using Block Coordinate Ascent/Descent (BCA/BCD)\n\nThe method aims to optimize the prediction for a given test set using the block coordinate ascent/descent algorithm.\n\nThe method was first introduced and described in the paper:\n> [Erik Schultheis, Marek Wydmuch, Wojciech Kot\u0142owski, Rohit Babbar, Krzysztof Dembczy\u0144ski. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)\n\n### Finding optimal population classifier via Frank-Wolfe (FW)\n\nThe method was first introduced and described in the paper:\n> [Erik Schultheis, Wojciech Kot\u0142owski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczy\u0144ski. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)\n\n\n## Repository structure\n\nThe repository is organized as follows:\n- `docs/` - Sphinx documentation (work in progress)\n- `experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for details\n- `xcolumns/` - Python package with the library\n- `tests/` - tests for the library (the coverage is bit limited at the moment, but these test should guarantee that the main components of the library works as expected)\n\n\n## Development and contributing\n\nThe library was created as a part of our research projects.\nWe are happy to share it with the community and we hope that someone will find it useful.\nIf you have any questions or suggestions or if you found a bug, please open an issue.\nWe are also happy to accept contributions in the form of pull requests.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A small library for Consistent Optimization of Label-wise Utilities in Multi-label clasifficatioN",
    "version": "0.0.2",
    "project_urls": {
        "Bug Report": "https://github.com/mwydmuch/xcolumns/issues",
        "Documentation": "https://github.com/mwydmuch/xcolumns",
        "Homepage": "https://github.com/mwydmuch/xcolumns",
        "Repository": "https://github.com/mwydmuch/xcolumns"
    },
    "split_keywords": [
        "machine learning",
        " multi-label classification",
        " performance metrics",
        " label-wise utilities",
        " classification metrics",
        " macro-measures",
        " optimization"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cf0169c473153bbcd3aaae375b8e57b0035afe3b17e1ba65462faff5377a2304",
                "md5": "dc181187b68eabe81d1abf8acdb5dd58",
                "sha256": "52985ba3ac946a044043f69aa83068e7ae212871b5eaaae56124c35d2b7e9281"
            },
            "downloads": -1,
            "filename": "xcolumns-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "dc181187b68eabe81d1abf8acdb5dd58",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 31560,
            "upload_time": "2024-04-06T23:34:22",
            "upload_time_iso_8601": "2024-04-06T23:34:22.688806Z",
            "url": "https://files.pythonhosted.org/packages/cf/01/69c473153bbcd3aaae375b8e57b0035afe3b17e1ba65462faff5377a2304/xcolumns-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-06 23:34:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mwydmuch",
    "github_project": "xcolumns",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "xcolumns"
}
        
Elapsed time: 2.59941s