rbo


Namerbo JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/changyaochen/rbo
SummarySimple library to calculate Rank-biased Overlap between two lists
upload_time2023-01-31 02:59:25
maintainer
docs_urlNone
authorChangyao Chen
requires_python>=3.7,<4.0
licenseMIT
keywords rbo
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Rank-biased Overlap (RBO)
[![CircleCI](https://circleci.com/gh/changyaochen/rbo/tree/master.svg?style=svg)](https://circleci.com/gh/changyaochen/rbo/tree/master)
[![PyPI version](https://badge.fury.io/py/rbo.svg)](https://badge.fury.io/py/rbo)

This project contains a Python implementation of Rank-Biased Overlap (RBO) from: Webber, William, Alistair Moffat, and Justin Zobel. "A similarity measure for indefinite rankings." ACM Transactions on Information Systems (TOIS) 28.4 (2010): 20." ([Download][paper]).

- [Rank-biased Overlap (RBO)](#rank-biased-overlap-rbo)
  - [Introduction](#introduction)
  - [Usage](#usage)
    - [Installation using pip](#installation-using-pip)
    - [Computing RBO](#computing-rbo)
    - [Computing extrapolated RBO](#computing-extrapolated-rbo)
- [Development](#development)

## Introduction

> For a more general introduction, please refer to this blog [post](https://changyaochen.github.io/Comparing-two-ranked-lists/).

RBO compares two ranked lists, and returns a numeric value between zero and one to quantify their similarity.
A RBO value of zero indicates the lists are completely different, and a RBO of one means completely identical. The terms 'different' and 'identical' require a little more clarification.

Given two ranked lists:

    A = ["a", "b", "c", "d", "e"]
    B = ["e", "d", "c", "b", "a"]

We can see that both of them rank 5 items ("a", "b", "c", "d" and "e"), but with completely opposite order. In this case the similarity between `A` and `B` should be larger than 0 (as they contain the same items, namely, conjoint), but smaller than 1 (as the order of the items are different). If there is third ranked list

    C = ["f", "g", "h", "i", "j"]

which ranks 5 totally different items, then if we ask for the similarity between `A` and `C`, we should expect a value of 0. In such a non-conjoint case, we need to be able to calculate a similarity as well.

The RBO measure can handle ranked lists with different lengths as well, with proper extrapolation. For example, the RBO between the list `A` and list

    D = ["a", "b", "c", "d", "e", "f", "g"]

will be 1.


## Usage

### Installation using pip

To install the RBO module to the current interpreter with Pip:

    pip install rbo


### Computing RBO

The `RankingSimilarity` class contains the calculation for the different flavours of RBO, with clear reference to the corresponding equations in the paper.
Below shows how to compute the similarity of two ranked lists S and T:

```python
In [1]: import rbo

In [2]: S = [1, 2, 3]

In [3]: T = [1, 3, 2]

In [4]: rbo.RankingSimilarity(S, T).rbo()
Out[4]: 0.8333333333333334
```

Accepted data types are Python lists and Numpy arrays.
Using Pandas series is possible using the underlying Numpy array as shown below. This restriction is necessary, because using `[]` on a Pandas series queries the index, which might not number items contiguously, or might even be non-numeric.

```python
In [1]: import pandas as pd

In [2]: import rbo

In [3]: S = [1, 2, 3]

In [4]: U = pd.Series([1, 3, 2])

In [5]: rbo.RankingSimilarity(S, U.values).rbo()
Out[5]: 0.8333333333333334
```

### Computing extrapolated RBO
There is an extension of the vanilla RBO implementation, in which we extrapolate from the visible lists, and assume that the degree of agreement seen up to depth $k$ is continued indefinitely.

This extrapolated version is implemented as the `RankingSimilarity.rbo_ext()` method.


# Development

Refer to the Makefile for supplementary tasks to development, e.g., executing unit tests, or checking for proper packaging.
Please let [me][contact] know if there is any issue.

[contact]: mailto:changyao.chen@gmail.com
[paper]: http://w.codalism.com/research/papers/wmz10_tois.pdf

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/changyaochen/rbo",
    "name": "rbo",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7,<4.0",
    "maintainer_email": "",
    "keywords": "rbo",
    "author": "Changyao Chen",
    "author_email": "changyao.chen@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/bb/46/68f4b51550bb00bcced190518bf8ffa1172aa0c41aa5c9efb7cf98138746/rbo-0.1.3.tar.gz",
    "platform": null,
    "description": "# Rank-biased Overlap (RBO)\n[![CircleCI](https://circleci.com/gh/changyaochen/rbo/tree/master.svg?style=svg)](https://circleci.com/gh/changyaochen/rbo/tree/master)\n[![PyPI version](https://badge.fury.io/py/rbo.svg)](https://badge.fury.io/py/rbo)\n\nThis project contains a Python implementation of Rank-Biased Overlap (RBO) from: Webber, William, Alistair Moffat, and Justin Zobel. \"A similarity measure for indefinite rankings.\" ACM Transactions on Information Systems (TOIS) 28.4 (2010): 20.\" ([Download][paper]).\n\n- [Rank-biased Overlap (RBO)](#rank-biased-overlap-rbo)\n  - [Introduction](#introduction)\n  - [Usage](#usage)\n    - [Installation using pip](#installation-using-pip)\n    - [Computing RBO](#computing-rbo)\n    - [Computing extrapolated RBO](#computing-extrapolated-rbo)\n- [Development](#development)\n\n## Introduction\n\n> For a more general introduction, please refer to this blog [post](https://changyaochen.github.io/Comparing-two-ranked-lists/).\n\nRBO compares two ranked lists, and returns a numeric value between zero and one to quantify their similarity.\nA RBO value of zero indicates the lists are completely different, and a RBO of one means completely identical. The terms 'different' and 'identical' require a little more clarification.\n\nGiven two ranked lists:\n\n    A = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n    B = [\"e\", \"d\", \"c\", \"b\", \"a\"]\n\nWe can see that both of them rank 5 items (\"a\", \"b\", \"c\", \"d\" and \"e\"), but with completely opposite order. In this case the similarity between `A` and `B` should be larger than 0 (as they contain the same items, namely, conjoint), but smaller than 1 (as the order of the items are different). If there is third ranked list\n\n    C = [\"f\", \"g\", \"h\", \"i\", \"j\"]\n\nwhich ranks 5 totally different items, then if we ask for the similarity between `A` and `C`, we should expect a value of 0. In such a non-conjoint case, we need to be able to calculate a similarity as well.\n\nThe RBO measure can handle ranked lists with different lengths as well, with proper extrapolation. For example, the RBO between the list `A` and list\n\n    D = [\"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\"]\n\nwill be 1.\n\n\n## Usage\n\n### Installation using pip\n\nTo install the RBO module to the current interpreter with Pip:\n\n    pip install rbo\n\n\n### Computing RBO\n\nThe `RankingSimilarity` class contains the calculation for the different flavours of RBO, with clear reference to the corresponding equations in the paper.\nBelow shows how to compute the similarity of two ranked lists S and T:\n\n```python\nIn [1]: import rbo\n\nIn [2]: S = [1, 2, 3]\n\nIn [3]: T = [1, 3, 2]\n\nIn [4]: rbo.RankingSimilarity(S, T).rbo()\nOut[4]: 0.8333333333333334\n```\n\nAccepted data types are Python lists and Numpy arrays.\nUsing Pandas series is possible using the underlying Numpy array as shown below. This restriction is necessary, because using `[]` on a Pandas series queries the index, which might not number items contiguously, or might even be non-numeric.\n\n```python\nIn [1]: import pandas as pd\n\nIn [2]: import rbo\n\nIn [3]: S = [1, 2, 3]\n\nIn [4]: U = pd.Series([1, 3, 2])\n\nIn [5]: rbo.RankingSimilarity(S, U.values).rbo()\nOut[5]: 0.8333333333333334\n```\n\n### Computing extrapolated RBO\nThere is an extension of the vanilla RBO implementation, in which we extrapolate from the visible lists, and assume that the degree of agreement seen up to depth $k$ is continued indefinitely.\n\nThis extrapolated version is implemented as the `RankingSimilarity.rbo_ext()` method.\n\n\n# Development\n\nRefer to the Makefile for supplementary tasks to development, e.g., executing unit tests, or checking for proper packaging.\nPlease let [me][contact] know if there is any issue.\n\n[contact]: mailto:changyao.chen@gmail.com\n[paper]: http://w.codalism.com/research/papers/wmz10_tois.pdf\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Simple library to calculate Rank-biased Overlap between two lists",
    "version": "0.1.3",
    "split_keywords": [
        "rbo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f0b3aa1923e0ed19ecf190f7e8d9fe939f9020dd601b64e190b1f58b3692be8e",
                "md5": "08160964893e536a78967f13482a1b6f",
                "sha256": "9f5b90bdca6c91e05126112d5ff3625b27835981f7da68e5143bf01120175a1f"
            },
            "downloads": -1,
            "filename": "rbo-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "08160964893e536a78967f13482a1b6f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7,<4.0",
            "size": 7811,
            "upload_time": "2023-01-31T02:59:24",
            "upload_time_iso_8601": "2023-01-31T02:59:24.185522Z",
            "url": "https://files.pythonhosted.org/packages/f0/b3/aa1923e0ed19ecf190f7e8d9fe939f9020dd601b64e190b1f58b3692be8e/rbo-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bb4668f4b51550bb00bcced190518bf8ffa1172aa0c41aa5c9efb7cf98138746",
                "md5": "68d1af3373271f9f5f3f67259634dac8",
                "sha256": "14410a38d1d5b26c6e2841098f81d3771f324d27d9cb3dc1ae53f467d845d30f"
            },
            "downloads": -1,
            "filename": "rbo-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "68d1af3373271f9f5f3f67259634dac8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7,<4.0",
            "size": 7052,
            "upload_time": "2023-01-31T02:59:25",
            "upload_time_iso_8601": "2023-01-31T02:59:25.288026Z",
            "url": "https://files.pythonhosted.org/packages/bb/46/68f4b51550bb00bcced190518bf8ffa1172aa0c41aa5c9efb7cf98138746/rbo-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-31 02:59:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "changyaochen",
    "github_project": "rbo",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "circle": true,
    "lcname": "rbo"
}
        
Elapsed time: 0.03812s