sortedness


Namesortedness JSON
Version 2.231027.1 PyPI version JSON
download
home_page
SummaryMeasures of projection quality
upload_time2023-10-28 00:16:57
maintainer
docs_urlNone
authordavips
requires_python>=3.10,<3.13
licenseGPLv3
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![test](https://github.com/sortedness/sortedness/workflows/test/badge.svg)
[![codecov](https://codecov.io/gh/sortedness/sortedness/branch/main/graph/badge.svg)](https://codecov.io/gh/sortedness/sortedness)
<a href="https://pypi.org/project/sortedness">
<img src="https://img.shields.io/github/v/release/sortedness/sortedness?display_name=tag&sort=semver&color=blue" alt="github">
</a>
![Python version](https://img.shields.io/badge/python-3.8+-blue.svg)
[![license: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)

<!-- [![arXiv](https://img.shields.io/badge/arXiv-2109.06028-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2109.06028) --->
[![API documentation](https://img.shields.io/badge/doc-API%20%28auto%29-a0a0a0.svg)](https://sortedness.github.io/sortedness)
[![DOI](https://zenodo.org/badge/513273889.svg)](https://zenodo.org/badge/latestdoi/513273889)
[![Downloads](https://static.pepy.tech/badge/sortedness)](https://pepy.tech/project/sortedness)
![PyPI - Downloads](https://img.shields.io/pypi/dm/sortedness)


# sortedness

`sortedness` is the level of agreement between two points regarding to how they rank all remaining points in a dataset.
This is valid even for points from different spaces, enabling the measurement of the quality of data transformation processes, often dimensionality reduction.
It is less sensitive to irrelevant distortions, and return values in a more meaningful interval, than Kruskal stress formula I.
<br>This [Python library](https://pypi.org/project/sortedness) / [code](https://github.com/sortedness/sortedness) provides a reference implementation for the functions presented [here (paper unavailable until publication)](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Nonparametric+Dimensionality+Reduction+Quality+Assessment+based+on+Sortedness+of+Unrestricted+Neighborhood&btnG=).

## Overview
Local variants return a value for each provided point. The global variant returns a single value for all points.
Any local variant can be used as a global measure by taking the mean value.

Local variants: `sortedness(X, X_)`, `pwsortedness(X, X_)`, `rsortedness(X, X_)`.

Global variant: `global_sortedness(X, X_)`.

## Python installation
### from package through pip
```bash
# Set up a virtualenv. 
python3 -m venv venv
source venv/bin/activate

# Install from PyPI
pip install -U sortedness
```

### from source
```bash
git clone https://github.com/sortedness/sortedness
cd sortedness
poetry install
```


### Examples

**Sortedness**
<details>
<p>

```python3

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import sortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print `min`, `mean`, and `max` values.
s = sortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
```

```python3

s = sortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
```

```python3

s = sortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.3934632246658146 0.7565797804350681 0.944810120533741
"""
```

```python3

s = sortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.6483054795666044 -0.09539895194976367 0.3970195075915949
"""
```

```python3

# Single point fast calculation.
s = sortedness(original, projectedrnd, 2)
print(s)
"""
0.23107954749077175
"""
```


</p>
</details>

**Pairwise sortedness**
<details>
<p>

```python3

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import pwsortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print `min`, `mean`, and `max` values.
s = pwsortedness(original, original)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
```

```python3

s = pwsortedness(original, projected2)
print(min(s), sum(s) / len(s), max(s))
"""
1.0 1.0 1.0
"""
```

```python3

s = pwsortedness(original, projected1)
print(min(s), sum(s) / len(s), max(s))
"""
0.649315577592 0.7534291438324999 0.834601601062
"""
```

```python3

s = pwsortedness(original, projectedrnd)
print(min(s), sum(s) / len(s), max(s))
"""
-0.168611098044 -0.07988253899783333 0.14442446342
"""
```

```python3

# Single point fast calculation.
s = pwsortedness(original, projectedrnd, 2)
print(s)
"""
0.036119718802
"""
```


</p>
</details>

**Global pairwise sortedness**
<details>
<p>

```python3

import numpy as np
from numpy.random import permutation
from sklearn.decomposition import PCA

from sortedness import global_pwsortedness

# Some synthetic data.
mean = (1, 2)
cov = np.eye(2)
rng = np.random.default_rng(seed=0)
original = rng.multivariate_normal(mean, cov, size=12)
projected2 = PCA(n_components=2).fit_transform(original)
projected1 = PCA(n_components=1).fit_transform(original)
np.random.seed(0)
projectedrnd = permutation(original)

# Print measurement result and p-value.
s = global_pwsortedness(original, original)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
```

```python3

s = global_pwsortedness(original, projected2)
print(list(s))
"""
[1.0, 3.6741408919675163e-93]
"""
```

```python3

s = global_pwsortedness(original, projected1)
print(list(s))
"""
[0.7715617715617715, 5.240847664048334e-20]
"""
```

```python3

s = global_pwsortedness(original, projectedrnd)
print(list(s))
"""
[-0.06107226107226107, 0.46847188611226276]
"""
```


</p>
</details>


** Copyright (c) 2023. Davi Pereira dos Santos and Tacito Neves**


### TODO
Future work address handling large datasets: approximate sortedness value, and size-insensitive weighting scheme.

## Reference
Please use the following reference to cite this work:
```
@inproceedings {10.2312:eurova.20231093,
booktitle = {EuroVis Workshop on Visual Analytics (EuroVA)},
editor = {Angelini, Marco and El-Assady, Mennatallah},
title = {{Nonparametric Dimensionality Reduction Quality Assessment based on Sortedness of Unrestricted Neighborhood}},
author = {Pereira-Santos, Davi and Neves, Tácito Trindade Araújo Tiburtino and Carvalho, André C. P. L. F. de and Paulovich, Fernando V.},
year = {2023},
publisher = {The Eurographics Association},
ISSN = {2664-4487},
ISBN = {978-3-03868-222-6},
DOI = {10.2312/eurova.20231093}
}
```

## Grants
This work was supported by Wellcome Leap 1kD Program; São
Paulo Research Foundation (FAPESP) - grant 2020/09835-1; Canadian Institute for Health Research (CIHR) Canadian Research
Chairs (CRC) stipend [award number 1024586]; Canadian Foundation for Innovation (CFI) John R. Evans Leaders Fund (JELF)
[grant number 38835]; Dalhousie Medical Research Fund (DMRF)
COVID-19 Research Grant [grant number 603082]; and the Canadian Institute for Health Research (CIHR) Project Grant [award
number 177968].

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "sortedness",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<3.13",
    "maintainer_email": "",
    "keywords": "",
    "author": "davips",
    "author_email": "dpsabc@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/5b/d5/c983cf3bb735576c93ec9ce3191ae50462418cab943a3fd01b03a12c04a6/sortedness-2.231027.1.tar.gz",
    "platform": null,
    "description": "![test](https://github.com/sortedness/sortedness/workflows/test/badge.svg)\n[![codecov](https://codecov.io/gh/sortedness/sortedness/branch/main/graph/badge.svg)](https://codecov.io/gh/sortedness/sortedness)\n<a href=\"https://pypi.org/project/sortedness\">\n<img src=\"https://img.shields.io/github/v/release/sortedness/sortedness?display_name=tag&sort=semver&color=blue\" alt=\"github\">\n</a>\n![Python version](https://img.shields.io/badge/python-3.8+-blue.svg)\n[![license: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n\n<!-- [![arXiv](https://img.shields.io/badge/arXiv-2109.06028-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2109.06028) --->\n[![API documentation](https://img.shields.io/badge/doc-API%20%28auto%29-a0a0a0.svg)](https://sortedness.github.io/sortedness)\n[![DOI](https://zenodo.org/badge/513273889.svg)](https://zenodo.org/badge/latestdoi/513273889)\n[![Downloads](https://static.pepy.tech/badge/sortedness)](https://pepy.tech/project/sortedness)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/sortedness)\n\n\n# sortedness\n\n`sortedness` is the level of agreement between two points regarding to how they rank all remaining points in a dataset.\nThis is valid even for points from different spaces, enabling the measurement of the quality of data transformation processes, often dimensionality reduction.\nIt is less sensitive to irrelevant distortions, and return values in a more meaningful interval, than Kruskal stress formula I.\n<br>This [Python library](https://pypi.org/project/sortedness) / [code](https://github.com/sortedness/sortedness) provides a reference implementation for the functions presented [here (paper unavailable until publication)](https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=Nonparametric+Dimensionality+Reduction+Quality+Assessment+based+on+Sortedness+of+Unrestricted+Neighborhood&btnG=).\n\n## Overview\nLocal variants return a value for each provided point. The global variant returns a single value for all points.\nAny local variant can be used as a global measure by taking the mean value.\n\nLocal variants: `sortedness(X, X_)`, `pwsortedness(X, X_)`, `rsortedness(X, X_)`.\n\nGlobal variant: `global_sortedness(X, X_)`.\n\n## Python installation\n### from package through pip\n```bash\n# Set up a virtualenv. \npython3 -m venv venv\nsource venv/bin/activate\n\n# Install from PyPI\npip install -U sortedness\n```\n\n### from source\n```bash\ngit clone https://github.com/sortedness/sortedness\ncd sortedness\npoetry install\n```\n\n\n### Examples\n\n**Sortedness**\n<details>\n<p>\n\n```python3\n\nimport numpy as np\nfrom numpy.random import permutation\nfrom sklearn.decomposition import PCA\n\nfrom sortedness import sortedness\n\n# Some synthetic data.\nmean = (1, 2)\ncov = np.eye(2)\nrng = np.random.default_rng(seed=0)\noriginal = rng.multivariate_normal(mean, cov, size=12)\nprojected2 = PCA(n_components=2).fit_transform(original)\nprojected1 = PCA(n_components=1).fit_transform(original)\nnp.random.seed(0)\nprojectedrnd = permutation(original)\n\n# Print `min`, `mean`, and `max` values.\ns = sortedness(original, original)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n1.0 1.0 1.0\n\"\"\"\n```\n\n```python3\n\ns = sortedness(original, projected2)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n1.0 1.0 1.0\n\"\"\"\n```\n\n```python3\n\ns = sortedness(original, projected1)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n0.3934632246658146 0.7565797804350681 0.944810120533741\n\"\"\"\n```\n\n```python3\n\ns = sortedness(original, projectedrnd)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n-0.6483054795666044 -0.09539895194976367 0.3970195075915949\n\"\"\"\n```\n\n```python3\n\n# Single point fast calculation.\ns = sortedness(original, projectedrnd, 2)\nprint(s)\n\"\"\"\n0.23107954749077175\n\"\"\"\n```\n\n\n</p>\n</details>\n\n**Pairwise sortedness**\n<details>\n<p>\n\n```python3\n\nimport numpy as np\nfrom numpy.random import permutation\nfrom sklearn.decomposition import PCA\n\nfrom sortedness import pwsortedness\n\n# Some synthetic data.\nmean = (1, 2)\ncov = np.eye(2)\nrng = np.random.default_rng(seed=0)\noriginal = rng.multivariate_normal(mean, cov, size=12)\nprojected2 = PCA(n_components=2).fit_transform(original)\nprojected1 = PCA(n_components=1).fit_transform(original)\nnp.random.seed(0)\nprojectedrnd = permutation(original)\n\n# Print `min`, `mean`, and `max` values.\ns = pwsortedness(original, original)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n1.0 1.0 1.0\n\"\"\"\n```\n\n```python3\n\ns = pwsortedness(original, projected2)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n1.0 1.0 1.0\n\"\"\"\n```\n\n```python3\n\ns = pwsortedness(original, projected1)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n0.649315577592 0.7534291438324999 0.834601601062\n\"\"\"\n```\n\n```python3\n\ns = pwsortedness(original, projectedrnd)\nprint(min(s), sum(s) / len(s), max(s))\n\"\"\"\n-0.168611098044 -0.07988253899783333 0.14442446342\n\"\"\"\n```\n\n```python3\n\n# Single point fast calculation.\ns = pwsortedness(original, projectedrnd, 2)\nprint(s)\n\"\"\"\n0.036119718802\n\"\"\"\n```\n\n\n</p>\n</details>\n\n**Global pairwise sortedness**\n<details>\n<p>\n\n```python3\n\nimport numpy as np\nfrom numpy.random import permutation\nfrom sklearn.decomposition import PCA\n\nfrom sortedness import global_pwsortedness\n\n# Some synthetic data.\nmean = (1, 2)\ncov = np.eye(2)\nrng = np.random.default_rng(seed=0)\noriginal = rng.multivariate_normal(mean, cov, size=12)\nprojected2 = PCA(n_components=2).fit_transform(original)\nprojected1 = PCA(n_components=1).fit_transform(original)\nnp.random.seed(0)\nprojectedrnd = permutation(original)\n\n# Print measurement result and p-value.\ns = global_pwsortedness(original, original)\nprint(list(s))\n\"\"\"\n[1.0, 3.6741408919675163e-93]\n\"\"\"\n```\n\n```python3\n\ns = global_pwsortedness(original, projected2)\nprint(list(s))\n\"\"\"\n[1.0, 3.6741408919675163e-93]\n\"\"\"\n```\n\n```python3\n\ns = global_pwsortedness(original, projected1)\nprint(list(s))\n\"\"\"\n[0.7715617715617715, 5.240847664048334e-20]\n\"\"\"\n```\n\n```python3\n\ns = global_pwsortedness(original, projectedrnd)\nprint(list(s))\n\"\"\"\n[-0.06107226107226107, 0.46847188611226276]\n\"\"\"\n```\n\n\n</p>\n</details>\n\n\n** Copyright (c) 2023. Davi Pereira dos Santos and Tacito Neves**\n\n\n### TODO\nFuture work address handling large datasets: approximate sortedness value, and size-insensitive weighting scheme.\n\n## Reference\nPlease use the following reference to cite this work:\n```\n@inproceedings {10.2312:eurova.20231093,\nbooktitle = {EuroVis Workshop on Visual Analytics (EuroVA)},\neditor = {Angelini, Marco and El-Assady, Mennatallah},\ntitle = {{Nonparametric Dimensionality Reduction Quality Assessment based on Sortedness of Unrestricted Neighborhood}},\nauthor = {Pereira-Santos, Davi and Neves, T\u00e1cito Trindade Ara\u00fajo Tiburtino and Carvalho, Andr\u00e9 C. P. L. F. de and Paulovich, Fernando V.},\nyear = {2023},\npublisher = {The Eurographics Association},\nISSN = {2664-4487},\nISBN = {978-3-03868-222-6},\nDOI = {10.2312/eurova.20231093}\n}\n```\n\n## Grants\nThis work was supported by Wellcome Leap 1kD Program; S\u00e3o\nPaulo Research Foundation (FAPESP) - grant 2020/09835-1; Canadian Institute for Health Research (CIHR) Canadian Research\nChairs (CRC) stipend [award number 1024586]; Canadian Foundation for Innovation (CFI) John R. Evans Leaders Fund (JELF)\n[grant number 38835]; Dalhousie Medical Research Fund (DMRF)\nCOVID-19 Research Grant [grant number 603082]; and the Canadian Institute for Health Research (CIHR) Project Grant [award\nnumber 177968].\n",
    "bugtrack_url": null,
    "license": "GPLv3",
    "summary": "Measures of projection quality",
    "version": "2.231027.1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "379d66c59ef58946b12196fd98614f6915e7c8a179efa7771f8a92d837c281f4",
                "md5": "4a965a56abd0b4688904fdc188d3b3e6",
                "sha256": "6135718c18e8cc590f5b5b076383df7b0da8ff78d70061becb94d32024f99dc1"
            },
            "downloads": -1,
            "filename": "sortedness-2.231027.1-cp310-cp310-manylinux_2_35_x86_64.whl",
            "has_sig": false,
            "md5_digest": "4a965a56abd0b4688904fdc188d3b3e6",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": ">=3.10,<3.13",
            "size": 745692,
            "upload_time": "2023-10-28T00:16:54",
            "upload_time_iso_8601": "2023-10-28T00:16:54.410802Z",
            "url": "https://files.pythonhosted.org/packages/37/9d/66c59ef58946b12196fd98614f6915e7c8a179efa7771f8a92d837c281f4/sortedness-2.231027.1-cp310-cp310-manylinux_2_35_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5bd5c983cf3bb735576c93ec9ce3191ae50462418cab943a3fd01b03a12c04a6",
                "md5": "f5c7fd0980223df874f3a1c4c8978aec",
                "sha256": "c002a995ae4c70dbad84b2bed98d758da31e13aaa02b07011b46321d60ff3372"
            },
            "downloads": -1,
            "filename": "sortedness-2.231027.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f5c7fd0980223df874f3a1c4c8978aec",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<3.13",
            "size": 732969,
            "upload_time": "2023-10-28T00:16:57",
            "upload_time_iso_8601": "2023-10-28T00:16:57.079052Z",
            "url": "https://files.pythonhosted.org/packages/5b/d5/c983cf3bb735576c93ec9ce3191ae50462418cab943a3fd01b03a12c04a6/sortedness-2.231027.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-28 00:16:57",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "sortedness"
}
        
Elapsed time: 0.18477s