diffusion-curvature


Namediffusion-curvature JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/professorwug/diffusion_curvature
SummaryFast, pointwise graph curvature
upload_time2023-08-07 23:48:09
maintainer
docs_urlNone
authorKincaid
requires_python>=3.7
licenseApache Software License 2.0
keywords nbdev jupyter notebook python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Diffusion Curvature

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

> \[!INFO\] This code is currently in *early beta*. Some features,
> particularly those relating to dimension estimation and the
> construction of comparison spaces, are experimental and will likely
> change. Please report any issues you encounter to the Github Issues
> page.

Diffusion curvature is a pointwise extension of Ollivier-Ricci
curvature, designed specifically for the often messy world of pointcloud
data. Its advantages include:

1.  Unaffected by density fluctuations in data: it inherits the
    diffusion operator’s denoising properties.
2.  Fast, and scalable to millions of points: it depends only on matrix
    powering - no optimal transport required.

## Install

<!-- To install with conda (or better yet, mamba),
```sh
conda install diffusion-curvature -c riddlelabs
``` -->

To install with pip (or better yet, poetry),

``` sh
pip install diffusion-curvature
```

or

``` sh
poetry add diffusion-curvature
```

Conda releases are pending.

## Usage

To compute diffusion curvature, first create a `graphtools` graph with
your data. Graphtools offers extensive support for different kernel
types (if creating from a pointcloud), and can also work with graphs in
the `PyGSP` format. We recommend using `anistropy=1`, and verifying that
the supplied knn value encompasses a reasonable portion of the graph.

``` python
from diffusion_curvature.datasets import torus
import graphtools
X_torus, torus_gaussian_curvature = torus(n=5000)
G_torus = graphtools.Graph(X_torus, anisotropy=1, knn=30)
```

Graphtools offers many additional options. For large graphs, you can
speed up the powering of the diffusion matrix with landmarking: simply
pass `n_landmarks=1000` (e.g) when creating the graphtools graph. If you
enable landmarking, `diffusion-curvature` will automatically use it.

Next, instantiate a
[`DiffusionCurvature`](https://professorwug.github.io/diffusion_curvature/core%20(graphtools).html#diffusioncurvature)
operator.

``` python
from diffusion_curvature.graphtools import DiffusionCurvature
DC = DiffusionCurvature(t=12)
```

------------------------------------------------------------------------

<a
href="https://github.com/professorwug/diffusion_curvature/blob/main/diffusion_curvature/graphtools.py#LNone"
target="_blank" style="float:right; font-size:smaller">source</a>

### DiffusionCurvature

>      DiffusionCurvature (t:int, distance_type='PHATE', dimest=None,
>                          use_entropy:bool=False, **kwargs)

Initialize self. See help(type(self)) for accurate signature.

|               | **Type** | **Default** | **Details**                                                                                                       |
|---------------|----------|-------------|-------------------------------------------------------------------------------------------------------------------|
| t             | int      |             | Number of diffusion steps to use when measuring curvature. TODO: Heuristics                                       |
| distance_type | str      | PHATE       |                                                                                                                   |
| dimest        | NoneType | None        | Dimensionality estimator to use. If None, defaults to KNN with default params                                     |
| use_entropy   | bool     | False       | If true, uses KL Divergence instead of Wasserstein Distances. Faster, seems empirically as good, but less proven. |
| kwargs        |          |             |                                                                                                                   |

And, finally, pass your graph through it. The
[`DiffusionCurvature`](https://professorwug.github.io/diffusion_curvature/core%20(graphtools).html#diffusioncurvature)
operator will store everything it computes – the powered diffusion
matrix, the estimated manifold distances, and the curvatures – as
attributes of your graph. To get the curvatures, you can run `G.ks`.

``` python
G_torus = DC.curvature(G_torus, dimension=2) # note: this is the intrinsic dimension of the data
```

``` python
plot_3d(X_torus, G_torus.ks, colorbar=True, title="Diffusion Curvature on the torus")
```

![](index_files/figure-commonmark/cell-6-output-1.png)

# Using on a predefined graph

If you have an adjacency matrix but no pointcloud, diffusion curvature
may still be useful. The caveat, currently, is that our intrinsic
dimension estimation doesn’t yet support graphs, so you’ll have to
compute & provide the dimension yourself – if you want a signed
curvature value.

If you’re only comparing relative magnitudes of curvature, you can skip
this step.

For predefined graphs, we use our own
[`ManifoldGraph`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#manifoldgraph)
class. You can create one straight from an adjacency matrix:

``` python
from diffusion_curvature.manifold_graph import ManifoldGraph, diffusion_curvature, diffusion_entropy_curvature, entropy_of_diffusion, wasserstein_spread_of_diffusion, power_diffusion_matrix, phate_distances, flattened_facsimile_of_graph
```

``` python
# pretend we've computed the adjacency matrix elsewhere in the code
A = G_torus.K.toarray()
# initialize the manifold graph; input your computed dimension along with the adjacency matrix
G_pure = ManifoldGraph(A, dimension=2)
```

``` python
G_pure = diffusion_curvature(G_pure, t=12)
plot_3d(X_torus, G_pure.ks, title = "Diffusion Curvature on Graph")
```

![](index_files/figure-commonmark/cell-9-output-1.png)

Alternately, to compute just the *relative magnitudes* of the pointwise
curvatures (without signs), we can directly use either the
[`wasserstein_spread_of_diffusion`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#wasserstein_spread_of_diffusion)
(which computes the $W_1$ distance from a dirac to its t-step
diffusion), or the
[`entropy_of_diffusion`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#entropy_of_diffusion)
function (which computes the entropy of each t-step diffusion). The
latter is nice when the manifold’s geodesic distances are hard to
estimate – it corresponds to replacing the wasserstein distance with the
KL divergence.

``` python
# for the wasserstein version, we need manifold distances
G_pure = power_diffusion_matrix(G_pure)
G_pure = phate_distances(G_pure)
ks_wasserstein = wasserstein_spread_of_diffusion(G_pure)
```

``` python
# for the entropic version, we need only power the diffusion operator
G_pure = power_diffusion_matrix(G_pure, t=12)
ks_entropy = entropy_of_diffusion(G_pure)
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/professorwug/diffusion_curvature",
    "name": "diffusion-curvature",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "nbdev jupyter notebook python",
    "author": "Kincaid",
    "author_email": "dev@riddle.press",
    "download_url": "https://files.pythonhosted.org/packages/60/0a/1ed404400df5d2cc932f84c981278d8369fcb5f0a48719935bd4c9a9cc5e/diffusion_curvature-0.0.3.tar.gz",
    "platform": null,
    "description": "# Diffusion Curvature\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\n> \\[!INFO\\] This code is currently in *early beta*. Some features,\n> particularly those relating to dimension estimation and the\n> construction of comparison spaces, are experimental and will likely\n> change. Please report any issues you encounter to the Github Issues\n> page.\n\nDiffusion curvature is a pointwise extension of Ollivier-Ricci\ncurvature, designed specifically for the often messy world of pointcloud\ndata. Its advantages include:\n\n1.  Unaffected by density fluctuations in data: it inherits the\n    diffusion operator\u2019s denoising properties.\n2.  Fast, and scalable to millions of points: it depends only on matrix\n    powering - no optimal transport required.\n\n## Install\n\n<!-- To install with conda (or better yet, mamba),\n```sh\nconda install diffusion-curvature -c riddlelabs\n``` -->\n\nTo install with pip (or better yet, poetry),\n\n``` sh\npip install diffusion-curvature\n```\n\nor\n\n``` sh\npoetry add diffusion-curvature\n```\n\nConda releases are pending.\n\n## Usage\n\nTo compute diffusion curvature, first create a `graphtools` graph with\nyour data. Graphtools offers extensive support for different kernel\ntypes (if creating from a pointcloud), and can also work with graphs in\nthe `PyGSP` format. We recommend using `anistropy=1`, and verifying that\nthe supplied knn value encompasses a reasonable portion of the graph.\n\n``` python\nfrom diffusion_curvature.datasets import torus\nimport graphtools\nX_torus, torus_gaussian_curvature = torus(n=5000)\nG_torus = graphtools.Graph(X_torus, anisotropy=1, knn=30)\n```\n\nGraphtools offers many additional options. For large graphs, you can\nspeed up the powering of the diffusion matrix with landmarking: simply\npass `n_landmarks=1000` (e.g) when creating the graphtools graph. If you\nenable landmarking, `diffusion-curvature` will automatically use it.\n\nNext, instantiate a\n[`DiffusionCurvature`](https://professorwug.github.io/diffusion_curvature/core%20(graphtools).html#diffusioncurvature)\noperator.\n\n``` python\nfrom diffusion_curvature.graphtools import DiffusionCurvature\nDC = DiffusionCurvature(t=12)\n```\n\n------------------------------------------------------------------------\n\n<a\nhref=\"https://github.com/professorwug/diffusion_curvature/blob/main/diffusion_curvature/graphtools.py#LNone\"\ntarget=\"_blank\" style=\"float:right; font-size:smaller\">source</a>\n\n### DiffusionCurvature\n\n>      DiffusionCurvature (t:int, distance_type='PHATE', dimest=None,\n>                          use_entropy:bool=False, **kwargs)\n\nInitialize self. See help(type(self)) for accurate signature.\n\n|               | **Type** | **Default** | **Details**                                                                                                       |\n|---------------|----------|-------------|-------------------------------------------------------------------------------------------------------------------|\n| t             | int      |             | Number of diffusion steps to use when measuring curvature. TODO: Heuristics                                       |\n| distance_type | str      | PHATE       |                                                                                                                   |\n| dimest        | NoneType | None        | Dimensionality estimator to use. If None, defaults to KNN with default params                                     |\n| use_entropy   | bool     | False       | If true, uses KL Divergence instead of Wasserstein Distances. Faster, seems empirically as good, but less proven. |\n| kwargs        |          |             |                                                                                                                   |\n\nAnd, finally, pass your graph through it. The\n[`DiffusionCurvature`](https://professorwug.github.io/diffusion_curvature/core%20(graphtools).html#diffusioncurvature)\noperator will store everything it computes \u2013 the powered diffusion\nmatrix, the estimated manifold distances, and the curvatures \u2013 as\nattributes of your graph. To get the curvatures, you can run `G.ks`.\n\n``` python\nG_torus = DC.curvature(G_torus, dimension=2) # note: this is the intrinsic dimension of the data\n```\n\n``` python\nplot_3d(X_torus, G_torus.ks, colorbar=True, title=\"Diffusion Curvature on the torus\")\n```\n\n![](index_files/figure-commonmark/cell-6-output-1.png)\n\n# Using on a predefined graph\n\nIf you have an adjacency matrix but no pointcloud, diffusion curvature\nmay still be useful. The caveat, currently, is that our intrinsic\ndimension estimation doesn\u2019t yet support graphs, so you\u2019ll have to\ncompute & provide the dimension yourself \u2013 if you want a signed\ncurvature value.\n\nIf you\u2019re only comparing relative magnitudes of curvature, you can skip\nthis step.\n\nFor predefined graphs, we use our own\n[`ManifoldGraph`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#manifoldgraph)\nclass. You can create one straight from an adjacency matrix:\n\n``` python\nfrom diffusion_curvature.manifold_graph import ManifoldGraph, diffusion_curvature, diffusion_entropy_curvature, entropy_of_diffusion, wasserstein_spread_of_diffusion, power_diffusion_matrix, phate_distances, flattened_facsimile_of_graph\n```\n\n``` python\n# pretend we've computed the adjacency matrix elsewhere in the code\nA = G_torus.K.toarray()\n# initialize the manifold graph; input your computed dimension along with the adjacency matrix\nG_pure = ManifoldGraph(A, dimension=2)\n```\n\n``` python\nG_pure = diffusion_curvature(G_pure, t=12)\nplot_3d(X_torus, G_pure.ks, title = \"Diffusion Curvature on Graph\")\n```\n\n![](index_files/figure-commonmark/cell-9-output-1.png)\n\nAlternately, to compute just the *relative magnitudes* of the pointwise\ncurvatures (without signs), we can directly use either the\n[`wasserstein_spread_of_diffusion`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#wasserstein_spread_of_diffusion)\n(which computes the $W_1$ distance from a dirac to its t-step\ndiffusion), or the\n[`entropy_of_diffusion`](https://professorwug.github.io/diffusion_curvature/core%20(manifoldgraph).html#entropy_of_diffusion)\nfunction (which computes the entropy of each t-step diffusion). The\nlatter is nice when the manifold\u2019s geodesic distances are hard to\nestimate \u2013 it corresponds to replacing the wasserstein distance with the\nKL divergence.\n\n``` python\n# for the wasserstein version, we need manifold distances\nG_pure = power_diffusion_matrix(G_pure)\nG_pure = phate_distances(G_pure)\nks_wasserstein = wasserstein_spread_of_diffusion(G_pure)\n```\n\n``` python\n# for the entropic version, we need only power the diffusion operator\nG_pure = power_diffusion_matrix(G_pure, t=12)\nks_entropy = entropy_of_diffusion(G_pure)\n```\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Fast, pointwise graph curvature",
    "version": "0.0.3",
    "project_urls": {
        "Homepage": "https://github.com/professorwug/diffusion_curvature"
    },
    "split_keywords": [
        "nbdev",
        "jupyter",
        "notebook",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "24800022745e59ef1eb76f6c6fa8737235bfdd401b7c73ddae4e075379cfc884",
                "md5": "3f7ffbe941f1376442e56d3f5e2b2f48",
                "sha256": "954f31acf1e5cc2fd75d48a004c6abe440710ce31fd6d3ea8e1cac1d5ccc4a13"
            },
            "downloads": -1,
            "filename": "diffusion_curvature-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3f7ffbe941f1376442e56d3f5e2b2f48",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 17366,
            "upload_time": "2023-08-07T23:48:07",
            "upload_time_iso_8601": "2023-08-07T23:48:07.915620Z",
            "url": "https://files.pythonhosted.org/packages/24/80/0022745e59ef1eb76f6c6fa8737235bfdd401b7c73ddae4e075379cfc884/diffusion_curvature-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "600a1ed404400df5d2cc932f84c981278d8369fcb5f0a48719935bd4c9a9cc5e",
                "md5": "f0672e9f2fa771789ebc4315b9975cab",
                "sha256": "3806fad5d10b608b47fb2f29b235de87ef694554253e7393934a8bac75f955e9"
            },
            "downloads": -1,
            "filename": "diffusion_curvature-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "f0672e9f2fa771789ebc4315b9975cab",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 17248,
            "upload_time": "2023-08-07T23:48:09",
            "upload_time_iso_8601": "2023-08-07T23:48:09.555463Z",
            "url": "https://files.pythonhosted.org/packages/60/0a/1ed404400df5d2cc932f84c981278d8369fcb5f0a48719935bd4c9a9cc5e/diffusion_curvature-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-08-07 23:48:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "professorwug",
    "github_project": "diffusion_curvature",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "diffusion-curvature"
}
        
Elapsed time: 0.22529s