torchdr


Nametorchdr JSON
Version 0.2 PyPI version JSON
download
home_pageNone
SummaryTorch Dimensionality Reduction Library
upload_time2025-02-07 11:29:33
maintainerNone
docs_urlNone
authorTorchDR contributors
requires_pythonNone
licenseBSD (3-Clause)
keywords dimensionality reduction machine learning data analysis pytorch scikit-learn gpu
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Torch Dimensionality Reduction

<p align="center">
  <img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/torchdr_logo.png" width="800" alt="torchdr logo">
</p>

[![Documentation](https://img.shields.io/badge/Documentation-blue.svg)](https://torchdr.github.io/)
[![Benchmark](https://img.shields.io/badge/Benchmarks-blue.svg)](https://github.com/TorchDR/TorchDR/tree/main/benchmarks)
[![Version](https://img.shields.io/github/v/release/TorchDR/TorchDR.svg?color=blue)](https://github.com/TorchDR/TorchDR/releases)
[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-3100/)
[![Pytorch](https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white)](https://pytorch.org/get-started/locally/)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Test Status](https://github.com/torchdr/torchdr/actions/workflows/testing.yml/badge.svg)]()
[![CircleCI](https://dl.circleci.com/status-badge/img/gh/TorchDR/TorchDR/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/TorchDR/TorchDR/tree/main)
[![codecov](https://codecov.io/gh/torchdr/torchdr/branch/main/graph/badge.svg)](https://codecov.io/gh/torchdr/torchdr)

TorchDR is an open-source **dimensionality reduction (DR)** library using PyTorch. Its goal is to provide **fast GPU-compatible** implementations of DR algorithms, as well as to accelerate the development of new DR methods by providing a **common simplified framework**.

DR aims to construct a **low-dimensional representation (or embedding)** of an input dataset that best preserves its **geometry encoded via a pairwise affinity matrix**. To this end, DR methods **optimize the embedding** such that its **associated pairwise affinity matrix matches the input affinity**. TorchDR provides a general framework for solving problems of this form. Defining a DR algorithm solely requires choosing or implementing an *Affinity* object for both input and embedding as well as an objective function.


## Benefits of TorchDR

- **Speed**: supports **GPU acceleration**, leverages **sparsity** and **sampling** strategies with **contrastive learning** techniques.
- **Modularity**: all of it is written in **Python** in a **highly modular** way, making it easy to create or transform components.
- **Memory efficiency**: relies on **sparsity** and/or **symbolic tensors** to **avoid memory overflows**.
- **Compatibility**: implemented methods are fully **compatible** with the sklearn API and torch ecosystem.


## Getting Started

`TorchDR` offers a **user-friendly API similar to scikit-learn** where dimensionality reduction modules can be called with the `fit_transform` method. It seamlessly accepts both NumPy arrays and PyTorch tensors as input, ensuring that the output matches the type and backend of the input.

```python
from sklearn.datasets import fetch_openml
from torchdr import PCA, TSNE

x = fetch_openml("mnist_784").data.astype("float32")

x_ = PCA(n_components=50).fit_transform(x)
z = TSNE(perplexity=30).fit_transform(x_)
```

`TorchDR` is fully **GPU compatible**, enabling **significant speed-ups** when a GPU is available. To run computations on the GPU, simply set `device="cuda"` as shown in the example below:

```python
z_gpu = TSNE(perplexity=30, device="cuda").fit_transform(x_)
```


## Backends

The `backend` keyword specifies which tool to use for handling kNN computations and memory-efficient symbolic computations.

- To perform symbolic tensor computations on the GPU without memory limitations, you can leverage the [KeOps Library](https://www.kernel-operations.io/keops/index.html). This library also allows computing kNN graphs. To enable KeOps, set `backend="keops"`.
- Alternatively, you can use `backend="faiss"` to rely on [Faiss](https://github.com/facebookresearch/faiss) for fast kNN computations.
- Finally, setting `backend=None` will use raw PyTorch for all computations.


## Benchmarks

Relying on `TorchDR` enables an order-of-magnitude improvement in both runtime and memory performance compared to CPU-based implementations. [See the code](https://github.com/TorchDR/TorchDR/blob/main/benchmarks/benchmark_umap.py). Stay tuned for additional benchmarks.

| Dataset         | Samples   | Method            | Runtime (sec) | Memory (MB) |
|-----------------|-----------|-------------------|---------------|-------------|
| Macosko         | 44,808    | Classic UMAP (CPU)| 61.3          | 410.9       |
|                 |           | TorchDR UMAP (GPU)| **7.7**       | **100.4**   |
| 10x Mouse Zheng | 1,306,127 | Classic UMAP (CPU)| 1910.4        | 11278.1     |
|                 |           | TorchDR UMAP (GPU)| **184.4**     | **2699.7**  |


## Examples

See the [examples](https://github.com/TorchDR/TorchDR/tree/main/examples/) folder for all examples.


**MNIST.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/panorama_readme.py))
A comparison of various neighbor embedding methods on the MNIST digits dataset.

<p align="center">
  <img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/mnist_readme.png" width="800" alt="various neighbor embedding methods on MNIST">
</p>


**Single-cell genomics.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/single_cell/single_cell_readme.py))
Visualizing cells using `LargeVis` from `TorchDR`.

<p align="center">
  <img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/single_cell.gif" width="700" alt="single cell embeddings">
</p>


**CIFAR100.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/cifar100.py))
Visualizing the CIFAR100 dataset using DINO features and TSNE.

<p align="center">
  <img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/cifar100_tsne.png" width="1024" alt="TSNE on CIFAR100 DINO features">
</p>


## Implemented Features (to date)

### Affinities

`TorchDR` features a **wide range of affinities** which can then be used as a building block for DR algorithms. It includes:

- Usual affinities: [`ScalarProductAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.ScalarProductAffinity.html), [`GaussianAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.GaussianAffinity.html), [`StudentAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.StudentAffinity.html).
- Affinities based on k-NN normalizations: [`SelfTuningAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SelfTuningAffinity.html), [`MAGICAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.MAGICAffinity.html).
- Doubly stochastic affinities: [`SinkhornAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SinkhornAffinity.html), [`DoublyStochasticQuadraticAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.DoublyStochasticQuadraticAffinity.html).
- Adaptive affinities with entropy control: [`EntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.EntropicAffinity.html), [`SymmetricEntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SymmetricEntropicAffinity.html).


### Dimensionality Reduction Algorithms

**Spectral.** `TorchDR` provides **spectral embeddings** calculated via eigenvalue decomposition of the affinities or their Laplacian: [`PCA`](https://torchdr.github.io/dev/gen_modules/torchdr.PCA.html), [`KernelPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.KernelPCA.html), [`IncrementalPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.IncrementalPCA.html).

**Neighbor Embedding.** `TorchDR` includes various **neighbor embedding methods**: [`SNE`](https://torchdr.github.io/dev/gen_modules/torchdr.SNE.html), [`TSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNE.html), [`TSNEkhorn`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNEkhorn.html), [`UMAP`](https://torchdr.github.io/dev/gen_modules/torchdr.UMAP.html), [`LargeVis`](https://torchdr.github.io/dev/gen_modules/torchdr.LargeVis.html), [`InfoTSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.InfoTSNE.html).


### Evaluation Metric

`TorchDR` provides efficient GPU-compatible evaluation metrics: [`silhouette_score`](https://torchdr.github.io/dev/gen_modules/torchdr.silhouette_score.html).

## Installation

You can install the toolbox through PyPI with:

```bash
pip install torchdr
```

To get the latest version, you can install it from the source code as follows:

```bash
pip install git+https://github.com/torchdr/torchdr
```

## Finding Help

If you have any questions or suggestions, feel free to open an issue on the [issue tracker](https://github.com/torchdr/torchdr/issues) or contact [Hugues Van Assel](https://huguesva.github.io/) directly.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "torchdr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "dimensionality reduction, machine learning, data analysis, pytorch, scikit-learn, GPU",
    "author": "TorchDR contributors",
    "author_email": "Hugues Van Assel <vanasselhugues@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/2d/7b/8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d/torchdr-0.2.tar.gz",
    "platform": null,
    "description": "# Torch Dimensionality Reduction\n\n<p align=\"center\">\n  <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/torchdr_logo.png\" width=\"800\" alt=\"torchdr logo\">\n</p>\n\n[![Documentation](https://img.shields.io/badge/Documentation-blue.svg)](https://torchdr.github.io/)\n[![Benchmark](https://img.shields.io/badge/Benchmarks-blue.svg)](https://github.com/TorchDR/TorchDR/tree/main/benchmarks)\n[![Version](https://img.shields.io/github/v/release/TorchDR/TorchDR.svg?color=blue)](https://github.com/TorchDR/TorchDR/releases)\n[![License](https://img.shields.io/badge/License-BSD_3--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-3100/)\n[![Pytorch](https://img.shields.io/badge/PyTorch-ee4c2c?logo=pytorch&logoColor=white)](https://pytorch.org/get-started/locally/)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![Test Status](https://github.com/torchdr/torchdr/actions/workflows/testing.yml/badge.svg)]()\n[![CircleCI](https://dl.circleci.com/status-badge/img/gh/TorchDR/TorchDR/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/TorchDR/TorchDR/tree/main)\n[![codecov](https://codecov.io/gh/torchdr/torchdr/branch/main/graph/badge.svg)](https://codecov.io/gh/torchdr/torchdr)\n\nTorchDR is an open-source **dimensionality reduction (DR)** library using PyTorch. Its goal is to provide **fast GPU-compatible** implementations of DR algorithms, as well as to accelerate the development of new DR methods by providing a **common simplified framework**.\n\nDR aims to construct a **low-dimensional representation (or embedding)** of an input dataset that best preserves its **geometry encoded via a pairwise affinity matrix**. To this end, DR methods **optimize the embedding** such that its **associated pairwise affinity matrix matches the input affinity**. TorchDR provides a general framework for solving problems of this form. Defining a DR algorithm solely requires choosing or implementing an *Affinity* object for both input and embedding as well as an objective function.\n\n\n## Benefits of TorchDR\n\n- **Speed**: supports **GPU acceleration**, leverages **sparsity** and **sampling** strategies with **contrastive learning** techniques.\n- **Modularity**: all of it is written in **Python** in a **highly modular** way, making it easy to create or transform components.\n- **Memory efficiency**: relies on **sparsity** and/or **symbolic tensors** to **avoid memory overflows**.\n- **Compatibility**: implemented methods are fully **compatible** with the sklearn API and torch ecosystem.\n\n\n## Getting Started\n\n`TorchDR` offers a **user-friendly API similar to scikit-learn** where dimensionality reduction modules can be called with the `fit_transform` method. It seamlessly accepts both NumPy arrays and PyTorch tensors as input, ensuring that the output matches the type and backend of the input.\n\n```python\nfrom sklearn.datasets import fetch_openml\nfrom torchdr import PCA, TSNE\n\nx = fetch_openml(\"mnist_784\").data.astype(\"float32\")\n\nx_ = PCA(n_components=50).fit_transform(x)\nz = TSNE(perplexity=30).fit_transform(x_)\n```\n\n`TorchDR` is fully **GPU compatible**, enabling **significant speed-ups** when a GPU is available. To run computations on the GPU, simply set `device=\"cuda\"` as shown in the example below:\n\n```python\nz_gpu = TSNE(perplexity=30, device=\"cuda\").fit_transform(x_)\n```\n\n\n## Backends\n\nThe `backend` keyword specifies which tool to use for handling kNN computations and memory-efficient symbolic computations.\n\n- To perform symbolic tensor computations on the GPU without memory limitations, you can leverage the [KeOps Library](https://www.kernel-operations.io/keops/index.html). This library also allows computing kNN graphs. To enable KeOps, set `backend=\"keops\"`.\n- Alternatively, you can use `backend=\"faiss\"` to rely on [Faiss](https://github.com/facebookresearch/faiss) for fast kNN computations.\n- Finally, setting `backend=None` will use raw PyTorch for all computations.\n\n\n## Benchmarks\n\nRelying on `TorchDR` enables an order-of-magnitude improvement in both runtime and memory performance compared to CPU-based implementations. [See the code](https://github.com/TorchDR/TorchDR/blob/main/benchmarks/benchmark_umap.py). Stay tuned for additional benchmarks.\n\n| Dataset         | Samples   | Method            | Runtime (sec) | Memory (MB) |\n|-----------------|-----------|-------------------|---------------|-------------|\n| Macosko         | 44,808    | Classic UMAP (CPU)| 61.3          | 410.9       |\n|                 |           | TorchDR UMAP (GPU)| **7.7**       | **100.4**   |\n| 10x Mouse Zheng | 1,306,127 | Classic UMAP (CPU)| 1910.4        | 11278.1     |\n|                 |           | TorchDR UMAP (GPU)| **184.4**     | **2699.7**  |\n\n\n## Examples\n\nSee the [examples](https://github.com/TorchDR/TorchDR/tree/main/examples/) folder for all examples.\n\n\n**MNIST.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/panorama_readme.py))\nA comparison of various neighbor embedding methods on the MNIST digits dataset.\n\n<p align=\"center\">\n  <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/mnist_readme.png\" width=\"800\" alt=\"various neighbor embedding methods on MNIST\">\n</p>\n\n\n**Single-cell genomics.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/single_cell/single_cell_readme.py))\nVisualizing cells using `LargeVis` from `TorchDR`.\n\n<p align=\"center\">\n  <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/single_cell.gif\" width=\"700\" alt=\"single cell embeddings\">\n</p>\n\n\n**CIFAR100.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/cifar100.py))\nVisualizing the CIFAR100 dataset using DINO features and TSNE.\n\n<p align=\"center\">\n  <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/cifar100_tsne.png\" width=\"1024\" alt=\"TSNE on CIFAR100 DINO features\">\n</p>\n\n\n## Implemented Features (to date)\n\n### Affinities\n\n`TorchDR` features a **wide range of affinities** which can then be used as a building block for DR algorithms. It includes:\n\n- Usual affinities: [`ScalarProductAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.ScalarProductAffinity.html), [`GaussianAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.GaussianAffinity.html), [`StudentAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.StudentAffinity.html).\n- Affinities based on k-NN normalizations: [`SelfTuningAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SelfTuningAffinity.html), [`MAGICAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.MAGICAffinity.html).\n- Doubly stochastic affinities: [`SinkhornAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SinkhornAffinity.html), [`DoublyStochasticQuadraticAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.DoublyStochasticQuadraticAffinity.html).\n- Adaptive affinities with entropy control: [`EntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.EntropicAffinity.html), [`SymmetricEntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SymmetricEntropicAffinity.html).\n\n\n### Dimensionality Reduction Algorithms\n\n**Spectral.** `TorchDR` provides **spectral embeddings** calculated via eigenvalue decomposition of the affinities or their Laplacian: [`PCA`](https://torchdr.github.io/dev/gen_modules/torchdr.PCA.html), [`KernelPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.KernelPCA.html), [`IncrementalPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.IncrementalPCA.html).\n\n**Neighbor Embedding.** `TorchDR` includes various **neighbor embedding methods**: [`SNE`](https://torchdr.github.io/dev/gen_modules/torchdr.SNE.html), [`TSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNE.html), [`TSNEkhorn`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNEkhorn.html), [`UMAP`](https://torchdr.github.io/dev/gen_modules/torchdr.UMAP.html), [`LargeVis`](https://torchdr.github.io/dev/gen_modules/torchdr.LargeVis.html), [`InfoTSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.InfoTSNE.html).\n\n\n### Evaluation Metric\n\n`TorchDR` provides efficient GPU-compatible evaluation metrics: [`silhouette_score`](https://torchdr.github.io/dev/gen_modules/torchdr.silhouette_score.html).\n\n## Installation\n\nYou can install the toolbox through PyPI with:\n\n```bash\npip install torchdr\n```\n\nTo get the latest version, you can install it from the source code as follows:\n\n```bash\npip install git+https://github.com/torchdr/torchdr\n```\n\n## Finding Help\n\nIf you have any questions or suggestions, feel free to open an issue on the [issue tracker](https://github.com/torchdr/torchdr/issues) or contact [Hugues Van Assel](https://huguesva.github.io/) directly.\n",
    "bugtrack_url": null,
    "license": "BSD (3-Clause)",
    "summary": "Torch Dimensionality Reduction Library",
    "version": "0.2",
    "project_urls": {
        "documentation": "https://torchdr.github.io/",
        "homepage": "https://torchdr.github.io/",
        "repository": "https://github.com/TorchDR/TorchDR"
    },
    "split_keywords": [
        "dimensionality reduction",
        " machine learning",
        " data analysis",
        " pytorch",
        " scikit-learn",
        " gpu"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5b6df9575491c1345ece7c1c36d630205273fcce723bf078ace0cd379fc9eff5",
                "md5": "83679700e29d1de0f8c8381095f414cc",
                "sha256": "3ca6f7513699411e1ebf846720c4f9eb34d67bc5351bd363929157ac7eaa1f9a"
            },
            "downloads": -1,
            "filename": "torchdr-0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "83679700e29d1de0f8c8381095f414cc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 84231,
            "upload_time": "2025-02-07T11:29:30",
            "upload_time_iso_8601": "2025-02-07T11:29:30.656317Z",
            "url": "https://files.pythonhosted.org/packages/5b/6d/f9575491c1345ece7c1c36d630205273fcce723bf078ace0cd379fc9eff5/torchdr-0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2d7b8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d",
                "md5": "74a94ba1be641c1830960075f7bf0896",
                "sha256": "ed3435b0eb46e90658b2f236bbdad1eab747eb3b0c17319709528b135da23a9e"
            },
            "downloads": -1,
            "filename": "torchdr-0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "74a94ba1be641c1830960075f7bf0896",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3736092,
            "upload_time": "2025-02-07T11:29:33",
            "upload_time_iso_8601": "2025-02-07T11:29:33.020011Z",
            "url": "https://files.pythonhosted.org/packages/2d/7b/8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d/torchdr-0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-07 11:29:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TorchDR",
    "github_project": "TorchDR",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "circle": true,
    "lcname": "torchdr"
}
        
Elapsed time: 0.40743s