# Torch Dimensionality Reduction
<p align="center">
<img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/torchdr_logo.png" width="800" alt="torchdr logo">
</p>
[data:image/s3,"s3://crabby-images/da24b/da24b32ccad26567b06b58e54430ca68ff9b3a5b" alt="Documentation"](https://torchdr.github.io/)
[data:image/s3,"s3://crabby-images/895ef/895ef590f64071944af5f37d0570da5f66cd817d" alt="Benchmark"](https://github.com/TorchDR/TorchDR/tree/main/benchmarks)
[data:image/s3,"s3://crabby-images/438be/438be9b5d0858bbd77beaa6c5120f3b0ec157b7d" alt="Version"](https://github.com/TorchDR/TorchDR/releases)
[data:image/s3,"s3://crabby-images/8ef8d/8ef8dbeb8c789c35b87647b2d4d665d3d4e4b420" alt="License"](https://opensource.org/licenses/BSD-3-Clause)
[data:image/s3,"s3://crabby-images/a8d12/a8d1274db5046db41f992a91bfb138d5b754082f" alt="Python 3.10+"](https://www.python.org/downloads/release/python-3100/)
[data:image/s3,"s3://crabby-images/fc7a2/fc7a2621ffd92118b599241fd58958c6fe409b80" alt="Pytorch"](https://pytorch.org/get-started/locally/)
[data:image/s3,"s3://crabby-images/6a099/6a099727a52cf617121ab5d23cc43109ed9fa550" alt="Ruff"](https://github.com/astral-sh/ruff)
[data:image/s3,"s3://crabby-images/a61cf/a61cf6ecba9e1657aed7bbfdfadbabf1a9201f91" alt="Test Status"]()
[data:image/s3,"s3://crabby-images/188d9/188d919affd5218f370f6476c391e664f83a76c8" alt="CircleCI"](https://dl.circleci.com/status-badge/redirect/gh/TorchDR/TorchDR/tree/main)
[data:image/s3,"s3://crabby-images/baac7/baac79b6260923501cf6a01f1fa74f09e9df001a" alt="codecov"](https://codecov.io/gh/torchdr/torchdr)
TorchDR is an open-source **dimensionality reduction (DR)** library using PyTorch. Its goal is to provide **fast GPU-compatible** implementations of DR algorithms, as well as to accelerate the development of new DR methods by providing a **common simplified framework**.
DR aims to construct a **low-dimensional representation (or embedding)** of an input dataset that best preserves its **geometry encoded via a pairwise affinity matrix**. To this end, DR methods **optimize the embedding** such that its **associated pairwise affinity matrix matches the input affinity**. TorchDR provides a general framework for solving problems of this form. Defining a DR algorithm solely requires choosing or implementing an *Affinity* object for both input and embedding as well as an objective function.
## Benefits of TorchDR
- **Speed**: supports **GPU acceleration**, leverages **sparsity** and **sampling** strategies with **contrastive learning** techniques.
- **Modularity**: all of it is written in **Python** in a **highly modular** way, making it easy to create or transform components.
- **Memory efficiency**: relies on **sparsity** and/or **symbolic tensors** to **avoid memory overflows**.
- **Compatibility**: implemented methods are fully **compatible** with the sklearn API and torch ecosystem.
## Getting Started
`TorchDR` offers a **user-friendly API similar to scikit-learn** where dimensionality reduction modules can be called with the `fit_transform` method. It seamlessly accepts both NumPy arrays and PyTorch tensors as input, ensuring that the output matches the type and backend of the input.
```python
from sklearn.datasets import fetch_openml
from torchdr import PCA, TSNE
x = fetch_openml("mnist_784").data.astype("float32")
x_ = PCA(n_components=50).fit_transform(x)
z = TSNE(perplexity=30).fit_transform(x_)
```
`TorchDR` is fully **GPU compatible**, enabling **significant speed-ups** when a GPU is available. To run computations on the GPU, simply set `device="cuda"` as shown in the example below:
```python
z_gpu = TSNE(perplexity=30, device="cuda").fit_transform(x_)
```
## Backends
The `backend` keyword specifies which tool to use for handling kNN computations and memory-efficient symbolic computations.
- To perform symbolic tensor computations on the GPU without memory limitations, you can leverage the [KeOps Library](https://www.kernel-operations.io/keops/index.html). This library also allows computing kNN graphs. To enable KeOps, set `backend="keops"`.
- Alternatively, you can use `backend="faiss"` to rely on [Faiss](https://github.com/facebookresearch/faiss) for fast kNN computations.
- Finally, setting `backend=None` will use raw PyTorch for all computations.
## Benchmarks
Relying on `TorchDR` enables an order-of-magnitude improvement in both runtime and memory performance compared to CPU-based implementations. [See the code](https://github.com/TorchDR/TorchDR/blob/main/benchmarks/benchmark_umap.py). Stay tuned for additional benchmarks.
| Dataset | Samples | Method | Runtime (sec) | Memory (MB) |
|-----------------|-----------|-------------------|---------------|-------------|
| Macosko | 44,808 | Classic UMAP (CPU)| 61.3 | 410.9 |
| | | TorchDR UMAP (GPU)| **7.7** | **100.4** |
| 10x Mouse Zheng | 1,306,127 | Classic UMAP (CPU)| 1910.4 | 11278.1 |
| | | TorchDR UMAP (GPU)| **184.4** | **2699.7** |
## Examples
See the [examples](https://github.com/TorchDR/TorchDR/tree/main/examples/) folder for all examples.
**MNIST.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/panorama_readme.py))
A comparison of various neighbor embedding methods on the MNIST digits dataset.
<p align="center">
<img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/mnist_readme.png" width="800" alt="various neighbor embedding methods on MNIST">
</p>
**Single-cell genomics.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/single_cell/single_cell_readme.py))
Visualizing cells using `LargeVis` from `TorchDR`.
<p align="center">
<img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/single_cell.gif" width="700" alt="single cell embeddings">
</p>
**CIFAR100.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/cifar100.py))
Visualizing the CIFAR100 dataset using DINO features and TSNE.
<p align="center">
<img src="https://github.com/torchdr/torchdr/raw/main/docs/source/figures/cifar100_tsne.png" width="1024" alt="TSNE on CIFAR100 DINO features">
</p>
## Implemented Features (to date)
### Affinities
`TorchDR` features a **wide range of affinities** which can then be used as a building block for DR algorithms. It includes:
- Usual affinities: [`ScalarProductAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.ScalarProductAffinity.html), [`GaussianAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.GaussianAffinity.html), [`StudentAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.StudentAffinity.html).
- Affinities based on k-NN normalizations: [`SelfTuningAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SelfTuningAffinity.html), [`MAGICAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.MAGICAffinity.html).
- Doubly stochastic affinities: [`SinkhornAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SinkhornAffinity.html), [`DoublyStochasticQuadraticAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.DoublyStochasticQuadraticAffinity.html).
- Adaptive affinities with entropy control: [`EntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.EntropicAffinity.html), [`SymmetricEntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SymmetricEntropicAffinity.html).
### Dimensionality Reduction Algorithms
**Spectral.** `TorchDR` provides **spectral embeddings** calculated via eigenvalue decomposition of the affinities or their Laplacian: [`PCA`](https://torchdr.github.io/dev/gen_modules/torchdr.PCA.html), [`KernelPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.KernelPCA.html), [`IncrementalPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.IncrementalPCA.html).
**Neighbor Embedding.** `TorchDR` includes various **neighbor embedding methods**: [`SNE`](https://torchdr.github.io/dev/gen_modules/torchdr.SNE.html), [`TSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNE.html), [`TSNEkhorn`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNEkhorn.html), [`UMAP`](https://torchdr.github.io/dev/gen_modules/torchdr.UMAP.html), [`LargeVis`](https://torchdr.github.io/dev/gen_modules/torchdr.LargeVis.html), [`InfoTSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.InfoTSNE.html).
### Evaluation Metric
`TorchDR` provides efficient GPU-compatible evaluation metrics: [`silhouette_score`](https://torchdr.github.io/dev/gen_modules/torchdr.silhouette_score.html).
## Installation
You can install the toolbox through PyPI with:
```bash
pip install torchdr
```
To get the latest version, you can install it from the source code as follows:
```bash
pip install git+https://github.com/torchdr/torchdr
```
## Finding Help
If you have any questions or suggestions, feel free to open an issue on the [issue tracker](https://github.com/torchdr/torchdr/issues) or contact [Hugues Van Assel](https://huguesva.github.io/) directly.
Raw data
{
"_id": null,
"home_page": null,
"name": "torchdr",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "dimensionality reduction, machine learning, data analysis, pytorch, scikit-learn, GPU",
"author": "TorchDR contributors",
"author_email": "Hugues Van Assel <vanasselhugues@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/2d/7b/8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d/torchdr-0.2.tar.gz",
"platform": null,
"description": "# Torch Dimensionality Reduction\n\n<p align=\"center\">\n <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/torchdr_logo.png\" width=\"800\" alt=\"torchdr logo\">\n</p>\n\n[data:image/s3,"s3://crabby-images/da24b/da24b32ccad26567b06b58e54430ca68ff9b3a5b" alt="Documentation"](https://torchdr.github.io/)\n[data:image/s3,"s3://crabby-images/895ef/895ef590f64071944af5f37d0570da5f66cd817d" alt="Benchmark"](https://github.com/TorchDR/TorchDR/tree/main/benchmarks)\n[data:image/s3,"s3://crabby-images/438be/438be9b5d0858bbd77beaa6c5120f3b0ec157b7d" alt="Version"](https://github.com/TorchDR/TorchDR/releases)\n[data:image/s3,"s3://crabby-images/8ef8d/8ef8dbeb8c789c35b87647b2d4d665d3d4e4b420" alt="License"](https://opensource.org/licenses/BSD-3-Clause)\n[data:image/s3,"s3://crabby-images/a8d12/a8d1274db5046db41f992a91bfb138d5b754082f" alt="Python 3.10+"](https://www.python.org/downloads/release/python-3100/)\n[data:image/s3,"s3://crabby-images/fc7a2/fc7a2621ffd92118b599241fd58958c6fe409b80" alt="Pytorch"](https://pytorch.org/get-started/locally/)\n[data:image/s3,"s3://crabby-images/6a099/6a099727a52cf617121ab5d23cc43109ed9fa550" alt="Ruff"](https://github.com/astral-sh/ruff)\n[data:image/s3,"s3://crabby-images/a61cf/a61cf6ecba9e1657aed7bbfdfadbabf1a9201f91" alt="Test Status"]()\n[data:image/s3,"s3://crabby-images/188d9/188d919affd5218f370f6476c391e664f83a76c8" alt="CircleCI"](https://dl.circleci.com/status-badge/redirect/gh/TorchDR/TorchDR/tree/main)\n[data:image/s3,"s3://crabby-images/baac7/baac79b6260923501cf6a01f1fa74f09e9df001a" alt="codecov"](https://codecov.io/gh/torchdr/torchdr)\n\nTorchDR is an open-source **dimensionality reduction (DR)** library using PyTorch. Its goal is to provide **fast GPU-compatible** implementations of DR algorithms, as well as to accelerate the development of new DR methods by providing a **common simplified framework**.\n\nDR aims to construct a **low-dimensional representation (or embedding)** of an input dataset that best preserves its **geometry encoded via a pairwise affinity matrix**. To this end, DR methods **optimize the embedding** such that its **associated pairwise affinity matrix matches the input affinity**. TorchDR provides a general framework for solving problems of this form. Defining a DR algorithm solely requires choosing or implementing an *Affinity* object for both input and embedding as well as an objective function.\n\n\n## Benefits of TorchDR\n\n- **Speed**: supports **GPU acceleration**, leverages **sparsity** and **sampling** strategies with **contrastive learning** techniques.\n- **Modularity**: all of it is written in **Python** in a **highly modular** way, making it easy to create or transform components.\n- **Memory efficiency**: relies on **sparsity** and/or **symbolic tensors** to **avoid memory overflows**.\n- **Compatibility**: implemented methods are fully **compatible** with the sklearn API and torch ecosystem.\n\n\n## Getting Started\n\n`TorchDR` offers a **user-friendly API similar to scikit-learn** where dimensionality reduction modules can be called with the `fit_transform` method. It seamlessly accepts both NumPy arrays and PyTorch tensors as input, ensuring that the output matches the type and backend of the input.\n\n```python\nfrom sklearn.datasets import fetch_openml\nfrom torchdr import PCA, TSNE\n\nx = fetch_openml(\"mnist_784\").data.astype(\"float32\")\n\nx_ = PCA(n_components=50).fit_transform(x)\nz = TSNE(perplexity=30).fit_transform(x_)\n```\n\n`TorchDR` is fully **GPU compatible**, enabling **significant speed-ups** when a GPU is available. To run computations on the GPU, simply set `device=\"cuda\"` as shown in the example below:\n\n```python\nz_gpu = TSNE(perplexity=30, device=\"cuda\").fit_transform(x_)\n```\n\n\n## Backends\n\nThe `backend` keyword specifies which tool to use for handling kNN computations and memory-efficient symbolic computations.\n\n- To perform symbolic tensor computations on the GPU without memory limitations, you can leverage the [KeOps Library](https://www.kernel-operations.io/keops/index.html). This library also allows computing kNN graphs. To enable KeOps, set `backend=\"keops\"`.\n- Alternatively, you can use `backend=\"faiss\"` to rely on [Faiss](https://github.com/facebookresearch/faiss) for fast kNN computations.\n- Finally, setting `backend=None` will use raw PyTorch for all computations.\n\n\n## Benchmarks\n\nRelying on `TorchDR` enables an order-of-magnitude improvement in both runtime and memory performance compared to CPU-based implementations. [See the code](https://github.com/TorchDR/TorchDR/blob/main/benchmarks/benchmark_umap.py). Stay tuned for additional benchmarks.\n\n| Dataset | Samples | Method | Runtime (sec) | Memory (MB) |\n|-----------------|-----------|-------------------|---------------|-------------|\n| Macosko | 44,808 | Classic UMAP (CPU)| 61.3 | 410.9 |\n| | | TorchDR UMAP (GPU)| **7.7** | **100.4** |\n| 10x Mouse Zheng | 1,306,127 | Classic UMAP (CPU)| 1910.4 | 11278.1 |\n| | | TorchDR UMAP (GPU)| **184.4** | **2699.7** |\n\n\n## Examples\n\nSee the [examples](https://github.com/TorchDR/TorchDR/tree/main/examples/) folder for all examples.\n\n\n**MNIST.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/panorama_readme.py))\nA comparison of various neighbor embedding methods on the MNIST digits dataset.\n\n<p align=\"center\">\n <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/mnist_readme.png\" width=\"800\" alt=\"various neighbor embedding methods on MNIST\">\n</p>\n\n\n**Single-cell genomics.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/single_cell/single_cell_readme.py))\nVisualizing cells using `LargeVis` from `TorchDR`.\n\n<p align=\"center\">\n <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/single_cell.gif\" width=\"700\" alt=\"single cell embeddings\">\n</p>\n\n\n**CIFAR100.** ([Code](https://github.com/TorchDR/TorchDR/tree/main/examples/images/cifar100.py))\nVisualizing the CIFAR100 dataset using DINO features and TSNE.\n\n<p align=\"center\">\n <img src=\"https://github.com/torchdr/torchdr/raw/main/docs/source/figures/cifar100_tsne.png\" width=\"1024\" alt=\"TSNE on CIFAR100 DINO features\">\n</p>\n\n\n## Implemented Features (to date)\n\n### Affinities\n\n`TorchDR` features a **wide range of affinities** which can then be used as a building block for DR algorithms. It includes:\n\n- Usual affinities: [`ScalarProductAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.ScalarProductAffinity.html), [`GaussianAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.GaussianAffinity.html), [`StudentAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.StudentAffinity.html).\n- Affinities based on k-NN normalizations: [`SelfTuningAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SelfTuningAffinity.html), [`MAGICAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.MAGICAffinity.html).\n- Doubly stochastic affinities: [`SinkhornAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SinkhornAffinity.html), [`DoublyStochasticQuadraticAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.DoublyStochasticQuadraticAffinity.html).\n- Adaptive affinities with entropy control: [`EntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.EntropicAffinity.html), [`SymmetricEntropicAffinity`](https://torchdr.github.io/dev/gen_modules/torchdr.SymmetricEntropicAffinity.html).\n\n\n### Dimensionality Reduction Algorithms\n\n**Spectral.** `TorchDR` provides **spectral embeddings** calculated via eigenvalue decomposition of the affinities or their Laplacian: [`PCA`](https://torchdr.github.io/dev/gen_modules/torchdr.PCA.html), [`KernelPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.KernelPCA.html), [`IncrementalPCA`](https://torchdr.github.io/dev/gen_modules/torchdr.IncrementalPCA.html).\n\n**Neighbor Embedding.** `TorchDR` includes various **neighbor embedding methods**: [`SNE`](https://torchdr.github.io/dev/gen_modules/torchdr.SNE.html), [`TSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNE.html), [`TSNEkhorn`](https://torchdr.github.io/dev/gen_modules/torchdr.TSNEkhorn.html), [`UMAP`](https://torchdr.github.io/dev/gen_modules/torchdr.UMAP.html), [`LargeVis`](https://torchdr.github.io/dev/gen_modules/torchdr.LargeVis.html), [`InfoTSNE`](https://torchdr.github.io/dev/gen_modules/torchdr.InfoTSNE.html).\n\n\n### Evaluation Metric\n\n`TorchDR` provides efficient GPU-compatible evaluation metrics: [`silhouette_score`](https://torchdr.github.io/dev/gen_modules/torchdr.silhouette_score.html).\n\n## Installation\n\nYou can install the toolbox through PyPI with:\n\n```bash\npip install torchdr\n```\n\nTo get the latest version, you can install it from the source code as follows:\n\n```bash\npip install git+https://github.com/torchdr/torchdr\n```\n\n## Finding Help\n\nIf you have any questions or suggestions, feel free to open an issue on the [issue tracker](https://github.com/torchdr/torchdr/issues) or contact [Hugues Van Assel](https://huguesva.github.io/) directly.\n",
"bugtrack_url": null,
"license": "BSD (3-Clause)",
"summary": "Torch Dimensionality Reduction Library",
"version": "0.2",
"project_urls": {
"documentation": "https://torchdr.github.io/",
"homepage": "https://torchdr.github.io/",
"repository": "https://github.com/TorchDR/TorchDR"
},
"split_keywords": [
"dimensionality reduction",
" machine learning",
" data analysis",
" pytorch",
" scikit-learn",
" gpu"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5b6df9575491c1345ece7c1c36d630205273fcce723bf078ace0cd379fc9eff5",
"md5": "83679700e29d1de0f8c8381095f414cc",
"sha256": "3ca6f7513699411e1ebf846720c4f9eb34d67bc5351bd363929157ac7eaa1f9a"
},
"downloads": -1,
"filename": "torchdr-0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "83679700e29d1de0f8c8381095f414cc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 84231,
"upload_time": "2025-02-07T11:29:30",
"upload_time_iso_8601": "2025-02-07T11:29:30.656317Z",
"url": "https://files.pythonhosted.org/packages/5b/6d/f9575491c1345ece7c1c36d630205273fcce723bf078ace0cd379fc9eff5/torchdr-0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "2d7b8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d",
"md5": "74a94ba1be641c1830960075f7bf0896",
"sha256": "ed3435b0eb46e90658b2f236bbdad1eab747eb3b0c17319709528b135da23a9e"
},
"downloads": -1,
"filename": "torchdr-0.2.tar.gz",
"has_sig": false,
"md5_digest": "74a94ba1be641c1830960075f7bf0896",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3736092,
"upload_time": "2025-02-07T11:29:33",
"upload_time_iso_8601": "2025-02-07T11:29:33.020011Z",
"url": "https://files.pythonhosted.org/packages/2d/7b/8ece2c425139eee707c641546b0a636ecd9cad5387542e6cb9c8ccf2235d/torchdr-0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-07 11:29:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "TorchDR",
"github_project": "TorchDR",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"circle": true,
"lcname": "torchdr"
}