pacmap-gpu

Name	pacmap-gpu JSON
Version	1.0.2 JSON
	download
home_page	https://github.com/AntonioRoye/PaCMAP-GPU
Summary	GPU-Accelerated Pairwise Controlled Manifold Approximation for large-scale dimensionality reduction
upload_time	2025-07-22 03:06:04
maintainer	None
docs_url	None
author	Antonio Roye-Azar
requires_python	>=3.10
license	Apache-2.0
keywords	dimensionality-reduction manifold-learning gpu cuda machine-learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # PaCMAP-GPU: GPU-Accelerated Pairwise Controlled Manifold Approximation

[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![CUDA](https://img.shields.io/badge/CUDA-12.0%2B-green)](https://developer.nvidia.com/cuda-toolkit)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-red)](https://pytorch.org/)
[![RAPIDS](https://img.shields.io/badge/RAPIDS-24.0%2B-purple)](https://rapids.ai/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

**PaCMAP-GPU** is a high-performance GPU implementation of Pairwise Controlled Manifold Approximation and Projection (PaCMAP), providing dramatic speedups for large-scale dimensionality reduction and manifold learning tasks.

## 🚀 Key Features

- **GPU Acceleration**: Native CUDA implementation using PyTorch and CUDA graphs for maximum performance
- **Massive Speedups**: 5-50x faster than CPU implementations depending on dataset size
- **Memory Efficient**: Optimized GPU memory usage with automatic cleanup
- **Drop-in Replacement**: Compatible API with sklearn-style transformers
- **Three-Phase Training**: Standard PaCMAP phases plus LocalMAP variant
- **Large Dataset Support**: Handle datasets with millions of points efficiently
- **CUDA Graph Optimization**: Minimal kernel launch overhead

## 📋 Requirements

### Hardware Requirements
- NVIDIA GPU with Compute Capability 6.0+ (Pascal architecture or newer)
- 4GB+ GPU memory (8GB+ recommended for large datasets)
- CUDA 12.0 or higher

### Software Dependencies
Core dependencies (automatically installed):
- PyTorch >= 2.0.0
- NumPy >= 1.21.0, < 2.0.0
- SciPy >= 1.7.0
- scikit-learn >= 1.0.0
- pandas >= 1.3.0
- matplotlib >= 3.5.0

GPU acceleration (required):
- cupy-cuda12x >= 12.0.0
- cuml-cu12 >= 24.0.0
- cudf-cu12 >= 24.0.0
- faiss-gpu >= 1.7.0

**Important: CUDA-enabled PyTorch Required**

> By default, `pip install pacmap-gpu` will install the CPU-only version of PyTorch.  
> **To use GPU acceleration, you must manually install the CUDA-enabled PyTorch wheel** that matches your system and CUDA version.  
> See [PyTorch Get Started](https://pytorch.org/get-started/locally/) for installation instructions.

**Example:**
```bash
# Example for CUDA 12.1
pip install torch==2.2.2+cu121 torchvision==0.17.2+cu121 torchaudio==2.2.2+cu121 \
  --index-url https://download.pytorch.org/whl/cu121
```

## 🔧 Installation

### Option 1: Install from PyPI (Recommended)

```bash
# Install PaCMAP-GPU with all dependencies
pip install pacmap-gpu

# Verify installation
pacmap-gpu-check
```

**Note:** PaCMAP-GPU requires CUDA-capable hardware and GPU libraries. All necessary RAPIDS and CUDA dependencies are automatically installed.

### Option 2: Install from Source

```bash
# Clone the repository
git clone https://github.com/AntonioRoye/PaCMAP-GPU.git
cd PaCMAP-GPU

# Create conda environment with RAPIDS and all dependencies (recommended for development)
conda create -n pacmap-gpu -c rapidsai -c conda-forge \
    cuml-cu12 cudf-cu12 cupy-cuda12x faiss-gpu \
    python=3.10 pytorch pytorch-cuda=12.1 -c pytorch -c nvidia

# Activate environment
conda activate pacmap-gpu

# Install package in development mode
pip install -e .
```

### Alternative: Using requirements file

```bash
# Clone repository
git clone https://github.com/AntonioRoye/PaCMAP-GPU.git
cd PaCMAP-GPU

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install CUDA 12 dependencies
pip install -r requirements-cuda12.txt

# Install package
pip install -e .
```

### Verify Installation

```bash
# Check installation
pacmap-gpu-check

# Or in Python
python -c "from pacmap_gpu import PaCMAPGPU, check_gpu_availability; print('GPU Available:', check_gpu_availability())"
```

## 🎯 Quick Start

### Basic Usage

```python
import numpy as np
from pacmap_gpu import PaCMAPGPU
from sklearn.datasets import make_swiss_roll

# Generate sample data
X, color = make_swiss_roll(n_samples=2000, noise=0.1, random_state=42)

# Create PaCMAP-GPU instance
reducer = PaCMAPGPU(
    n_components=2,
    n_neighbors=10,
    num_iters=450,
    random_state=42,
    verbose=True
)

# Fit and transform the data
X_embedded = reducer.fit_transform(X)
print(f"Embedded shape: {X_embedded.shape}")
```

### Advanced Configuration

```python
# High-performance configuration for large datasets
reducer = PaCMAPGPU(
    n_components=2,
    n_neighbors=15,
    MN_ratio=0.5,
    FP_ratio=2.0,
    num_iters=450,
    batch_size=8192,
    init_method="pca",
    optimizer="adam",
    lr=0.01,
    low_dist_thres=1.0,  # LocalMAP parameter
    device="cuda",
    verbose=True,
    random_state=42
)

X_embedded = reducer.fit_transform(X)
```

### Visualization Example

```python
import matplotlib.pyplot as plt
from sklearn.datasets import load_digits

# Load digits dataset
digits = load_digits()
X, y = digits.data, digits.target

# Apply PaCMAP-GPU
reducer = PaCMAPGPU(n_components=2, verbose=True, random_state=42)
X_embedded = reducer.fit_transform(X)

# Plot results
plt.figure(figsize=(10, 8))
scatter = plt.scatter(X_embedded[:, 0], X_embedded[:, 1], 
                     c=y, cmap='tab10', alpha=0.7)
plt.colorbar(scatter)
plt.title('Digits Dataset - PaCMAP-GPU Embedding')
plt.xlabel('Component 1')
plt.ylabel('Component 2')
plt.show()
```

## 📊 Performance Benchmarks

### Speedup Comparison

Performance varies based on dataset size and GPU hardware. Below are typical dataset configurations for benchmarking:

| Dataset Size | Features | Use Case | Notes |
|-------------|----------|----------|-------|
| 1K samples  | 50       | Quick testing | May show minimal speedup due to GPU overhead |
| 5K samples  | 100      | Standard evaluation | ~2-5x speedup typical |  
| 10K samples | 200      | Medium datasets | ~5-15x speedup expected |
| 50K samples | 500      | Large-scale analysis | ~10-50x speedup possible |

*Actual performance depends on GPU model, CUDA version, and data characteristics*

**Run your own benchmarks:**
```bash
pytest tests/test_benchmarks.py -v -m benchmark
```

### Memory Efficiency

- **PaCMAP-GPU**: ~2-4GB GPU memory for 50K samples (with CUDA graph optimizations)
- **Traditional Methods**: ~4-8GB RAM for 50K samples (varies by dimensionality)  
- **Automatic Cleanup**: Memory pools and CUDA graph optimization prevent OOM errors

## 🔍 API Reference

### PaCMAPGPU Class

```python
PaCMAPGPU(
    n_components=2,          # Output dimensions
    n_neighbors=10,          # Neighbors for local structure
    MN_ratio=0.5,           # Mid-near pairs ratio
    FP_ratio=2.0,           # Further pairs ratio
    num_iters=450,          # Total optimization iterations
    batch_size=8192,        # Batch size for stochastic optimization
    init_method="pca",      # Initialization method
    optimizer="adam",       # Optimizer type
    lr=0.01,               # Learning rate
    device="cuda",         # CUDA device
    low_dist_thres=1.0,    # LocalMAP distance threshold
    random_state=None,     # Random seed
    verbose=False          # Print progress
)
```

#### Key Methods

- `fit_transform(X)`: Fit and return embedding
- `transform(X)`: Transform new data (not implemented)

#### Attributes

- `embedding_`: Final embedding coordinates
- `is_fitted`: Whether model has been fitted
- `n_samples`: Number of training samples

### Utility Functions

```python
from pacmap_gpu import check_gpu_availability, get_gpu_info, clear_gpu_memory

# Check GPU status
gpu_available = check_gpu_availability()
gpu_info = get_gpu_info()

# Memory management
clear_gpu_memory()

# Installation verification
from pacmap_gpu import verify_installation, print_installation_report
print_installation_report()
```

## ⚡ Performance Tips

### Dataset Size Guidelines
- **Small datasets (< 1K)**: GPU overhead may impact performance, but still required
- **Medium datasets (1K-10K)**: GPU shows significant speedup over traditional CPU methods  
- **Large datasets (10K+)**: Maximum GPU performance advantage

### Memory Optimization
```python
from pacmap_gpu import clear_gpu_memory

# Clear memory between runs
clear_gpu_memory()

# Use efficient data types
X = X.astype(np.float32)  # Use float32 instead of float64
```

### Hardware Optimization
- Use modern GPUs with Tensor Cores (RTX series, V100, A100)
- Ensure sufficient GPU memory (8GB+ for datasets > 50K samples)
- Use CUDA 12.0+ for best performance

## 🧪 Running Tests and Benchmarks

```bash
# Install development dependencies
pip install -e ".[dev]"

# Run basic tests
pytest tests/ -v

# Run GPU tests (requires GPU)
pytest tests/ -v -m gpu

# Run benchmarks
pytest tests/test_benchmarks.py -v -m benchmark

# Run without slow tests
pytest tests/ -v -m "not slow"
```

### Example Benchmark Output

```bash
pytest tests/test_benchmarks.py::TestPerformanceBenchmarks::test_gpu_vs_cpu_performance -v -s

=== MEDIUM Dataset Benchmark ===
Dataset shape: (2000, 50)
GPU time: 1.8432s
CPU time: 14.2156s
Speedup: 7.71x
```

## 🐛 Troubleshooting

### Common Issues

1. **CUDA Not Found**
   ```bash
   # Check CUDA installation
   nvidia-smi
   python -c "import torch; print(torch.cuda.is_available())"
   ```

2. **Out of Memory Error**
   ```python
   # Reduce batch size or data size
   reducer = PaCMAPGPU(batch_size=2048, n_neighbors=5)
   ```

3. **Import Errors**
   ```bash
   # Verify installation
   pacmap-gpu-check
   
   # Reinstall if needed
   pip install --force-reinstall pacmap-gpu
   ```

### Performance Issues

- **Slow on small datasets**: GPU overhead exists, but GPU is still required for functionality
- **GPU underutilized**: Increase dataset size or batch parameters
- **Memory warnings**: Monitor GPU memory and clear caches regularly

## 🤝 Contributing

We welcome contributions! Please follow these steps:

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make changes and add tests
4. Run the test suite (`pytest tests/`)
5. Commit changes (`git commit -m 'Add amazing feature'`)
6. Push to branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

### Development Setup

```bash
# Clone repository
git clone https://github.com/AntonioRoye/PaCMAP-GPU.git
cd PaCMAP-GPU

# Set up conda environment with all dependencies (recommended)
conda create -n pacmap-gpu-dev -c rapidsai -c conda-forge \
    cuml-cu12 cudf-cu12 cupy-cuda12x faiss-gpu \
    python=3.10 pytorch pytorch-cuda=12.1 -c pytorch -c nvidia
conda activate pacmap-gpu-dev

# Install package and development tools
pip install -e ".[dev]"

# Or alternatively, install from requirements file + dev extras
# pip install -r requirements-cuda12.txt
# pip install -e ".[dev]"

# Run tests
pytest tests/

# Run GPU-specific tests (requires CUDA)
pytest tests/ -m gpu

# Format code
black source/

# Type checking
mypy source/
```

## 📚 Citation

If you use PaCMAP-GPU in your research, please cite:

```bibtex
@article{JMLR:v22:20-1061,
  author  = {Yingfan Wang and Haiyang Huang and Cynthia Rudin and Yaron Shaposhnik},
  title   = {Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {201},
  pages   = {1-73},
  url     = {http://jmlr.org/papers/v22/20-1061.html}
}

@software{pacmap_gpu2025,
  title={PaCMAP-GPU: GPU-Accelerated Pairwise Controlled Manifold Approximation},
  author={Antonio Roye-Azar},
  year={2025},
  url={https://github.com/AntonioRoye/PaCMAP-GPU}
}
```

## 📄 License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## 🔗 Links

- **Original PaCMAP**: [https://github.com/YingfanWang/PaCMAP](https://github.com/YingfanWang/PaCMAP)
- **PyTorch**: [https://pytorch.org/](https://pytorch.org/)
- **RAPIDS cuML**: [https://docs.rapids.ai/api/cuml/stable/](https://docs.rapids.ai/api/cuml/stable/)
- **Issues**: [https://github.com/AntonioRoye/PaCMAP-GPU/issues](https://github.com/AntonioRoye/PaCMAP-GPU/issues)

## 🙏 Acknowledgments

- Original PaCMAP authors for the algorithm and reference implementation
- RAPIDS team for GPU acceleration libraries
- PyTorch team for the deep learning framework
- NVIDIA for CUDA and GPU computing platform

---

**Made with ❤️ for the GPU computing community**

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/AntonioRoye/PaCMAP-GPU",
    "name": "pacmap-gpu",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "dimensionality-reduction, manifold-learning, gpu, cuda, machine-learning",
    "author": "Antonio Roye-Azar",
    "author_email": "Antonio Roye-Azar <antonio@varonova.ca>",
    "download_url": "https://files.pythonhosted.org/packages/2a/b9/c49ee9c570631eac95ec89cb362eea11460d36befe000282edf2e92e79fb/pacmap_gpu-1.0.2.tar.gz",
    "platform": "Linux",
    "description": "# PaCMAP-GPU: GPU-Accelerated Pairwise Controlled Manifold Approximation\n\n[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)\n[![CUDA](https://img.shields.io/badge/CUDA-12.0%2B-green)](https://developer.nvidia.com/cuda-toolkit)\n[![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-red)](https://pytorch.org/)\n[![RAPIDS](https://img.shields.io/badge/RAPIDS-24.0%2B-purple)](https://rapids.ai/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\n**PaCMAP-GPU** is a high-performance GPU implementation of Pairwise Controlled Manifold Approximation and Projection (PaCMAP), providing dramatic speedups for large-scale dimensionality reduction and manifold learning tasks.\n\n## \ud83d\ude80 Key Features\n\n- **GPU Acceleration**: Native CUDA implementation using PyTorch and CUDA graphs for maximum performance\n- **Massive Speedups**: 5-50x faster than CPU implementations depending on dataset size\n- **Memory Efficient**: Optimized GPU memory usage with automatic cleanup\n- **Drop-in Replacement**: Compatible API with sklearn-style transformers\n- **Three-Phase Training**: Standard PaCMAP phases plus LocalMAP variant\n- **Large Dataset Support**: Handle datasets with millions of points efficiently\n- **CUDA Graph Optimization**: Minimal kernel launch overhead\n\n## \ud83d\udccb Requirements\n\n### Hardware Requirements\n- NVIDIA GPU with Compute Capability 6.0+ (Pascal architecture or newer)\n- 4GB+ GPU memory (8GB+ recommended for large datasets)\n- CUDA 12.0 or higher\n\n### Software Dependencies\nCore dependencies (automatically installed):\n- PyTorch >= 2.0.0\n- NumPy >= 1.21.0, < 2.0.0\n- SciPy >= 1.7.0\n- scikit-learn >= 1.0.0\n- pandas >= 1.3.0\n- matplotlib >= 3.5.0\n\nGPU acceleration (required):\n- cupy-cuda12x >= 12.0.0\n- cuml-cu12 >= 24.0.0\n- cudf-cu12 >= 24.0.0\n- faiss-gpu >= 1.7.0\n\n**Important: CUDA-enabled PyTorch Required**\n\n> By default, `pip install pacmap-gpu` will install the CPU-only version of PyTorch.  \n> **To use GPU acceleration, you must manually install the CUDA-enabled PyTorch wheel** that matches your system and CUDA version.  \n> See [PyTorch Get Started](https://pytorch.org/get-started/locally/) for installation instructions.\n\n**Example:**\n```bash\n# Example for CUDA 12.1\npip install torch==2.2.2+cu121 torchvision==0.17.2+cu121 torchaudio==2.2.2+cu121 \\\n  --index-url https://download.pytorch.org/whl/cu121\n```\n\n## \ud83d\udd27 Installation\n\n### Option 1: Install from PyPI (Recommended)\n\n```bash\n# Install PaCMAP-GPU with all dependencies\npip install pacmap-gpu\n\n# Verify installation\npacmap-gpu-check\n```\n\n**Note:** PaCMAP-GPU requires CUDA-capable hardware and GPU libraries. All necessary RAPIDS and CUDA dependencies are automatically installed.\n\n### Option 2: Install from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/AntonioRoye/PaCMAP-GPU.git\ncd PaCMAP-GPU\n\n# Create conda environment with RAPIDS and all dependencies (recommended for development)\nconda create -n pacmap-gpu -c rapidsai -c conda-forge \\\n    cuml-cu12 cudf-cu12 cupy-cuda12x faiss-gpu \\\n    python=3.10 pytorch pytorch-cuda=12.1 -c pytorch -c nvidia\n\n# Activate environment\nconda activate pacmap-gpu\n\n# Install package in development mode\npip install -e .\n```\n\n### Alternative: Using requirements file\n\n```bash\n# Clone repository\ngit clone https://github.com/AntonioRoye/PaCMAP-GPU.git\ncd PaCMAP-GPU\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install CUDA 12 dependencies\npip install -r requirements-cuda12.txt\n\n# Install package\npip install -e .\n```\n\n### Verify Installation\n\n```bash\n# Check installation\npacmap-gpu-check\n\n# Or in Python\npython -c \"from pacmap_gpu import PaCMAPGPU, check_gpu_availability; print('GPU Available:', check_gpu_availability())\"\n```\n\n## \ud83c\udfaf Quick Start\n\n### Basic Usage\n\n```python\nimport numpy as np\nfrom pacmap_gpu import PaCMAPGPU\nfrom sklearn.datasets import make_swiss_roll\n\n# Generate sample data\nX, color = make_swiss_roll(n_samples=2000, noise=0.1, random_state=42)\n\n# Create PaCMAP-GPU instance\nreducer = PaCMAPGPU(\n    n_components=2,\n    n_neighbors=10,\n    num_iters=450,\n    random_state=42,\n    verbose=True\n)\n\n# Fit and transform the data\nX_embedded = reducer.fit_transform(X)\nprint(f\"Embedded shape: {X_embedded.shape}\")\n```\n\n### Advanced Configuration\n\n```python\n# High-performance configuration for large datasets\nreducer = PaCMAPGPU(\n    n_components=2,\n    n_neighbors=15,\n    MN_ratio=0.5,\n    FP_ratio=2.0,\n    num_iters=450,\n    batch_size=8192,\n    init_method=\"pca\",\n    optimizer=\"adam\",\n    lr=0.01,\n    low_dist_thres=1.0,  # LocalMAP parameter\n    device=\"cuda\",\n    verbose=True,\n    random_state=42\n)\n\nX_embedded = reducer.fit_transform(X)\n```\n\n### Visualization Example\n\n```python\nimport matplotlib.pyplot as plt\nfrom sklearn.datasets import load_digits\n\n# Load digits dataset\ndigits = load_digits()\nX, y = digits.data, digits.target\n\n# Apply PaCMAP-GPU\nreducer = PaCMAPGPU(n_components=2, verbose=True, random_state=42)\nX_embedded = reducer.fit_transform(X)\n\n# Plot results\nplt.figure(figsize=(10, 8))\nscatter = plt.scatter(X_embedded[:, 0], X_embedded[:, 1], \n                     c=y, cmap='tab10', alpha=0.7)\nplt.colorbar(scatter)\nplt.title('Digits Dataset - PaCMAP-GPU Embedding')\nplt.xlabel('Component 1')\nplt.ylabel('Component 2')\nplt.show()\n```\n\n## \ud83d\udcca Performance Benchmarks\n\n### Speedup Comparison\n\nPerformance varies based on dataset size and GPU hardware. Below are typical dataset configurations for benchmarking:\n\n| Dataset Size | Features | Use Case | Notes |\n|-------------|----------|----------|-------|\n| 1K samples  | 50       | Quick testing | May show minimal speedup due to GPU overhead |\n| 5K samples  | 100      | Standard evaluation | ~2-5x speedup typical |  \n| 10K samples | 200      | Medium datasets | ~5-15x speedup expected |\n| 50K samples | 500      | Large-scale analysis | ~10-50x speedup possible |\n\n*Actual performance depends on GPU model, CUDA version, and data characteristics*\n\n**Run your own benchmarks:**\n```bash\npytest tests/test_benchmarks.py -v -m benchmark\n```\n\n### Memory Efficiency\n\n- **PaCMAP-GPU**: ~2-4GB GPU memory for 50K samples (with CUDA graph optimizations)\n- **Traditional Methods**: ~4-8GB RAM for 50K samples (varies by dimensionality)  \n- **Automatic Cleanup**: Memory pools and CUDA graph optimization prevent OOM errors\n\n## \ud83d\udd0d API Reference\n\n### PaCMAPGPU Class\n\n```python\nPaCMAPGPU(\n    n_components=2,          # Output dimensions\n    n_neighbors=10,          # Neighbors for local structure\n    MN_ratio=0.5,           # Mid-near pairs ratio\n    FP_ratio=2.0,           # Further pairs ratio\n    num_iters=450,          # Total optimization iterations\n    batch_size=8192,        # Batch size for stochastic optimization\n    init_method=\"pca\",      # Initialization method\n    optimizer=\"adam\",       # Optimizer type\n    lr=0.01,               # Learning rate\n    device=\"cuda\",         # CUDA device\n    low_dist_thres=1.0,    # LocalMAP distance threshold\n    random_state=None,     # Random seed\n    verbose=False          # Print progress\n)\n```\n\n#### Key Methods\n\n- `fit_transform(X)`: Fit and return embedding\n- `transform(X)`: Transform new data (not implemented)\n\n#### Attributes\n\n- `embedding_`: Final embedding coordinates\n- `is_fitted`: Whether model has been fitted\n- `n_samples`: Number of training samples\n\n### Utility Functions\n\n```python\nfrom pacmap_gpu import check_gpu_availability, get_gpu_info, clear_gpu_memory\n\n# Check GPU status\ngpu_available = check_gpu_availability()\ngpu_info = get_gpu_info()\n\n# Memory management\nclear_gpu_memory()\n\n# Installation verification\nfrom pacmap_gpu import verify_installation, print_installation_report\nprint_installation_report()\n```\n\n## \u26a1 Performance Tips\n\n### Dataset Size Guidelines\n- **Small datasets (< 1K)**: GPU overhead may impact performance, but still required\n- **Medium datasets (1K-10K)**: GPU shows significant speedup over traditional CPU methods  \n- **Large datasets (10K+)**: Maximum GPU performance advantage\n\n### Memory Optimization\n```python\nfrom pacmap_gpu import clear_gpu_memory\n\n# Clear memory between runs\nclear_gpu_memory()\n\n# Use efficient data types\nX = X.astype(np.float32)  # Use float32 instead of float64\n```\n\n### Hardware Optimization\n- Use modern GPUs with Tensor Cores (RTX series, V100, A100)\n- Ensure sufficient GPU memory (8GB+ for datasets > 50K samples)\n- Use CUDA 12.0+ for best performance\n\n## \ud83e\uddea Running Tests and Benchmarks\n\n```bash\n# Install development dependencies\npip install -e \".[dev]\"\n\n# Run basic tests\npytest tests/ -v\n\n# Run GPU tests (requires GPU)\npytest tests/ -v -m gpu\n\n# Run benchmarks\npytest tests/test_benchmarks.py -v -m benchmark\n\n# Run without slow tests\npytest tests/ -v -m \"not slow\"\n```\n\n### Example Benchmark Output\n\n```bash\npytest tests/test_benchmarks.py::TestPerformanceBenchmarks::test_gpu_vs_cpu_performance -v -s\n\n=== MEDIUM Dataset Benchmark ===\nDataset shape: (2000, 50)\nGPU time: 1.8432s\nCPU time: 14.2156s\nSpeedup: 7.71x\n```\n\n## \ud83d\udc1b Troubleshooting\n\n### Common Issues\n\n1. **CUDA Not Found**\n   ```bash\n   # Check CUDA installation\n   nvidia-smi\n   python -c \"import torch; print(torch.cuda.is_available())\"\n   ```\n\n2. **Out of Memory Error**\n   ```python\n   # Reduce batch size or data size\n   reducer = PaCMAPGPU(batch_size=2048, n_neighbors=5)\n   ```\n\n3. **Import Errors**\n   ```bash\n   # Verify installation\n   pacmap-gpu-check\n   \n   # Reinstall if needed\n   pip install --force-reinstall pacmap-gpu\n   ```\n\n### Performance Issues\n\n- **Slow on small datasets**: GPU overhead exists, but GPU is still required for functionality\n- **GPU underutilized**: Increase dataset size or batch parameters\n- **Memory warnings**: Monitor GPU memory and clear caches regularly\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make changes and add tests\n4. Run the test suite (`pytest tests/`)\n5. Commit changes (`git commit -m 'Add amazing feature'`)\n6. Push to branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n### Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/AntonioRoye/PaCMAP-GPU.git\ncd PaCMAP-GPU\n\n# Set up conda environment with all dependencies (recommended)\nconda create -n pacmap-gpu-dev -c rapidsai -c conda-forge \\\n    cuml-cu12 cudf-cu12 cupy-cuda12x faiss-gpu \\\n    python=3.10 pytorch pytorch-cuda=12.1 -c pytorch -c nvidia\nconda activate pacmap-gpu-dev\n\n# Install package and development tools\npip install -e \".[dev]\"\n\n# Or alternatively, install from requirements file + dev extras\n# pip install -r requirements-cuda12.txt\n# pip install -e \".[dev]\"\n\n# Run tests\npytest tests/\n\n# Run GPU-specific tests (requires CUDA)\npytest tests/ -m gpu\n\n# Format code\nblack source/\n\n# Type checking\nmypy source/\n```\n\n## \ud83d\udcda Citation\n\nIf you use PaCMAP-GPU in your research, please cite:\n\n```bibtex\n@article{JMLR:v22:20-1061,\n  author  = {Yingfan Wang and Haiyang Huang and Cynthia Rudin and Yaron Shaposhnik},\n  title   = {Understanding How Dimension Reduction Tools Work: An Empirical Approach to Deciphering t-SNE, UMAP, TriMap, and PaCMAP for Data Visualization},\n  journal = {Journal of Machine Learning Research},\n  year    = {2021},\n  volume  = {22},\n  number  = {201},\n  pages   = {1-73},\n  url     = {http://jmlr.org/papers/v22/20-1061.html}\n}\n\n@software{pacmap_gpu2025,\n  title={PaCMAP-GPU: GPU-Accelerated Pairwise Controlled Manifold Approximation},\n  author={Antonio Roye-Azar},\n  year={2025},\n  url={https://github.com/AntonioRoye/PaCMAP-GPU}\n}\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- **Original PaCMAP**: [https://github.com/YingfanWang/PaCMAP](https://github.com/YingfanWang/PaCMAP)\n- **PyTorch**: [https://pytorch.org/](https://pytorch.org/)\n- **RAPIDS cuML**: [https://docs.rapids.ai/api/cuml/stable/](https://docs.rapids.ai/api/cuml/stable/)\n- **Issues**: [https://github.com/AntonioRoye/PaCMAP-GPU/issues](https://github.com/AntonioRoye/PaCMAP-GPU/issues)\n\n## \ud83d\ude4f Acknowledgments\n\n- Original PaCMAP authors for the algorithm and reference implementation\n- RAPIDS team for GPU acceleration libraries\n- PyTorch team for the deep learning framework\n- NVIDIA for CUDA and GPU computing platform\n\n---\n\n**Made with \u2764\ufe0f for the GPU computing community** \n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "GPU-Accelerated Pairwise Controlled Manifold Approximation for large-scale dimensionality reduction",
    "version": "1.0.2",
    "project_urls": {
        "Documentation": "https://github.com/AntonioRoye/PaCMAP-GPU#readme",
        "Homepage": "https://github.com/AntonioRoye/PaCMAP-GPU",
        "Issues": "https://github.com/AntonioRoye/PaCMAP-GPU/issues",
        "Repository": "https://github.com/AntonioRoye/PaCMAP-GPU"
    },
    "split_keywords": [
        "dimensionality-reduction",
        " manifold-learning",
        " gpu",
        " cuda",
        " machine-learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ff3a481e9237da7f27136f9c01df89f8965fca390c8f3e6bde71b09706c2e168",
                "md5": "71f1fe69dbd30d2f34e80b47bdcd6702",
                "sha256": "896d3e3df30eb0bdf3d81912f7c10dc2e3f5ce4381622e30c0c2df1e7a322607"
            },
            "downloads": -1,
            "filename": "pacmap_gpu-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "71f1fe69dbd30d2f34e80b47bdcd6702",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 20453,
            "upload_time": "2025-07-22T03:06:03",
            "upload_time_iso_8601": "2025-07-22T03:06:03.662793Z",
            "url": "https://files.pythonhosted.org/packages/ff/3a/481e9237da7f27136f9c01df89f8965fca390c8f3e6bde71b09706c2e168/pacmap_gpu-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2ab9c49ee9c570631eac95ec89cb362eea11460d36befe000282edf2e92e79fb",
                "md5": "f615dedd033e908d1fc5e10a0545ff81",
                "sha256": "d6b554786bc59788224de31fe2182721ed1104b02bd316fbeccb469ec76e97a7"
            },
            "downloads": -1,
            "filename": "pacmap_gpu-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f615dedd033e908d1fc5e10a0545ff81",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 42076,
            "upload_time": "2025-07-22T03:06:04",
            "upload_time_iso_8601": "2025-07-22T03:06:04.862688Z",
            "url": "https://files.pythonhosted.org/packages/2a/b9/c49ee9c570631eac95ec89cb362eea11460d36befe000282edf2e92e79fb/pacmap_gpu-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-22 03:06:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AntonioRoye",
    "github_project": "PaCMAP-GPU",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pacmap-gpu"
}

Antonio Roye-Azar