libcuml-cu12


Namelibcuml-cu12 JSON
Version 25.2.1 PyPI version JSON
download
home_pageNone
SummarycuML - RAPIDS ML Algorithms (C++)
upload_time2025-03-03 23:09:04
maintainerNone
docs_urlNone
authorNVIDIA Corporation
requires_python>=3.10
licenseApache 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <div align="left"><img src="img/rapids_logo.png" width="90px"/>&nbsp;cuML - GPU Machine Learning Algorithms</div>

cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other [RAPIDS](https://rapids.ai/) projects.

cuML enables data scientists, researchers, and software engineers to run
traditional tabular ML tasks on GPUs without going into the details of CUDA
programming. In most cases, cuML's Python API matches the API from
[scikit-learn](https://scikit-learn.org).

For large datasets, these GPU-based implementations can complete 10-50x faster
than their CPU equivalents. For details on performance, see the [cuML Benchmarks
Notebook](https://github.com/rapidsai/cuml/tree/branch-25.02/notebooks/tools).

As an example, the following Python snippet loads input and computes DBSCAN clusters, all on GPU, using cuDF:
```python
import cudf
from cuml.cluster import DBSCAN

# Create and populate a GPU DataFrame
gdf_float = cudf.DataFrame()
gdf_float['0'] = [1.0, 2.0, 5.0]
gdf_float['1'] = [4.0, 2.0, 1.0]
gdf_float['2'] = [4.0, 2.0, 1.0]

# Setup and fit clusters
dbscan_float = DBSCAN(eps=1.0, min_samples=1)
dbscan_float.fit(gdf_float)

print(dbscan_float.labels_)
```

Output:
```
0    0
1    1
2    2
dtype: int32
```

cuML also features multi-GPU and multi-node-multi-GPU operation, using [Dask](https://www.dask.org), for a
growing list of algorithms. The following Python snippet reads input from a CSV file and performs
a NearestNeighbors query across a cluster of Dask workers, using multiple GPUs on a single node:


Initialize a `LocalCUDACluster` configured with [UCX](https://github.com/rapidsai/ucx-py) for fast transport of CUDA arrays
```python
# Initialize UCX for high-speed transport of CUDA arrays
from dask_cuda import LocalCUDACluster

# Create a Dask single-node CUDA cluster w/ one worker per device
cluster = LocalCUDACluster(protocol="ucx",
                           enable_tcp_over_ucx=True,
                           enable_nvlink=True,
                           enable_infiniband=False)
```

Load data and perform `k-Nearest Neighbors` search. `cuml.dask` estimators also support `Dask.Array` as input:
```python

from dask.distributed import Client
client = Client(cluster)

# Read CSV file in parallel across workers
import dask_cudf
df = dask_cudf.read_csv("/path/to/csv")

# Fit a NearestNeighbors model and query it
from cuml.dask.neighbors import NearestNeighbors
nn = NearestNeighbors(n_neighbors = 10, client=client)
nn.fit(df)
neighbors = nn.kneighbors(df)
```

For additional examples, browse our complete [API
documentation](https://docs.rapids.ai/api/cuml/stable/), or check out our
example [walkthrough
notebooks](https://github.com/rapidsai/cuml/tree/branch-25.02/notebooks). Finally, you
can find complete end-to-end examples in the [notebooks-contrib
repo](https://github.com/rapidsai/notebooks-contrib).


### Supported Algorithms
| Category | Algorithm | Notes |
| --- | --- | --- |
| **Clustering** |  Density-Based Spatial Clustering of Applications with Noise (DBSCAN) | Multi-node multi-GPU via Dask |
|  | Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN)  | |
|  | K-Means | Multi-node multi-GPU via Dask |
|  | Single-Linkage Agglomerative Clustering | |
| **Dimensionality Reduction** | Principal Components Analysis (PCA) | Multi-node multi-GPU via Dask|
| | Incremental PCA | |
| | Truncated Singular Value Decomposition (tSVD) | Multi-node multi-GPU via Dask |
| | Uniform Manifold Approximation and Projection (UMAP) | Multi-node multi-GPU Inference via Dask |
| | Random Projection | |
| | t-Distributed Stochastic Neighbor Embedding (TSNE) | |
| **Linear Models for Regression or Classification** | Linear Regression (OLS) | Multi-node multi-GPU via Dask |
| | Linear Regression with Lasso or Ridge Regularization | Multi-node multi-GPU via Dask |
| | ElasticNet Regression | |
| | LARS Regression | (experimental) |
| | Logistic Regression | Multi-node multi-GPU via Dask-GLM [demo](https://github.com/daxiongshu/rapids-demos) |
| | Naive Bayes | Multi-node multi-GPU via Dask |
| | Stochastic Gradient Descent (SGD), Coordinate Descent (CD), and Quasi-Newton (QN) (including L-BFGS and OWL-QN) solvers for linear models  | |
| **Nonlinear Models for Regression or Classification** | Random Forest (RF) Classification | Experimental multi-node multi-GPU via Dask |
| | Random Forest (RF) Regression | Experimental multi-node multi-GPU via Dask |
| | Inference for decision tree-based models | Forest Inference Library (FIL) |
|  | K-Nearest Neighbors (KNN) Classification | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |
|  | K-Nearest Neighbors (KNN) Regression | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |
|  | Support Vector Machine Classifier (SVC) | |
|  | Epsilon-Support Vector Regression (SVR) | |
| **Preprocessing** | Standardization, or mean removal and variance scaling / Normalization / Encoding categorical features / Discretization / Imputation of missing values / Polynomial features generation / and coming soon custom transformers and non-linear transformation | Based on Scikit-Learn preprocessing
| **Time Series** | Holt-Winters Exponential Smoothing | |
|  | Auto-regressive Integrated Moving Average (ARIMA) | Supports seasonality (SARIMA) |
| **Model Explanation**                                 | SHAP Kernel Explainer
| [Based on SHAP](https://shap.readthedocs.io/en/latest/)                                                                                                                                              |
|                                                       | SHAP Permutation Explainer
| [Based on SHAP](https://shap.readthedocs.io/en/latest/)                                                                                                                                               |
| **Execution device interoperability** | | Run estimators interchangeably from host/cpu or device/gpu with minimal code change [demo](https://docs.rapids.ai/api/cuml/stable/execution_device_interoperability.html) |
| **Other**                                             | K-Nearest Neighbors (KNN) Search                                                                                                          | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |

---

## Installation

See [the RAPIDS Release Selector](https://docs.rapids.ai/install#selector) for
the command line to install either nightly or official release cuML packages
via Conda or Docker.

## Build/Install from Source
See the build [guide](BUILD.md).

## Contributing

Please see our [guide for contributing to cuML](CONTRIBUTING.md).

## References

The RAPIDS team has a number of blogs with deeper technical dives and examples. [You can find them here on Medium.](https://medium.com/rapids-ai/tagged/machine-learning)

For additional details on the technologies behind cuML, as well as a broader overview of the Python Machine Learning landscape, see [_Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence_ (2020)](https://arxiv.org/abs/2002.04803) by Sebastian Raschka, Joshua Patterson, and Corey Nolet.

Please consider citing this when using cuML in a project. You can use the citation BibTeX:

```bibtex
@article{raschka2020machine,
  title={Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence},
  author={Raschka, Sebastian and Patterson, Joshua and Nolet, Corey},
  journal={arXiv preprint arXiv:2002.04803},
  year={2020}
}
```

## Contact

Find out more details on the [RAPIDS site](https://rapids.ai/community.html)

## <div align="left"><img src="img/rapids_logo.png" width="265px"/></div> Open GPU Data Science

The RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

<p align="center"><img src="img/rapids_arrow.png" width="80%"/></p>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "libcuml-cu12",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "NVIDIA Corporation",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/bc/2c/082f021de508c2615285e3ba3f2975fb1b0a19de3582a8a8644bc69ccaa2/libcuml_cu12-25.2.1.tar.gz",
    "platform": null,
    "description": "# <div align=\"left\"><img src=\"img/rapids_logo.png\" width=\"90px\"/>&nbsp;cuML - GPU Machine Learning Algorithms</div>\n\ncuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other [RAPIDS](https://rapids.ai/) projects.\n\ncuML enables data scientists, researchers, and software engineers to run\ntraditional tabular ML tasks on GPUs without going into the details of CUDA\nprogramming. In most cases, cuML's Python API matches the API from\n[scikit-learn](https://scikit-learn.org).\n\nFor large datasets, these GPU-based implementations can complete 10-50x faster\nthan their CPU equivalents. For details on performance, see the [cuML Benchmarks\nNotebook](https://github.com/rapidsai/cuml/tree/branch-25.02/notebooks/tools).\n\nAs an example, the following Python snippet loads input and computes DBSCAN clusters, all on GPU, using cuDF:\n```python\nimport cudf\nfrom cuml.cluster import DBSCAN\n\n# Create and populate a GPU DataFrame\ngdf_float = cudf.DataFrame()\ngdf_float['0'] = [1.0, 2.0, 5.0]\ngdf_float['1'] = [4.0, 2.0, 1.0]\ngdf_float['2'] = [4.0, 2.0, 1.0]\n\n# Setup and fit clusters\ndbscan_float = DBSCAN(eps=1.0, min_samples=1)\ndbscan_float.fit(gdf_float)\n\nprint(dbscan_float.labels_)\n```\n\nOutput:\n```\n0    0\n1    1\n2    2\ndtype: int32\n```\n\ncuML also features multi-GPU and multi-node-multi-GPU operation, using [Dask](https://www.dask.org), for a\ngrowing list of algorithms. The following Python snippet reads input from a CSV file and performs\na NearestNeighbors query across a cluster of Dask workers, using multiple GPUs on a single node:\n\n\nInitialize a `LocalCUDACluster` configured with [UCX](https://github.com/rapidsai/ucx-py) for fast transport of CUDA arrays\n```python\n# Initialize UCX for high-speed transport of CUDA arrays\nfrom dask_cuda import LocalCUDACluster\n\n# Create a Dask single-node CUDA cluster w/ one worker per device\ncluster = LocalCUDACluster(protocol=\"ucx\",\n                           enable_tcp_over_ucx=True,\n                           enable_nvlink=True,\n                           enable_infiniband=False)\n```\n\nLoad data and perform `k-Nearest Neighbors` search. `cuml.dask` estimators also support `Dask.Array` as input:\n```python\n\nfrom dask.distributed import Client\nclient = Client(cluster)\n\n# Read CSV file in parallel across workers\nimport dask_cudf\ndf = dask_cudf.read_csv(\"/path/to/csv\")\n\n# Fit a NearestNeighbors model and query it\nfrom cuml.dask.neighbors import NearestNeighbors\nnn = NearestNeighbors(n_neighbors = 10, client=client)\nnn.fit(df)\nneighbors = nn.kneighbors(df)\n```\n\nFor additional examples, browse our complete [API\ndocumentation](https://docs.rapids.ai/api/cuml/stable/), or check out our\nexample [walkthrough\nnotebooks](https://github.com/rapidsai/cuml/tree/branch-25.02/notebooks). Finally, you\ncan find complete end-to-end examples in the [notebooks-contrib\nrepo](https://github.com/rapidsai/notebooks-contrib).\n\n\n### Supported Algorithms\n| Category | Algorithm | Notes |\n| --- | --- | --- |\n| **Clustering** |  Density-Based Spatial Clustering of Applications with Noise (DBSCAN) | Multi-node multi-GPU via Dask |\n|  | Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN)  | |\n|  | K-Means | Multi-node multi-GPU via Dask |\n|  | Single-Linkage Agglomerative Clustering | |\n| **Dimensionality Reduction** | Principal Components Analysis (PCA) | Multi-node multi-GPU via Dask|\n| | Incremental PCA | |\n| | Truncated Singular Value Decomposition (tSVD) | Multi-node multi-GPU via Dask |\n| | Uniform Manifold Approximation and Projection (UMAP) | Multi-node multi-GPU Inference via Dask |\n| | Random Projection | |\n| | t-Distributed Stochastic Neighbor Embedding (TSNE) | |\n| **Linear Models for Regression or Classification** | Linear Regression (OLS) | Multi-node multi-GPU via Dask |\n| | Linear Regression with Lasso or Ridge Regularization | Multi-node multi-GPU via Dask |\n| | ElasticNet Regression | |\n| | LARS Regression | (experimental) |\n| | Logistic Regression | Multi-node multi-GPU via Dask-GLM [demo](https://github.com/daxiongshu/rapids-demos) |\n| | Naive Bayes | Multi-node multi-GPU via Dask |\n| | Stochastic Gradient Descent (SGD), Coordinate Descent (CD), and Quasi-Newton (QN) (including L-BFGS and OWL-QN) solvers for linear models  | |\n| **Nonlinear Models for Regression or Classification** | Random Forest (RF) Classification | Experimental multi-node multi-GPU via Dask |\n| | Random Forest (RF) Regression | Experimental multi-node multi-GPU via Dask |\n| | Inference for decision tree-based models | Forest Inference Library (FIL) |\n|  | K-Nearest Neighbors (KNN) Classification | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |\n|  | K-Nearest Neighbors (KNN) Regression | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |\n|  | Support Vector Machine Classifier (SVC) | |\n|  | Epsilon-Support Vector Regression (SVR) | |\n| **Preprocessing** | Standardization, or mean removal and variance scaling / Normalization / Encoding categorical features / Discretization / Imputation of missing values / Polynomial features generation / and coming soon custom transformers and non-linear transformation | Based on Scikit-Learn preprocessing\n| **Time Series** | Holt-Winters Exponential Smoothing | |\n|  | Auto-regressive Integrated Moving Average (ARIMA) | Supports seasonality (SARIMA) |\n| **Model Explanation**                                 | SHAP Kernel Explainer\n| [Based on SHAP](https://shap.readthedocs.io/en/latest/)                                                                                                                                              |\n|                                                       | SHAP Permutation Explainer\n| [Based on SHAP](https://shap.readthedocs.io/en/latest/)                                                                                                                                               |\n| **Execution device interoperability** | | Run estimators interchangeably from host/cpu or device/gpu with minimal code change [demo](https://docs.rapids.ai/api/cuml/stable/execution_device_interoperability.html) |\n| **Other**                                             | K-Nearest Neighbors (KNN) Search                                                                                                          | Multi-node multi-GPU via Dask+[UCX](https://github.com/rapidsai/ucx-py), uses [Faiss](https://github.com/facebookresearch/faiss) for Nearest Neighbors Query. |\n\n---\n\n## Installation\n\nSee [the RAPIDS Release Selector](https://docs.rapids.ai/install#selector) for\nthe command line to install either nightly or official release cuML packages\nvia Conda or Docker.\n\n## Build/Install from Source\nSee the build [guide](BUILD.md).\n\n## Contributing\n\nPlease see our [guide for contributing to cuML](CONTRIBUTING.md).\n\n## References\n\nThe RAPIDS team has a number of blogs with deeper technical dives and examples. [You can find them here on Medium.](https://medium.com/rapids-ai/tagged/machine-learning)\n\nFor additional details on the technologies behind cuML, as well as a broader overview of the Python Machine Learning landscape, see [_Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence_ (2020)](https://arxiv.org/abs/2002.04803) by Sebastian Raschka, Joshua Patterson, and Corey Nolet.\n\nPlease consider citing this when using cuML in a project. You can use the citation BibTeX:\n\n```bibtex\n@article{raschka2020machine,\n  title={Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence},\n  author={Raschka, Sebastian and Patterson, Joshua and Nolet, Corey},\n  journal={arXiv preprint arXiv:2002.04803},\n  year={2020}\n}\n```\n\n## Contact\n\nFind out more details on the [RAPIDS site](https://rapids.ai/community.html)\n\n## <div align=\"left\"><img src=\"img/rapids_logo.png\" width=\"265px\"/></div> Open GPU Data Science\n\nThe RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA\u00ae CUDA\u00ae primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.\n\n<p align=\"center\"><img src=\"img/rapids_arrow.png\" width=\"80%\"/></p>\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "cuML - RAPIDS ML Algorithms (C++)",
    "version": "25.2.1",
    "project_urls": {
        "Homepage": "https://github.com/rapidsai/cuml"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc2c082f021de508c2615285e3ba3f2975fb1b0a19de3582a8a8644bc69ccaa2",
                "md5": "dcf8eaca1e0e368ac7253e7f8f20d8b8",
                "sha256": "344358186ecceef06cc39fa85c6ee520d3f3da8af5cfea607d953defe0e6807b"
            },
            "downloads": -1,
            "filename": "libcuml_cu12-25.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "dcf8eaca1e0e368ac7253e7f8f20d8b8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 4052,
            "upload_time": "2025-03-03T23:09:04",
            "upload_time_iso_8601": "2025-03-03T23:09:04.948131Z",
            "url": "https://files.pythonhosted.org/packages/bc/2c/082f021de508c2615285e3ba3f2975fb1b0a19de3582a8a8644bc69ccaa2/libcuml_cu12-25.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-03-03 23:09:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rapidsai",
    "github_project": "cuml",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "libcuml-cu12"
}
        
Elapsed time: 0.40942s