pdc-dp-means


Namepdc-dp-means JSON
Version 0.0.8 PyPI version JSON
download
home_pageNone
SummaryNone
upload_time2024-07-20 20:33:55
maintainerNone
docs_urlNone
authorOr Dinari
requires_pythonNone
licenseBSD3
keywords dp-means clustering
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Parallel Delayed Cluster DP-Means

[Paper](https://openreview.net/pdf?id=rnzVBD8jqlq) <br>

### Introduction
The PDC-DP-Means package presents a highly optimized version of the DP-Means algorithm, introducing a new parallel algorithm, Parallel Delayed Cluster DP-Means (PDC-DP-Means), and a MiniBatch implementation for enhanced speed. These features cater to scalable and efficient cluster analysis where the number of clusters is unknown.

In addition to offering major speed improvements, the PDC-DP-Means algorithm supports an optional online mode for real-time data processing. Its scikit-learn-like interface is user-friendly and designed for easy integration into existing data workflows. PDC-DP-Means outperforms other nonparametric methods, establishing its efficiency and scalability in the realm of clustering algorithms.

See the paper for more details.


### Installation
`pip install pdc-dp-means`

### Quick Start

    from sklearn.datasets import make_blobs
    from pdc_dp_means import DPMeans

    # Generate sample data
    X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)

    # Apply DPMeans clustering
    dpmeans = DPMeans(n_clusters=1,n_init=10, delta=10)  # n_init and delta parameters
    dpmeans.fit(X)

    # Predict the cluster for each data point
    y_dpmeans = dpmeans.predict(X)

    # Plotting clusters and centroids
    import matplotlib.pyplot as plt

    plt.scatter(X[:, 0], X[:, 1], c=y_dpmeans, s=50, cmap='viridis')
    centers = dpmeans.cluster_centers_
    plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)
    plt.show()

One thing to note is that we replace the `\lambda` parameter from the paper with `delta` in the code, as `lambda` is a reserved word in python.

### Usage
Please refer to the documentation: https://pdc-dp-means.readthedocs.io/en/latest/

### Paper Code
Please refer to https://github.com/BGU-CS-VIL/pdc-dp-means/tree/main/paper_code for the code used in the paper.

### Citing this work
If you use this code for your work, please cite the following:

```
@inproceedings{dinari2022revisiting,
  title={Revisiting {DP}-Means: Fast Scalable Algorithms via Parallelism and Delayed Cluster Creation},
  author={Dinari, Or and Freifeld, Oren},
  booktitle={The 38th Conference on Uncertainty in Artificial Intelligence},
  year={2022}
}
```
### License 
Our code is licensed under the BDS-3-Clause license.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pdc-dp-means",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "dp-means clustering",
    "author": "Or Dinari",
    "author_email": "dinari.or@gmail.com",
    "download_url": null,
    "platform": null,
    "description": "# Parallel Delayed Cluster DP-Means\n\n[Paper](https://openreview.net/pdf?id=rnzVBD8jqlq) <br>\n\n### Introduction\nThe PDC-DP-Means package presents a highly optimized version of the DP-Means algorithm, introducing a new parallel algorithm, Parallel Delayed Cluster DP-Means (PDC-DP-Means), and a MiniBatch implementation for enhanced speed. These features cater to scalable and efficient cluster analysis where the number of clusters is unknown.\n\nIn addition to offering major speed improvements, the PDC-DP-Means algorithm supports an optional online mode for real-time data processing. Its scikit-learn-like interface is user-friendly and designed for easy integration into existing data workflows. PDC-DP-Means outperforms other nonparametric methods, establishing its efficiency and scalability in the realm of clustering algorithms.\n\nSee the paper for more details.\n\n\n### Installation\n`pip install pdc-dp-means`\n\n### Quick Start\n\n    from sklearn.datasets import make_blobs\n    from pdc_dp_means import DPMeans\n\n    # Generate sample data\n    X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_state=0)\n\n    # Apply DPMeans clustering\n    dpmeans = DPMeans(n_clusters=1,n_init=10, delta=10)  # n_init and delta parameters\n    dpmeans.fit(X)\n\n    # Predict the cluster for each data point\n    y_dpmeans = dpmeans.predict(X)\n\n    # Plotting clusters and centroids\n    import matplotlib.pyplot as plt\n\n    plt.scatter(X[:, 0], X[:, 1], c=y_dpmeans, s=50, cmap='viridis')\n    centers = dpmeans.cluster_centers_\n    plt.scatter(centers[:, 0], centers[:, 1], c='black', s=200, alpha=0.5)\n    plt.show()\n\nOne thing to note is that we replace the `\\lambda` parameter from the paper with `delta` in the code, as `lambda` is a reserved word in python.\n\n### Usage\nPlease refer to the documentation: https://pdc-dp-means.readthedocs.io/en/latest/\n\n### Paper Code\nPlease refer to https://github.com/BGU-CS-VIL/pdc-dp-means/tree/main/paper_code for the code used in the paper.\n\n### Citing this work\nIf you use this code for your work, please cite the following:\n\n```\n@inproceedings{dinari2022revisiting,\n  title={Revisiting {DP}-Means: Fast Scalable Algorithms via Parallelism and Delayed Cluster Creation},\n  author={Dinari, Or and Freifeld, Oren},\n  booktitle={The 38th Conference on Uncertainty in Artificial Intelligence},\n  year={2022}\n}\n```\n### License \nOur code is licensed under the BDS-3-Clause license.\n",
    "bugtrack_url": null,
    "license": "BSD3",
    "summary": null,
    "version": "0.0.8",
    "project_urls": {
        "Documentation": "https://pdc-dp-means.readthedocs.io/en/latest/",
        "Source": "https://github.com/BGU-CS-VIL/pdc-dp-means",
        "Tracker": "https://github.com/BGU-CS-VIL/pdc-dp-means"
    },
    "split_keywords": [
        "dp-means",
        "clustering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "80df22fa8a25802fffb2b5ed9d57e6e2f55d9a60b95f29bf4ce49a8b916e3770",
                "md5": "3d47df2a6890b0a8fe68a9e58e644494",
                "sha256": "83b3069a0fa078a90d9db48272705880272092476de508b30a149fe85a07af5b"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp310-cp310-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "3d47df2a6890b0a8fe68a9e58e644494",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 2561225,
            "upload_time": "2024-07-20T20:33:55",
            "upload_time_iso_8601": "2024-07-20T20:33:55.686222Z",
            "url": "https://files.pythonhosted.org/packages/80/df/22fa8a25802fffb2b5ed9d57e6e2f55d9a60b95f29bf4ce49a8b916e3770/pdc_dp_means-0.0.8-cp310-cp310-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc932e9a14ab03f298bc544633851e0d962b57d1d1e394db415426fad52f40fb",
                "md5": "de03312cef756e041f78e72bf5635aea",
                "sha256": "8baeeb74efe8abca3d70bd7e5f1c6e4d9633c3ceea8eaf125b1c67d2ac887bf2"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "de03312cef756e041f78e72bf5635aea",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 3079251,
            "upload_time": "2024-07-20T20:33:57",
            "upload_time_iso_8601": "2024-07-20T20:33:57.409929Z",
            "url": "https://files.pythonhosted.org/packages/dc/93/2e9a14ab03f298bc544633851e0d962b57d1d1e394db415426fad52f40fb/pdc_dp_means-0.0.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1bd8cb8e04a7730d9d39d7044907972d23614eb38e415deda321fb20c0afa6a6",
                "md5": "1130d0ea2d32a02d132b84a75852365a",
                "sha256": "24b60975074a301b5f1a82373007372a7bd4b0d47b409194693ce01d86f29e1e"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp310-cp310-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "1130d0ea2d32a02d132b84a75852365a",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": null,
            "size": 2561299,
            "upload_time": "2024-07-20T20:33:59",
            "upload_time_iso_8601": "2024-07-20T20:33:59.066295Z",
            "url": "https://files.pythonhosted.org/packages/1b/d8/cb8e04a7730d9d39d7044907972d23614eb38e415deda321fb20c0afa6a6/pdc_dp_means-0.0.8-cp310-cp310-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0a531836fbaa5e1fe8f4b635750c4ae6fe99adb5f150e074143d07938bb73477",
                "md5": "b19d9510af655cacae1c4ad7d528b196",
                "sha256": "39ff67a3b65bc66688cdeee2b67cb3158e890fe8fc8bb7a04cf03948c1bb0eb0"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp311-cp311-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "b19d9510af655cacae1c4ad7d528b196",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 2561215,
            "upload_time": "2024-07-20T20:34:00",
            "upload_time_iso_8601": "2024-07-20T20:34:00.689610Z",
            "url": "https://files.pythonhosted.org/packages/0a/53/1836fbaa5e1fe8f4b635750c4ae6fe99adb5f150e074143d07938bb73477/pdc_dp_means-0.0.8-cp311-cp311-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7c1d8b748ac8a7ed1606b09af54aea1d395643c2c524bfd84859385d2e8be311",
                "md5": "2d029d04c00212bc4419b3b83f0fe24e",
                "sha256": "a691bbb15f59100a010358a9bc55b766d95736ad1ca53888b87bf969b0cf854e"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2d029d04c00212bc4419b3b83f0fe24e",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 3127374,
            "upload_time": "2024-07-20T20:34:02",
            "upload_time_iso_8601": "2024-07-20T20:34:02.315791Z",
            "url": "https://files.pythonhosted.org/packages/7c/1d/8b748ac8a7ed1606b09af54aea1d395643c2c524bfd84859385d2e8be311/pdc_dp_means-0.0.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c4d495057618524e451e682f022175676811929dd6db34ba255b128d91ecc984",
                "md5": "e4f92fac50e2c9d164bbf37eb27d1914",
                "sha256": "126b197dce8e96932ec7e55b21baf9a8196ff949aaa3d51b8ab7cf7247ffc629"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp311-cp311-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "e4f92fac50e2c9d164bbf37eb27d1914",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": null,
            "size": 2562032,
            "upload_time": "2024-07-20T20:34:03",
            "upload_time_iso_8601": "2024-07-20T20:34:03.990419Z",
            "url": "https://files.pythonhosted.org/packages/c4/d4/95057618524e451e682f022175676811929dd6db34ba255b128d91ecc984/pdc_dp_means-0.0.8-cp311-cp311-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "014e5a3d1f5b14ea36186bcf4e7be4b4b7d8313bae9c10dba5bdaa7836858999",
                "md5": "050d096e6a1430b783c23881b2843d7b",
                "sha256": "ba99c143a4f1eb1b0b81fe69b3d04a7b6109d0daa61aee4dd0f8bb47ddb0bdbb"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp312-cp312-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "050d096e6a1430b783c23881b2843d7b",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": null,
            "size": 2562294,
            "upload_time": "2024-07-20T20:34:05",
            "upload_time_iso_8601": "2024-07-20T20:34:05.484892Z",
            "url": "https://files.pythonhosted.org/packages/01/4e/5a3d1f5b14ea36186bcf4e7be4b4b7d8313bae9c10dba5bdaa7836858999/pdc_dp_means-0.0.8-cp312-cp312-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ac11a9d68aaa0cb6f436b873959151631ec7aa78973d41c2afe2c5d1fb7d42c7",
                "md5": "644066510f3611e2541e8f73e1a2df60",
                "sha256": "9282b74e086c1cf966888e0f98462abd96e1be4d89265b669a1bfa962a7e04fd"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "644066510f3611e2541e8f73e1a2df60",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": null,
            "size": 3109233,
            "upload_time": "2024-07-20T20:34:08",
            "upload_time_iso_8601": "2024-07-20T20:34:08.873347Z",
            "url": "https://files.pythonhosted.org/packages/ac/11/a9d68aaa0cb6f436b873959151631ec7aa78973d41c2afe2c5d1fb7d42c7/pdc_dp_means-0.0.8-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f6c5f3b87f647cd9934d8fa0b4138ff5d7c896b69c8d84450ab24dce472dd92",
                "md5": "bd0d817be7b358c93c0d36ae77b2d085",
                "sha256": "c86709f23ac22497b83884727f028930e3c9edcef7c4515f55dec41c4aa3fed6"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp312-cp312-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "bd0d817be7b358c93c0d36ae77b2d085",
            "packagetype": "bdist_wheel",
            "python_version": "cp312",
            "requires_python": null,
            "size": 2563415,
            "upload_time": "2024-07-20T20:34:10",
            "upload_time_iso_8601": "2024-07-20T20:34:10.581445Z",
            "url": "https://files.pythonhosted.org/packages/1f/6c/5f3b87f647cd9934d8fa0b4138ff5d7c896b69c8d84450ab24dce472dd92/pdc_dp_means-0.0.8-cp312-cp312-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fdaf4d789b8a833fa48d2cc80c8c18c86316726b6600bc0ba9b52d2a32082372",
                "md5": "5b0adb9670266b938b6d27c6863cc803",
                "sha256": "c4c1d95d445b194ed22df0e3704a6c37e1c102f2292796fdebb1acfc6c30405e"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp39-cp39-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "5b0adb9670266b938b6d27c6863cc803",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 2561772,
            "upload_time": "2024-07-20T20:34:12",
            "upload_time_iso_8601": "2024-07-20T20:34:12.083295Z",
            "url": "https://files.pythonhosted.org/packages/fd/af/4d789b8a833fa48d2cc80c8c18c86316726b6600bc0ba9b52d2a32082372/pdc_dp_means-0.0.8-cp39-cp39-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5b42178bca8b3e850e4361b7515e0376b51eb46bcc99dae69fc8f555d77d3ec7",
                "md5": "8070bd5ee62da933482a24f2c223f330",
                "sha256": "baa7a6e0f87f665cb9178d77a22458ab246e95486878d00cfe8aaa2ec44b08a0"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "8070bd5ee62da933482a24f2c223f330",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 3081138,
            "upload_time": "2024-07-20T20:34:13",
            "upload_time_iso_8601": "2024-07-20T20:34:13.339029Z",
            "url": "https://files.pythonhosted.org/packages/5b/42/178bca8b3e850e4361b7515e0376b51eb46bcc99dae69fc8f555d77d3ec7/pdc_dp_means-0.0.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8428a58832ef63413a058d711dc3b09df08bb5b313a729b2b44b8ad14b2ec40e",
                "md5": "21c5f0be0086959ee274c331680095a3",
                "sha256": "255b112f408aa04281ad26b7aa503324960196d5d6e9b90ddd110a5e5a463dc2"
            },
            "downloads": -1,
            "filename": "pdc_dp_means-0.0.8-cp39-cp39-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "21c5f0be0086959ee274c331680095a3",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": null,
            "size": 2561843,
            "upload_time": "2024-07-20T20:34:14",
            "upload_time_iso_8601": "2024-07-20T20:34:14.826110Z",
            "url": "https://files.pythonhosted.org/packages/84/28/a58832ef63413a058d711dc3b09df08bb5b313a729b2b44b8ad14b2ec40e/pdc_dp_means-0.0.8-cp39-cp39-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-20 20:33:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "BGU-CS-VIL",
    "github_project": "pdc-dp-means",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "pdc-dp-means"
}
        
Elapsed time: 0.30455s