kmeans1d


Namekmeans1d JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/dstein64/kmeans1d
SummaryA Python package for optimal 1D k-means clustering
upload_time2024-07-03 17:45:19
maintainerNone
docs_urlNone
authorDaniel Steinberg
requires_python>=3.6
licenseMIT
keywords k-means machine learning optimization
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Build Status](https://github.com/dstein64/kmeans1d/workflows/build/badge.svg)](https://github.com/dstein64/kmeans1d/actions)

kmeans1d
========

A Python library with an implementation of *k*-means clustering on 1D data, based on the algorithm
from Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).

Globally optimal *k*-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a
popular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial
time algorithms. The algorithm implemented here is an *O(kn + n log n)* dynamic programming algorithm
for finding the globally optimal *k* clusters for *n* 1D data points.

The code is written in C++, and wrapped with Python.

Requirements
------------

*kmeans1d* supports Python 3.x.

Installation
------------

[kmeans1d](https://pypi.python.org/pypi/kmeans1d) is available on PyPI, the Python Package Index.

```sh
$ pip3 install kmeans1d
```

Example Usage
-------------

```python
import kmeans1d

x = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]
k = 4

clusters, centroids = kmeans1d.cluster(x, k)

print(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]
print(centroids)  # [-50.0, 4.1, 94.0, 200.5]
```

Tests
-----

Tests are in [tests/](https://github.com/dstein64/kmeans1d/blob/master/tests).

```sh
# Run tests
$ python3 -m unittest discover tests -v
```

Development
-----------

The underlying C++ code can be built in-place, outside the context of `pip`. This requires Python
development tools for building Python modules (e.g., the `python3-dev` package on Ubuntu). `gcc`,
`clang`, and `MSVC` have been tested.

```
$ python3 setup.py build_ext --inplace
```

The [packages](https://github.com/dstein64/kmeans1d/blob/master/.github/workflows/packages.yml)
GitHub action can be manually triggered (`Actions` > `packages` > `Run workflow`) to build wheels
and a source distribution.

License
-------

The code in this repository has an [MIT License](https://en.wikipedia.org/wiki/MIT_License).

See [LICENSE](https://github.com/dstein64/kmeans1d/blob/master/LICENSE).

References
----------

[1] Wu, Xiaolin. "Optimal Quantization by Matrix Searching." Journal of Algorithms 12, no. 4
(December 1, 1991): 663

[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider,
and Mingzhou Song. "Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D."
ArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dstein64/kmeans1d",
    "name": "kmeans1d",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "k-means, machine learning, optimization",
    "author": "Daniel Steinberg",
    "author_email": "ds@dannyadam.com",
    "download_url": "https://files.pythonhosted.org/packages/18/5b/0ba1f71363e62fb2bd5370ec9b26360c482bd9ad06082b5dcfb03e148928/kmeans1d-0.4.0.tar.gz",
    "platform": null,
    "description": "[![Build Status](https://github.com/dstein64/kmeans1d/workflows/build/badge.svg)](https://github.com/dstein64/kmeans1d/actions)\n\nkmeans1d\n========\n\nA Python library with an implementation of *k*-means clustering on 1D data, based on the algorithm\nfrom Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).\n\nGlobally optimal *k*-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a\npopular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial\ntime algorithms. The algorithm implemented here is an *O(kn + n log n)* dynamic programming algorithm\nfor finding the globally optimal *k* clusters for *n* 1D data points.\n\nThe code is written in C++, and wrapped with Python.\n\nRequirements\n------------\n\n*kmeans1d* supports Python 3.x.\n\nInstallation\n------------\n\n[kmeans1d](https://pypi.python.org/pypi/kmeans1d) is available on PyPI, the Python Package Index.\n\n```sh\n$ pip3 install kmeans1d\n```\n\nExample Usage\n-------------\n\n```python\nimport kmeans1d\n\nx = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]\nk = 4\n\nclusters, centroids = kmeans1d.cluster(x, k)\n\nprint(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]\nprint(centroids)  # [-50.0, 4.1, 94.0, 200.5]\n```\n\nTests\n-----\n\nTests are in [tests/](https://github.com/dstein64/kmeans1d/blob/master/tests).\n\n```sh\n# Run tests\n$ python3 -m unittest discover tests -v\n```\n\nDevelopment\n-----------\n\nThe underlying C++ code can be built in-place, outside the context of `pip`. This requires Python\ndevelopment tools for building Python modules (e.g., the `python3-dev` package on Ubuntu). `gcc`,\n`clang`, and `MSVC` have been tested.\n\n```\n$ python3 setup.py build_ext --inplace\n```\n\nThe [packages](https://github.com/dstein64/kmeans1d/blob/master/.github/workflows/packages.yml)\nGitHub action can be manually triggered (`Actions` > `packages` > `Run workflow`) to build wheels\nand a source distribution.\n\nLicense\n-------\n\nThe code in this repository has an [MIT License](https://en.wikipedia.org/wiki/MIT_License).\n\nSee [LICENSE](https://github.com/dstein64/kmeans1d/blob/master/LICENSE).\n\nReferences\n----------\n\n[1] Wu, Xiaolin. \"Optimal Quantization by Matrix Searching.\" Journal of Algorithms 12, no. 4\n(December 1, 1991): 663\n\n[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider,\nand Mingzhou Song. \"Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D.\"\nArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for optimal 1D k-means clustering",
    "version": "0.4.0",
    "project_urls": {
        "Homepage": "https://github.com/dstein64/kmeans1d"
    },
    "split_keywords": [
        "k-means",
        " machine learning",
        " optimization"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "191957056e4566611f5da2166c52a9f88185c177a224d0ecab1dd632cffa1c88",
                "md5": "27cc446ada593465911564900b00f7cf",
                "sha256": "9a052bca5bbacbd70a7f4846896439ebed29e96a2b04115c1fec480555cad70d"
            },
            "downloads": -1,
            "filename": "kmeans1d-0.4.0-cp32-abi3-macosx_11_0_universal2.whl",
            "has_sig": false,
            "md5_digest": "27cc446ada593465911564900b00f7cf",
            "packagetype": "bdist_wheel",
            "python_version": "cp32",
            "requires_python": ">=3.6",
            "size": 28571,
            "upload_time": "2024-07-03T17:45:15",
            "upload_time_iso_8601": "2024-07-03T17:45:15.561866Z",
            "url": "https://files.pythonhosted.org/packages/19/19/57056e4566611f5da2166c52a9f88185c177a224d0ecab1dd632cffa1c88/kmeans1d-0.4.0-cp32-abi3-macosx_11_0_universal2.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ba0c63c31443d391f5452aa21e7d5e10106de06cd6ea42bf0a0689b01900139f",
                "md5": "2ad01bfbbc32032aebc10b6c03f7501d",
                "sha256": "ad8017fdb82746c718215695b4290e9bd6ded0305740e2fe2b1ede697a14e947"
            },
            "downloads": -1,
            "filename": "kmeans1d-0.4.0-cp32-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2ad01bfbbc32032aebc10b6c03f7501d",
            "packagetype": "bdist_wheel",
            "python_version": "cp32",
            "requires_python": ">=3.6",
            "size": 122693,
            "upload_time": "2024-07-03T17:45:17",
            "upload_time_iso_8601": "2024-07-03T17:45:17.428420Z",
            "url": "https://files.pythonhosted.org/packages/ba/0c/63c31443d391f5452aa21e7d5e10106de06cd6ea42bf0a0689b01900139f/kmeans1d-0.4.0-cp32-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d55adb6e1f27d31db470e53a6b70009216891050a7396a70584c9600a5b98bd4",
                "md5": "a327d8ece92b60bb366631897a8346d4",
                "sha256": "5943d68f6514309023ace23d8ad3f43703871e87a021e8024cdf4db79ed2f2e1"
            },
            "downloads": -1,
            "filename": "kmeans1d-0.4.0-cp32-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "a327d8ece92b60bb366631897a8346d4",
            "packagetype": "bdist_wheel",
            "python_version": "cp32",
            "requires_python": ">=3.6",
            "size": 18133,
            "upload_time": "2024-07-03T17:45:18",
            "upload_time_iso_8601": "2024-07-03T17:45:18.989084Z",
            "url": "https://files.pythonhosted.org/packages/d5/5a/db6e1f27d31db470e53a6b70009216891050a7396a70584c9600a5b98bd4/kmeans1d-0.4.0-cp32-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "185b0ba1f71363e62fb2bd5370ec9b26360c482bd9ad06082b5dcfb03e148928",
                "md5": "2b7eb7a6a44a2717e4717a07bd2cb6c6",
                "sha256": "80f66178d4e66f75e076c4877b52f9b4d0eb2d3b4294c056016a1384f75fe5de"
            },
            "downloads": -1,
            "filename": "kmeans1d-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2b7eb7a6a44a2717e4717a07bd2cb6c6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 7217,
            "upload_time": "2024-07-03T17:45:19",
            "upload_time_iso_8601": "2024-07-03T17:45:19.936062Z",
            "url": "https://files.pythonhosted.org/packages/18/5b/0ba1f71363e62fb2bd5370ec9b26360c482bd9ad06082b5dcfb03e148928/kmeans1d-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-03 17:45:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dstein64",
    "github_project": "kmeans1d",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "kmeans1d"
}
        
Elapsed time: 0.43341s