myclustering

Name	myclustering JSON
Version	0.1.0 JSON
	download
home_page	https://github.com/Natali-Hovhannisyan/DS233_Python_Package
Summary	Python package for clustering
upload_time	2023-05-16 15:16:28
maintainer
docs_url	None
author	Natali Hovhannisyan
requires_python	>=3.6
license	MIT license
keywords	myclustering
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI
coveralls test coverage	No coveralls.

            # myclustering package


## Description and Features

The MyClustering package is a Python library that provides implementations of various clustering algorithms, including K-means. It also includes utilities for visualizing clustering results and performing dimensionality reduction using PCA. The package aims to simplify the process of clustering and provide tools for analyzing and interpreting clustering results.

Key features of the myclustering package include:

- K-means clustering algorithm
- Silhouette score calculation and elbow method visualization
- PCA for dimensionality reduction
- Visualization of clustering results using scatter plots

## Installation

To install the myclustering package, you can use pip:

```bash
pip install myclustering
```

## Usage Examples

Here are some examples of how to use the myclustering package:

### K-means Clustering

```python
import numpy as np
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
from myclustering.kmeans.kmeans import KMeans

X, y = make_blobs(centers=3, n_samples=500, n_features=2, shuffle=True, random_state=40)
print(X.shape)

clusters = len(np.unique(y))
print(clusters)
k = KMeans(K=clusters, max_iters=150, plot_steps=False)
y_pred = k.fit(X)

k.plot()

```
You can identify the best number of cluster for K-means by looking at the silhouette score and elbow method
visualization.

```python
from myclustering.kmeans.silhouette import silhouette_score
from myclustering.kmeans.elbow import elbow_method 

silhouette_score(X,k)
elbow_method(X)
```
### PCA for dimensionality reduction

```python
from sklearn import datasets
import matplotlib.pyplot as plt
import numpy as np
from myclustering.pca.pca import PCA


data = datasets.load_iris()
X = data.data
y = data.target

# Project the data onto the 2 primary principal components
pca = PCA(2)
pca.fit(X)
X_projected = pca.transform(X)

print('Shape of X:', X.shape)
print('Shape of transformed X:', X_projected.shape)

x1 = X_projected[:, 0]
x2 = X_projected[:, 1]

plt.scatter(x1, x2,
        c=y, edgecolor='none', alpha=0.8,
        cmap=plt.cm.get_cmap('viridis', 3))

plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.colorbar()
plt.show()
```
### DBSCAN visualization with the help of PCA

You can also visalize the results of your DBSCAN algorithm and identify the outliers in you data.

```python
from myclustering.dbscan.visualization import plot_dbscan_pca
plot_dbscan_pca(X, epsilon = 0.3, min_points = 5)
```
## Customer segmentation

One of the common applications of clustering is customer segmentation, where customers are grouped into distinct segments based on their behavior, preferences, or characteristics. The MyClustering package can be used for customer segmentation tasks.

Here's an example of how the myclustering package can be used for customer segmentation:

```python
import pandas as pd
from myclustering.kmeans.kmeans import KMeans
from myclustering.pca.pca import PCA

# Load customer data
data = pd.read_csv('customer_data.csv')

# Preprocess the data (e.g., remove missing values, scale features)

# Apply PCA for dimensionality reduction
pca = PCA(n_components=2)
pca.fit(data)
X_pca = pca.transform(data)

# Apply K-means clustering
kmeans = KMeans(K=3, max_iters=150, plot_steps=False)
kmeans.fit(data)

# Analyze the clustering results
# (e.g., visualize clusters, identify key features for each cluster)
kmeans.plot()

# Interpret and use the customer segments for targeted marketing, personalized recommendations, etc.
```
## Contributing

Contributions to the MyClustering package are welcome! If you find any issues, have suggestions for improvements, or would like to add new features, feel free to open an issue or submit a pull request on the GitHub repository.

## License

The myclustering package is licensed under the MIT License. See the [MIT](https://opensource.org/license/mit/) for more information.


Credits
-------

This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.

.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Natali-Hovhannisyan/DS233_Python_Package",
    "name": "myclustering",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "myclustering",
    "author": "Natali Hovhannisyan",
    "author_email": "natalihovhannisyan00@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/38/21/187c70595197da900e0693dc67d749c9b6e37cab4421f6b06fbbe0b46b33/myclustering-0.1.0.tar.gz",
    "platform": null,
    "description": "# myclustering package\n\n\n## Description and Features\n\nThe MyClustering package is a Python library that provides implementations of various clustering algorithms, including K-means. It also includes utilities for visualizing clustering results and performing dimensionality reduction using PCA. The package aims to simplify the process of clustering and provide tools for analyzing and interpreting clustering results.\n\nKey features of the myclustering package include:\n\n- K-means clustering algorithm\n- Silhouette score calculation and elbow method visualization\n- PCA for dimensionality reduction\n- Visualization of clustering results using scatter plots\n\n## Installation\n\nTo install the myclustering package, you can use pip:\n\n```bash\npip install myclustering\n```\n\n## Usage Examples\n\nHere are some examples of how to use the myclustering package:\n\n### K-means Clustering\n\n```python\nimport numpy as np\nfrom sklearn.datasets import make_blobs\nimport matplotlib.pyplot as plt\nfrom myclustering.kmeans.kmeans import KMeans\n\nX, y = make_blobs(centers=3, n_samples=500, n_features=2, shuffle=True, random_state=40)\nprint(X.shape)\n\nclusters = len(np.unique(y))\nprint(clusters)\nk = KMeans(K=clusters, max_iters=150, plot_steps=False)\ny_pred = k.fit(X)\n\nk.plot()\n\n```\nYou can identify the best number of cluster for K-means by looking at the silhouette score and elbow method\nvisualization.\n\n```python\nfrom myclustering.kmeans.silhouette import silhouette_score\nfrom myclustering.kmeans.elbow import elbow_method \n\nsilhouette_score(X,k)\nelbow_method(X)\n```\n### PCA for dimensionality reduction\n\n```python\nfrom sklearn import datasets\nimport matplotlib.pyplot as plt\nimport numpy as np\nfrom myclustering.pca.pca import PCA\n\n\ndata = datasets.load_iris()\nX = data.data\ny = data.target\n\n# Project the data onto the 2 primary principal components\npca = PCA(2)\npca.fit(X)\nX_projected = pca.transform(X)\n\nprint('Shape of X:', X.shape)\nprint('Shape of transformed X:', X_projected.shape)\n\nx1 = X_projected[:, 0]\nx2 = X_projected[:, 1]\n\nplt.scatter(x1, x2,\n        c=y, edgecolor='none', alpha=0.8,\n        cmap=plt.cm.get_cmap('viridis', 3))\n\nplt.xlabel('Principal Component 1')\nplt.ylabel('Principal Component 2')\nplt.colorbar()\nplt.show()\n```\n### DBSCAN visualization with the help of PCA\n\nYou can also visalize the results of your DBSCAN algorithm and identify the outliers in you data.\n\n```python\nfrom myclustering.dbscan.visualization import plot_dbscan_pca\nplot_dbscan_pca(X, epsilon = 0.3, min_points = 5)\n```\n## Customer segmentation\n\nOne of the common applications of clustering is customer segmentation, where customers are grouped into distinct segments based on their behavior, preferences, or characteristics. The MyClustering package can be used for customer segmentation tasks.\n\nHere's an example of how the myclustering package can be used for customer segmentation:\n\n```python\nimport pandas as pd\nfrom myclustering.kmeans.kmeans import KMeans\nfrom myclustering.pca.pca import PCA\n\n# Load customer data\ndata = pd.read_csv('customer_data.csv')\n\n# Preprocess the data (e.g., remove missing values, scale features)\n\n# Apply PCA for dimensionality reduction\npca = PCA(n_components=2)\npca.fit(data)\nX_pca = pca.transform(data)\n\n# Apply K-means clustering\nkmeans = KMeans(K=3, max_iters=150, plot_steps=False)\nkmeans.fit(data)\n\n# Analyze the clustering results\n# (e.g., visualize clusters, identify key features for each cluster)\nkmeans.plot()\n\n# Interpret and use the customer segments for targeted marketing, personalized recommendations, etc.\n```\n## Contributing\n\nContributions to the MyClustering package are welcome! If you find any issues, have suggestions for improvements, or would like to add new features, feel free to open an issue or submit a pull request on the GitHub repository.\n\n## License\n\nThe myclustering package is licensed under the MIT License. See the [MIT](https://opensource.org/license/mit/) for more information.\n\n\nCredits\n-------\n\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\n\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\n\n\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Python package for clustering",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/Natali-Hovhannisyan/DS233_Python_Package"
    },
    "split_keywords": [
        "myclustering"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "979e443cc059c7375498285896fe6c79f4f8ab12b237aa9cd62495f5e960b0ec",
                "md5": "0ca5a1f8ef00fad35203960d4e078c0e",
                "sha256": "8cd8a4a1a59c10fae79638b24c1ca2c6974cfa07c8b666f83db4b3435261f97b"
            },
            "downloads": -1,
            "filename": "myclustering-0.1.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0ca5a1f8ef00fad35203960d4e078c0e",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.6",
            "size": 13292,
            "upload_time": "2023-05-16T15:16:26",
            "upload_time_iso_8601": "2023-05-16T15:16:26.996329Z",
            "url": "https://files.pythonhosted.org/packages/97/9e/443cc059c7375498285896fe6c79f4f8ab12b237aa9cd62495f5e960b0ec/myclustering-0.1.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3821187c70595197da900e0693dc67d749c9b6e37cab4421f6b06fbbe0b46b33",
                "md5": "bc675441b1cf474936ec205c3cf428a0",
                "sha256": "d6f13469fdc23bc3bc75ed387470bcdcb4cf706b2e5c2b93bf4b2f6d6eb9f497"
            },
            "downloads": -1,
            "filename": "myclustering-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bc675441b1cf474936ec205c3cf428a0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 19537,
            "upload_time": "2023-05-16T15:16:28",
            "upload_time_iso_8601": "2023-05-16T15:16:28.964552Z",
            "url": "https://files.pythonhosted.org/packages/38/21/187c70595197da900e0693dc67d749c9b6e37cab4421f6b06fbbe0b46b33/myclustering-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-16 15:16:28",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Natali-Hovhannisyan",
    "github_project": "DS233_Python_Package",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "myclustering"
}

Natali Hovhannisyan