.. image:: https://img.shields.io/pypi/v/druhg.svg
:target: https://pypi.python.org/pypi/druhg/
:alt: PyPI Version
.. image:: https://img.shields.io/pypi/l/druhg.svg
:target: https://github.com/artamono1/druhg/blob/master/LICENSE
:alt: License
=====
DRUHG
=====
| DRUHG - Dialectical Reflection Universal Hierarchical Grouping (друг).
| Performs clustering based on densities and builds a minimum spanning tree.
| **Does not require parameters.** *(The parameter is metric)*
| The user can filter the size of the clusters with ``limit1`` and ``limit2``.
| To get the genuine result and genuine outliers set ``limit1`` to 1 and ``limit2`` to sample size.
| Parameter ``fix_outliers`` allows to label outliers to their closest clusters via mstree edges.
-------------
Basic Concept
-------------
| There are some optional tuning parameters but the actual algorithm requires none and is universal.
| It works by applying **the universal society rule: treat others how you want to be treated**.
| The core of the algorithm is to rank the subject's closest subjective similarities and amalgamate them accordingly.
| Parameter ``max_ranking`` controls precision vs productivity balance, after some value the precision and the result would not change.
| Parameter ``algorithm`` can be set to 'slow' to further enhance the precision.
| The relationship of two objects sets two local densities, and distorts the distance between them.
| That **dialectical distance** is the reflection - one objects adjusts it's density to fit it's counterpart.
| This allows to arrange all of the relationships into minimal spanning tree.
| Mutual closeness is preferential.
| At the start, unconnected objects amalgamate into Universal and these contradictions define what amalgamation is the cluster.
| The amalgamation has to reflect in the other to emerge as a cluster. The more sizeable adversary the more probable is the change.
| After formation big cluster resists the outliers. This makes it a great algorithm for **outlier detection**.
| *Cluster is a mutually-close reflections.*
| To come up with this universal solution philosophy of dialectical materialism was used.
| You can read more about it in this work. In Russian
| (https://druhg.readthedocs.io/en/latest/dialectic_of_data.html)
| where you can read on:
| - triad Quality-Quantity-Measure (distance-rank-memberships)
| - triad Singular-Particular-Universal (subject-cluster-dataset)
| - and more
----------------
How to use DRUHG
----------------
.. code:: python
import sklearn.datasets as datasets
import druhg
iris = datasets.load_iris()
XX = iris['data']
clusterer = druhg.DRUHG(max_ranking=50)
labels = clusterer.fit(XX).labels_
It will build the tree and label the points. Now you can manipulate clusters by relabeling.
.. code:: python
labels = dr.relabel(limit1=1, limit2=len(XX)/2, fix_outliers=1)
ari = adjusted_rand_score(iris['target'], labels)
print ('iris ari', ari)
It will relabel the clusters, by restricting their size.
.. code:: python
from druhg import DRUHG
import matplotlib.pyplot as plt
import pandas as pd, numpy as np
XX = pd.read_csv('chameleon.csv', sep='\t', header=None)
XX = np.array(XX)
clusterer = DRUHG(max_ranking=200)
clusterer.fit(XX)
plt.figure(figsize=(30,16))
clusterer.minimum_spanning_tree_.plot(node_size=200)
It will draw mstree with druhg-edges.
.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/chameleon.jpg
:width: 300px
:align: center
:height: 200px
:alt: chameleon
-----------
Performance
-----------
| It can be slow on a highly structural data.
| There is a parameters ``max_ranking`` that can be used to decrease for a better performance.
----------
Installing
----------
PyPI install, presuming you have an up to date pip:
.. code:: bash
pip install druhg
-----------------
Running the Tests
-----------------
The package tests can be run after installation using the command:
.. code:: bash
pytest -s druhg
or
.. code:: bash
python -m pytest -s druhg
The tests may fail :-D
--------------
Python Version
--------------
The druhg library supports both Python 2 and Python 3.
------------
Contributing
------------
We welcome contributions in any form! Assistance with documentation, particularly expanding tutorials,
is always welcome. To contribute please `fork the project <https://github.com/artamono1/druhg/issues#fork-destination-box>`_
make your changes and submit a pull request. We will do our best to work through any issues with
you and get your code merged into the main branch.
---------
Licensing
---------
The druhg package is 3-clause BSD licensed.
Raw data
{
"_id": null,
"home_page": "https://github.com/artamono/druhg",
"name": "druhg",
"maintainer": "Pavel Artamonov",
"docs_url": null,
"requires_python": "",
"maintainer_email": "druhg.p@gmail.com",
"keywords": "cluster clustering density dialectics",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/8c/f9/8c8cd44443fc900bce226fa9d09c0ba77dfaf53bd1d6ebe8dfb676e7c697/druhg-1.5.0.tar.gz",
"platform": null,
"description": ".. image:: https://img.shields.io/pypi/v/druhg.svg\n :target: https://pypi.python.org/pypi/druhg/\n :alt: PyPI Version\n.. image:: https://img.shields.io/pypi/l/druhg.svg\n :target: https://github.com/artamono1/druhg/blob/master/LICENSE\n :alt: License\n\n=====\nDRUHG\n=====\n\n| DRUHG - Dialectical Reflection Universal Hierarchical Grouping (\u00d0\u00b4\u00d1\u20ac\u00d1\u0192\u00d0\u00b3).\n| Performs clustering based on densities and builds a minimum spanning tree.\n| **Does not require parameters.** *(The parameter is metric)*\n| The user can filter the size of the clusters with ``limit1`` and ``limit2``.\n| To get the genuine result and genuine outliers set ``limit1`` to 1 and ``limit2`` to sample size.\n| Parameter ``fix_outliers`` allows to label outliers to their closest clusters via mstree edges.\n\n-------------\nBasic Concept\n-------------\n\n| There are some optional tuning parameters but the actual algorithm requires none and is universal.\n| It works by applying **the universal society rule: treat others how you want to be treated**.\n| The core of the algorithm is to rank the subject's closest subjective similarities and amalgamate them accordingly.\n| Parameter ``max_ranking`` controls precision vs productivity balance, after some value the precision and the result would not change.\n| Parameter ``algorithm`` can be set to 'slow' to further enhance the precision.\n\n| The relationship of two objects sets two local densities, and distorts the distance between them.\n| That **dialectical distance** is the reflection - one objects adjusts it's density to fit it's counterpart.\n| This allows to arrange all of the relationships into minimal spanning tree.\n| Mutual closeness is preferential.\n\n| At the start, unconnected objects amalgamate into Universal and these contradictions define what amalgamation is the cluster.\n| The amalgamation has to reflect in the other to emerge as a cluster. The more sizeable adversary the more probable is the change.\n| After formation big cluster resists the outliers. This makes it a great algorithm for **outlier detection**.\n\n| *Cluster is a mutually-close reflections.*\n| To come up with this universal solution philosophy of dialectical materialism was used.\n| You can read more about it in this work. In Russian\n| (https://druhg.readthedocs.io/en/latest/dialectic_of_data.html)\n| where you can read on:\n| - triad Quality-Quantity-Measure (distance-rank-memberships)\n| - triad Singular-Particular-Universal (subject-cluster-dataset)\n| - and more\n\n----------------\nHow to use DRUHG\n----------------\n.. code:: python\n\n import sklearn.datasets as datasets\n import druhg\n\n iris = datasets.load_iris()\n XX = iris['data']\n\n clusterer = druhg.DRUHG(max_ranking=50)\n labels = clusterer.fit(XX).labels_\n\nIt will build the tree and label the points. Now you can manipulate clusters by relabeling.\n\n.. code:: python\n\n labels = dr.relabel(limit1=1, limit2=len(XX)/2, fix_outliers=1)\n ari = adjusted_rand_score(iris['target'], labels)\n print ('iris ari', ari)\n\nIt will relabel the clusters, by restricting their size.\n\n.. code:: python\n\n from druhg import DRUHG\n import matplotlib.pyplot as plt\n import pandas as pd, numpy as np\n\n XX = pd.read_csv('chameleon.csv', sep='\\t', header=None)\n XX = np.array(XX)\n clusterer = DRUHG(max_ranking=200)\n clusterer.fit(XX)\n\n plt.figure(figsize=(30,16))\n clusterer.minimum_spanning_tree_.plot(node_size=200)\n\nIt will draw mstree with druhg-edges.\n\n.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/chameleon.jpg\n :width: 300px\n :align: center\n :height: 200px\n :alt: chameleon\n\n-----------\nPerformance\n-----------\n| It can be slow on a highly structural data.\n| There is a parameters ``max_ranking`` that can be used to decrease for a better performance.\n\n----------\nInstalling\n----------\n\nPyPI install, presuming you have an up to date pip:\n\n.. code:: bash\n\n pip install druhg\n\n\n-----------------\nRunning the Tests\n-----------------\n\nThe package tests can be run after installation using the command:\n\n.. code:: bash\n\n pytest -s druhg\n\nor\n\n.. code:: bash\n\n python -m pytest -s druhg\n\nThe tests may fail :-D\n\n--------------\nPython Version\n--------------\n\nThe druhg library supports both Python 2 and Python 3. \n\n\n------------\nContributing\n------------\n\nWe welcome contributions in any form! Assistance with documentation, particularly expanding tutorials,\nis always welcome. To contribute please `fork the project <https://github.com/artamono1/druhg/issues#fork-destination-box>`_\nmake your changes and submit a pull request. We will do our best to work through any issues with\nyou and get your code merged into the main branch.\n\n---------\nLicensing\n---------\n\nThe druhg package is 3-clause BSD licensed.",
"bugtrack_url": null,
"license": "BSD",
"summary": "Universal clustering based on dialectical materialism",
"version": "1.5.0",
"project_urls": {
"Homepage": "https://github.com/artamono/druhg"
},
"split_keywords": [
"cluster",
"clustering",
"density",
"dialectics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8cf98c8cd44443fc900bce226fa9d09c0ba77dfaf53bd1d6ebe8dfb676e7c697",
"md5": "32bb75d4156a5614527267eaf6681ce7",
"sha256": "7278ee63558ecee53c87c0b93ed335850bb5e6ebe6bee9d176580d758b77503e"
},
"downloads": -1,
"filename": "druhg-1.5.0.tar.gz",
"has_sig": false,
"md5_digest": "32bb75d4156a5614527267eaf6681ce7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 477209,
"upload_time": "2023-08-31T04:26:44",
"upload_time_iso_8601": "2023-08-31T04:26:44.938776Z",
"url": "https://files.pythonhosted.org/packages/8c/f9/8c8cd44443fc900bce226fa9d09c0ba77dfaf53bd1d6ebe8dfb676e7c697/druhg-1.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-31 04:26:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "artamono",
"github_project": "druhg",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "druhg"
}