druhg


Namedruhg JSON
Version 1.7.2 PyPI version JSON
download
home_pagehttps://github.com/artamono1/druhg
SummaryUniversal clustering based on dialectical materialism
upload_time2025-02-16 14:07:40
maintainerPavel Artamonov
docs_urlNone
authorNone
requires_pythonNone
licenseBSD
keywords cluster clustering density dialectics
VCS
bugtrack_url
requirements cython numpy scipy scikit-learn
Travis-CI
coveralls test coverage No coveralls.
            .. image:: https://img.shields.io/pypi/v/druhg.svg
    :target: https://pypi.python.org/pypi/druhg/
    :alt: PyPI Version
.. image:: https://img.shields.io/pypi/l/druhg.svg
    :target: https://github.com/artamono1/druhg/blob/master/LICENSE
    :alt: License

=====
DRUHG
=====

| DRUHG - Dialectical Reflection Universal Hierarchical Grouping (друг).
| Performs clustering based on densities and builds a minimum spanning tree.
| **Does not require parameters.** *(The parameter is space metric, e.x. euclidean)*
| The user can filter the size of the clusters with ``size_range``, for genuine result and genuine outliers set to [1,1].
| Parameter ``fix_outliers`` allows to label outliers to their closest clusters via mstree edges.

-------------
Basic Concept
-------------

| There are some optional tuning parameters but the actual algorithm requires none and is universal.
| It works by applying **the universal society rule: treat others how you want to be treated**.
| The core of the algorithm is to rank the subject's closest subjective similarities and amalgamate them accordingly.
| Parameter ``max_ranking`` controls precision vs productivity balance, after some value the precision and the result would not change.
| todo: Parameter ``algorithm`` can be set to 'slow' to further enhance the precision.
|
|
| The **dialectical distance** reflects the opposite density.
| Max( r/R d(r); d(R) ), where r and R are ranks from A to B and from B to A.
| This orders outliers last and equal densities first.
| It's great **replacement for DBSCAN** and **global outliers detection**.
|
| Those ordered connections become trees. Two trees reflect of each other in their totality and can transfrom into cluster.
| D N₂ K₁/(K₁+K₂) sum 1/dᵢ > N₁ - 1, where N is size of a tree, K is number of clusters in a tree.
| This allows newly formed clusters to resist the reshaping.


----------------
How to use DRUHG
----------------
.. code:: python

             import sklearn.datasets as datasets
             import druhg

             iris = datasets.load_iris()
             XX = iris['data']

             clusterer = druhg.DRUHG(max_ranking=50)
             labels = clusterer.fit(XX).labels_

It will build the tree and label the points. Now you can manipulate clusters by relabeling.

.. code:: python

             labels = dr.relabel(size_range==[1, len(XX)/2], fix_outliers=1)
             ari = adjusted_rand_score(iris['target'], labels)
             print ('iris ari', ari)

It will relabel the clusters, by restricting their size.

.. code:: python

            clusterer.plot(labels)

It will draw mstree with druhg-edges.

.. code:: python

            clusterer.plot()

It will provide interactive sliders for an exploration.

.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/chameleon-sliders.png
    :width: 300px
    :align: center
    :height: 200px
    :alt: chameleon-sliders

-----------
Performance
-----------
| It can be slow on a highly structural data.
| There is a parameters ``max_ranking`` that can be used to decrease for a better performance.

.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/comparison_ver.png
    :width: 300px
    :align: center
    :height: 200px
    :alt: comparison

----------
Installing
----------

PyPI install, presuming you have an up to date pip:

.. code:: bash

    pip install druhg


-----------------
Running the Tests
-----------------

The package tests can be run after installation using the command:

.. code:: bash

    pytest -k "test_name"


The tests may fail :-D

--------------
Python Version
--------------

The druhg library supports Python 3.


------------
Contributing
------------

We welcome contributions in any form! Assistance with documentation, particularly expanding tutorials,
is always welcome. To contribute please `fork the project <https://github.com/artamono1/druhg/issues#fork-destination-box>`_
make your changes and submit a pull request. We will do our best to work through any issues with
you and get your code merged into the main branch.

---------
Licensing
---------

The druhg package is 3-clause BSD licensed.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/artamono1/druhg",
    "name": "druhg",
    "maintainer": "Pavel Artamonov",
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": "druhg.p@gmail.com",
    "keywords": "cluster clustering density dialectics",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/0a/40/7982d3bae3ca5c5040e59aae4db67029b25b99e7ce9f4ee9441a50582557/druhg-1.7.2.tar.gz",
    "platform": null,
    "description": ".. image:: https://img.shields.io/pypi/v/druhg.svg\r\n    :target: https://pypi.python.org/pypi/druhg/\r\n    :alt: PyPI Version\r\n.. image:: https://img.shields.io/pypi/l/druhg.svg\r\n    :target: https://github.com/artamono1/druhg/blob/master/LICENSE\r\n    :alt: License\r\n\r\n=====\r\nDRUHG\r\n=====\r\n\r\n| DRUHG - Dialectical Reflection Universal Hierarchical Grouping (\u0434\u0440\u0443\u0433).\r\n| Performs clustering based on densities and builds a minimum spanning tree.\r\n| **Does not require parameters.** *(The parameter is space metric, e.x. euclidean)*\r\n| The user can filter the size of the clusters with ``size_range``, for genuine result and genuine outliers set to [1,1].\r\n| Parameter ``fix_outliers`` allows to label outliers to their closest clusters via mstree edges.\r\n\r\n-------------\r\nBasic Concept\r\n-------------\r\n\r\n| There are some optional tuning parameters but the actual algorithm requires none and is universal.\r\n| It works by applying **the universal society rule: treat others how you want to be treated**.\r\n| The core of the algorithm is to rank the subject's closest subjective similarities and amalgamate them accordingly.\r\n| Parameter ``max_ranking`` controls precision vs productivity balance, after some value the precision and the result would not change.\r\n| todo: Parameter ``algorithm`` can be set to 'slow' to further enhance the precision.\r\n|\r\n|\r\n| The **dialectical distance** reflects the opposite density.\r\n| Max( r/R d(r); d(R) ), where r and R are ranks from A to B and from B to A.\r\n| This orders outliers last and equal densities first.\r\n| It's great **replacement for DBSCAN** and **global outliers detection**.\r\n|\r\n| Those ordered connections become trees. Two trees reflect of each other in their totality and can transfrom into cluster.\r\n| D N\u2082 K\u2081/(K\u2081+K\u2082) sum 1/d\u1d62 > N\u2081 - 1, where N is size of a tree, K is number of clusters in a tree.\r\n| This allows newly formed clusters to resist the reshaping.\r\n\r\n\r\n----------------\r\nHow to use DRUHG\r\n----------------\r\n.. code:: python\r\n\r\n             import sklearn.datasets as datasets\r\n             import druhg\r\n\r\n             iris = datasets.load_iris()\r\n             XX = iris['data']\r\n\r\n             clusterer = druhg.DRUHG(max_ranking=50)\r\n             labels = clusterer.fit(XX).labels_\r\n\r\nIt will build the tree and label the points. Now you can manipulate clusters by relabeling.\r\n\r\n.. code:: python\r\n\r\n             labels = dr.relabel(size_range==[1, len(XX)/2], fix_outliers=1)\r\n             ari = adjusted_rand_score(iris['target'], labels)\r\n             print ('iris ari', ari)\r\n\r\nIt will relabel the clusters, by restricting their size.\r\n\r\n.. code:: python\r\n\r\n            clusterer.plot(labels)\r\n\r\nIt will draw mstree with druhg-edges.\r\n\r\n.. code:: python\r\n\r\n            clusterer.plot()\r\n\r\nIt will provide interactive sliders for an exploration.\r\n\r\n.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/chameleon-sliders.png\r\n    :width: 300px\r\n    :align: center\r\n    :height: 200px\r\n    :alt: chameleon-sliders\r\n\r\n-----------\r\nPerformance\r\n-----------\r\n| It can be slow on a highly structural data.\r\n| There is a parameters ``max_ranking`` that can be used to decrease for a better performance.\r\n\r\n.. image:: https://raw.githubusercontent.com/artamono1/druhg/master/docs/source/pics/comparison_ver.png\r\n    :width: 300px\r\n    :align: center\r\n    :height: 200px\r\n    :alt: comparison\r\n\r\n----------\r\nInstalling\r\n----------\r\n\r\nPyPI install, presuming you have an up to date pip:\r\n\r\n.. code:: bash\r\n\r\n    pip install druhg\r\n\r\n\r\n-----------------\r\nRunning the Tests\r\n-----------------\r\n\r\nThe package tests can be run after installation using the command:\r\n\r\n.. code:: bash\r\n\r\n    pytest -k \"test_name\"\r\n\r\n\r\nThe tests may fail :-D\r\n\r\n--------------\r\nPython Version\r\n--------------\r\n\r\nThe druhg library supports Python 3.\r\n\r\n\r\n------------\r\nContributing\r\n------------\r\n\r\nWe welcome contributions in any form! Assistance with documentation, particularly expanding tutorials,\r\nis always welcome. To contribute please `fork the project <https://github.com/artamono1/druhg/issues#fork-destination-box>`_\r\nmake your changes and submit a pull request. We will do our best to work through any issues with\r\nyou and get your code merged into the main branch.\r\n\r\n---------\r\nLicensing\r\n---------\r\n\r\nThe druhg package is 3-clause BSD licensed.\r\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Universal clustering based on dialectical materialism",
    "version": "1.7.2",
    "project_urls": {
        "Homepage": "https://github.com/artamono1/druhg"
    },
    "split_keywords": [
        "cluster",
        "clustering",
        "density",
        "dialectics"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0a407982d3bae3ca5c5040e59aae4db67029b25b99e7ce9f4ee9441a50582557",
                "md5": "626f724052b11254c9cb75a890c7e52a",
                "sha256": "55004050b296465b3c40e50311defc75754635d5b08ddd22fc07145e13c6a23e"
            },
            "downloads": -1,
            "filename": "druhg-1.7.2.tar.gz",
            "has_sig": false,
            "md5_digest": "626f724052b11254c9cb75a890c7e52a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 35639,
            "upload_time": "2025-02-16T14:07:40",
            "upload_time_iso_8601": "2025-02-16T14:07:40.141064Z",
            "url": "https://files.pythonhosted.org/packages/0a/40/7982d3bae3ca5c5040e59aae4db67029b25b99e7ce9f4ee9441a50582557/druhg-1.7.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-16 14:07:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "artamono1",
    "github_project": "druhg",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "cython",
            "specs": [
                [
                    ">=",
                    "0.27"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.24"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.10"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.3"
                ]
            ]
        }
    ],
    "lcname": "druhg"
}
        
Elapsed time: 1.41818s