hnet


Namehnet JSON
Version 1.2.3 PyPI version JSON
download
home_pagehttps://erdogant.github.io/hnet
SummaryGraphical Hypergeometric Networks
upload_time2023-10-18 20:08:26
maintainer
docs_urlNone
authorErdogan Taskesen
requires_python>=3
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Python](https://img.shields.io/pypi/pyversions/hnet)](https://img.shields.io/pypi/pyversions/hnet)
[![PyPI Version](https://img.shields.io/pypi/v/hnet)](https://pypi.org/project/hnet/)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/erdogant/hnet/blob/master/LICENSE)
[![Github Forks](https://img.shields.io/github/forks/erdogant/hnet.svg)](https://github.com/erdogant/hnet/network)
[![GitHub Open Issues](https://img.shields.io/github/issues/erdogant/hnet.svg)](https://github.com/erdogant/hnet/issues)
[![Project Status](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![Downloads](https://pepy.tech/badge/hnet/month)](https://pepy.tech/project/hnet/)
[![Downloads](https://pepy.tech/badge/hnet)](https://pepy.tech/project/hnet)
[![Sphinx](https://img.shields.io/badge/Sphinx-Docs-Green)](https://erdogant.github.io/hnet/)
[![arXiv](https://img.shields.io/badge/arXiv-Docs-Green)](https://arxiv.org/abs/2005.04679)
[![Substack](https://img.shields.io/badge/Substack-Blog-green)](https://erdogant.substack.com/p/advanced-network-analysis-to-explore)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/erdogant/hnet/blob/master/notebooks/hnet.ipynb)
<!-- [![DOI](https://zenodo.org/badge/226647104.svg)](https://zenodo.org/badge/latestdoi/226647104) -->

<p align="left">
  <a href="https://erdogant.substack.com/p/advanced-network-analysis-to-explore">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/blog_link.jpg" width="250" />
  </a>
</p>


# HNET - Association ruled based networks using graphical Hypergeometric Networks.

**Star this repo if you like it! ⭐️**
## 


## Dashboard HNet!

[**HNet Dashboard**](https://erdogant.github.io/hnet/pages/html/Documentation.html#online-web-interface)





## Summary

HNet stands for graphical Hypergeometric Networks, which is a method where associations across variables are tested for significance by statistical inference.
The aim is to determine a network with significant associations that can shed light on the complex relationships across variables.
Input datasets can range from generic dataframes to nested data structures with lists, missing values and enumerations.

Real-world data often contain measurements with both continuous and discrete values.
Despite the availability of many libraries, data sets with mixed data types require intensive pre-processing steps,
and it remains a challenge to describe the relationships between variables.
The data understanding phase is crucial to the data-mining process, however, without making any assumptions on the data,
the search space is super-exponential in the number of variables. A thorough data understanding phase is therefore not common practice.

**Methods**

We propose graphical hypergeometric networks (``HNet``), a method to test associations across variables for significance using statistical inference. The aim is to determine a network using only the significant associations in order to shed light on the complex relationships across variables. HNet processes raw unstructured data sets and outputs a network that consists of (partially) directed or undirected edges between the nodes (i.e., variables). To evaluate the accuracy of HNet, we used well known data sets and generated data sets with known ground truth. In addition, the performance of HNet is compared to Bayesian association learning.

**Results**

We demonstrate that HNet showed high accuracy and performance in the detection of node links. In the case of the Alarm data set we can demonstrate on average an MCC score of 0.33 + 0.0002 (*P*<1x10-6), whereas Bayesian association learning resulted in an average MCC score of 0.52 + 0.006 (*P*<1x10-11), and randomly assigning edges resulted in a MCC score of 0.004 + 0.0003 (*P*=0.49). 

**Conclusions**

HNet overcomes processes raw unstructured data sets, it allows analysis of mixed data types, it easily scales up in number of variables, and allows detailed examination of the detected associations.

**Documentation**

* API Documentation: https://erdogant.github.io/hnet/
* Article: https://arxiv.org/abs/2005.04679

## Method overview

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/index.html">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/fig1.png" width="600" />
  </a>
</p>

## Installation
* Install hnet from PyPI (recommended).

```bash
pip install -U hnet
```
## Examples

- Simple example for the Titanic data set

```python
# Initialize hnet with default settings
from hnet import hnet
# Load example dataset
df = hnet.import_example('titanic')
# Print to screen
print(df)
```

	#      PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked
	# 0              1         0       3  ...   7.2500   NaN         S
	# 1              2         1       1  ...  71.2833   C85         C
	# 2              3         1       3  ...   7.9250   NaN         S
	# 3              4         1       1  ...  53.1000  C123         S
	# 4              5         0       3  ...   8.0500   NaN         S
	# ..           ...       ...     ...  ...      ...   ...       ...
	# 886          887         0       2  ...  13.0000   NaN         S
	# 887          888         1       1  ...  30.0000   B42         S
	# 888          889         0       3  ...  23.4500   NaN         S
	# 889          890         1       1  ...  30.0000  C148         C
	# 890          891         0       3  ...   7.7500   NaN         Q

#


##### <a href="https://erdogant.github.io/docs/d3graph/titanic_example/index.html">Play with the interactive Titanic results.</a> 
<link rel="import" href="https://erdogant.github.io/docs/d3graph/titanic_example/index.html">

# 

##### [Example: Learn association learning on the titanic dataset](https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset)

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset">
     <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/fig4.png" width="900" />
  </a>
</p>


#

##### [Example: Summarize results](https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#summarize-results)

Networks can become giant hairballs and heatmaps unreadable. You may want to see the general associations between the categories, instead of the label-associations.
With the summarize functionality, the results will be summarized towards categories.

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#summarize-results">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_static_heatmap.png" width="300" />
  <a href="https://erdogant.github.io/docs/d3heatmap/d3heatmap.html">
     <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_dynamic_heatmap.png" width="400" />
  </a>
</p>

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_static_graph.png" width="400" />
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_dynamic_graph.png" width="400" />
  </a>
</p>





#

##### [Example: Feature importance](https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#feature-importance)

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#feature-importance">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_1.png" width="600" />
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_2.png" width="600" />
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_3.png" width="600" />
  </a>
</p>

#


#### Performance

<p align="left">
  <a href="https://erdogant.github.io/hnet/pages/html/index.html">
  <img src="https://github.com/erdogant/hnet/blob/master/docs/figs/fig3.png" width="600" />
  </a>
</p>



<hr>

### Contribute
* All kinds of contributions are welcome!

### Citation
Please cite ``HNet`` in your publications if this is useful for your research. See column right for citation information.

* [arXiv](https://arxiv.org/abs/2005.04679)
* [Article in pdf](https://arxiv.org/pdf/2005.04679)
* [Sphinx](https://erdogant.github.io/hnet)
* [Github](https://github.com/erdogant/hnet)

### Maintainer
* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)
* Contributions are welcome.
* If you wish to buy me a <a href="https://erdogant.github.io/donate/?currency=USD&amount=5">Coffee</a> for this work, it is very appreciated :)

            

Raw data

            {
    "_id": null,
    "home_page": "https://erdogant.github.io/hnet",
    "name": "hnet",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3",
    "maintainer_email": "",
    "keywords": "",
    "author": "Erdogan Taskesen",
    "author_email": "erdogant@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/1a/d1/819f6b0117b9a555e823843a15cb63cd0c9c853ed5cda33e7937d8a2a839/hnet-1.2.3.tar.gz",
    "platform": null,
    "description": "[![Python](https://img.shields.io/pypi/pyversions/hnet)](https://img.shields.io/pypi/pyversions/hnet)\r\n[![PyPI Version](https://img.shields.io/pypi/v/hnet)](https://pypi.org/project/hnet/)\r\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/erdogant/hnet/blob/master/LICENSE)\r\n[![Github Forks](https://img.shields.io/github/forks/erdogant/hnet.svg)](https://github.com/erdogant/hnet/network)\r\n[![GitHub Open Issues](https://img.shields.io/github/issues/erdogant/hnet.svg)](https://github.com/erdogant/hnet/issues)\r\n[![Project Status](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)\r\n[![Downloads](https://pepy.tech/badge/hnet/month)](https://pepy.tech/project/hnet/)\r\n[![Downloads](https://pepy.tech/badge/hnet)](https://pepy.tech/project/hnet)\r\n[![Sphinx](https://img.shields.io/badge/Sphinx-Docs-Green)](https://erdogant.github.io/hnet/)\r\n[![arXiv](https://img.shields.io/badge/arXiv-Docs-Green)](https://arxiv.org/abs/2005.04679)\r\n[![Substack](https://img.shields.io/badge/Substack-Blog-green)](https://erdogant.substack.com/p/advanced-network-analysis-to-explore)\r\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/erdogant/hnet/blob/master/notebooks/hnet.ipynb)\r\n<!-- [![DOI](https://zenodo.org/badge/226647104.svg)](https://zenodo.org/badge/latestdoi/226647104) -->\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.substack.com/p/advanced-network-analysis-to-explore\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/blog_link.jpg\" width=\"250\" />\r\n  </a>\r\n</p>\r\n\r\n\r\n# HNET - Association ruled based networks using graphical Hypergeometric Networks.\r\n\r\n**Star this repo if you like it! \u2b50\ufe0f**\r\n## \r\n\r\n\r\n## Dashboard HNet!\r\n\r\n[**HNet Dashboard**](https://erdogant.github.io/hnet/pages/html/Documentation.html#online-web-interface)\r\n\r\n\r\n\r\n\r\n\r\n## Summary\r\n\r\nHNet stands for graphical Hypergeometric Networks, which is a method where associations across variables are tested for significance by statistical inference.\r\nThe aim is to determine a network with significant associations that can shed light on the complex relationships across variables.\r\nInput datasets can range from generic dataframes to nested data structures with lists, missing values and enumerations.\r\n\r\nReal-world data often contain measurements with both continuous and discrete values.\r\nDespite the availability of many libraries, data sets with mixed data types require intensive pre-processing steps,\r\nand it remains a challenge to describe the relationships between variables.\r\nThe data understanding phase is crucial to the data-mining process, however, without making any assumptions on the data,\r\nthe search space is super-exponential in the number of variables. A thorough data understanding phase is therefore not common practice.\r\n\r\n**Methods**\r\n\r\nWe propose graphical hypergeometric networks (``HNet``), a method to test associations across variables for significance using statistical inference. The aim is to determine a network using only the significant associations in order to shed light on the complex relationships across variables. HNet processes raw unstructured data sets and outputs a network that consists of (partially) directed or undirected edges between the nodes (i.e., variables). To evaluate the accuracy of HNet, we used well known data sets and generated data sets with known ground truth. In addition, the performance of HNet is compared to Bayesian association learning.\r\n\r\n**Results**\r\n\r\nWe demonstrate that HNet showed high accuracy and performance in the detection of node links. In the case of the Alarm data set we can demonstrate on average an MCC score of 0.33 + 0.0002 (*P*<1x10-6), whereas Bayesian association learning resulted in an average MCC score of 0.52 + 0.006 (*P*<1x10-11), and randomly assigning edges resulted in a MCC score of 0.004 + 0.0003 (*P*=0.49). \r\n\r\n**Conclusions**\r\n\r\nHNet overcomes processes raw unstructured data sets, it allows analysis of mixed data types, it easily scales up in number of variables, and allows detailed examination of the detected associations.\r\n\r\n**Documentation**\r\n\r\n* API Documentation: https://erdogant.github.io/hnet/\r\n* Article: https://arxiv.org/abs/2005.04679\r\n\r\n## Method overview\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/index.html\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/fig1.png\" width=\"600\" />\r\n  </a>\r\n</p>\r\n\r\n## Installation\r\n* Install hnet from PyPI (recommended).\r\n\r\n```bash\r\npip install -U hnet\r\n```\r\n## Examples\r\n\r\n- Simple example for the Titanic data set\r\n\r\n```python\r\n# Initialize hnet with default settings\r\nfrom hnet import hnet\r\n# Load example dataset\r\ndf = hnet.import_example('titanic')\r\n# Print to screen\r\nprint(df)\r\n```\r\n\r\n\t#      PassengerId  Survived  Pclass  ...     Fare Cabin  Embarked\r\n\t# 0              1         0       3  ...   7.2500   NaN         S\r\n\t# 1              2         1       1  ...  71.2833   C85         C\r\n\t# 2              3         1       3  ...   7.9250   NaN         S\r\n\t# 3              4         1       1  ...  53.1000  C123         S\r\n\t# 4              5         0       3  ...   8.0500   NaN         S\r\n\t# ..           ...       ...     ...  ...      ...   ...       ...\r\n\t# 886          887         0       2  ...  13.0000   NaN         S\r\n\t# 887          888         1       1  ...  30.0000   B42         S\r\n\t# 888          889         0       3  ...  23.4500   NaN         S\r\n\t# 889          890         1       1  ...  30.0000  C148         C\r\n\t# 890          891         0       3  ...   7.7500   NaN         Q\r\n\r\n#\r\n\r\n\r\n##### <a href=\"https://erdogant.github.io/docs/d3graph/titanic_example/index.html\">Play with the interactive Titanic results.</a> \r\n<link rel=\"import\" href=\"https://erdogant.github.io/docs/d3graph/titanic_example/index.html\">\r\n\r\n# \r\n\r\n##### [Example: Learn association learning on the titanic dataset](https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset)\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset\">\r\n     <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/fig4.png\" width=\"900\" />\r\n  </a>\r\n</p>\r\n\r\n\r\n#\r\n\r\n##### [Example: Summarize results](https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#summarize-results)\r\n\r\nNetworks can become giant hairballs and heatmaps unreadable. You may want to see the general associations between the categories, instead of the label-associations.\r\nWith the summarize functionality, the results will be summarized towards categories.\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#summarize-results\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_static_heatmap.png\" width=\"300\" />\r\n  <a href=\"https://erdogant.github.io/docs/d3heatmap/d3heatmap.html\">\r\n     <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_dynamic_heatmap.png\" width=\"400\" />\r\n  </a>\r\n</p>\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/Examples.html#titanic-dataset\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_static_graph.png\" width=\"400\" />\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/titanic_summarize_dynamic_graph.png\" width=\"400\" />\r\n  </a>\r\n</p>\r\n\r\n\r\n\r\n\r\n\r\n#\r\n\r\n##### [Example: Feature importance](https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#feature-importance)\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/Use%20Cases.html#feature-importance\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_1.png\" width=\"600\" />\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_2.png\" width=\"600\" />\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/other/feat_imp_3.png\" width=\"600\" />\r\n  </a>\r\n</p>\r\n\r\n#\r\n\r\n\r\n#### Performance\r\n\r\n<p align=\"left\">\r\n  <a href=\"https://erdogant.github.io/hnet/pages/html/index.html\">\r\n  <img src=\"https://github.com/erdogant/hnet/blob/master/docs/figs/fig3.png\" width=\"600\" />\r\n  </a>\r\n</p>\r\n\r\n\r\n\r\n<hr>\r\n\r\n### Contribute\r\n* All kinds of contributions are welcome!\r\n\r\n### Citation\r\nPlease cite ``HNet`` in your publications if this is useful for your research. See column right for citation information.\r\n\r\n* [arXiv](https://arxiv.org/abs/2005.04679)\r\n* [Article in pdf](https://arxiv.org/pdf/2005.04679)\r\n* [Sphinx](https://erdogant.github.io/hnet)\r\n* [Github](https://github.com/erdogant/hnet)\r\n\r\n### Maintainer\r\n* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)\r\n* Contributions are welcome.\r\n* If you wish to buy me a <a href=\"https://erdogant.github.io/donate/?currency=USD&amount=5\">Coffee</a> for this work, it is very appreciated :)\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Graphical Hypergeometric Networks",
    "version": "1.2.3",
    "project_urls": {
        "Download": "https://github.com/erdogant/hnet/archive/1.2.3.tar.gz",
        "Homepage": "https://erdogant.github.io/hnet"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c835ee7366b7778c872b0d25f582051221ecbde9d20ee3914e8376bd48879d97",
                "md5": "dd5dcc80102135fe8c842bcc2bfdde1b",
                "sha256": "8c19cc21dc52affa5d909642922fcb94af7e5adcfd338abaa54d57b8e641019d"
            },
            "downloads": -1,
            "filename": "hnet-1.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dd5dcc80102135fe8c842bcc2bfdde1b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3",
            "size": 48891,
            "upload_time": "2023-10-18T20:08:24",
            "upload_time_iso_8601": "2023-10-18T20:08:24.836892Z",
            "url": "https://files.pythonhosted.org/packages/c8/35/ee7366b7778c872b0d25f582051221ecbde9d20ee3914e8376bd48879d97/hnet-1.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1ad1819f6b0117b9a555e823843a15cb63cd0c9c853ed5cda33e7937d8a2a839",
                "md5": "c8ecd0e3182751e95d6c0bc47a085e67",
                "sha256": "aab8d59a126dd2d394f28b324bdbfaa3ec08a665afaca63dfc5f94f1d4902e19"
            },
            "downloads": -1,
            "filename": "hnet-1.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "c8ecd0e3182751e95d6c0bc47a085e67",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3",
            "size": 48482,
            "upload_time": "2023-10-18T20:08:26",
            "upload_time_iso_8601": "2023-10-18T20:08:26.575450Z",
            "url": "https://files.pythonhosted.org/packages/1a/d1/819f6b0117b9a555e823843a15cb63cd0c9c853ed5cda33e7937d8a2a839/hnet-1.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-18 20:08:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "erdogant",
    "github_project": "hnet",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "hnet"
}
        
Elapsed time: 0.22863s