<!-- badges: start -->
[![Build Status](https://travis-ci.com/wwwjk366/gower.svg?branch=master)](https://travis-ci.com/wwwjk366/gower)
[![PyPI version](https://badge.fury.io/py/gower.svg)](https://pypi.org/project/gower/)
[![Downloads](https://pepy.tech/badge/gower/month)](https://pepy.tech/project/gower/month)
<!-- badges: end -->
# Introduction
Gower's distance calculation in Python. Gower Distance is a distance measure that can be used to calculate distance between two entity whose attribute has a mixed of categorical and numerical values. [Gower (1971) A general coefficient of similarity and some of its properties. Biometrics 27 857–874.](https://www.jstor.org/stable/2528823?seq=1)
More details and examples can be found on my personal website here:(https://www.thinkdatascience.com/post/2019-12-16-introducing-python-package-gower/)
Core functions are wrote by [Marcelo Beckmann](https://sourceforge.net/projects/gower-distance-4python/files/).
# Examples
## Installation
```
pip install gower
```
## Generate some data
```python
import numpy as np
import pandas as pd
import gower
Xd=pd.DataFrame({'age':[21,21,19, 30,21,21,19,30,None],
'gender':['M','M','N','M','F','F','F','F',None],
'civil_status':['MARRIED','SINGLE','SINGLE','SINGLE','MARRIED','SINGLE','WIDOW','DIVORCED',None],
'salary':[3000.0,1200.0 ,32000.0,1800.0 ,2900.0 ,1100.0 ,10000.0,1500.0,None],
'has_children':[1,0,1,1,1,0,0,1,None],
'available_credit':[2200,100,22000,1100,2000,100,6000,2200,None]})
Yd = Xd.iloc[1:3,:]
X = np.asarray(Xd)
Y = np.asarray(Yd)
```
## Find the distance matrix
```python
gower.gower_matrix(X)
```
array([[0. , 0.3590238 , 0.6707398 , 0.31787416, 0.16872811,
0.52622986, 0.59697855, 0.47778758, nan],
[0.3590238 , 0. , 0.6964303 , 0.3138769 , 0.523629 ,
0.16720603, 0.45600235, 0.6539635 , nan],
[0.6707398 , 0.6964303 , 0. , 0.6552807 , 0.6728013 ,
0.6969697 , 0.740428 , 0.8151941 , nan],
[0.31787416, 0.3138769 , 0.6552807 , 0. , 0.4824794 ,
0.48108295, 0.74818605, 0.34332284, nan],
[0.16872811, 0.523629 , 0.6728013 , 0.4824794 , 0. ,
0.35750175, 0.43237334, 0.3121036 , nan],
[0.52622986, 0.16720603, 0.6969697 , 0.48108295, 0.35750175,
0. , 0.2898751 , 0.4878362 , nan],
[0.59697855, 0.45600235, 0.740428 , 0.74818605, 0.43237334,
0.2898751 , 0. , 0.57476616, nan],
[0.47778758, 0.6539635 , 0.8151941 , 0.34332284, 0.3121036 ,
0.4878362 , 0.57476616, 0. , nan],
[ nan, nan, nan, nan, nan,
nan, nan, nan, nan]], dtype=float32)
## Find Top n results
```python
gower.gower_topn(Xd.iloc[0:2,:], Xd.iloc[:,], n = 5)
```
{'index': array([4, 3, 1, 7, 5]),
'values': array([0.16872811, 0.31787416, 0.3590238 , 0.47778758, 0.52622986],
dtype=float32)}
Raw data
{
"_id": null,
"home_page": "https://github.com/wwwjk366/gower",
"name": "gower",
"maintainer": "",
"docs_url": null,
"requires_python": ">=2.7",
"maintainer_email": "",
"keywords": "gower,distance,matrix",
"author": "Dominic D",
"author_email": "Michael Yan <author@example.com>",
"download_url": "https://files.pythonhosted.org/packages/7c/b8/f02ffa72009105e981b21fe957895107d1b3c81dece43167d28d8acfdfb0/gower-0.1.2.tar.gz",
"platform": null,
"description": "<!-- badges: start -->\n[![Build Status](https://travis-ci.com/wwwjk366/gower.svg?branch=master)](https://travis-ci.com/wwwjk366/gower)\n[![PyPI version](https://badge.fury.io/py/gower.svg)](https://pypi.org/project/gower/)\n[![Downloads](https://pepy.tech/badge/gower/month)](https://pepy.tech/project/gower/month)\n<!-- badges: end -->\n\n# Introduction\n\nGower's distance calculation in Python. Gower Distance is a distance measure that can be used to calculate distance between two entity whose attribute has a mixed of categorical and numerical values. [Gower (1971) A general coefficient of similarity and some of its properties. Biometrics 27 857\u2013874.](https://www.jstor.org/stable/2528823?seq=1) \n\nMore details and examples can be found on my personal website here:(https://www.thinkdatascience.com/post/2019-12-16-introducing-python-package-gower/)\n\nCore functions are wrote by [Marcelo Beckmann](https://sourceforge.net/projects/gower-distance-4python/files/).\n\n# Examples\n\n## Installation\n\n```\npip install gower\n```\n\n## Generate some data\n\n```python\nimport numpy as np\nimport pandas as pd\nimport gower\n\nXd=pd.DataFrame({'age':[21,21,19, 30,21,21,19,30,None],\n'gender':['M','M','N','M','F','F','F','F',None],\n'civil_status':['MARRIED','SINGLE','SINGLE','SINGLE','MARRIED','SINGLE','WIDOW','DIVORCED',None],\n'salary':[3000.0,1200.0 ,32000.0,1800.0 ,2900.0 ,1100.0 ,10000.0,1500.0,None],\n'has_children':[1,0,1,1,1,0,0,1,None],\n'available_credit':[2200,100,22000,1100,2000,100,6000,2200,None]})\nYd = Xd.iloc[1:3,:]\nX = np.asarray(Xd)\nY = np.asarray(Yd)\n\n```\n\n## Find the distance matrix\n\n```python\ngower.gower_matrix(X)\n```\n\n\n\n\n array([[0. , 0.3590238 , 0.6707398 , 0.31787416, 0.16872811,\n 0.52622986, 0.59697855, 0.47778758, nan],\n [0.3590238 , 0. , 0.6964303 , 0.3138769 , 0.523629 ,\n 0.16720603, 0.45600235, 0.6539635 , nan],\n [0.6707398 , 0.6964303 , 0. , 0.6552807 , 0.6728013 ,\n 0.6969697 , 0.740428 , 0.8151941 , nan],\n [0.31787416, 0.3138769 , 0.6552807 , 0. , 0.4824794 ,\n 0.48108295, 0.74818605, 0.34332284, nan],\n [0.16872811, 0.523629 , 0.6728013 , 0.4824794 , 0. ,\n 0.35750175, 0.43237334, 0.3121036 , nan],\n [0.52622986, 0.16720603, 0.6969697 , 0.48108295, 0.35750175,\n 0. , 0.2898751 , 0.4878362 , nan],\n [0.59697855, 0.45600235, 0.740428 , 0.74818605, 0.43237334,\n 0.2898751 , 0. , 0.57476616, nan],\n [0.47778758, 0.6539635 , 0.8151941 , 0.34332284, 0.3121036 ,\n 0.4878362 , 0.57476616, 0. , nan],\n [ nan, nan, nan, nan, nan,\n nan, nan, nan, nan]], dtype=float32)\n\n\n## Find Top n results\n\n```python\ngower.gower_topn(Xd.iloc[0:2,:], Xd.iloc[:,], n = 5)\n```\n\n\n\n\n {'index': array([4, 3, 1, 7, 5]),\n 'values': array([0.16872811, 0.31787416, 0.3590238 , 0.47778758, 0.52622986],\n dtype=float32)}\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python implementation of Gowers distance, pairwise between records in two data sets",
"version": "0.1.2",
"split_keywords": [
"gower",
"distance",
"matrix"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "d7319f211797296951c89c0b4985d67b",
"sha256": "cb46e18243e1d88d2fa0a23d20afb71e5469f25db4ee6236db40f897dfea9e6f"
},
"downloads": -1,
"filename": "gower-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d7319f211797296951c89c0b4985d67b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=2.7",
"size": 5168,
"upload_time": "2022-11-13T20:23:18",
"upload_time_iso_8601": "2022-11-13T20:23:18.387727Z",
"url": "https://files.pythonhosted.org/packages/99/23/88b526457ea992e0a47147a886db3d749d07347c8d3a303f6076deee7299/gower-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "1d33bdd101ad7196dbadad0fc09de08c",
"sha256": "34ddb5158f0e8bfba093dca06b9f887bda244998d10af2a3ad8c74a6efa1b5f6"
},
"downloads": -1,
"filename": "gower-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "1d33bdd101ad7196dbadad0fc09de08c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=2.7",
"size": 5637,
"upload_time": "2022-11-13T20:23:20",
"upload_time_iso_8601": "2022-11-13T20:23:20.493752Z",
"url": "https://files.pythonhosted.org/packages/7c/b8/f02ffa72009105e981b21fe957895107d1b3c81dece43167d28d8acfdfb0/gower-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-11-13 20:23:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "wwwjk366",
"github_project": "gower",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "numpy",
"specs": []
},
{
"name": "scipy",
"specs": []
},
{
"name": "pandas",
"specs": []
}
],
"lcname": "gower"
}