# coar
**coar** is implementation of clustering of association rules based on user defined thresholds.
## Installation
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install **coar**.
```bash
pip install coar
```
## Usage
Usage is displayed on association rules mined using [Cleverminer](https://www.cleverminer.org/) using modified version of [CleverMiner quickstart example](https://www.cleverminer.org/docs-page.html#section-3). You need to install **cleverminer** first.
```bash
pip install cleverminer
```
Mining association rules using **cleverminer**:
```python
# imports
import json
import pandas as pd
from cleverminer import cleverminer
# getting the source file
df = pd.read_csv(
'https://www.cleverminer.org/hotel.zip',
encoding='cp1250',
sep='\t'
)
# selecting the columns
df = df[['VTypeOfVisit', 'GState', 'GCity']]
# mining association rules
clm = cleverminer(
df=df, proc='4ftMiner',
quantifiers={'conf': 0.6, 'Base': 50},
ante={
'attributes': [
{'name': 'GState', 'type': 'subset', 'minlen': 1, 'maxlen': 1},
{'name': 'GCity', 'type': 'subset', 'minlen': 1, 'maxlen': 1},
], 'minlen': 1, 'maxlen': 2, 'type': 'con'},
succ={
'attributes': [
{'name': 'VTypeOfVisit', 'type': 'subset', 'minlen': 1, 'maxlen': 1}
], 'minlen': 1, 'maxlen': 1, 'type': 'con'},
)
# saving rules to file
with open('rules.json', 'w') as save_file:
save_file.write(json.dumps(clm.rulelist))
```
Clustering rules using **coar**:
```python
# imports
import json
import pandas as pd
from coar.cluster import agglomerative_clustering, cluster_representative
# loading rules
rule_file = open('rules.json')
rule_list = json.loads(rule_file.read())
# creating dataframe
df = pd.DataFrame.from_records([{
'antecedent': set(attr for attr in rule['cedents_str']['ante'].split(' & ')),
'succedent': set(attr for attr in rule['cedents_str']['succ'].split(' & ')),
'support': rule['params']['rel_base'],
'confidence': rule['params']['conf']
} for rule in rule_list])
# clustering
clustering = agglomerative_clustering(
df,
abs_ante_attr_diff_threshold=1,
abs_succ_attr_diff_threshold=0,
abs_supp_diff_threshold=1,
abs_conf_diff_threshold=1,
)
# getting cluster representatives
clusters_repr = cluster_representative(clustering)
```
## Contributing
If you find a bug 🐛, please open a [bug report](https://github.com/jmichalovcik/coar/issues/new?assignees=jmichalovcik&labels=bug).
If you have an idea for an improvement, new feature or enhancement 🚀, please open a [feature request](https://github.com/jmichalovcik/coar/issues/new?assignees=jmichalovcik&labels=enhancement).
## License
[MIT](https://github.com/jmichalovcik/coar/blob/master/LICENSE)
Raw data
{
"_id": null,
"home_page": "https://github.com/jmichalovcik/coar",
"name": "coar",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "",
"keywords": "coar,clustering of association rules,association rule clustering,association rules,association rule processingclustering,cluster analysis,database knowledge mining,data mining,data analysis",
"author": "jmichalovcik",
"author_email": "yee06.zones@icloud.com",
"download_url": "https://files.pythonhosted.org/packages/ee/d7/cf153adc927d858d30762966dcc6f1e0addaeaa8c96710ee2ac85b616a6b/coar-1.22.tar.gz",
"platform": null,
"description": "# coar\n\n**coar** is implementation of clustering of association rules based on user defined thresholds.\n\n## Installation\n\nUse the package manager [pip](https://pip.pypa.io/en/stable/) to install **coar**.\n\n```bash\npip install coar\n```\n\n## Usage\n\nUsage is displayed on association rules mined using [Cleverminer](https://www.cleverminer.org/) using modified version of [CleverMiner quickstart example](https://www.cleverminer.org/docs-page.html#section-3). You need to install **cleverminer** first.\n\n```bash\npip install cleverminer\n```\n\nMining association rules using **cleverminer**:\n\n```python\n# imports\nimport json\nimport pandas as pd\nfrom cleverminer import cleverminer\n\n# getting the source file\ndf = pd.read_csv(\n 'https://www.cleverminer.org/hotel.zip', \n encoding='cp1250', \n sep='\\t'\n)\n\n# selecting the columns\ndf = df[['VTypeOfVisit', 'GState', 'GCity']]\n\n\n# mining association rules\nclm = cleverminer(\n df=df, proc='4ftMiner',\n quantifiers={'conf': 0.6, 'Base': 50},\n ante={\n 'attributes': [\n {'name': 'GState', 'type': 'subset', 'minlen': 1, 'maxlen': 1},\n {'name': 'GCity', 'type': 'subset', 'minlen': 1, 'maxlen': 1},\n ], 'minlen': 1, 'maxlen': 2, 'type': 'con'},\n succ={\n 'attributes': [\n {'name': 'VTypeOfVisit', 'type': 'subset', 'minlen': 1, 'maxlen': 1}\n ], 'minlen': 1, 'maxlen': 1, 'type': 'con'},\n)\n\n# saving rules to file\nwith open('rules.json', 'w') as save_file:\n save_file.write(json.dumps(clm.rulelist))\n\n\n```\n\nClustering rules using **coar**:\n\n```python\n# imports\nimport json\nimport pandas as pd\n\nfrom coar.cluster import agglomerative_clustering, cluster_representative\n\n\n# loading rules\nrule_file = open('rules.json')\nrule_list = json.loads(rule_file.read())\n\n# creating dataframe\ndf = pd.DataFrame.from_records([{\n 'antecedent': set(attr for attr in rule['cedents_str']['ante'].split(' & ')),\n 'succedent': set(attr for attr in rule['cedents_str']['succ'].split(' & ')),\n 'support': rule['params']['rel_base'],\n 'confidence': rule['params']['conf']\n} for rule in rule_list])\n\n# clustering\nclustering = agglomerative_clustering(\n df,\n abs_ante_attr_diff_threshold=1,\n abs_succ_attr_diff_threshold=0,\n abs_supp_diff_threshold=1,\n abs_conf_diff_threshold=1,\n)\n\n# getting cluster representatives\nclusters_repr = cluster_representative(clustering)\n\n```\n\n## Contributing\n\nIf you find a bug \ud83d\udc1b, please open a [bug report](https://github.com/jmichalovcik/coar/issues/new?assignees=jmichalovcik&labels=bug).\nIf you have an idea for an improvement, new feature or enhancement \ud83d\ude80, please open a [feature request](https://github.com/jmichalovcik/coar/issues/new?assignees=jmichalovcik&labels=enhancement).\n\n## License\n[MIT](https://github.com/jmichalovcik/coar/blob/master/LICENSE)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Clustering of association rules based on user defined thresholds.",
"version": "1.22",
"project_urls": {
"Homepage": "https://github.com/jmichalovcik/coar"
},
"split_keywords": [
"coar",
"clustering of association rules",
"association rule clustering",
"association rules",
"association rule processingclustering",
"cluster analysis",
"database knowledge mining",
"data mining",
"data analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "eed7cf153adc927d858d30762966dcc6f1e0addaeaa8c96710ee2ac85b616a6b",
"md5": "7df31e7bf5bf84d235f763554cc7e59b",
"sha256": "fa2dfc7a632ef71913b32893f20112d0938026f994d3793c2baf2a163b9d1413"
},
"downloads": -1,
"filename": "coar-1.22.tar.gz",
"has_sig": false,
"md5_digest": "7df31e7bf5bf84d235f763554cc7e59b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 6497,
"upload_time": "2023-11-28T14:38:46",
"upload_time_iso_8601": "2023-11-28T14:38:46.206252Z",
"url": "https://files.pythonhosted.org/packages/ee/d7/cf153adc927d858d30762966dcc6f1e0addaeaa8c96710ee2ac85b616a6b/coar-1.22.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-28 14:38:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "jmichalovcik",
"github_project": "coar",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "coar"
}