<p align="center">
<img alt="logo" width="300" src="https://raw.githubusercontent.com/firefly-cpp/NiaARM/main/.github/images/logo.png">
</p>
<h1 align="center">
NiaARM
</h1>
<h2 align="center">
A minimalistic framework for Numerical Association Rule Mining
</h2>
<p align="center">
<a href="https://pypi.python.org/pypi/niaarm">
<img alt="PyPI Version" src="https://img.shields.io/pypi/v/niaarm.svg" />
</a>
<img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/niaarm.svg" />
<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/niaarm.svg" />
<a href="https://src.fedoraproject.org/rpms/python-niaarm">
<img alt="Fedora package" src="https://img.shields.io/fedora/v/python3-niaarm?color=blue&label=Fedora%20Linux&logo=fedora" />
</a>
<a href="https://aur.archlinux.org/packages/python-niaarm">
<img alt="AUR package" src="https://img.shields.io/aur/version/python-niaarm?color=blue&label=Arch%20Linux&logo=arch-linux" />
</a>
<a href="https://repology.org/project/python:niaarm/versions">
<img alt="Packaging status" src="https://repology.org/badge/tiny-repos/python:niaarm.svg" />
</a>
<a href="https://pepy.tech/project/niaarm">
<img alt="Downloads" src="https://pepy.tech/badge/niaarm" />
</a>
<a href="https://github.com/firefly-cpp/NiaARM/blob/main/LICENSE">
<img alt="GitHub license" src="https://img.shields.io/github/license/firefly-cpp/niaarm.svg" />
</a>
<img alt="NiaARM" src="https://github.com/firefly-cpp/niaarm/actions/workflows/test.yml/badge.svg" />
<img alt="Documentation status" src="https://readthedocs.org/projects/niaarm/badge/?version=latest" />
</p>
<p align="center">
<img alt="GitHub commit activity" src="https://img.shields.io/github/commit-activity/w/firefly-cpp/niaarm.svg" />
<a href="http://isitmaintained.com/project/firefly-cpp/niaarml">
<img alt="Percentage of issues still open" src="http://isitmaintained.com/badge/open/firefly-cpp/niaarm.svg">
</a>
<a href='http://isitmaintained.com/project/firefly-cpp/niaarm "Average time to resolve an issue'>
<img alt="Average time to resolve an issue" src="http://isitmaintained.com/badge/resolution/firefly-cpp/niaarm.svg" />
</a>
<a href="#-contributors">
<img alt="All Contributors" src="https://img.shields.io/badge/all_contributors-1-orange.svg" />
</a>
</p>
<p align="center">
<a href="https://doi.org/10.21105/joss.04448">
<img alt="DOI" src="https://joss.theoj.org/papers/10.21105/joss.04448/status.svg" />
</a>
</p>
<p align="center">
<a href="#-detailed-insights">๐ Detailed insights</a> โข
<a href="#-installation">๐ฆ Installation</a> โข
<a href="#-usage">๐ Usage</a> โข
<a href="#-cite-us">๐ Cite us</a> โข
<a href="#-references">๐ References</a> โข
<a href="#-see-also">๐ See also</a> โข
<a href="#-license">๐ License</a> โข
<a href="#-contributors">๐ซ Contributors</a>
</p>
NiaARM is a framework for Association Rule Mining based on nature-inspired algorithms for optimization. ๐ฟ The framework is written fully in Python and runs on all platforms. NiaARM allows users to preprocess the data in a transaction database automatically, to search for association rules and provide a pretty output of the rules found. ๐ This framework also supports integral and real-valued types of attributes besides the categorical ones. Mining the association rules is defined as an optimization problem, and solved using the nature-inspired algorithms that come from the related framework called [NiaPy](https://github.com/NiaOrg/NiaPy). ๐
* **Documentation:** https://niaarm.readthedocs.io/en/latest
* **Tested OS:** Windows, Ubuntu, Fedora, Alpine, Arch, macOS. **However, that does not mean it does not work on others**
## ๐ Detailed insights
The current version includes (but is not limited to) the following functions:
- loading datasets in CSV format ๐
- preprocessing of data ๐งน
- searching for association rules ๐
- providing output of mined association rules ๐
- generating statistics about mined association rules ๐
- visualization of association rules ๐
- association rule text mining (experimental) ๐
## ๐ฆ Installation
### pip
To install `NiaARM` with pip, use:
```sh
pip install niaarm
```
To install `NiaARM` on Alpine Linux, enable Community repository and use:
```sh
$ apk add py3-niaarm
```
To install `NiaARM` on Arch Linux, use an [AUR helper](https://wiki.archlinux.org/title/AUR_helpers):
```sh
$ yay -Syyu python-niaarm
```
To install `NiaARM` on Fedora, use:
```sh
$ dnf install python3-niaarm
```
To install `NiaARM` on NixOS, use:
```sh
nix-env -iA nixos.python311Packages.niaarm
```
## ๐ Usage
### Loading data
In NiaARM, data loading is done via the `Dataset` class. There are two options for loading data:
#### Option 1: From a pandas DataFrame (recommended)
```python
import pandas as pd
from niaarm import Dataset
df = pd.read_csv('datasets/Abalone.csv')
# preprocess data...
data = Dataset(df)
print(data) # printing the dataset will generate a feature report
```
#### Option 2: Directly from a CSV file
```python
from niaarm import Dataset
data = Dataset('datasets/Abalone.csv')
print(data)
```
### Preprocessing
#### Data Squashing
Optionally, a preprocessing technique, called data squashing [5], can be applied. This will significantly reduce the number of transactions, while providing similar results to the original dataset.
```python
from niaarm import Dataset, squash
dataset = Dataset('datasets/Abalone.csv')
squashed = squash(dataset, threshold=0.9, similarity='euclidean')
print(squashed)
```
### Mining association rules
#### The easy way (recommended)
Association rule mining can be easily performed using the `get_rules` function:
```python
from niaarm import Dataset, get_rules
from niapy.algorithms.basic import DifferentialEvolution
data = Dataset("datasets/Abalone.csv")
algo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)
metrics = ('support', 'confidence')
rules, run_time = get_rules(data, algo, metrics, max_iters=30, logging=True)
print(rules) # Prints basic stats about the mined rules
print(f'Run Time: {run_time}')
rules.to_csv('output.csv')
```
#### The hard way
The above example can be also be implemented using a more low level interface,
with the `NiaARM` class directly:
```python
from niaarm import NiaARM, Dataset
from niapy.algorithms.basic import DifferentialEvolution
from niapy.task import Task, OptimizationType
data = Dataset("datasets/Abalone.csv")
# Create a problem
# dimension represents the dimension of the problem;
# features represent the list of features, while transactions depicts the list of transactions
# metrics is a sequence of metrics to be taken into account when computing the fitness;
# you can also pass in a dict of the shape {'metric_name': <weight of metric in range [0, 1]>};
# when passing a sequence, the weights default to 1.
problem = NiaARM(data.dimension, data.features, data.transactions, metrics=('support', 'confidence'), logging=True)
# build niapy task
task = Task(problem=problem, max_iters=30, optimization_type=OptimizationType.MAXIMIZATION)
# use Differential Evolution (DE) algorithm from the NiaPy library
# see full list of available algorithms: https://github.com/NiaOrg/NiaPy/blob/master/Algorithms.md
algo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)
# run algorithm
best = algo.run(task=task)
# sort rules
problem.rules.sort()
# export all rules to csv
problem.rules.to_csv('output.csv')
```
#### Interest measures
The framework implements several popular interest measures, which can be used to compute the fitness function value of rules
and for assessing the quality of the mined rules. A full list of the implemented interest measures along with their descriptions
and equations can be found [here](interest_measures.md).
### Visualization
The framework currently supports:
- hill slopes (presented in [4]),
- scatter plot and
- grouped matrix plot visualization methods.
More visualization methods are planned to be implemented in future releases.
#### Hill Slopes
```python
from matplotlib import pyplot as plt
from niaarm import Dataset, get_rules
from niaarm.visualize import hill_slopes
dataset = Dataset('datasets/Abalone.csv')
metrics = ('support', 'confidence')
rules, _ = get_rules(dataset, 'DifferentialEvolution', metrics, max_evals=1000, seed=1234)
some_rule = rules[150]
hill_slopes(some_rule, dataset.transactions)
plt.show()
```
<p>
<img alt="logo" src="https://raw.githubusercontent.com/firefly-cpp/NiaARM/main/.github/images/hill_slopes.png">
</p>
#### Scatter Plot
```python
from examples.visualization_examples.prepare_datasets import get_weather_data
from niaarm import Dataset, get_rules
from niaarm.visualize import scatter_plot
# Get prepared data
arm_df = get_weather_data()
# Prepare Dataset
dataset = Dataset(path_or_df=arm_df,delimiter=",")
# Get rules
metrics = ("support", "confidence")
rules, run_time = get_rules(dataset, "DifferentialEvolution", metrics, max_evals=500)
# Add lift to metrics
metrics = list(metrics)
metrics.append("lift")
metrics = tuple(metrics)
# Visualize scatter plot
fig = scatter_plot(rules=rules, metrics=metrics, interactive=False)
fig.show()
```
<p>
<img alt="logo" src=".github/images/scatter_plot.png">
</p>
#### Grouped Matrix Plot
```python
from examples.visualization_examples.prepare_datasets import get_football_player_data
from niaarm import Dataset, get_rules
from niaarm.visualize import grouped_matrix_plot
# Get prepared data
arm_df = get_football_player_data()
# Prepare Dataset
dataset = Dataset(path_or_df=arm_df, delimiter=",")
# Get rules
metrics = ("support", "confidence")
rules, run_time = get_rules(dataset, "DifferentialEvolution", metrics, max_evals=500)
# Add lift to metrics
metrics = list(metrics)
metrics.append("lift")
metrics = tuple(metrics)
# Visualize grouped matrix plot
fig = grouped_matrix_plot(rules=rules, metrics=metrics, k=5, interactive=False)
fig.show()
```
<p>
<img alt="logo" src=".github/images/grouped_matrix_plot.png">
</p>
### Text Mining (Experimental)
An experimental implementation of association rule text mining using nature-inspired algorithms, based on ideas from [5]
is also provided. The `niaarm.text` module contains the `Corpus` and `Document` classes for loading and preprocessing corpora,
a `TextRule` class, representing a text rule, and the `NiaARTM` class, implementing association rule text mining
as a continuous optimization problem. The `get_text_rules` function, equivalent to `get_rules`, but for text mining, was also
added to the `niaarm.mine` module.
```python
import pandas as pd
from niaarm.text import Corpus
from niaarm.mine import get_text_rules
from niapy.algorithms.basic import ParticleSwarmOptimization
df = pd.read_json('datasets/text/artm_test_dataset.json', orient='records')
documents = df['text'].tolist()
corpus = Corpus.from_list(documents)
algorithm = ParticleSwarmOptimization(population_size=200, seed=123)
metrics = ('support', 'confidence', 'aws')
rules, time = get_text_rules(corpus, max_terms=5, algorithm=algorithm, metrics=metrics, max_evals=10000, logging=True)
print(rules)
print(f'Run time: {time:.2f}s')
rules.to_csv('output.csv')
```
**Note:** You may need to download stopwords and the punkt tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt')`.
For a full list of examples see the [examples folder](https://github.com/firefly-cpp/NiaARM/tree/main/examples)
in the GitHub repository.
### Command line interface
We provide a simple command line interface, which allows you to easily
mine association rules on any input dataset, output them to a csv file and/or perform
a simple statistical analysis on them. For more details see the [documentation](https://niaarm.readthedocs.io/en/latest/cli.html).
```shell
niaarm -h
```
```
usage: niaarm [-h] [-v] [-c CONFIG] [-i INPUT_FILE] [-o OUTPUT_FILE] [--squashing-similarity {euclidean,cosine}] [--squashing-threshold SQUASHING_THRESHOLD] [-a ALGORITHM] [-s SEED] [--max-evals MAX_EVALS] [--max-iters MAX_ITERS]
[--metrics METRICS [METRICS ...]] [--weights WEIGHTS [WEIGHTS ...]] [--log] [--stats]
Perform ARM, output mined rules as csv, get mined rules' statistics
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-c CONFIG, --config CONFIG
Path to a TOML config file
-i INPUT_FILE, --input-file INPUT_FILE
Input file containing a csv dataset
-o OUTPUT_FILE, --output-file OUTPUT_FILE
Output file for mined rules
--squashing-similarity {euclidean,cosine}
Similarity measure to use for squashing
--squashing-threshold SQUASHING_THRESHOLD
Threshold to use for squashing
-a ALGORITHM, --algorithm ALGORITHM
Algorithm to use (niapy class name, e.g. DifferentialEvolution)
-s SEED, --seed SEED Seed for the algorithm's random number generator
--max-evals MAX_EVALS
Maximum number of fitness function evaluations
--max-iters MAX_ITERS
Maximum number of iterations
--metrics METRICS [METRICS ...]
Metrics to use in the fitness function.
--weights WEIGHTS [WEIGHTS ...]
Weights in range [0, 1] corresponding to --metrics
--log Enable logging of fitness improvements
--stats Display stats about mined rules
```
Note: The CLI script can also run as a python module (`python -m niaarm ...`)
## ๐ Cite us
Stupan, ลฝ., & Fister Jr., I. (2022). [NiaARM: A minimalistic framework for Numerical Association Rule Mining](https://www.theoj.org/joss-papers/joss.04448/10.21105.joss.04448.pdf). Journal of Open Source Software, 7(77), 4448.
## ๐ References
Ideas are based on the following research papers:
[1] I. Fister Jr., A. Iglesias, A. Gรกlvez, J. Del Ser, E. Osaba, I Fister. [Differential evolution for association rule mining using categorical and numerical attributes](http://www.iztok-jr-fister.eu/static/publications/231.pdf) In: Intelligent data engineering and automated learning - IDEAL 2018, pp. 79-88, 2018.
[2] I. Fister Jr., V. Podgorelec, I. Fister. [Improved Nature-Inspired Algorithms for Numeric Association Rule Mining](https://iztok-jr-fister.eu/static/publications/324.pdf). In: Vasant P., Zelinka I., Weber GW. (eds) Intelligent Computing and Optimization. ICO 2020. Advances in Intelligent Systems and Computing, vol 1324. Springer, Cham.
[3] I. Fister Jr., I. Fister [A brief overview of swarm intelligence-based algorithms for numerical association rule mining](https://arxiv.org/abs/2010.15524). arXiv preprint arXiv:2010.15524 (2020).
[4] Fister, I. et al. (2020). [Visualization of Numerical Association Rules by Hill Slopes](http://www.iztok-jr-fister.eu/static/publications/280.pdf).
In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning โ IDEAL 2020.
IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_10
[5] I. Fister, S. Deb, I. Fister, [Population-based metaheuristics for Association Rule Text Mining](http://www.iztok-jr-fister.eu/static/publications/260.pdf),
In: Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence,
New York, NY, USA, mar. 2020, pp. 19โ23. doi: [10.1145/3396474.3396493](https://dl.acm.org/doi/10.1145/3396474.3396493).
[6] I. Fister, I. Fister Jr., D. Novak and D. Verber, [Data squashing as preprocessing in association rule mining](https://iztok-jr-fister.eu/static/publications/300.pdf), 2022 IEEE Symposium Series on Computational Intelligence (SSCI), Singapore, Singapore, 2022, pp. 1720-1725, doi: [10.1109/SSCI51031.2022.10022240](https://doi.org/10.1109/SSCI51031.2022.10022240).
## ๐ See also
[1] [NiaARM.jl: Numerical Association Rule Mining in Julia](https://github.com/firefly-cpp/NiaARM.jl)
[2] [arm-preprocessing: Implementation of several preprocessing techniques for Association Rule Mining (ARM)](https://github.com/firefly-cpp/arm-preprocessing)
## ๐ License
This package is distributed under the MIT License. This license can be found online at <http://www.opensource.org/licenses/MIT>.
## Disclaimer
This framework is provided as-is, and there are no guarantees that it fits your purposes or that it is bug-free. Use it at your own risk!
## ๐ซ Contributors
Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- prettier-ignore-start -->
<!-- markdownlint-disable -->
<table>
<tbody>
<tr>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/zStupan"><img src="https://avatars.githubusercontent.com/u/48752988?v=4?s=100" width="100px;" alt="zStupan"/><br /><sub><b>zStupan</b></sub></a><br /><a href="https://github.com/firefly-cpp/NiaARM/commits?author=zStupan" title="Code">๐ป</a> <a href="https://github.com/firefly-cpp/NiaARM/issues?q=author%3AzStupan" title="Bug reports">๐</a> <a href="https://github.com/firefly-cpp/NiaARM/commits?author=zStupan" title="Documentation">๐</a> <a href="#content-zStupan" title="Content">๐</a> <a href="#ideas-zStupan" title="Ideas, Planning, & Feedback">๐ค</a> <a href="#example-zStupan" title="Examples">๐ก</a></td>
<td align="center" valign="top" width="14.28%"><a href="http://www.iztok.xyz"><img src="https://avatars.githubusercontent.com/u/1633361?v=4?s=100" width="100px;" alt="Iztok Fister Jr."/><br /><sub><b>Iztok Fister Jr.</b></sub></a><br /><a href="https://github.com/firefly-cpp/NiaARM/commits?author=firefly-cpp" title="Code">๐ป</a> <a href="https://github.com/firefly-cpp/NiaARM/issues?q=author%3Afirefly-cpp" title="Bug reports">๐</a> <a href="#mentoring-firefly-cpp" title="Mentoring">๐งโ๐ซ</a> <a href="#maintenance-firefly-cpp" title="Maintenance">๐ง</a> <a href="#ideas-firefly-cpp" title="Ideas, Planning, & Feedback">๐ค</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://erkankarabulut.github.io"><img src="https://avatars.githubusercontent.com/u/15374776?v=4?s=100" width="100px;" alt="Erkan Karabulut"/><br /><sub><b>Erkan Karabulut</b></sub></a><br /><a href="https://github.com/firefly-cpp/NiaARM/commits?author=erkankarabulut" title="Code">๐ป</a> <a href="https://github.com/firefly-cpp/NiaARM/issues?q=author%3Aerkankarabulut" title="Bug reports">๐</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/lahovniktadej"><img src="https://avatars.githubusercontent.com/u/57890734?v=4?s=100" width="100px;" alt="Tadej Lahovnik"/><br /><sub><b>Tadej Lahovnik</b></sub></a><br /><a href="https://github.com/firefly-cpp/NiaARM/commits?author=lahovniktadej" title="Documentation">๐</a></td>
<td align="center" valign="top" width="14.28%"><a href="https://github.com/musicinmybrain"><img src="https://avatars.githubusercontent.com/u/6898909?v=4?s=100" width="100px;" alt="Ben Beasley"/><br /><sub><b>Ben Beasley</b></sub></a><br /><a href="https://github.com/firefly-cpp/NiaARM/commits?author=musicinmybrain" title="Documentation">๐</a></td>
<td align="center" valign="top" width="14.28%"><a href="http://www.dusanfister.com"><img src="https://avatars.githubusercontent.com/u/3198785?v=4?s=100" width="100px;" alt="Dusan Fister"/><br /><sub><b>Dusan Fister</b></sub></a><br /><a href="#design-rhododendrom" title="Design">๐จ</a></td>
</tr>
</tbody>
</table>
<!-- markdownlint-restore -->
<!-- prettier-ignore-end -->
<!-- ALL-CONTRIBUTORS-LIST:END -->
This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!
Raw data
{
"_id": null,
"home_page": "https://github.com/firefly-cpp/NiaARM",
"name": "niaarm",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.14,>=3.9",
"maintainer_email": null,
"keywords": "association rule mining, data science, numerical association rule mining, preprocessing, visualization",
"author": "\u017diga Stupan",
"author_email": "ziga.stupan1@student.um.si",
"download_url": "https://files.pythonhosted.org/packages/e5/bb/85b94f13ef6d16850aebef13128c45d3c4f446052d914e455b923c807956/niaarm-0.4.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <img alt=\"logo\" width=\"300\" src=\"https://raw.githubusercontent.com/firefly-cpp/NiaARM/main/.github/images/logo.png\">\n</p>\n\n<h1 align=\"center\">\n NiaARM\n</h1>\n\n<h2 align=\"center\">\n A minimalistic framework for Numerical Association Rule Mining\n</h2>\n\n<p align=\"center\">\n <a href=\"https://pypi.python.org/pypi/niaarm\">\n <img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/niaarm.svg\" />\n </a>\n <img alt=\"PyPI - Python Version\" src=\"https://img.shields.io/pypi/pyversions/niaarm.svg\" />\n <img alt=\"PyPI - Downloads\" src=\"https://img.shields.io/pypi/dm/niaarm.svg\" />\n <a href=\"https://src.fedoraproject.org/rpms/python-niaarm\">\n <img alt=\"Fedora package\" src=\"https://img.shields.io/fedora/v/python3-niaarm?color=blue&label=Fedora%20Linux&logo=fedora\" />\n </a>\n <a href=\"https://aur.archlinux.org/packages/python-niaarm\">\n <img alt=\"AUR package\" src=\"https://img.shields.io/aur/version/python-niaarm?color=blue&label=Arch%20Linux&logo=arch-linux\" />\n </a>\n <a href=\"https://repology.org/project/python:niaarm/versions\">\n <img alt=\"Packaging status\" src=\"https://repology.org/badge/tiny-repos/python:niaarm.svg\" />\n </a>\n <a href=\"https://pepy.tech/project/niaarm\">\n <img alt=\"Downloads\" src=\"https://pepy.tech/badge/niaarm\" />\n </a>\n <a href=\"https://github.com/firefly-cpp/NiaARM/blob/main/LICENSE\">\n <img alt=\"GitHub license\" src=\"https://img.shields.io/github/license/firefly-cpp/niaarm.svg\" />\n </a>\n <img alt=\"NiaARM\" src=\"https://github.com/firefly-cpp/niaarm/actions/workflows/test.yml/badge.svg\" />\n <img alt=\"Documentation status\" src=\"https://readthedocs.org/projects/niaarm/badge/?version=latest\" />\n</p>\n\n<p align=\"center\">\n <img alt=\"GitHub commit activity\" src=\"https://img.shields.io/github/commit-activity/w/firefly-cpp/niaarm.svg\" />\n <a href=\"http://isitmaintained.com/project/firefly-cpp/niaarml\">\n <img alt=\"Percentage of issues still open\" src=\"http://isitmaintained.com/badge/open/firefly-cpp/niaarm.svg\">\n </a>\n <a href='http://isitmaintained.com/project/firefly-cpp/niaarm \"Average time to resolve an issue'>\n <img alt=\"Average time to resolve an issue\" src=\"http://isitmaintained.com/badge/resolution/firefly-cpp/niaarm.svg\" />\n </a>\n <a href=\"#-contributors\">\n <img alt=\"All Contributors\" src=\"https://img.shields.io/badge/all_contributors-1-orange.svg\" />\n </a>\n</p>\n\n<p align=\"center\">\n <a href=\"https://doi.org/10.21105/joss.04448\">\n <img alt=\"DOI\" src=\"https://joss.theoj.org/papers/10.21105/joss.04448/status.svg\" />\n </a>\n</p>\n\n<p align=\"center\">\n <a href=\"#-detailed-insights\">\ud83d\udd0d Detailed insights</a> \u2022\n <a href=\"#-installation\">\ud83d\udce6 Installation</a> \u2022\n <a href=\"#-usage\">\ud83d\ude80 Usage</a> \u2022\n <a href=\"#-cite-us\">\ud83d\udcc4 Cite us</a> \u2022\n <a href=\"#-references\">\ud83d\udcda References</a> \u2022\n <a href=\"#-see-also\">\ud83d\udcd6 See also</a> \u2022\n <a href=\"#-license\">\ud83d\udd11 License</a> \u2022\n <a href=\"#-contributors\">\ud83e\udec2 Contributors</a>\n</p>\n\nNiaARM is a framework for Association Rule Mining based on nature-inspired algorithms for optimization. \ud83c\udf3f The framework is written fully in Python and runs on all platforms. NiaARM allows users to preprocess the data in a transaction database automatically, to search for association rules and provide a pretty output of the rules found. \ud83d\udcca This framework also supports integral and real-valued types of attributes besides the categorical ones. Mining the association rules is defined as an optimization problem, and solved using the nature-inspired algorithms that come from the related framework called [NiaPy](https://github.com/NiaOrg/NiaPy). \ud83d\udd17\n\n* **Documentation:** https://niaarm.readthedocs.io/en/latest\n* **Tested OS:** Windows, Ubuntu, Fedora, Alpine, Arch, macOS. **However, that does not mean it does not work on others**\n\n## \ud83d\udd0d Detailed insights\nThe current version includes (but is not limited to) the following functions:\n\n- loading datasets in CSV format \ud83d\udcc1\n- preprocessing of data \ud83e\uddf9\n- searching for association rules \ud83d\udd0e\n- providing output of mined association rules \ud83d\udccb\n- generating statistics about mined association rules \ud83d\udcca\n- visualization of association rules \ud83d\udcc8\n- association rule text mining (experimental) \ud83d\udcc4\n\n## \ud83d\udce6 Installation\n\n### pip\n\nTo install `NiaARM` with pip, use:\n\n```sh\npip install niaarm\n```\n\nTo install `NiaARM` on Alpine Linux, enable Community repository and use:\n\n```sh\n$ apk add py3-niaarm\n```\n\nTo install `NiaARM` on Arch Linux, use an [AUR helper](https://wiki.archlinux.org/title/AUR_helpers):\n\n```sh\n$ yay -Syyu python-niaarm\n```\n\nTo install `NiaARM` on Fedora, use:\n\n```sh\n$ dnf install python3-niaarm\n```\n\nTo install `NiaARM` on NixOS, use:\n\n```sh\nnix-env -iA nixos.python311Packages.niaarm\n```\n\n## \ud83d\ude80 Usage\n\n### Loading data\n\nIn NiaARM, data loading is done via the `Dataset` class. There are two options for loading data:\n\n#### Option 1: From a pandas DataFrame (recommended)\n\n```python\nimport pandas as pd\nfrom niaarm import Dataset\n\n\ndf = pd.read_csv('datasets/Abalone.csv')\n# preprocess data...\ndata = Dataset(df)\nprint(data) # printing the dataset will generate a feature report\n```\n\n#### Option 2: Directly from a CSV file\n\n```python\nfrom niaarm import Dataset\n\n\ndata = Dataset('datasets/Abalone.csv')\nprint(data)\n```\n\n### Preprocessing\n\n#### Data Squashing\n\nOptionally, a preprocessing technique, called data squashing [5], can be applied. This will significantly reduce the number of transactions, while providing similar results to the original dataset.\n\n```python\nfrom niaarm import Dataset, squash\n\ndataset = Dataset('datasets/Abalone.csv')\nsquashed = squash(dataset, threshold=0.9, similarity='euclidean')\nprint(squashed)\n```\n\n### Mining association rules\n\n#### The easy way (recommended)\n\nAssociation rule mining can be easily performed using the `get_rules` function:\n\n```python\nfrom niaarm import Dataset, get_rules\nfrom niapy.algorithms.basic import DifferentialEvolution\n\ndata = Dataset(\"datasets/Abalone.csv\")\n\nalgo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)\nmetrics = ('support', 'confidence')\n\nrules, run_time = get_rules(data, algo, metrics, max_iters=30, logging=True)\n\nprint(rules) # Prints basic stats about the mined rules\nprint(f'Run Time: {run_time}')\nrules.to_csv('output.csv')\n```\n\n#### The hard way\n\nThe above example can be also be implemented using a more low level interface,\nwith the `NiaARM` class directly:\n\n```python\nfrom niaarm import NiaARM, Dataset\nfrom niapy.algorithms.basic import DifferentialEvolution\nfrom niapy.task import Task, OptimizationType\n\n\ndata = Dataset(\"datasets/Abalone.csv\")\n\n# Create a problem\n# dimension represents the dimension of the problem;\n# features represent the list of features, while transactions depicts the list of transactions\n# metrics is a sequence of metrics to be taken into account when computing the fitness;\n# you can also pass in a dict of the shape {'metric_name': <weight of metric in range [0, 1]>};\n# when passing a sequence, the weights default to 1.\nproblem = NiaARM(data.dimension, data.features, data.transactions, metrics=('support', 'confidence'), logging=True)\n\n# build niapy task\ntask = Task(problem=problem, max_iters=30, optimization_type=OptimizationType.MAXIMIZATION)\n\n# use Differential Evolution (DE) algorithm from the NiaPy library\n# see full list of available algorithms: https://github.com/NiaOrg/NiaPy/blob/master/Algorithms.md\nalgo = DifferentialEvolution(population_size=50, differential_weight=0.5, crossover_probability=0.9)\n\n# run algorithm\nbest = algo.run(task=task)\n\n# sort rules\nproblem.rules.sort()\n\n# export all rules to csv\nproblem.rules.to_csv('output.csv')\n```\n\n#### Interest measures\n\nThe framework implements several popular interest measures, which can be used to compute the fitness function value of rules\nand for assessing the quality of the mined rules. A full list of the implemented interest measures along with their descriptions\nand equations can be found [here](interest_measures.md).\n\n### Visualization\n\nThe framework currently supports:\n\n- hill slopes (presented in [4]),\n- scatter plot and\n- grouped matrix plot visualization methods.\n\nMore visualization methods are planned to be implemented in future releases.\n\n#### Hill Slopes\n\n```python\nfrom matplotlib import pyplot as plt\nfrom niaarm import Dataset, get_rules\nfrom niaarm.visualize import hill_slopes\n\ndataset = Dataset('datasets/Abalone.csv')\nmetrics = ('support', 'confidence')\nrules, _ = get_rules(dataset, 'DifferentialEvolution', metrics, max_evals=1000, seed=1234)\nsome_rule = rules[150]\nhill_slopes(some_rule, dataset.transactions)\nplt.show()\n```\n\n<p>\n <img alt=\"logo\" src=\"https://raw.githubusercontent.com/firefly-cpp/NiaARM/main/.github/images/hill_slopes.png\">\n</p>\n\n#### Scatter Plot\n\n```python\nfrom examples.visualization_examples.prepare_datasets import get_weather_data\nfrom niaarm import Dataset, get_rules\nfrom niaarm.visualize import scatter_plot\n\n# Get prepared data\narm_df = get_weather_data()\n\n# Prepare Dataset\ndataset = Dataset(path_or_df=arm_df,delimiter=\",\")\n\n# Get rules\nmetrics = (\"support\", \"confidence\")\nrules, run_time = get_rules(dataset, \"DifferentialEvolution\", metrics, max_evals=500)\n\n# Add lift to metrics\nmetrics = list(metrics)\nmetrics.append(\"lift\")\nmetrics = tuple(metrics)\n\n# Visualize scatter plot\nfig = scatter_plot(rules=rules, metrics=metrics, interactive=False)\nfig.show()\n```\n\n<p>\n <img alt=\"logo\" src=\".github/images/scatter_plot.png\">\n</p>\n\n#### Grouped Matrix Plot\n\n```python\nfrom examples.visualization_examples.prepare_datasets import get_football_player_data\nfrom niaarm import Dataset, get_rules\nfrom niaarm.visualize import grouped_matrix_plot\n\n# Get prepared data\narm_df = get_football_player_data()\n\n# Prepare Dataset\ndataset = Dataset(path_or_df=arm_df, delimiter=\",\")\n\n# Get rules\nmetrics = (\"support\", \"confidence\")\nrules, run_time = get_rules(dataset, \"DifferentialEvolution\", metrics, max_evals=500)\n\n# Add lift to metrics\nmetrics = list(metrics)\nmetrics.append(\"lift\")\nmetrics = tuple(metrics)\n\n# Visualize grouped matrix plot\nfig = grouped_matrix_plot(rules=rules, metrics=metrics, k=5, interactive=False)\nfig.show()\n```\n\n<p>\n <img alt=\"logo\" src=\".github/images/grouped_matrix_plot.png\">\n</p>\n\n### Text Mining (Experimental)\n\nAn experimental implementation of association rule text mining using nature-inspired algorithms, based on ideas from [5]\nis also provided. The `niaarm.text` module contains the `Corpus` and `Document` classes for loading and preprocessing corpora,\na `TextRule` class, representing a text rule, and the `NiaARTM` class, implementing association rule text mining\nas a continuous optimization problem. The `get_text_rules` function, equivalent to `get_rules`, but for text mining, was also\nadded to the `niaarm.mine` module.\n\n```python\nimport pandas as pd\nfrom niaarm.text import Corpus\nfrom niaarm.mine import get_text_rules\nfrom niapy.algorithms.basic import ParticleSwarmOptimization\n\ndf = pd.read_json('datasets/text/artm_test_dataset.json', orient='records')\ndocuments = df['text'].tolist()\ncorpus = Corpus.from_list(documents)\n\nalgorithm = ParticleSwarmOptimization(population_size=200, seed=123)\nmetrics = ('support', 'confidence', 'aws')\nrules, time = get_text_rules(corpus, max_terms=5, algorithm=algorithm, metrics=metrics, max_evals=10000, logging=True)\n\nprint(rules)\nprint(f'Run time: {time:.2f}s')\nrules.to_csv('output.csv')\n```\n\n**Note:** You may need to download stopwords and the punkt tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt')`.\n\nFor a full list of examples see the [examples folder](https://github.com/firefly-cpp/NiaARM/tree/main/examples)\nin the GitHub repository.\n\n### Command line interface\n\nWe provide a simple command line interface, which allows you to easily\nmine association rules on any input dataset, output them to a csv file and/or perform\na simple statistical analysis on them. For more details see the [documentation](https://niaarm.readthedocs.io/en/latest/cli.html).\n\n```shell\nniaarm -h\n```\n\n```\nusage: niaarm [-h] [-v] [-c CONFIG] [-i INPUT_FILE] [-o OUTPUT_FILE] [--squashing-similarity {euclidean,cosine}] [--squashing-threshold SQUASHING_THRESHOLD] [-a ALGORITHM] [-s SEED] [--max-evals MAX_EVALS] [--max-iters MAX_ITERS]\n [--metrics METRICS [METRICS ...]] [--weights WEIGHTS [WEIGHTS ...]] [--log] [--stats]\n\nPerform ARM, output mined rules as csv, get mined rules' statistics\n\noptions:\n -h, --help show this help message and exit\n -v, --version show program's version number and exit\n -c CONFIG, --config CONFIG\n Path to a TOML config file\n -i INPUT_FILE, --input-file INPUT_FILE\n Input file containing a csv dataset\n -o OUTPUT_FILE, --output-file OUTPUT_FILE\n Output file for mined rules\n --squashing-similarity {euclidean,cosine}\n Similarity measure to use for squashing\n --squashing-threshold SQUASHING_THRESHOLD\n Threshold to use for squashing\n -a ALGORITHM, --algorithm ALGORITHM\n Algorithm to use (niapy class name, e.g. DifferentialEvolution)\n -s SEED, --seed SEED Seed for the algorithm's random number generator\n --max-evals MAX_EVALS\n Maximum number of fitness function evaluations\n --max-iters MAX_ITERS\n Maximum number of iterations\n --metrics METRICS [METRICS ...]\n Metrics to use in the fitness function.\n --weights WEIGHTS [WEIGHTS ...]\n Weights in range [0, 1] corresponding to --metrics\n --log Enable logging of fitness improvements\n --stats Display stats about mined rules\n```\nNote: The CLI script can also run as a python module (`python -m niaarm ...`)\n\n## \ud83d\udcc4 Cite us\n\nStupan, \u017d., & Fister Jr., I. (2022). [NiaARM: A minimalistic framework for Numerical Association Rule Mining](https://www.theoj.org/joss-papers/joss.04448/10.21105.joss.04448.pdf). Journal of Open Source Software, 7(77), 4448.\n\n## \ud83d\udcda References\n\nIdeas are based on the following research papers:\n\n[1] I. Fister Jr., A. Iglesias, A. G\u00e1lvez, J. Del Ser, E. Osaba, I Fister. [Differential evolution for association rule mining using categorical and numerical attributes](http://www.iztok-jr-fister.eu/static/publications/231.pdf) In: Intelligent data engineering and automated learning - IDEAL 2018, pp. 79-88, 2018.\n\n[2] I. Fister Jr., V. Podgorelec, I. Fister. [Improved Nature-Inspired Algorithms for Numeric Association Rule Mining](https://iztok-jr-fister.eu/static/publications/324.pdf). In: Vasant P., Zelinka I., Weber GW. (eds) Intelligent Computing and Optimization. ICO 2020. Advances in Intelligent Systems and Computing, vol 1324. Springer, Cham.\n\n[3] I. Fister Jr., I. Fister [A brief overview of swarm intelligence-based algorithms for numerical association rule mining](https://arxiv.org/abs/2010.15524). arXiv preprint arXiv:2010.15524 (2020).\n\n[4] Fister, I. et al. (2020). [Visualization of Numerical Association Rules by Hill Slopes](http://www.iztok-jr-fister.eu/static/publications/280.pdf).\n In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning \u2013 IDEAL 2020.\n IDEAL 2020. Lecture Notes in Computer Science(), vol 12489. Springer, Cham. https://doi.org/10.1007/978-3-030-62362-3_10\n\n[5] I. Fister, S. Deb, I. Fister, [Population-based metaheuristics for Association Rule Text Mining](http://www.iztok-jr-fister.eu/static/publications/260.pdf),\n In: Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence,\n New York, NY, USA, mar. 2020, pp. 19\u201323. doi: [10.1145/3396474.3396493](https://dl.acm.org/doi/10.1145/3396474.3396493).\n\n[6] I. Fister, I. Fister Jr., D. Novak and D. Verber, [Data squashing as preprocessing in association rule mining](https://iztok-jr-fister.eu/static/publications/300.pdf), 2022 IEEE Symposium Series on Computational Intelligence (SSCI), Singapore, Singapore, 2022, pp. 1720-1725, doi: [10.1109/SSCI51031.2022.10022240](https://doi.org/10.1109/SSCI51031.2022.10022240).\n\n## \ud83d\udcd6 See also\n\n[1] [NiaARM.jl: Numerical Association Rule Mining in Julia](https://github.com/firefly-cpp/NiaARM.jl)\n\n[2] [arm-preprocessing: Implementation of several preprocessing techniques for Association Rule Mining (ARM)](https://github.com/firefly-cpp/arm-preprocessing)\n\n## \ud83d\udd11 License\n\nThis package is distributed under the MIT License. This license can be found online at <http://www.opensource.org/licenses/MIT>.\n\n## Disclaimer\n\nThis framework is provided as-is, and there are no guarantees that it fits your purposes or that it is bug-free. Use it at your own risk!\n\n## \ud83e\udec2 Contributors\n\nThanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):\n\n<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->\n<!-- prettier-ignore-start -->\n<!-- markdownlint-disable -->\n<table>\n <tbody>\n <tr>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/zStupan\"><img src=\"https://avatars.githubusercontent.com/u/48752988?v=4?s=100\" width=\"100px;\" alt=\"zStupan\"/><br /><sub><b>zStupan</b></sub></a><br /><a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=zStupan\" title=\"Code\">\ud83d\udcbb</a> <a href=\"https://github.com/firefly-cpp/NiaARM/issues?q=author%3AzStupan\" title=\"Bug reports\">\ud83d\udc1b</a> <a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=zStupan\" title=\"Documentation\">\ud83d\udcd6</a> <a href=\"#content-zStupan\" title=\"Content\">\ud83d\udd8b</a> <a href=\"#ideas-zStupan\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a> <a href=\"#example-zStupan\" title=\"Examples\">\ud83d\udca1</a></td>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"http://www.iztok.xyz\"><img src=\"https://avatars.githubusercontent.com/u/1633361?v=4?s=100\" width=\"100px;\" alt=\"Iztok Fister Jr.\"/><br /><sub><b>Iztok Fister Jr.</b></sub></a><br /><a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=firefly-cpp\" title=\"Code\">\ud83d\udcbb</a> <a href=\"https://github.com/firefly-cpp/NiaARM/issues?q=author%3Afirefly-cpp\" title=\"Bug reports\">\ud83d\udc1b</a> <a href=\"#mentoring-firefly-cpp\" title=\"Mentoring\">\ud83e\uddd1\u200d\ud83c\udfeb</a> <a href=\"#maintenance-firefly-cpp\" title=\"Maintenance\">\ud83d\udea7</a> <a href=\"#ideas-firefly-cpp\" title=\"Ideas, Planning, & Feedback\">\ud83e\udd14</a></td>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://erkankarabulut.github.io\"><img src=\"https://avatars.githubusercontent.com/u/15374776?v=4?s=100\" width=\"100px;\" alt=\"Erkan Karabulut\"/><br /><sub><b>Erkan Karabulut</b></sub></a><br /><a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=erkankarabulut\" title=\"Code\">\ud83d\udcbb</a> <a href=\"https://github.com/firefly-cpp/NiaARM/issues?q=author%3Aerkankarabulut\" title=\"Bug reports\">\ud83d\udc1b</a></td>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/lahovniktadej\"><img src=\"https://avatars.githubusercontent.com/u/57890734?v=4?s=100\" width=\"100px;\" alt=\"Tadej Lahovnik\"/><br /><sub><b>Tadej Lahovnik</b></sub></a><br /><a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=lahovniktadej\" title=\"Documentation\">\ud83d\udcd6</a></td>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"https://github.com/musicinmybrain\"><img src=\"https://avatars.githubusercontent.com/u/6898909?v=4?s=100\" width=\"100px;\" alt=\"Ben Beasley\"/><br /><sub><b>Ben Beasley</b></sub></a><br /><a href=\"https://github.com/firefly-cpp/NiaARM/commits?author=musicinmybrain\" title=\"Documentation\">\ud83d\udcd6</a></td>\n <td align=\"center\" valign=\"top\" width=\"14.28%\"><a href=\"http://www.dusanfister.com\"><img src=\"https://avatars.githubusercontent.com/u/3198785?v=4?s=100\" width=\"100px;\" alt=\"Dusan Fister\"/><br /><sub><b>Dusan Fister</b></sub></a><br /><a href=\"#design-rhododendrom\" title=\"Design\">\ud83c\udfa8</a></td>\n </tr>\n </tbody>\n</table>\n\n<!-- markdownlint-restore -->\n<!-- prettier-ignore-end -->\n\n<!-- ALL-CONTRIBUTORS-LIST:END -->\n\nThis project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A minimalistic framework for numerical association rule mining",
"version": "0.4.0",
"project_urls": {
"Documentation": "https://niaarm.readthedocs.io/en/latest/",
"Homepage": "https://github.com/firefly-cpp/NiaARM",
"Repository": "https://github.com/firefly-cpp/NiaARM"
},
"split_keywords": [
"association rule mining",
" data science",
" numerical association rule mining",
" preprocessing",
" visualization"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "765610580c458603cacd241ff0ee89ca8d6b5d58e7a9895da7fb8e9866288cb9",
"md5": "2269419ee308c447c329a8fc630314b5",
"sha256": "1fcc76c49fbb8e3516a151ec84d246906af41f0d3afeb9f79d20c262fcb7facf"
},
"downloads": -1,
"filename": "niaarm-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2269419ee308c447c329a8fc630314b5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.14,>=3.9",
"size": 33492,
"upload_time": "2025-02-05T09:14:34",
"upload_time_iso_8601": "2025-02-05T09:14:34.989744Z",
"url": "https://files.pythonhosted.org/packages/76/56/10580c458603cacd241ff0ee89ca8d6b5d58e7a9895da7fb8e9866288cb9/niaarm-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e5bb85b94f13ef6d16850aebef13128c45d3c4f446052d914e455b923c807956",
"md5": "5e218589b60bdbe4d2cde31253495dcb",
"sha256": "7fe3d9442e7318466eee39c2158438b35403ffeec0c086ff926ed51da0b9a6db"
},
"downloads": -1,
"filename": "niaarm-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "5e218589b60bdbe4d2cde31253495dcb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.14,>=3.9",
"size": 38391,
"upload_time": "2025-02-05T09:14:36",
"upload_time_iso_8601": "2025-02-05T09:14:36.564266Z",
"url": "https://files.pythonhosted.org/packages/e5/bb/85b94f13ef6d16850aebef13128c45d3c4f446052d914e455b923c807956/niaarm-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-05 09:14:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "firefly-cpp",
"github_project": "NiaARM",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "niaarm"
}