ruleminer

Name	ruleminer JSON
Version	0.3.0 JSON
	download
home_page	None
Summary	Python package to mine association rules in datasets
upload_time	2025-03-21 11:38:56
maintainer	None
docs_url	None
author	Willem Jan Willemse
requires_python	<3.13,>=3.9
license	MIT/X
keywords	association rules pandas
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # ruleminer

[![Documentation](https://readthedocs.org/projects/ruleminer/badge)](https://ruleminer.readthedocs.io/en/latest/)
[![image](https://img.shields.io/pypi/v/ruleminer.svg)](https://pypi.python.org/pypi/ruleminer)
[![image](https://img.shields.io/pypi/pyversions/ruleminer.svg)](https://pypi.python.org/pypi/ruleminer)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

Python package to discover association rules in Pandas DataFrames. 

This package implements the code of the paper [Discovering and ranking validation rules in supervisory data](https://github.com/wjwillemse/ruleminer/tree/main/docs/paper.pdf).
-   Free software: MIT/X license
-   Documentation: <https://ruleminer.readthedocs.io/en/latest>.

## Features

Here is what the package does:

* Generate human-readable validation rules using rule templates containing regular expressions and a Pandas DataFrame dataset

  - available functions: min, max, abs, quantile, sum, substr, split, count, sumif and countif
  - including parameters for metric filters and rule precisions (including XBRL tolerances)

* Evaluate rules and calculate association rules metrics

  - available metrics: abs support, abs exceptions, confidence, support, added value, casual confidence, casual support, conviction, lift and rule power factor

Here are some examples of rule templates with regexes with which you can generate validation rules:

  - *if ({"Type"} == ".*") then ({".*"} > 0)*

  - *if ({".*"} > 0) then (({".*"} == 0) & ({".*"} > 0))*

  - *(({".*"} + {".*"} + {".*"}) == {".*"})*

  - *({"Own funds"} <= quantile({"Own funds"}, 0.95))*

  - *(substr({"Type"}, 0, 1) in ["a", "b"])*

The first template generates (with the dataset described in the Usage section) rules like

  - *if ({"Type"} == "non-life_insurer") then ({"TP-nonlife"} > 0)*
  - *if ({"Type"} == "life_insurer") then ({"TP-life"} > 0)*

These generated validation rules can then be used to validate new datasets.

## Contributors

* Willem Jan Willemse <https://github.com/wjwillemse>

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ruleminer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.9",
    "maintainer_email": null,
    "keywords": "association rules, pandas",
    "author": "Willem Jan Willemse",
    "author_email": "w.j.willemse@freedom.nl",
    "download_url": "https://files.pythonhosted.org/packages/7b/42/fcd66cea08f401620bbcd6d870f66270368ee365ba046a478aad9fc8d142/ruleminer-0.3.0.tar.gz",
    "platform": null,
    "description": "# ruleminer\n\n[![Documentation](https://readthedocs.org/projects/ruleminer/badge)](https://ruleminer.readthedocs.io/en/latest/)\n[![image](https://img.shields.io/pypi/v/ruleminer.svg)](https://pypi.python.org/pypi/ruleminer)\n[![image](https://img.shields.io/pypi/pyversions/ruleminer.svg)](https://pypi.python.org/pypi/ruleminer)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n\nPython package to discover association rules in Pandas DataFrames. \n\nThis package implements the code of the paper [Discovering and ranking validation rules in supervisory data](https://github.com/wjwillemse/ruleminer/tree/main/docs/paper.pdf).\n-   Free software: MIT/X license\n-   Documentation: <https://ruleminer.readthedocs.io/en/latest>.\n\n## Features\n\nHere is what the package does:\n\n* Generate human-readable validation rules using rule templates containing regular expressions and a Pandas DataFrame dataset\n\n  - available functions: min, max, abs, quantile, sum, substr, split, count, sumif and countif\n  - including parameters for metric filters and rule precisions (including XBRL tolerances)\n\n* Evaluate rules and calculate association rules metrics\n\n  - available metrics: abs support, abs exceptions, confidence, support, added value, casual confidence, casual support, conviction, lift and rule power factor\n\nHere are some examples of rule templates with regexes with which you can generate validation rules:\n\n  - *if ({\"Type\"} == \".*\") then ({\".*\"} > 0)*\n\n  - *if ({\".*\"} > 0) then (({\".*\"} == 0) & ({\".*\"} > 0))*\n\n  - *(({\".*\"} + {\".*\"} + {\".*\"}) == {\".*\"})*\n\n  - *({\"Own funds\"} <= quantile({\"Own funds\"}, 0.95))*\n\n  - *(substr({\"Type\"}, 0, 1) in [\"a\", \"b\"])*\n\nThe first template generates (with the dataset described in the Usage section) rules like\n\n  - *if ({\"Type\"} == \"non-life_insurer\") then ({\"TP-nonlife\"} > 0)*\n  - *if ({\"Type\"} == \"life_insurer\") then ({\"TP-life\"} > 0)*\n\nThese generated validation rules can then be used to validate new datasets.\n\n## Contributors\n\n* Willem Jan Willemse <https://github.com/wjwillemse>\n\n",
    "bugtrack_url": null,
    "license": "MIT/X",
    "summary": "Python package to mine association rules in datasets",
    "version": "0.3.0",
    "project_urls": {
        "Documentation": "https://ruleminer.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/DeNederlandscheBank/ruleminer",
        "Repository": "https://github.com/DeNederlandscheBank/ruleminer"
    },
    "split_keywords": [
        "association rules",
        " pandas"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "07fd7012a02ea100811caed1ddcc53b426e529e901049152a8afa130fa9c34f6",
                "md5": "d4c54fe699a1d34b0f71a9d16737ff6e",
                "sha256": "18aa1aea6c3e34a719c4baa8cfd4e9e49ea28fb523c11336192157b58c0af15f"
            },
            "downloads": -1,
            "filename": "ruleminer-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d4c54fe699a1d34b0f71a9d16737ff6e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.9",
            "size": 37113,
            "upload_time": "2025-03-21T11:38:51",
            "upload_time_iso_8601": "2025-03-21T11:38:51.949425Z",
            "url": "https://files.pythonhosted.org/packages/07/fd/7012a02ea100811caed1ddcc53b426e529e901049152a8afa130fa9c34f6/ruleminer-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7b42fcd66cea08f401620bbcd6d870f66270368ee365ba046a478aad9fc8d142",
                "md5": "02e782842e634b68bf1b129195da0231",
                "sha256": "ae02467bbce119380138deda146b9fc909be595fb00deac1265e298661659e34"
            },
            "downloads": -1,
            "filename": "ruleminer-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "02e782842e634b68bf1b129195da0231",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.9",
            "size": 34164,
            "upload_time": "2025-03-21T11:38:56",
            "upload_time_iso_8601": "2025-03-21T11:38:56.845328Z",
            "url": "https://files.pythonhosted.org/packages/7b/42/fcd66cea08f401620bbcd6d870f66270368ee365ba046a478aad9fc8d142/ruleminer-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-03-21 11:38:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "DeNederlandscheBank",
    "github_project": "ruleminer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "ruleminer"
}

Willem Jan Willemse