causalexplain

Name	causalexplain JSON
Version	0.5.3 JSON
	download
home_page	https://github.com/renero/causalgraph
Summary	A package to extract the causal graph from continuous tabular data.
upload_time	2024-12-18 09:41:07
maintainer	None
docs_url	None
author	J. Renero
requires_python	>=3.10
license	MIT License
keywords	causal inference causal graph data science
VCS
bugtrack_url
requirements	numpy scipy pytorch-lightning platformdirs prompt_toolkit causal_learn colorama daft Deprecated dowhy future gadjid hyppo joblib kneed matplotlib mlforge networkx optuna pandas pydot pydotplus pygam pygraphviz pytest pytest-mock python_igraph rich scikit-learn seaborn shap statsmodels tensorboardX torch tqdm sphinx sphinx_rtd_theme myst-parser numpydoc
Travis-CI	No Travis.
coveralls test coverage

            ![logo](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/logo-light.png)

[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-31012/)
[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey.svg)](#)
[![PyPI version](https://badge.fury.io/py/causalexplain.svg)](https://badge.fury.io/py/causalexplain)
[![Build Status](https://github.com/renero/causalgraph/actions/workflows/build.yaml/badge.svg)](https://github.com/renero/causalgraph/actions/workflows/build.yaml)
[![codecov](https://codecov.io/gh/renero/causalgraph/graph/badge.svg?token=HCV0IJDFLQ)](https://codecov.io/gh/renero/causalgraph)
[![Documentation](https://img.shields.io/badge/docs-GitHub%20Pages-blue.svg)](https://renero.github.io/causalgraph/)


# causalexplain - A library to infer causal-effect relationships from tabular data

'**causalexplain**' is a library that implements methods to extract the causal graph,
from tabular data, specifically the **ReX** method, and other compared methods
like GES, PC, FCI, LiNGAM, CAM, and NOTEARS.

**ReX** is a causal discovery method that leverages machine learning (ML) models 
coupled with explainability techniques, specifically Shapley values, to 
identify and interpret significant causal relationships among variables. 
Comparative evaluations on synthetic datasets comprising tabular data reveal that 
**ReX** outperforms state-of-the-art causal discovery methods across diverse data 
generation processes, including non-linear and additive noise models. Moreover, 
**ReX** was tested on the Sachs single-cell protein-signaling dataset, achieving a 
precision of 0.952 and recovering 
key causal relationships with no incorrect edges. Taking together, these 
results showcase **ReX**'s effectiveness in accurately recovering true causal 
structures while minimizing false positive pre- dictions, its robustness 
across diverse datasets, and its applicability to real-world problems. 
By combining ML and explainability techniques with causal discovery, **ReX** 
bridges the gap between predictive modeling and causal inference, offering an 
effective tool for understanding complex causal structures.

![ReX Schema](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/REX.png)

It is built using SKLearn estimators, so that it can be used in scikit-learn 
pipelines and (hyper)parameter search, while facilitating testing (including 
some API compliance), documentation, open source development, packaging, 
and continuous integration.

The datasets used in the examples can be generated using the `generators` 
module, which is also part of this library. But in case you want to 
reproduce results from the articles that we used as reference, you can find 
the datasets in the `data` folder.

## Prerequisites without Docker

- Operating System: Linux or macOS
- Environment Manager: PyEnv or Conda
- Programming Language: Python 3.10.12 or higher
- Hardware: CPU

## Installation

The project can be installed using pip:

```bash
$ pip install causalexplain
```

## Data

The datasets used to reproduce the results presented in the manuscript are 
available under the `data` folder. The datasets were generated using the
`generators` module.

## Executing `causalexplain`

To run `causalexplain` on your data, you can use the `causalexplain` command:

```
$ python -m causalexplain
   ___                      _                 _       _       
  / __\__ _ _   _ ___  __ _| | _____  ___ __ | | __ _(_)_ __  
 / /  / _` | | | / __|/ _` | |/ _ \ \/ / '_ \| |/ _` | | '_ \ 
/ /__| (_| | |_| \__ \ (_| | |  __/>  <| |_) | | (_| | | | | |
\____/\__,_|\__,_|___/\__,_|_|\___/_/\_\ .__/|_|\__,_|_|_| |_|
                                       |_|                                        
usage: causalexplain [-h] -d DATASET [-m {rex,pc,fci,ges,lingam,cam,notears}] 
                   [-t TRUE_DAG] [-l LOAD_MODEL] [-T THRESHOLD] [-u UNION] 
                   [-i ITERATIONS] [-b BOOTSTRAP] [-r REGRESSOR] [-S SEED] 
                   [-s [SAVE_MODEL]] [-n] [-v] [-q] [-o OUTPUT]
```

that will present you with a menu to choose the dataset you want to use, the 
method you want to use to infer the causal graph, and the hyperparameters you
want to use.

The minimum required to run `causalexplain` is a dataset file in CSV format,
with the first row containing the names of the variables, and the rest of
the rows containing the values of the variables. The method selected by default
is ReX, but you can also choose between PC, FCI, GES, LiNGAM, CAM, NOTEARS. 
At the end of the execution, the edges of the plausible causal graph will be 
displayed along with the metrics obtained, if the true dag is provided 
(argument `-t`).

## Example commands

The following command illustrates how to run `causalexplain` on the toy dataset
using the ReX method:

```bash
$ python -m causalexplain -d /path/to/toy_dataset.csv -t /path/to/toy_dataset.dot
```

The same command can be used to run `causalexplain` on the toy dataset using the
CAM method:

```bash
$ python -m causalexplain -d /path/to/toy_dataset.csv -m cam -t /path/to/toy_dataset.dot
```

For more information on command line options, run `causalexplain -h` or go to 
the [Quickstart](https://renero.github.io/causalgraph/quickstart.html) section in the documentation.

## Additional Information

WIP

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/renero/causalgraph",
    "name": "causalexplain",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "causal inference, causal graph, data science",
    "author": "J. Renero",
    "author_email": "jesus.renero@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/18/22/f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b/causalexplain-0.5.3.tar.gz",
    "platform": null,
    "description": "![logo](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/logo-light.png)\n\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-31012/)\n[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey.svg)](#)\n[![PyPI version](https://badge.fury.io/py/causalexplain.svg)](https://badge.fury.io/py/causalexplain)\n[![Build Status](https://github.com/renero/causalgraph/actions/workflows/build.yaml/badge.svg)](https://github.com/renero/causalgraph/actions/workflows/build.yaml)\n[![codecov](https://codecov.io/gh/renero/causalgraph/graph/badge.svg?token=HCV0IJDFLQ)](https://codecov.io/gh/renero/causalgraph)\n[![Documentation](https://img.shields.io/badge/docs-GitHub%20Pages-blue.svg)](https://renero.github.io/causalgraph/)\n\n\n# causalexplain - A library to infer causal-effect relationships from tabular data\n\n'**causalexplain**' is a library that implements methods to extract the causal graph,\nfrom tabular data, specifically the **ReX** method, and other compared methods\nlike GES, PC, FCI, LiNGAM, CAM, and NOTEARS.\n\n**ReX** is a causal discovery method that leverages machine learning (ML) models \ncoupled with explainability techniques, specifically Shapley values, to \nidentify and interpret significant causal relationships among variables. \nComparative evaluations on synthetic datasets comprising tabular data reveal that \n**ReX** outperforms state-of-the-art causal discovery methods across diverse data \ngeneration processes, including non-linear and additive noise models. Moreover, \n**ReX** was tested on the Sachs single-cell protein-signaling dataset, achieving a \nprecision of 0.952 and recovering \nkey causal relationships with no incorrect edges. Taking together, these \nresults showcase **ReX**'s effectiveness in accurately recovering true causal \nstructures while minimizing false positive pre- dictions, its robustness \nacross diverse datasets, and its applicability to real-world problems. \nBy combining ML and explainability techniques with causal discovery, **ReX** \nbridges the gap between predictive modeling and causal inference, offering an \neffective tool for understanding complex causal structures.\n\n![ReX Schema](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/REX.png)\n\nIt is built using SKLearn estimators, so that it can be used in scikit-learn \npipelines and (hyper)parameter search, while facilitating testing (including \nsome API compliance), documentation, open source development, packaging, \nand continuous integration.\n\nThe datasets used in the examples can be generated using the `generators` \nmodule, which is also part of this library. But in case you want to \nreproduce results from the articles that we used as reference, you can find \nthe datasets in the `data` folder.\n\n## Prerequisites without Docker\n\n- Operating System: Linux or macOS\n- Environment Manager: PyEnv or Conda\n- Programming Language: Python 3.10.12 or higher\n- Hardware: CPU\n\n## Installation\n\nThe project can be installed using pip:\n\n```bash\n$ pip install causalexplain\n```\n\n## Data\n\nThe datasets used to reproduce the results presented in the manuscript are \navailable under the `data` folder. The datasets were generated using the\n`generators` module.\n\n## Executing `causalexplain`\n\nTo run `causalexplain` on your data, you can use the `causalexplain` command:\n\n```\n$ python -m causalexplain\n   ___                      _                 _       _       \n  / __\\__ _ _   _ ___  __ _| | _____  ___ __ | | __ _(_)_ __  \n / /  / _` | | | / __|/ _` | |/ _ \\ \\/ / '_ \\| |/ _` | | '_ \\ \n/ /__| (_| | |_| \\__ \\ (_| | |  __/>  <| |_) | | (_| | | | | |\n\\____/\\__,_|\\__,_|___/\\__,_|_|\\___/_/\\_\\ .__/|_|\\__,_|_|_| |_|\n                                       |_|                                        \nusage: causalexplain [-h] -d DATASET [-m {rex,pc,fci,ges,lingam,cam,notears}] \n                   [-t TRUE_DAG] [-l LOAD_MODEL] [-T THRESHOLD] [-u UNION] \n                   [-i ITERATIONS] [-b BOOTSTRAP] [-r REGRESSOR] [-S SEED] \n                   [-s [SAVE_MODEL]] [-n] [-v] [-q] [-o OUTPUT]\n```\n\nthat will present you with a menu to choose the dataset you want to use, the \nmethod you want to use to infer the causal graph, and the hyperparameters you\nwant to use.\n\nThe minimum required to run `causalexplain` is a dataset file in CSV format,\nwith the first row containing the names of the variables, and the rest of\nthe rows containing the values of the variables. The method selected by default\nis ReX, but you can also choose between PC, FCI, GES, LiNGAM, CAM, NOTEARS. \nAt the end of the execution, the edges of the plausible causal graph will be \ndisplayed along with the metrics obtained, if the true dag is provided \n(argument `-t`).\n\n## Example commands\n\nThe following command illustrates how to run `causalexplain` on the toy dataset\nusing the ReX method:\n\n```bash\n$ python -m causalexplain -d /path/to/toy_dataset.csv -t /path/to/toy_dataset.dot\n```\n\nThe same command can be used to run `causalexplain` on the toy dataset using the\nCAM method:\n\n```bash\n$ python -m causalexplain -d /path/to/toy_dataset.csv -m cam -t /path/to/toy_dataset.dot\n```\n\nFor more information on command line options, run `causalexplain -h` or go to \nthe [Quickstart](https://renero.github.io/causalgraph/quickstart.html) section in the documentation.\n\n## Additional Information\n\nWIP\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A package to extract the causal graph from continuous tabular data.",
    "version": "0.5.3",
    "project_urls": {
        "Homepage": "https://github.com/renero/causalgraph"
    },
    "split_keywords": [
        "causal inference",
        " causal graph",
        " data science"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "17ab3262cc5d1507bdc0ec247b6ab6bf468ff4eb456f78df24b6e82116d24287",
                "md5": "7405ab25c9828cb71dd26d08d5900194",
                "sha256": "951dc6d16af9457d1502e1ee2c4568871ffe498880ea2bcb5d46b3b98e4d1a2d"
            },
            "downloads": -1,
            "filename": "causalexplain-0.5.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7405ab25c9828cb71dd26d08d5900194",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 253503,
            "upload_time": "2024-12-18T09:41:03",
            "upload_time_iso_8601": "2024-12-18T09:41:03.449533Z",
            "url": "https://files.pythonhosted.org/packages/17/ab/3262cc5d1507bdc0ec247b6ab6bf468ff4eb456f78df24b6e82116d24287/causalexplain-0.5.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1822f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b",
                "md5": "f421cb9d0423327f77b1b32bf34fd1c9",
                "sha256": "8e1bd21f5fbb0e4af4a64c9e6968993a0e9d4bd2b83be220622818476ee39171"
            },
            "downloads": -1,
            "filename": "causalexplain-0.5.3.tar.gz",
            "has_sig": false,
            "md5_digest": "f421cb9d0423327f77b1b32bf34fd1c9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 211417,
            "upload_time": "2024-12-18T09:41:07",
            "upload_time_iso_8601": "2024-12-18T09:41:07.066735Z",
            "url": "https://files.pythonhosted.org/packages/18/22/f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b/causalexplain-0.5.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-18 09:41:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "renero",
    "github_project": "causalgraph",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.24.2"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.11.2"
                ]
            ]
        },
        {
            "name": "pytorch-lightning",
            "specs": [
                [
                    "==",
                    "2.0.9"
                ]
            ]
        },
        {
            "name": "platformdirs",
            "specs": [
                [
                    "<",
                    "4.0.0"
                ],
                [
                    ">=",
                    "3.9.1"
                ]
            ]
        },
        {
            "name": "prompt_toolkit",
            "specs": [
                [
                    ">=",
                    "3.0.41"
                ],
                [
                    "<",
                    "3.1.0"
                ]
            ]
        },
        {
            "name": "causal_learn",
            "specs": []
        },
        {
            "name": "colorama",
            "specs": []
        },
        {
            "name": "daft",
            "specs": []
        },
        {
            "name": "Deprecated",
            "specs": []
        },
        {
            "name": "dowhy",
            "specs": []
        },
        {
            "name": "future",
            "specs": []
        },
        {
            "name": "gadjid",
            "specs": []
        },
        {
            "name": "hyppo",
            "specs": []
        },
        {
            "name": "joblib",
            "specs": []
        },
        {
            "name": "kneed",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "mlforge",
            "specs": []
        },
        {
            "name": "networkx",
            "specs": []
        },
        {
            "name": "optuna",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "pydot",
            "specs": []
        },
        {
            "name": "pydotplus",
            "specs": []
        },
        {
            "name": "pygam",
            "specs": [
                [
                    "==",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "pygraphviz",
            "specs": []
        },
        {
            "name": "pytest",
            "specs": []
        },
        {
            "name": "pytest-mock",
            "specs": []
        },
        {
            "name": "python_igraph",
            "specs": []
        },
        {
            "name": "rich",
            "specs": []
        },
        {
            "name": "scikit-learn",
            "specs": []
        },
        {
            "name": "seaborn",
            "specs": []
        },
        {
            "name": "shap",
            "specs": []
        },
        {
            "name": "statsmodels",
            "specs": []
        },
        {
            "name": "tensorboardX",
            "specs": []
        },
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "sphinx",
            "specs": []
        },
        {
            "name": "sphinx_rtd_theme",
            "specs": []
        },
        {
            "name": "myst-parser",
            "specs": []
        },
        {
            "name": "numpydoc",
            "specs": []
        }
    ],
    "lcname": "causalexplain"
}

J. Renero