![logo](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/logo-light.png)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-31012/)
[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey.svg)](#)
[![PyPI version](https://badge.fury.io/py/causalexplain.svg)](https://badge.fury.io/py/causalexplain)
[![Build Status](https://github.com/renero/causalgraph/actions/workflows/build.yaml/badge.svg)](https://github.com/renero/causalgraph/actions/workflows/build.yaml)
[![codecov](https://codecov.io/gh/renero/causalgraph/graph/badge.svg?token=HCV0IJDFLQ)](https://codecov.io/gh/renero/causalgraph)
[![Documentation](https://img.shields.io/badge/docs-GitHub%20Pages-blue.svg)](https://renero.github.io/causalgraph/)
# causalexplain - A library to infer causal-effect relationships from tabular data
'**causalexplain**' is a library that implements methods to extract the causal graph,
from tabular data, specifically the **ReX** method, and other compared methods
like GES, PC, FCI, LiNGAM, CAM, and NOTEARS.
**ReX** is a causal discovery method that leverages machine learning (ML) models
coupled with explainability techniques, specifically Shapley values, to
identify and interpret significant causal relationships among variables.
Comparative evaluations on synthetic datasets comprising tabular data reveal that
**ReX** outperforms state-of-the-art causal discovery methods across diverse data
generation processes, including non-linear and additive noise models. Moreover,
**ReX** was tested on the Sachs single-cell protein-signaling dataset, achieving a
precision of 0.952 and recovering
key causal relationships with no incorrect edges. Taking together, these
results showcase **ReX**'s effectiveness in accurately recovering true causal
structures while minimizing false positive pre- dictions, its robustness
across diverse datasets, and its applicability to real-world problems.
By combining ML and explainability techniques with causal discovery, **ReX**
bridges the gap between predictive modeling and causal inference, offering an
effective tool for understanding complex causal structures.
![ReX Schema](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/REX.png)
It is built using SKLearn estimators, so that it can be used in scikit-learn
pipelines and (hyper)parameter search, while facilitating testing (including
some API compliance), documentation, open source development, packaging,
and continuous integration.
The datasets used in the examples can be generated using the `generators`
module, which is also part of this library. But in case you want to
reproduce results from the articles that we used as reference, you can find
the datasets in the `data` folder.
## Prerequisites without Docker
- Operating System: Linux or macOS
- Environment Manager: PyEnv or Conda
- Programming Language: Python 3.10.12 or higher
- Hardware: CPU
## Installation
The project can be installed using pip:
```bash
$ pip install causalexplain
```
## Data
The datasets used to reproduce the results presented in the manuscript are
available under the `data` folder. The datasets were generated using the
`generators` module.
## Executing `causalexplain`
To run `causalexplain` on your data, you can use the `causalexplain` command:
```
$ python -m causalexplain
___ _ _ _
/ __\__ _ _ _ ___ __ _| | _____ ___ __ | | __ _(_)_ __
/ / / _` | | | / __|/ _` | |/ _ \ \/ / '_ \| |/ _` | | '_ \
/ /__| (_| | |_| \__ \ (_| | | __/> <| |_) | | (_| | | | | |
\____/\__,_|\__,_|___/\__,_|_|\___/_/\_\ .__/|_|\__,_|_|_| |_|
|_|
usage: causalexplain [-h] -d DATASET [-m {rex,pc,fci,ges,lingam,cam,notears}]
[-t TRUE_DAG] [-l LOAD_MODEL] [-T THRESHOLD] [-u UNION]
[-i ITERATIONS] [-b BOOTSTRAP] [-r REGRESSOR] [-S SEED]
[-s [SAVE_MODEL]] [-n] [-v] [-q] [-o OUTPUT]
```
that will present you with a menu to choose the dataset you want to use, the
method you want to use to infer the causal graph, and the hyperparameters you
want to use.
The minimum required to run `causalexplain` is a dataset file in CSV format,
with the first row containing the names of the variables, and the rest of
the rows containing the values of the variables. The method selected by default
is ReX, but you can also choose between PC, FCI, GES, LiNGAM, CAM, NOTEARS.
At the end of the execution, the edges of the plausible causal graph will be
displayed along with the metrics obtained, if the true dag is provided
(argument `-t`).
## Example commands
The following command illustrates how to run `causalexplain` on the toy dataset
using the ReX method:
```bash
$ python -m causalexplain -d /path/to/toy_dataset.csv -t /path/to/toy_dataset.dot
```
The same command can be used to run `causalexplain` on the toy dataset using the
CAM method:
```bash
$ python -m causalexplain -d /path/to/toy_dataset.csv -m cam -t /path/to/toy_dataset.dot
```
For more information on command line options, run `causalexplain -h` or go to
the [Quickstart](https://renero.github.io/causalgraph/quickstart.html) section in the documentation.
## Additional Information
WIP
Raw data
{
"_id": null,
"home_page": "https://github.com/renero/causalgraph",
"name": "causalexplain",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "causal inference, causal graph, data science",
"author": "J. Renero",
"author_email": "jesus.renero@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/18/22/f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b/causalexplain-0.5.3.tar.gz",
"platform": null,
"description": "![logo](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/logo-light.png)\n\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n[![Python](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/downloads/release/python-31012/)\n[![Platform](https://img.shields.io/badge/Platform-Linux%20%7C%20macOS-lightgrey.svg)](#)\n[![PyPI version](https://badge.fury.io/py/causalexplain.svg)](https://badge.fury.io/py/causalexplain)\n[![Build Status](https://github.com/renero/causalgraph/actions/workflows/build.yaml/badge.svg)](https://github.com/renero/causalgraph/actions/workflows/build.yaml)\n[![codecov](https://codecov.io/gh/renero/causalgraph/graph/badge.svg?token=HCV0IJDFLQ)](https://codecov.io/gh/renero/causalgraph)\n[![Documentation](https://img.shields.io/badge/docs-GitHub%20Pages-blue.svg)](https://renero.github.io/causalgraph/)\n\n\n# causalexplain - A library to infer causal-effect relationships from tabular data\n\n'**causalexplain**' is a library that implements methods to extract the causal graph,\nfrom tabular data, specifically the **ReX** method, and other compared methods\nlike GES, PC, FCI, LiNGAM, CAM, and NOTEARS.\n\n**ReX** is a causal discovery method that leverages machine learning (ML) models \ncoupled with explainability techniques, specifically Shapley values, to \nidentify and interpret significant causal relationships among variables. \nComparative evaluations on synthetic datasets comprising tabular data reveal that \n**ReX** outperforms state-of-the-art causal discovery methods across diverse data \ngeneration processes, including non-linear and additive noise models. Moreover, \n**ReX** was tested on the Sachs single-cell protein-signaling dataset, achieving a \nprecision of 0.952 and recovering \nkey causal relationships with no incorrect edges. Taking together, these \nresults showcase **ReX**'s effectiveness in accurately recovering true causal \nstructures while minimizing false positive pre- dictions, its robustness \nacross diverse datasets, and its applicability to real-world problems. \nBy combining ML and explainability techniques with causal discovery, **ReX** \nbridges the gap between predictive modeling and causal inference, offering an \neffective tool for understanding complex causal structures.\n\n![ReX Schema](https://raw.githubusercontent.com/renero/causalgraph/main/docs/_static/REX.png)\n\nIt is built using SKLearn estimators, so that it can be used in scikit-learn \npipelines and (hyper)parameter search, while facilitating testing (including \nsome API compliance), documentation, open source development, packaging, \nand continuous integration.\n\nThe datasets used in the examples can be generated using the `generators` \nmodule, which is also part of this library. But in case you want to \nreproduce results from the articles that we used as reference, you can find \nthe datasets in the `data` folder.\n\n## Prerequisites without Docker\n\n- Operating System: Linux or macOS\n- Environment Manager: PyEnv or Conda\n- Programming Language: Python 3.10.12 or higher\n- Hardware: CPU\n\n## Installation\n\nThe project can be installed using pip:\n\n```bash\n$ pip install causalexplain\n```\n\n## Data\n\nThe datasets used to reproduce the results presented in the manuscript are \navailable under the `data` folder. The datasets were generated using the\n`generators` module.\n\n## Executing `causalexplain`\n\nTo run `causalexplain` on your data, you can use the `causalexplain` command:\n\n```\n$ python -m causalexplain\n ___ _ _ _ \n / __\\__ _ _ _ ___ __ _| | _____ ___ __ | | __ _(_)_ __ \n / / / _` | | | / __|/ _` | |/ _ \\ \\/ / '_ \\| |/ _` | | '_ \\ \n/ /__| (_| | |_| \\__ \\ (_| | | __/> <| |_) | | (_| | | | | |\n\\____/\\__,_|\\__,_|___/\\__,_|_|\\___/_/\\_\\ .__/|_|\\__,_|_|_| |_|\n |_| \nusage: causalexplain [-h] -d DATASET [-m {rex,pc,fci,ges,lingam,cam,notears}] \n [-t TRUE_DAG] [-l LOAD_MODEL] [-T THRESHOLD] [-u UNION] \n [-i ITERATIONS] [-b BOOTSTRAP] [-r REGRESSOR] [-S SEED] \n [-s [SAVE_MODEL]] [-n] [-v] [-q] [-o OUTPUT]\n```\n\nthat will present you with a menu to choose the dataset you want to use, the \nmethod you want to use to infer the causal graph, and the hyperparameters you\nwant to use.\n\nThe minimum required to run `causalexplain` is a dataset file in CSV format,\nwith the first row containing the names of the variables, and the rest of\nthe rows containing the values of the variables. The method selected by default\nis ReX, but you can also choose between PC, FCI, GES, LiNGAM, CAM, NOTEARS. \nAt the end of the execution, the edges of the plausible causal graph will be \ndisplayed along with the metrics obtained, if the true dag is provided \n(argument `-t`).\n\n## Example commands\n\nThe following command illustrates how to run `causalexplain` on the toy dataset\nusing the ReX method:\n\n```bash\n$ python -m causalexplain -d /path/to/toy_dataset.csv -t /path/to/toy_dataset.dot\n```\n\nThe same command can be used to run `causalexplain` on the toy dataset using the\nCAM method:\n\n```bash\n$ python -m causalexplain -d /path/to/toy_dataset.csv -m cam -t /path/to/toy_dataset.dot\n```\n\nFor more information on command line options, run `causalexplain -h` or go to \nthe [Quickstart](https://renero.github.io/causalgraph/quickstart.html) section in the documentation.\n\n## Additional Information\n\nWIP\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "A package to extract the causal graph from continuous tabular data.",
"version": "0.5.3",
"project_urls": {
"Homepage": "https://github.com/renero/causalgraph"
},
"split_keywords": [
"causal inference",
" causal graph",
" data science"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "17ab3262cc5d1507bdc0ec247b6ab6bf468ff4eb456f78df24b6e82116d24287",
"md5": "7405ab25c9828cb71dd26d08d5900194",
"sha256": "951dc6d16af9457d1502e1ee2c4568871ffe498880ea2bcb5d46b3b98e4d1a2d"
},
"downloads": -1,
"filename": "causalexplain-0.5.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7405ab25c9828cb71dd26d08d5900194",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 253503,
"upload_time": "2024-12-18T09:41:03",
"upload_time_iso_8601": "2024-12-18T09:41:03.449533Z",
"url": "https://files.pythonhosted.org/packages/17/ab/3262cc5d1507bdc0ec247b6ab6bf468ff4eb456f78df24b6e82116d24287/causalexplain-0.5.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1822f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b",
"md5": "f421cb9d0423327f77b1b32bf34fd1c9",
"sha256": "8e1bd21f5fbb0e4af4a64c9e6968993a0e9d4bd2b83be220622818476ee39171"
},
"downloads": -1,
"filename": "causalexplain-0.5.3.tar.gz",
"has_sig": false,
"md5_digest": "f421cb9d0423327f77b1b32bf34fd1c9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 211417,
"upload_time": "2024-12-18T09:41:07",
"upload_time_iso_8601": "2024-12-18T09:41:07.066735Z",
"url": "https://files.pythonhosted.org/packages/18/22/f8ec6e7e58a077773dd92491644747c6caf9bc28fabd3e268e272d326d3b/causalexplain-0.5.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-18 09:41:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "renero",
"github_project": "causalgraph",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": [
[
"==",
"1.24.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.11.2"
]
]
},
{
"name": "pytorch-lightning",
"specs": [
[
"==",
"2.0.9"
]
]
},
{
"name": "platformdirs",
"specs": [
[
"<",
"4.0.0"
],
[
">=",
"3.9.1"
]
]
},
{
"name": "prompt_toolkit",
"specs": [
[
"<",
"3.1.0"
],
[
">=",
"3.0.41"
]
]
},
{
"name": "causal_learn",
"specs": []
},
{
"name": "colorama",
"specs": []
},
{
"name": "daft",
"specs": []
},
{
"name": "Deprecated",
"specs": []
},
{
"name": "dowhy",
"specs": []
},
{
"name": "future",
"specs": []
},
{
"name": "gadjid",
"specs": []
},
{
"name": "hyppo",
"specs": []
},
{
"name": "joblib",
"specs": []
},
{
"name": "kneed",
"specs": []
},
{
"name": "matplotlib",
"specs": []
},
{
"name": "mlforge",
"specs": []
},
{
"name": "networkx",
"specs": []
},
{
"name": "optuna",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "pydot",
"specs": []
},
{
"name": "pydotplus",
"specs": []
},
{
"name": "pygam",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "pytest",
"specs": []
},
{
"name": "python_igraph",
"specs": []
},
{
"name": "rich",
"specs": []
},
{
"name": "scikit-learn",
"specs": []
},
{
"name": "seaborn",
"specs": []
},
{
"name": "shap",
"specs": []
},
{
"name": "statsmodels",
"specs": []
},
{
"name": "tensorboardX",
"specs": []
},
{
"name": "torch",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "sphinx",
"specs": []
},
{
"name": "sphinx_rtd_theme",
"specs": []
},
{
"name": "myst-parser",
"specs": []
},
{
"name": "numpydoc",
"specs": []
}
],
"lcname": "causalexplain"
}