# Fairness Checker
This Python module fairness_checker provides a set of methods to evaluate the fairness of a predictive model's outcomes across different demographic groups represented in a CSV file or given a model.
## Dependencies
* Python >= 3.8
## Installation
```bash
pip3 install fairness-checker
```
## Usage
### As a library
#### CSV checker
First set up the checker using a benchmark dataset:
```python3
from fairness_checker import fairness_csv_checker
c = fairness_csv_checker("compas-scores-two-years.csv")
```
Then you can call fairness measure functions. For example:
```python
c.demographic_parity(0.2, lambda row: row['sex'] == 'Male', lambda row: row['score_text'] in {'Medium', 'High'})
```
Output:
```txt
demographic parity
fair: 0.04 < 0.2
```
Note the function signature of `demographic_parity`:
```python
demographic_parity(ratio: float,
privileged_predicate: Callable[[csv_row], bool],
positive_predicate: Callable[[T], bool]) -> bool:
```
Here the `privileged_predicate` is
```python
lambda row: row['sex'] == 'Male'
```
meaning the privileged group is the male group, and the `positive_predicate` is
```python
lambda row: row['score_text'] in {'Medium', 'High'}
```
meaning the row is positive if the score is categorized as medium or high.
For a more complicated example involving parameters:
```python
c.conditional_statistical_parity(0.2, lambda row: row['sex'] == sex, lambda row: row['score_text'] in {'Medium', 'High'}, lambda x: (lambda row: int(row['priors_count']) > x), (0,))
```
Output:
```txt
conditional statistical parity
fair: 0.04 < 0.2
```
Note the function signature of `conditional_statistical_parity`:
```python
def conditional_statistical_parity(ratio: float,
privileged_predicate: Callable[[csv_row], bool],
positive_predicate: Callable[[csv_row], bool],
legitimate_predicate_h: Callable[..., Callable[[csv_row], bool]],
legitimate_arg: Tuple[Any, ...]) -> bool:
```
Here the higher order function `legitimate_predicate_h` is
```python
lambda x: (lambda row: int(row['priors_count']) > x)
```
and the argument to it, `legitimate_arg`, is `(0,)`.
#### Model checker
```python3
from fairness_checker import fairness_model_checker
c = fairness_model_checker("compas-scores-two-years.csv")
```
Alternatively, you can use the checker on a model. It expects the model
to have a `predict` method that takes a csv filename as input and
returns an iterable of results.
```python
c.demographic_parity(0.2, model, lambda row: row['sex'] == 'Male', lambda Y: Y == 1)
```
The last predicate here is used on the model result.
### As a command line CLI
Prepare your dataset file. Create a predicate definition file containing arguments to the measure functions. For example, to calculate negative balance, create a file `test_predicates1.py` containing the following:
```python
def privileged_predicate(row):
return row['sex'] == 'Male'
def score_predicate(row):
return int(row['decile_score'])
def truth_predicate(row):
return row['is_recid'] == '1'
```
Make sure the order of the definitions are the same as the order of the function signature.
Then execute the client in command line:
```bash
python3 -m fairness_checker
```
You'll be asked a few questions about which fairness measure you want to calculate and what ratio you want to set, like so:
```txt
Input dataset file name: compas-scores-two-years.csv
Input ratio: 0.2
Input the fairness measure: negative balance
Input the predicate definitions file name: test_predicates1.py
```
Output:
```
negative balance
fair: 0.07 < 0.2
```
For another example, let's calculate equal calibration with a predicate file `test_predicates2.py` containing the following:
```python
def privileged_predicate(row):
return row['sex'] == 'Male'
def truth_predicate(row):
return row['is_recid'] == '1'
def calib_predicate_h(u, l):
def tmp(row):
return l <= int(row['decile_score']) and int(row['decile_score']) <= u
return tmp
calib_arg = (7, 5)
```
Again, the order of the definition matters. They must match that of the function signature.
Execute in command line:
```bash
python3 -m fairness_checker
```
You'll be asked a few questions about which fairness measure you want to calculate and what ratio you want to set, like so:
```txt
Input dataset file name: compas-scores-two-years.csv
Input ratio: 0.2
Input the fairness measure: equal calibration
Input the predicate definitions file name: test_predicates2.py
```
Output:
```
equal calibration
fair: 0.10 < 0.2
```
Raw data
{
"_id": null,
"home_page": "https://github.com/RexYuan/Shu",
"name": "fairness-checker",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "RexYuan",
"author_email": "hello@rexyuan.com",
"download_url": "https://files.pythonhosted.org/packages/54/df/3dbd5463c735f884e145a8e5b06742ae901fc2ed8939807ff5599c8d4500/fairness_checker-0.1.12.tar.gz",
"platform": null,
"description": "# Fairness Checker\n\nThis Python module fairness_checker provides a set of methods to evaluate the fairness of a predictive model's outcomes across different demographic groups represented in a CSV file or given a model.\n\n## Dependencies\n\n* Python >= 3.8\n\n## Installation\n\n```bash\npip3 install fairness-checker\n```\n\n## Usage\n\n### As a library\n\n#### CSV checker\n\nFirst set up the checker using a benchmark dataset:\n\n```python3\nfrom fairness_checker import fairness_csv_checker\nc = fairness_csv_checker(\"compas-scores-two-years.csv\")\n```\n\nThen you can call fairness measure functions. For example:\n\n```python\nc.demographic_parity(0.2, lambda row: row['sex'] == 'Male', lambda row: row['score_text'] in {'Medium', 'High'})\n```\n\nOutput:\n\n```txt\ndemographic parity\nfair: 0.04 < 0.2\n```\n\nNote the function signature of `demographic_parity`:\n```python\ndemographic_parity(ratio: float,\n privileged_predicate: Callable[[csv_row], bool],\n positive_predicate: Callable[[T], bool]) -> bool:\n```\n\nHere the `privileged_predicate` is\n\n```python\nlambda row: row['sex'] == 'Male'\n```\n\nmeaning the privileged group is the male group, and the `positive_predicate` is\n\n```python\nlambda row: row['score_text'] in {'Medium', 'High'}\n```\n\nmeaning the row is positive if the score is categorized as medium or high.\n\nFor a more complicated example involving parameters:\n\n```python\nc.conditional_statistical_parity(0.2, lambda row: row['sex'] == sex, lambda row: row['score_text'] in {'Medium', 'High'}, lambda x: (lambda row: int(row['priors_count']) > x), (0,))\n```\n\nOutput:\n```txt\nconditional statistical parity\nfair: 0.04 < 0.2\n```\n\nNote the function signature of `conditional_statistical_parity`:\n```python\ndef conditional_statistical_parity(ratio: float,\n privileged_predicate: Callable[[csv_row], bool],\n positive_predicate: Callable[[csv_row], bool],\n legitimate_predicate_h: Callable[..., Callable[[csv_row], bool]],\n legitimate_arg: Tuple[Any, ...]) -> bool:\n```\n\nHere the higher order function `legitimate_predicate_h` is\n\n```python\nlambda x: (lambda row: int(row['priors_count']) > x)\n```\n\nand the argument to it, `legitimate_arg`, is `(0,)`.\n\n#### Model checker\n\n```python3\nfrom fairness_checker import fairness_model_checker\nc = fairness_model_checker(\"compas-scores-two-years.csv\")\n```\n\nAlternatively, you can use the checker on a model. It expects the model\nto have a `predict` method that takes a csv filename as input and\nreturns an iterable of results.\n\n```python\nc.demographic_parity(0.2, model, lambda row: row['sex'] == 'Male', lambda Y: Y == 1)\n```\n\nThe last predicate here is used on the model result.\n\n### As a command line CLI\n\nPrepare your dataset file. Create a predicate definition file containing arguments to the measure functions. For example, to calculate negative balance, create a file `test_predicates1.py` containing the following:\n\n```python\ndef privileged_predicate(row):\n return row['sex'] == 'Male'\n\ndef score_predicate(row):\n return int(row['decile_score'])\n\ndef truth_predicate(row):\n return row['is_recid'] == '1'\n```\n\nMake sure the order of the definitions are the same as the order of the function signature.\n\nThen execute the client in command line:\n\n```bash\npython3 -m fairness_checker\n```\n\nYou'll be asked a few questions about which fairness measure you want to calculate and what ratio you want to set, like so:\n\n```txt\nInput dataset file name: compas-scores-two-years.csv\nInput ratio: 0.2\nInput the fairness measure: negative balance\nInput the predicate definitions file name: test_predicates1.py\n```\n\nOutput:\n\n```\nnegative balance\nfair: 0.07 < 0.2\n```\n\nFor another example, let's calculate equal calibration with a predicate file `test_predicates2.py` containing the following:\n\n```python\ndef privileged_predicate(row):\n return row['sex'] == 'Male'\n\ndef truth_predicate(row):\n return row['is_recid'] == '1'\n\ndef calib_predicate_h(u, l):\n def tmp(row):\n return l <= int(row['decile_score']) and int(row['decile_score']) <= u\n return tmp\n\ncalib_arg = (7, 5)\n```\n\nAgain, the order of the definition matters. They must match that of the function signature.\n\nExecute in command line:\n\n```bash\npython3 -m fairness_checker\n```\n\nYou'll be asked a few questions about which fairness measure you want to calculate and what ratio you want to set, like so:\n\n```txt\nInput dataset file name: compas-scores-two-years.csv\nInput ratio: 0.2\nInput the fairness measure: equal calibration\nInput the predicate definitions file name: test_predicates2.py\n```\n\nOutput:\n\n```\nequal calibration\nfair: 0.10 < 0.2\n```\n",
"bugtrack_url": null,
"license": "Unlicense",
"summary": "Fairnes checker",
"version": "0.1.12",
"project_urls": {
"Homepage": "https://github.com/RexYuan/Shu"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "248ed2b4f8b114c3e0419c7c2916ecd3dca2fb5004b0285de8ef8b8e4e1ee359",
"md5": "c8a100f41295ad7d6e604ea3b28cc000",
"sha256": "a4d6ecabde61d2954249e197d051fd8201ae871768cff125759400c4082eeb87"
},
"downloads": -1,
"filename": "fairness_checker-0.1.12-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c8a100f41295ad7d6e604ea3b28cc000",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9143,
"upload_time": "2024-08-24T12:56:53",
"upload_time_iso_8601": "2024-08-24T12:56:53.085727Z",
"url": "https://files.pythonhosted.org/packages/24/8e/d2b4f8b114c3e0419c7c2916ecd3dca2fb5004b0285de8ef8b8e4e1ee359/fairness_checker-0.1.12-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "54df3dbd5463c735f884e145a8e5b06742ae901fc2ed8939807ff5599c8d4500",
"md5": "42c46a4759707828e8a45d3070695919",
"sha256": "91161b9cf91818944ae9510df6cd6a2ae3071ff0ca9c65bc090e60cfd43054fd"
},
"downloads": -1,
"filename": "fairness_checker-0.1.12.tar.gz",
"has_sig": false,
"md5_digest": "42c46a4759707828e8a45d3070695919",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 10064,
"upload_time": "2024-08-24T12:56:54",
"upload_time_iso_8601": "2024-08-24T12:56:54.586538Z",
"url": "https://files.pythonhosted.org/packages/54/df/3dbd5463c735f884e145a8e5b06742ae901fc2ed8939807ff5599c8d4500/fairness_checker-0.1.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-24 12:56:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "RexYuan",
"github_project": "Shu",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "fairness-checker"
}