wittgenstein

Name	wittgenstein JSON
Version	0.3.4 JSON
	download
home_page	https://github.com/imoscovitz/wittgenstein
Summary	Ruleset covering algorithms for explainable machine learning
upload_time	2023-04-03 22:00:12
maintainer
docs_url	None
author	Ilan Moscovitz
requires_python
license
keywords	classification decision rule machine learning explainable machine learning data science machine learning interpretability transparent machine learning ml ruleset
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # wittgenstein

_And is there not also the case where we play and--make up the rules as we go along?  
  -Ludwig Wittgenstein_

![the duck-rabbit](https://github.com/imoscovitz/wittgenstein/blob/master/duck-rabbit.jpg)

## Summary

This package implements two interpretable coverage-based ruleset algorithms: IREP and RIPPERk, as well as additional features for model interpretation.

Performance is similar to sklearn's DecisionTree CART implementation (see [Performance Tests](https://github.com/imoscovitz/wittgenstein/blob/master/examples/performance_tests.ipynb)).

For explanation of the algorithms, see my article in _Towards Data Science_, or the papers below, under [Useful References](https://github.com/imoscovitz/wittgenstein#useful-references).

## Installation

To install, use
```bash
$ pip install wittgenstein
```

To uninstall, use
```bash
$ pip uninstall wittgenstein
```

## Requirements
- pandas
- numpy
- python version>=3.6

## Usage
Usage syntax is similar to sklearn's.

### Training

Once you have loaded and split your data...
```python
>>> import pandas as pd
>>> df = pd.read_csv(dataset_filename)
>>> from sklearn.model_selection import train_test_split # Or any other mechanism you want to use for data partitioning
>>> train, test = train_test_split(df, test_size=.33)
```
Use the `fit` method to train a `RIPPER` or `IREP` classifier:

```python
>>> import wittgenstein as lw
>>> ripper_clf = lw.RIPPER() # Or irep_clf = lw.IREP() to build a model using IREP
>>> ripper_clf.fit(df, class_feat='Poisonous/Edible', pos_class='p') # Or pass X and y data to .fit
>>> ripper_clf
<RIPPER(max_rules=None, random_state=2, max_rule_conds=None, verbosity=0, max_total_conds=None, k=2, prune_size=0.33, dl_allowance=64, n_discretize_bins=10) with fit ruleset> # Hyperparameter details available in the docstrings and TDS article below
```

Access the underlying trained model with the `ruleset_` attribute, or output it with `out_model()`. A ruleset is a disjunction of conjunctions -- 'V' represents 'or'; '^' represents 'and'.

In other words, the model predicts positive class if any of the inner-nested condition-combinations are all true:
```python
>>> ripper_clf.out_model() # or ripper_clf.ruleset_
[[Odor=f] V
[Gill-size=n ^ Gill-color=b] V
[Gill-size=n ^ Odor=p] V
[Odor=c] V
[Spore-print-color=r] V
[Stalk-surface-below-ring=y ^ Stalk-surface-above-ring=k] V
[Habitat=l ^ Cap-color=w] V
[Stalk-color-above-ring=y]]
```

`IREP` models tend be higher bias, `RIPPER`'s higher variance.

### Scoring
To score a trained model, use the `score` function:
```python
>>> X_test = test.drop('Poisonous/Edible', axis=1)
>>> y_test = test['Poisonous/Edible']
>>> ripper_clf.score(test_X, test_y)
0.9985686906328078
```

Default scoring metric is accuracy. You can pass in alternate scoring functions, including those available through sklearn:
```python
>>> from sklearn.metrics import precision_score, recall_score
>>> precision = clf.score(X_test, y_test, precision_score)
>>> recall = clf.score(X_test, y_test, recall_score)
>>> print(f'precision: {precision} recall: {recall}')
precision: 0.9914..., recall: 0.9953...
```

### Prediction
To perform predictions, use `predict`:
```python
>>> ripper_clf.predict(new_data)[:5]
[True, True, False, True, False]
```

Predict class probabilities with `predict_proba`:
```python
>>> ripper_clf.predict_proba(test)
# Pairs of negative and positive class probabilities
array([[0.01212121, 0.98787879],
       [0.01212121, 0.98787879],
       [0.77777778, 0.22222222],
       [0.2       , 0.8       ],
       ...
```

We can also ask our model to tell us why it made each positive prediction using `give_reasons`:
```python
>>> ripper_clf.predict(new_data[:5], give_reasons=True)
([True, True, False, True, True]
[<Rule [physician-fee-freeze=n]>],
[<Rule [physician-fee-freeze=n]>,
  <Rule [synfuels-corporation-cutback=y^adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n]>], # This example met multiple sufficient conditions for a positive prediction
[],
[<Rule object: [physician-fee-freeze=n]>],
[])
```

### Model selection
wittgenstein is compatible with sklearn model_selection tools such as `cross_val_score` and `GridSearchCV`, as well
as ensemblers like `StackingClassifier`.

Cross validation:
```python
>>> # First dummify your categorical features and booleanize your class values to make sklearn happy
>>> X_train = pd.get_dummies(X_train, columns=X_train.select_dtypes('object').columns)
>>> y_train = y_train.map(lambda x: 1 if x=='p' else 0)
>>> cross_val_score(ripper_clf, X_train, y_train)
```

Grid search:
```python
>>> from sklearn.model_selection import GridSearchCV
>>> param_grid = {"prune_size": [0.33, 0.5], "k": [1, 2]}
>>> grid = GridSearchCV(estimator=ripper, param_grid=param_grid)
>>> grid.fit(X_train, y_train)
```

Ensemble:
```python
>>> from sklearn.ensemble import StackingClassifier
>>> from sklearn.tree import DecisionTreeClassifier
>>> from sklearn.naive_bayes import GaussianNB
>>> from sklearn.linear_model import LogisticRegression
>>> tree = DecisionTreeClassifier(random_state=42)
>>> nb = GaussianNB(random_state=42)
>>> estimators = [("rip", ripper_clf), ("tree", tree), ("nb", nb)]
>>> ensemble_clf = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression())
>>> ensemble_clf.fit(X_train, y_train)
```

### Defining and altering models
You can directly specify a new model, modify a preexisting model, or train from a preexisting model -- whether to take into account subject matter expertise, to create a baseline for scoring, or for insight into what the model is doing.

To specify a new model, use `init_ruleset`:
```python
>>> ripper_clf = RIPPER(random_state=42)
>>> ripper_clf.init_ruleset("[[Cap-shape=x^Cap-color=n] V [Odor=c] V ...]", class_feat=..., pos_class=...)
>>> ripper_clf.predict(df)
...
```
To modify a preexisting model, use `add_rule`, `replace_rule`, `remove_rule`, or `insert_rule`. To alter a model by index, use `replace_rule_at`, `remove_rule_at`, or `insert_rule_at`:
```python
>>> ripper_clf.replace_rule_at(1, '[Habitat=l]')
>>> ripper_clf.insert_rule(insert_before_rule='[Habitat=l]', new_rule='[Gill-size=n ^ Gill-color=b]')
>>> ripper_clf.out_model()
[[delicious=y^spooky-looking=y] V
[Gill-size=n ^ Gill-color=b] V
[Habitat=l]]
```
To specify a starting point for training, use `initial_model` when calling `fit`:
```python
>>> ripper_clf.fit(
>>>   X_train,
>>>   y_train,
>>>   initial_model="[[delicious=y^spooky-looking=y] V [Odor=c]]")
```
Expected string syntax for a Ruleset is `[<Rule1> V <Rule2> V ...]`, for a Rule `[<Cond1>^<Cond2>^...], and for a Cond `feature=value`. '^' represents 'and'; 'V' represents 'or'. (See the [Training](https://github.com/imoscovitz/wittgenstein#training) section above).

### Interpreter models
Use the interpret module to interpret non-wittgenstein models. `interpret_model` generates a ruleset that approximates some black-box model. It does to by fitting a wittgenstein classifier to the predictions of the other model.
```python
# Train the model we want to interpret
>>> from tensorflow.keras import Sequential
>>> from tensorflow.keras.layers import Dense
>>> mlp = Sequential()
>>> mlp.add(Dense(60, input_dim=13, activation='relu'))
>>> mlp.add(Dense(30, activation='relu'))
>>> mlp.add(Dense(1, activation='sigmoid'))
>>> mlp.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
>>> mlp.fit(
>>>   X_train,
>>>   y_train,
>>>   batch_size=1,
>>>   epochs=10)

# Create and fit wittgenstein classifier to use as a model interpreter.
>>> from wittgenstein.interpret import interpret_model, score_fidelity
>>> interpreter = RIPPER(random_state=42)
>>> interpret_model(model=mlp, X=X_train, interpreter=interpreter).out_pretty()
[[Proline=>1227.0] V
[Proline=880.0-1048.0] V
[Proline=1048.0-1227.0] V
[Proline=736.0-880.0] V
[Alcalinityofash=16.8-17.72]]
```
We can also use the now-fitted interpreter to approximate the reasons behind the underlying model's positive predictions. (See [Prediction](https://github.com/imoscovitz/wittgenstein#prediction)).
```python
>>> preds = (mlp.predict(X_test.tail()) > .5).flatten()
>>> _, interpretation = interpreter.predict(X_test.tail(), give_reasons=True)
>>> print(f'tf preds: {preds}\n')
>>> interpretation
tf preds: [ True False False  True False]
[[<Rule [Proline=880.0-1048.0]>],
 [],
 [],
 [<Rule [Proline=736.0-880.0]>, <Rule [Alcalinityofash=16.8-17.72]>],
 []]
```
Score how faithfully the interpreter fits the underlying model with `score_fidelity`.
```python
>>> score_fidelity(
>>>    X_test,
>>>    interpreter,
>>>    model=mlp,
>>>    score_function=[precision_score, recall_score, f1_score])
[1.0, 0.7916666666666666, 0.8837209302325582]
```
## Issues
If you encounter any issues, or if you have feedback or improvement requests for how wittgenstein could be more helpful for you, please post them to [issues](https://github.com/imoscovitz/wittgenstein/issues), and I'll respond.

## Contributing
Contributions are welcome! If you are interested in contributing, let me know at ilan.moscovitz@gmail.com or on [linkedin](https://www.linkedin.com/in/ilan-moscovitz/).

## Useful references
- [My article in _Towards Data Science_ explaining IREP, RIPPER, and wittgenstein](https://towardsdatascience.com/how-to-perform-explainable-machine-learning-classification-without-any-trees-873db4192c68)
- [Furnkrantz-Widmer IREP paper](https://pdfs.semanticscholar.org/f67e/bb7b392f51076899f58c53bf57d5e71e36e9.pdf)
- [Cohen's RIPPER paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.2612&rep=rep1&type=pdf)
- [Partial decision trees](https://researchcommons.waikato.ac.nz/bitstream/handle/10289/1047/uow-cs-wp-1998-02.pdf?sequence=1&isAllowed=y)
- [Bayesian Rulesets](https://pdfs.semanticscholar.org/bb51/b3046f6ff607deb218792347cb0e9b0b621a.pdf)
- [C4.5 paper including all the gory details on MDL](https://pdfs.semanticscholar.org/cb94/e3d981a5e1901793c6bfedd93ce9cc07885d.pdf)
- [_Philosophical Investigations_](https://static1.squarespace.com/static/54889e73e4b0a2c1f9891289/t/564b61a4e4b04eca59c4d232/1447780772744/Ludwig.Wittgenstein.-.Philosophical.Investigations.pdf)

## Changelog

#### v0.3.4: 4/3/2022
- Improvements to predict_proba calculation, including smoothing

#### v0.3.2: 8/8/2021
- Speedup for binning continuous features (~several orders of magnitude)
- Add support for expert feedback: Ability to explicitly specify and alter models.
- Add surrogate interpreter
- Add support for non-pandas datasets (ex. numpy arrays)

#### v0.2.3: 5/21/2020
- Minor bugfixes and optimizations

#### v0.2.0: 5/4/2020
- Algorithmic optimizations to improve training speed (~10x - ~100x)
- Support for training on iterable datatypes besides DataFrames, such as numpy arrays and python lists
- Compatibility with sklearn ensembling metalearners and sklearn model_selection
- `.predict_proba` returns probas in neg, pos order
- Certain parameters (hyperparameters, random_state, etc.) should now be passed into IREP/RIPPER constructors rather than the .fit method.
- Sundry bugfixes

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/imoscovitz/wittgenstein",
    "name": "wittgenstein",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "Classification,Decision Rule,Machine Learning,Explainable Machine Learning,Data Science,Machine Learning Interpretability,Transparent Machine Learning,ML,Ruleset",
    "author": "Ilan Moscovitz",
    "author_email": "ilan.moscovitz@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/79/42/4745252959c4b26feea413d6417e20fea39044f190eb8b1ffce80fb93c17/wittgenstein-0.3.4.tar.gz",
    "platform": null,
    "description": "# wittgenstein\n\n_And is there not also the case where we play and--make up the rules as we go along?  \n  -Ludwig Wittgenstein_\n\n![the duck-rabbit](https://github.com/imoscovitz/wittgenstein/blob/master/duck-rabbit.jpg)\n\n## Summary\n\nThis package implements two interpretable coverage-based ruleset algorithms: IREP and RIPPERk, as well as additional features for model interpretation.\n\nPerformance is similar to sklearn's DecisionTree CART implementation (see [Performance Tests](https://github.com/imoscovitz/wittgenstein/blob/master/examples/performance_tests.ipynb)).\n\nFor explanation of the algorithms, see my article in _Towards Data Science_, or the papers below, under [Useful References](https://github.com/imoscovitz/wittgenstein#useful-references).\n\n## Installation\n\nTo install, use\n```bash\n$ pip install wittgenstein\n```\n\nTo uninstall, use\n```bash\n$ pip uninstall wittgenstein\n```\n\n## Requirements\n- pandas\n- numpy\n- python version>=3.6\n\n## Usage\nUsage syntax is similar to sklearn's.\n\n### Training\n\nOnce you have loaded and split your data...\n```python\n>>> import pandas as pd\n>>> df = pd.read_csv(dataset_filename)\n>>> from sklearn.model_selection import train_test_split # Or any other mechanism you want to use for data partitioning\n>>> train, test = train_test_split(df, test_size=.33)\n```\nUse the `fit` method to train a `RIPPER` or `IREP` classifier:\n\n```python\n>>> import wittgenstein as lw\n>>> ripper_clf = lw.RIPPER() # Or irep_clf = lw.IREP() to build a model using IREP\n>>> ripper_clf.fit(df, class_feat='Poisonous/Edible', pos_class='p') # Or pass X and y data to .fit\n>>> ripper_clf\n<RIPPER(max_rules=None, random_state=2, max_rule_conds=None, verbosity=0, max_total_conds=None, k=2, prune_size=0.33, dl_allowance=64, n_discretize_bins=10) with fit ruleset> # Hyperparameter details available in the docstrings and TDS article below\n```\n\nAccess the underlying trained model with the `ruleset_` attribute, or output it with `out_model()`. A ruleset is a disjunction of conjunctions -- 'V' represents 'or'; '^' represents 'and'.\n\nIn other words, the model predicts positive class if any of the inner-nested condition-combinations are all true:\n```python\n>>> ripper_clf.out_model() # or ripper_clf.ruleset_\n[[Odor=f] V\n[Gill-size=n ^ Gill-color=b] V\n[Gill-size=n ^ Odor=p] V\n[Odor=c] V\n[Spore-print-color=r] V\n[Stalk-surface-below-ring=y ^ Stalk-surface-above-ring=k] V\n[Habitat=l ^ Cap-color=w] V\n[Stalk-color-above-ring=y]]\n```\n\n`IREP` models tend be higher bias, `RIPPER`'s higher variance.\n\n### Scoring\nTo score a trained model, use the `score` function:\n```python\n>>> X_test = test.drop('Poisonous/Edible', axis=1)\n>>> y_test = test['Poisonous/Edible']\n>>> ripper_clf.score(test_X, test_y)\n0.9985686906328078\n```\n\nDefault scoring metric is accuracy. You can pass in alternate scoring functions, including those available through sklearn:\n```python\n>>> from sklearn.metrics import precision_score, recall_score\n>>> precision = clf.score(X_test, y_test, precision_score)\n>>> recall = clf.score(X_test, y_test, recall_score)\n>>> print(f'precision: {precision} recall: {recall}')\nprecision: 0.9914..., recall: 0.9953...\n```\n\n### Prediction\nTo perform predictions, use `predict`:\n```python\n>>> ripper_clf.predict(new_data)[:5]\n[True, True, False, True, False]\n```\n\nPredict class probabilities with `predict_proba`:\n```python\n>>> ripper_clf.predict_proba(test)\n# Pairs of negative and positive class probabilities\narray([[0.01212121, 0.98787879],\n       [0.01212121, 0.98787879],\n       [0.77777778, 0.22222222],\n       [0.2       , 0.8       ],\n       ...\n```\n\nWe can also ask our model to tell us why it made each positive prediction using `give_reasons`:\n```python\n>>> ripper_clf.predict(new_data[:5], give_reasons=True)\n([True, True, False, True, True]\n[<Rule [physician-fee-freeze=n]>],\n[<Rule [physician-fee-freeze=n]>,\n  <Rule [synfuels-corporation-cutback=y^adoption-of-the-budget-resolution=y^anti-satellite-test-ban=n]>], # This example met multiple sufficient conditions for a positive prediction\n[],\n[<Rule object: [physician-fee-freeze=n]>],\n[])\n```\n\n### Model selection\nwittgenstein is compatible with sklearn model_selection tools such as `cross_val_score` and `GridSearchCV`, as well\nas ensemblers like `StackingClassifier`.\n\nCross validation:\n```python\n>>> # First dummify your categorical features and booleanize your class values to make sklearn happy\n>>> X_train = pd.get_dummies(X_train, columns=X_train.select_dtypes('object').columns)\n>>> y_train = y_train.map(lambda x: 1 if x=='p' else 0)\n>>> cross_val_score(ripper_clf, X_train, y_train)\n```\n\nGrid search:\n```python\n>>> from sklearn.model_selection import GridSearchCV\n>>> param_grid = {\"prune_size\": [0.33, 0.5], \"k\": [1, 2]}\n>>> grid = GridSearchCV(estimator=ripper, param_grid=param_grid)\n>>> grid.fit(X_train, y_train)\n```\n\nEnsemble:\n```python\n>>> from sklearn.ensemble import StackingClassifier\n>>> from sklearn.tree import DecisionTreeClassifier\n>>> from sklearn.naive_bayes import GaussianNB\n>>> from sklearn.linear_model import LogisticRegression\n>>> tree = DecisionTreeClassifier(random_state=42)\n>>> nb = GaussianNB(random_state=42)\n>>> estimators = [(\"rip\", ripper_clf), (\"tree\", tree), (\"nb\", nb)]\n>>> ensemble_clf = StackingClassifier(estimators=estimators, final_estimator=LogisticRegression())\n>>> ensemble_clf.fit(X_train, y_train)\n```\n\n### Defining and altering models\nYou can directly specify a new model, modify a preexisting model, or train from a preexisting model -- whether to take into account subject matter expertise, to create a baseline for scoring, or for insight into what the model is doing.\n\nTo specify a new model, use `init_ruleset`:\n```python\n>>> ripper_clf = RIPPER(random_state=42)\n>>> ripper_clf.init_ruleset(\"[[Cap-shape=x^Cap-color=n] V [Odor=c] V ...]\", class_feat=..., pos_class=...)\n>>> ripper_clf.predict(df)\n...\n```\nTo modify a preexisting model, use `add_rule`, `replace_rule`, `remove_rule`, or `insert_rule`. To alter a model by index, use `replace_rule_at`, `remove_rule_at`, or `insert_rule_at`:\n```python\n>>> ripper_clf.replace_rule_at(1, '[Habitat=l]')\n>>> ripper_clf.insert_rule(insert_before_rule='[Habitat=l]', new_rule='[Gill-size=n ^ Gill-color=b]')\n>>> ripper_clf.out_model()\n[[delicious=y^spooky-looking=y] V\n[Gill-size=n ^ Gill-color=b] V\n[Habitat=l]]\n```\nTo specify a starting point for training, use `initial_model` when calling `fit`:\n```python\n>>> ripper_clf.fit(\n>>>   X_train,\n>>>   y_train,\n>>>   initial_model=\"[[delicious=y^spooky-looking=y] V [Odor=c]]\")\n```\nExpected string syntax for a Ruleset is `[<Rule1> V <Rule2> V ...]`, for a Rule `[<Cond1>^<Cond2>^...], and for a Cond `feature=value`. '^' represents 'and'; 'V' represents 'or'. (See the [Training](https://github.com/imoscovitz/wittgenstein#training) section above).\n\n### Interpreter models\nUse the interpret module to interpret non-wittgenstein models. `interpret_model` generates a ruleset that approximates some black-box model. It does to by fitting a wittgenstein classifier to the predictions of the other model.\n```python\n# Train the model we want to interpret\n>>> from tensorflow.keras import Sequential\n>>> from tensorflow.keras.layers import Dense\n>>> mlp = Sequential()\n>>> mlp.add(Dense(60, input_dim=13, activation='relu'))\n>>> mlp.add(Dense(30, activation='relu'))\n>>> mlp.add(Dense(1, activation='sigmoid'))\n>>> mlp.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])\n>>> mlp.fit(\n>>>   X_train,\n>>>   y_train,\n>>>   batch_size=1,\n>>>   epochs=10)\n\n# Create and fit wittgenstein classifier to use as a model interpreter.\n>>> from wittgenstein.interpret import interpret_model, score_fidelity\n>>> interpreter = RIPPER(random_state=42)\n>>> interpret_model(model=mlp, X=X_train, interpreter=interpreter).out_pretty()\n[[Proline=>1227.0] V\n[Proline=880.0-1048.0] V\n[Proline=1048.0-1227.0] V\n[Proline=736.0-880.0] V\n[Alcalinityofash=16.8-17.72]]\n```\nWe can also use the now-fitted interpreter to approximate the reasons behind the underlying model's positive predictions. (See [Prediction](https://github.com/imoscovitz/wittgenstein#prediction)).\n```python\n>>> preds = (mlp.predict(X_test.tail()) > .5).flatten()\n>>> _, interpretation = interpreter.predict(X_test.tail(), give_reasons=True)\n>>> print(f'tf preds: {preds}\\n')\n>>> interpretation\ntf preds: [ True False False  True False]\n[[<Rule [Proline=880.0-1048.0]>],\n [],\n [],\n [<Rule [Proline=736.0-880.0]>, <Rule [Alcalinityofash=16.8-17.72]>],\n []]\n```\nScore how faithfully the interpreter fits the underlying model with `score_fidelity`.\n```python\n>>> score_fidelity(\n>>>    X_test,\n>>>    interpreter,\n>>>    model=mlp,\n>>>    score_function=[precision_score, recall_score, f1_score])\n[1.0, 0.7916666666666666, 0.8837209302325582]\n```\n## Issues\nIf you encounter any issues, or if you have feedback or improvement requests for how wittgenstein could be more helpful for you, please post them to [issues](https://github.com/imoscovitz/wittgenstein/issues), and I'll respond.\n\n## Contributing\nContributions are welcome! If you are interested in contributing, let me know at ilan.moscovitz@gmail.com or on [linkedin](https://www.linkedin.com/in/ilan-moscovitz/).\n\n## Useful references\n- [My article in _Towards Data Science_ explaining IREP, RIPPER, and wittgenstein](https://towardsdatascience.com/how-to-perform-explainable-machine-learning-classification-without-any-trees-873db4192c68)\n- [Furnkrantz-Widmer IREP paper](https://pdfs.semanticscholar.org/f67e/bb7b392f51076899f58c53bf57d5e71e36e9.pdf)\n- [Cohen's RIPPER paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.107.2612&rep=rep1&type=pdf)\n- [Partial decision trees](https://researchcommons.waikato.ac.nz/bitstream/handle/10289/1047/uow-cs-wp-1998-02.pdf?sequence=1&isAllowed=y)\n- [Bayesian Rulesets](https://pdfs.semanticscholar.org/bb51/b3046f6ff607deb218792347cb0e9b0b621a.pdf)\n- [C4.5 paper including all the gory details on MDL](https://pdfs.semanticscholar.org/cb94/e3d981a5e1901793c6bfedd93ce9cc07885d.pdf)\n- [_Philosophical Investigations_](https://static1.squarespace.com/static/54889e73e4b0a2c1f9891289/t/564b61a4e4b04eca59c4d232/1447780772744/Ludwig.Wittgenstein.-.Philosophical.Investigations.pdf)\n\n## Changelog\n\n#### v0.3.4: 4/3/2022\n- Improvements to predict_proba calculation, including smoothing\n\n#### v0.3.2: 8/8/2021\n- Speedup for binning continuous features (~several orders of magnitude)\n- Add support for expert feedback: Ability to explicitly specify and alter models.\n- Add surrogate interpreter\n- Add support for non-pandas datasets (ex. numpy arrays)\n\n#### v0.2.3: 5/21/2020\n- Minor bugfixes and optimizations\n\n#### v0.2.0: 5/4/2020\n- Algorithmic optimizations to improve training speed (~10x - ~100x)\n- Support for training on iterable datatypes besides DataFrames, such as numpy arrays and python lists\n- Compatibility with sklearn ensembling metalearners and sklearn model_selection\n- `.predict_proba` returns probas in neg, pos order\n- Certain parameters (hyperparameters, random_state, etc.) should now be passed into IREP/RIPPER constructors rather than the .fit method.\n- Sundry bugfixes\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Ruleset covering algorithms for explainable machine learning",
    "version": "0.3.4",
    "split_keywords": [
        "classification",
        "decision rule",
        "machine learning",
        "explainable machine learning",
        "data science",
        "machine learning interpretability",
        "transparent machine learning",
        "ml",
        "ruleset"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "940ea15dc9131d5bebaf2a18e0f5a68b3ad1685319808f354371846e22c9170c",
                "md5": "78dcfc948e3607d9705360c23ce6e159",
                "sha256": "99c771d1917ecbd967b7be5bd008789f95c378207456c5a3a99d3a069f69011e"
            },
            "downloads": -1,
            "filename": "wittgenstein-0.3.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "78dcfc948e3607d9705360c23ce6e159",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 110614,
            "upload_time": "2023-04-03T22:00:06",
            "upload_time_iso_8601": "2023-04-03T22:00:06.487003Z",
            "url": "https://files.pythonhosted.org/packages/94/0e/a15dc9131d5bebaf2a18e0f5a68b3ad1685319808f354371846e22c9170c/wittgenstein-0.3.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "79424745252959c4b26feea413d6417e20fea39044f190eb8b1ffce80fb93c17",
                "md5": "c5ffe5bfb9c977608c70a0995b73a637",
                "sha256": "ba1715b74c97ed260abf6df850f3d1739b2d4ba45b9306883e65c8f83c683bcc"
            },
            "downloads": -1,
            "filename": "wittgenstein-0.3.4.tar.gz",
            "has_sig": false,
            "md5_digest": "c5ffe5bfb9c977608c70a0995b73a637",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 483633,
            "upload_time": "2023-04-03T22:00:12",
            "upload_time_iso_8601": "2023-04-03T22:00:12.096272Z",
            "url": "https://files.pythonhosted.org/packages/79/42/4745252959c4b26feea413d6417e20fea39044f190eb8b1ffce80fb93c17/wittgenstein-0.3.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-03 22:00:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "imoscovitz",
    "github_project": "wittgenstein",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "wittgenstein"
}

Ilan Moscovitz