pytabkit

Name	pytabkit JSON
Version	1.6.0 JSON
	download
home_page	None
Summary	ML models + benchmark for tabular data classification and regression
upload_time	2025-07-27 17:40:09
maintainer	None
docs_url	None
author	David Holzmüller, Léo Grinsztajn, Ingo Steinwart
requires_python	>=3.9
license	Apache-2.0
keywords	realmlp deep learning gradient boosting scikit-learn tabular data
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dholzmueller/pytabkit/blob/main/examples/tutorial_notebook.ipynb)
[![](https://readthedocs.org/projects/pytabkit/badge/?version=latest&style=flat-default)](https://pytabkit.readthedocs.io/en/latest/)
[![test](https://github.com/dholzmueller/pytabkit/actions/workflows/testing.yml/badge.svg)](https://github.com/dholzmueller/pytabkit/actions/workflows/testing.yml)
[![Downloads](https://img.shields.io/pypi/dm/pytabkit)](https://pypistats.org/packages/pytabkit)

# PyTabKit: Tabular ML models and benchmarking (NeurIPS 2024)

 [Paper](https://arxiv.org/abs/2407.04491) | [Documentation](https://pytabkit.readthedocs.io) | [RealMLP-TD-S standalone implementation](https://github.com/dholzmueller/realmlp-td-s_standalone) | [Grinsztajn et al. benchmark code](https://github.com/LeoGrin/tabular-benchmark/tree/better_by_default) | [Data archive](https://doi.org/10.18419/darus-4555) |
|-------------------------------------------|--------------------------------------------------|---------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|-----------------------------------------------------|

PyTabKit provides **scikit-learn interfaces for modern tabular classification and regression methods**
benchmarked in our [paper](https://arxiv.org/abs/2407.04491), see below.
It also contains the code we used for **benchmarking** these methods
on our benchmarks.

![Meta-test benchmark results](./figures/meta-test_benchmark_results.png)

## When (not) to use pytabkit

- **To get the best possible results**: 
  - Generally we recommend AutoGluon for the best possible results, 
    though it does not include all the models from pytabkit.
    It will probably include RealMLP in the upcoming 1.4 version. 
  - To get the best possible results from `pytabkit`, 
    we recommend using 
    `Ensemble_HPO_Classifier(n_cv=8, use_full_caruana_ensembling=True, use_tabarena_spaces=True, n_hpo_steps=50)` 
    with a `val_metric_name` corresponding to your target metric 
    (e.g., `class_error`, `cross_entropy`, `brier`, `1-auc_ovr`), or the corresponding `Regressor`. 
    (This might take very long to fit.)
  - For only a single model, we recommend using 
    `RealMLP_HPO_Classifier(n_cv=8, hpo_space_name='tabarena', use_caruana_ensembling=True, n_hyperopt_steps=50)`,
    also with `val_metric_name` as above, or the corresponding `Regressor`.
- **Models**: [TabArena](https://github.com/AutoGluon/tabrepo) 
  also includes some newer models like RealMLP and TabM 
  with more general preprocessing (missing numericals, text, etc.),  
  as well as very good boosted tree implementations.
  `pytabkit` is currently still easier to use 
  and supports vectorized cross-validation for RealMLP, 
  which can significantly speed up the training.
- **Benchmarking**: While pytabkit can be good for quick benchmarking for development, 
  for method evaluation we recommend [TabArena](https://github.com/AutoGluon/tabrepo).

## Installation (new in 1.4.0: optional model dependencies)

```bash
pip install pytabkit[models]
```

- RealMLP (and TabM) can be used without the `[models]` part.
- If you want to use **TabR**, you have to manually install
  [faiss](https://github.com/facebookresearch/faiss/blob/main/INSTALL.md),
  which is only available on **conda**.
- Please install torch separately if you want to control the version (CPU/GPU etc.)
- Use `pytabkit[models,autogluon,extra,hpo,bench,dev]` to install additional dependencies for the other models,
  AutoGluon models, extra preprocessing,
  hyperparameter optimization methods beyond random search (hyperopt/SMAC),
  the benchmarking part, and testing/documentation. For the hpo part,
  you might need to install *swig* (e.g. via pip) if the build of *pyrfr* fails.
  See also the [documentation](https://pytabkit.readthedocs.io).
  To run the data download for the meta-train benchmark, you need one of rar, unrar, or 7-zip
  to be installed on the system.

## Using the ML models

Most of our machine learning models are directly available via scikit-learn interfaces.
For example, you can use RealMLP-TD for classification as follows:

```python
from pytabkit import RealMLP_TD_Classifier

model = RealMLP_TD_Classifier()  # or TabR_S_D_Classifier, CatBoost_TD_Classifier, etc.
model.fit(X_train, y_train)
model.predict(X_test)
```

The code above will automatically select a GPU if available,
try to detect categorical columns in dataframes,
preprocess numerical variables and regression targets (no standardization required),
and use a training-validation split for early stopping.
All of this (and much more) can be configured through the constructor
and the parameters of the fit() method.
For example, it is possible to do bagging
(ensembling of models on 5-fold cross-validation)
simply by passing `n_cv=5` to the constructor.
Here is an example for some of the parameters that can be set explicitly:

```python
from pytabkit import RealMLP_TD_Classifier

model = RealMLP_TD_Classifier(device='cpu', random_state=0, n_cv=1, n_refit=0,
                              n_epochs=256, batch_size=256, hidden_sizes=[256] * 3,
                              val_metric_name='cross_entropy',
                              use_ls=False,  # for metrics like AUC / log-loss
                              lr=0.04, verbosity=2)
model.fit(X_train, y_train, X_val, y_val, cat_col_names=['Education'])
model.predict_proba(X_test)
```

See [this notebook](https://colab.research.google.com/github/dholzmueller/pytabkit/blob/main/examples/tutorial_notebook.ipynb)
for more examples. Missing numerical values are currently *not* allowed and need to be imputed beforehand.

### Available ML models

Our ML models are available in up to three variants, all with best-epoch selection:

- library defaults (D)
- our tuned defaults (TD)
- random search hyperparameter optimization (HPO), 
  sometimes also tree parzen estimator (HPO-TPE) or weighted ensembling (Ensemble)

We provide the following ML models:

- **RealMLP** (TD, HPO, Ensemble): Our new neural net models with tuned defaults (TD),
  random search hyperparameter optimization (HPO), or Ensembling
- **XGB**, **LGBM**, **CatBoost** (D, TD, HPO, HPO-TPE): Interfaces for gradient-boosted
  tree libraries XGBoost, LightGBM, CatBoost
- **MLP**, **ResNet**, **FTT** (D, HPO): Models
  from [Revisiting Deep Learning Models for Tabular Data](https://proceedings.neurips.cc/paper_files/paper/2021/hash/9d86d83f925f2149e9edb0ac3b49229c-Abstract.html)
- **MLP-PLR** (D, HPO): MLP with numerical embeddings
  from [On Embeddings for Numerical Features in Tabular Deep Learning](https://proceedings.neurips.cc/paper_files/paper/2022/hash/9e9f0ffc3d836836ca96cbf8fe14b105-Abstract-Conference.html)
- **TabR** (D, HPO): TabR model
  from [TabR: Tabular Deep Learning Meets Nearest Neighbors](https://openreview.net/forum?id=rhgIgTSSxW)
- **TabM** (D): TabM model
  from [TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling](https://arxiv.org/abs/2410.24210)
- **RealTabR** (D): Our new TabR variant with default parameters
- **Ensemble-TD**: Weighted ensemble of all TD models (RealMLP, XGB, LGBM, CatBoost)

## Post-hoc calibration and refinement stopping

For using post-hoc temperature scaling and refinement stopping from our 
paper [Rethinking Early Stopping: Refine, Then Calibrate](https://arxiv.org/abs/2501.19195),
you can pass the following parameters to the scikit-learn interfaces:
```python
from pytabkit import RealMLP_TD_Classifier
clf = RealMLP_TD_Classifier(
    val_metric_name='ref-ll-ts',  # short for 'refinement_logloss_ts-mix_all'
    calibration_method='ts-mix',  # temperature scaling with laplace smoothing
    use_ls=False  # recommended for cross-entropy loss
)
```
Other calibration methods and validation metrics
from [probmetrics](https://github.com/dholzmueller/probmetrics)
can be used as well.

For reproducing the results from this paper, we refer to the
[documentation](https://pytabkit.readthedocs.io/en/latest/bench/refine_then_calibrate.html).

## Benchmarking code

Our benchmarking code has functionality for

- dataset download
- running methods highly parallel on single-node/multi-node/multi-GPU hardware,
  with automatic scheduling and trying to respect RAM constraints
- analyzing/plotting results

For more details, we refer to the [documentation](https://pytabkit.readthedocs.io).

## Preprocessing code

While many preprocessing methods are implemented in this repository,
a standalone version of our robust scaling + smooth clipping
can be found [here](https://github.com/dholzmueller/realmlp-td-s_standalone/blob/main/preprocessing.py#L65C7-L65C37).

## Citation

If you use this repository for research purposes, please cite our [paper](https://arxiv.org/abs/2407.04491):

```
@inproceedings{holzmuller2024better,
  title={Better by default: {S}trong pre-tuned {MLPs} and boosted trees on tabular data},
  author={Holzm{\"u}ller, David and Grinsztajn, Leo and Steinwart, Ingo},
  booktitle = {Neural {Information} {Processing} {Systems}},
  year={2024}
}
```

## Contributors

- David Holzmüller (main developer)
- Léo Grinsztajn (deep learning baselines, plotting)
- Ingo Steinwart (UCI dataset download)
- Katharina Strecker (PyTorch-Lightning interface)
- Lennart Purucker (some features/fixes)
- Jérôme Dockès (deployment, continuous integration)

## Acknowledgements

Code from other repositories is acknowledged as well as possible in code comments.
Especially, we used code from https://github.com/yandex-research/rtdl
and sub-packages (Apache 2.0 license),
code from https://github.com/catboost/benchmarks/
(Apache 2.0 license),
and https://docs.ray.io/en/latest/cluster/vms/user-guides/community/slurm.html
(Apache 2.0 license).

## Releases (see git tags)

- v1.6.0:
    - Added support for other training losses in TabM through the `train_metric_name` parameter, 
      for example, (multi)quantile regression via `train_metric_name='multi_pinball(0.05,0.95)'`.
    - RealMLP-TD now adds the `n_ens` hyperparameter, which can be set to values >1 
      to train ensembles per train-validation split (called PackedEnsemble in the TabM paper). 
      This is especially useful when using holdout validation instead of cross-validation ensembles, 
      and to get more reliable validation predictions and scores for tuning/ensembling.
    - fixed RealMLP TabArena search space (`hpo_space_name='tabarena'`) for classification 
      (allow no label smoothing through `use_ls=False` instead of `use_ls="auto"`).
- v1.5.2: fixed more device bugs for HPO and ensembling
- v1.5.1: fixed a device bug in TabM for GPU
- v1.5.0:
    - added `n_repeats` parameter to scikit-learn interfaces for repeated cross-validation
    - HPO sklearn interfaces (the ones using random search)
      can now do weighted ensembling instead by setting `use_caruana_ensembling=True`.
      Removed the `RealMLP_Ensemble_Classifier` and `RealMLP_Ensemble_Regressor` from v1.4.2 
      since they are now redundant through this feature.
    - renamed `space` parameter of GBDT HPO interface 
      to `hpo_space_name` so now it also works with non-TPE versions.
    - Added new [TabArena](https://tabarena.ai) search spaces for boosted trees (not TPE), 
      which should be almost equivalent to the ones from TabArena 
      except for the early stopping logic. 
    - TabM now supports `val_metric_name` for early stopping on different metrics.
    - fixed issues #20 and #21 regarding HPO
    - small updates for the ["Rethinking Early Stopping" paper](https://arxiv.org/abs/2501.19195)
- v1.4.2:
    - fixed handling of custom `val_metric_name` HPO models and `Ensemble_TD_Regressor`.
    - if `tmp_folder` is specified in HPO models, 
      save each model to disk immediately instead of holding all of them in memory.
      This can considerably reduce RAM/VRAM usage.
      In this case, pickled HPO models will still rely on the models stored in the `tmp_folder`.
    - We now provide `RealMLP_Ensemble_Classifier` and `RealMLP_Ensemble_Regressor`,
      which will use weighted ensembling and usually perform better than HPO 
      (but have slower inference time). We recommend using the new `hpo_space_name='tabarena'`
      for best results.
- v1.4.1: 
    - moved dill to optional dependencies
    - updated TabM code to a newer version: 
      added option share_training_batches=False (old version: True), 
      exclude certain parameters from weight decay.
    - added [documentation](https://pytabkit.readthedocs.io/en/latest/bench/using_the_scheduler.html) for using the scheduler with custom jobs.
    - fixed bug in RealMLP refitting.
    - updated process start method for scheduler to speed up benchmarking
- v1.4.0:
    - moved some imports to the new `models` optional dependencies
      to have a more light-weight RealMLP installation
    - Added GPU support for CatBoost with help from 
      [Maximilian Schambach](https://github.com/MaxSchambach) 
      in #16 (not guaranteed to produce exactly the same results).
    - Ensembling now saves models after training if a path is supplied, to reduce memory usage
    - Added more search spaces
    - fixed error in multiquantile output when the passed y was one-dimensional 
      instead of having shape `(n_samples, 1)`
    - Added some examples to the documentation
- v1.3.0: 
    - Added multiquantile regression for RealMLP: 
      see the [documentation](https://pytabkit.readthedocs.io/en/latest/models/quantile_reg.html)
    - More hyperparameters for RealMLP
    - Added [TabICL](github.com/soda-inria/tabicl) wrapper
    - Small fixes
- v1.2.1: avoid error for older skorch versions
- v1.2.0:
    - Included post-hoc calibration and more metrics through 
      [probmetrics](https://github.com/dholzmueller/probmetrics).
    - Added benchmarking code for [Rethinking Early Stopping: Refine, Then Calibrate](https://arxiv.org/abs/2501.19195).
    - Updated format for saving predictions, 
      allow to stop on multiple metrics during the same training 
      in the benchmark.
    - Better categorical handling, 
      avoiding an error for string and object columns,
      not ignoring boolean columns by default but treating them as 
      categorical.
    - Added Ensemble_HPO_Classifier and Ensemble_HPO_Regressor.
- v1.1.3:
  - Fixed a bug where the categorical encoding was incorrect if categories 
    were missing in the training or validation set. The bug affected XGBoost 
    and potentially many other models except RealMLP.
  - Scikit-learn interfaces now accept and auto-detect categorical datatypes
    (category, string, object) in dataframes.
- v1.1.2:
    - Some compatibility improvements for scikit-learn 1.6
      (but disabled 1.6 since skorch is not compatible with it).
    - Improved documentation for Pytorch-Lightning interface.
    - Other small bugfixes and improvements.
- v1.1.1:
    - Added parameters `weight_decay`, `tfms`,
      and `gradient_clipping_norm` to TabM.
      The updated default parameters now apply the RTDL quantile transform.
- v1.1.0:
    - Included TabM
    - Replaced `__` by `_` in parameter names for MLP, MLP-PLR, ResNet, and FTT,
      to comply with scikit-learn interface requirements.
    - Fixed non-determinism in NN baselines
      by initializing the random state of quantile (and KDI)
      preprocessing transforms.
    - n_threads parameter is not ignored by NNs anymore.
    - Changes by [Lennart Purucker](https://github.com/LennartPurucker):
      Add time limit for RealMLP,
      add support for `lightning` (but also still allowing `pytorch-lightning`),
      making skorch a lazy import, removed msgpack\_numpy dependency.
- v1.0.0: Release for the NeurIPS version and arXiv v2+v3.
    - More baselines (MLP-PLR, FT-Transformer, TabR-HPO, RF-HPO),
      also some un-polished internal interfaces for other methods,
      esp. the ones in AutoGluon.
    - Updated benchmarking code (configurations, plots)
      including the new version of the Grinsztajn et al. benchmark
    - Updated fit() parameters in scikit-learn interfaces, etc.
- v0.0.1: First release for arXiv v1.
  Code and data are archived at [DaRUS](https://doi.org/10.18419/darus-4255).

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pytabkit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "RealMLP, deep learning, gradient boosting, scikit-learn, tabular data",
    "author": "David Holzm\u00fcller, L\u00e9o Grinsztajn, Ingo Steinwart",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/09/d9/592d267413e9a596768677221184722d01e501a95606f463a1ea88decd84/pytabkit-1.6.0.tar.gz",
    "platform": null,
    "description": "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dholzmueller/pytabkit/blob/main/examples/tutorial_notebook.ipynb)\n[![](https://readthedocs.org/projects/pytabkit/badge/?version=latest&style=flat-default)](https://pytabkit.readthedocs.io/en/latest/)\n[![test](https://github.com/dholzmueller/pytabkit/actions/workflows/testing.yml/badge.svg)](https://github.com/dholzmueller/pytabkit/actions/workflows/testing.yml)\n[![Downloads](https://img.shields.io/pypi/dm/pytabkit)](https://pypistats.org/packages/pytabkit)\n\n# PyTabKit: Tabular ML models and benchmarking (NeurIPS 2024)\n\n [Paper](https://arxiv.org/abs/2407.04491) | [Documentation](https://pytabkit.readthedocs.io) | [RealMLP-TD-S standalone implementation](https://github.com/dholzmueller/realmlp-td-s_standalone) | [Grinsztajn et al. benchmark code](https://github.com/LeoGrin/tabular-benchmark/tree/better_by_default) | [Data archive](https://doi.org/10.18419/darus-4555) |\n|-------------------------------------------|--------------------------------------------------|---------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|-----------------------------------------------------|\n\nPyTabKit provides **scikit-learn interfaces for modern tabular classification and regression methods**\nbenchmarked in our [paper](https://arxiv.org/abs/2407.04491), see below.\nIt also contains the code we used for **benchmarking** these methods\non our benchmarks.\n\n![Meta-test benchmark results](./figures/meta-test_benchmark_results.png)\n\n## When (not) to use pytabkit\n\n- **To get the best possible results**: \n  - Generally we recommend AutoGluon for the best possible results, \n    though it does not include all the models from pytabkit.\n    It will probably include RealMLP in the upcoming 1.4 version. \n  - To get the best possible results from `pytabkit`, \n    we recommend using \n    `Ensemble_HPO_Classifier(n_cv=8, use_full_caruana_ensembling=True, use_tabarena_spaces=True, n_hpo_steps=50)` \n    with a `val_metric_name` corresponding to your target metric \n    (e.g., `class_error`, `cross_entropy`, `brier`, `1-auc_ovr`), or the corresponding `Regressor`. \n    (This might take very long to fit.)\n  - For only a single model, we recommend using \n    `RealMLP_HPO_Classifier(n_cv=8, hpo_space_name='tabarena', use_caruana_ensembling=True, n_hyperopt_steps=50)`,\n    also with `val_metric_name` as above, or the corresponding `Regressor`.\n- **Models**: [TabArena](https://github.com/AutoGluon/tabrepo) \n  also includes some newer models like RealMLP and TabM \n  with more general preprocessing (missing numericals, text, etc.),  \n  as well as very good boosted tree implementations.\n  `pytabkit` is currently still easier to use \n  and supports vectorized cross-validation for RealMLP, \n  which can significantly speed up the training.\n- **Benchmarking**: While pytabkit can be good for quick benchmarking for development, \n  for method evaluation we recommend [TabArena](https://github.com/AutoGluon/tabrepo).\n\n## Installation (new in 1.4.0: optional model dependencies)\n\n```bash\npip install pytabkit[models]\n```\n\n- RealMLP (and TabM) can be used without the `[models]` part.\n- If you want to use **TabR**, you have to manually install\n  [faiss](https://github.com/facebookresearch/faiss/blob/main/INSTALL.md),\n  which is only available on **conda**.\n- Please install torch separately if you want to control the version (CPU/GPU etc.)\n- Use `pytabkit[models,autogluon,extra,hpo,bench,dev]` to install additional dependencies for the other models,\n  AutoGluon models, extra preprocessing,\n  hyperparameter optimization methods beyond random search (hyperopt/SMAC),\n  the benchmarking part, and testing/documentation. For the hpo part,\n  you might need to install *swig* (e.g. via pip) if the build of *pyrfr* fails.\n  See also the [documentation](https://pytabkit.readthedocs.io).\n  To run the data download for the meta-train benchmark, you need one of rar, unrar, or 7-zip\n  to be installed on the system.\n\n## Using the ML models\n\nMost of our machine learning models are directly available via scikit-learn interfaces.\nFor example, you can use RealMLP-TD for classification as follows:\n\n```python\nfrom pytabkit import RealMLP_TD_Classifier\n\nmodel = RealMLP_TD_Classifier()  # or TabR_S_D_Classifier, CatBoost_TD_Classifier, etc.\nmodel.fit(X_train, y_train)\nmodel.predict(X_test)\n```\n\nThe code above will automatically select a GPU if available,\ntry to detect categorical columns in dataframes,\npreprocess numerical variables and regression targets (no standardization required),\nand use a training-validation split for early stopping.\nAll of this (and much more) can be configured through the constructor\nand the parameters of the fit() method.\nFor example, it is possible to do bagging\n(ensembling of models on 5-fold cross-validation)\nsimply by passing `n_cv=5` to the constructor.\nHere is an example for some of the parameters that can be set explicitly:\n\n```python\nfrom pytabkit import RealMLP_TD_Classifier\n\nmodel = RealMLP_TD_Classifier(device='cpu', random_state=0, n_cv=1, n_refit=0,\n                              n_epochs=256, batch_size=256, hidden_sizes=[256] * 3,\n                              val_metric_name='cross_entropy',\n                              use_ls=False,  # for metrics like AUC / log-loss\n                              lr=0.04, verbosity=2)\nmodel.fit(X_train, y_train, X_val, y_val, cat_col_names=['Education'])\nmodel.predict_proba(X_test)\n```\n\nSee [this notebook](https://colab.research.google.com/github/dholzmueller/pytabkit/blob/main/examples/tutorial_notebook.ipynb)\nfor more examples. Missing numerical values are currently *not* allowed and need to be imputed beforehand.\n\n### Available ML models\n\nOur ML models are available in up to three variants, all with best-epoch selection:\n\n- library defaults (D)\n- our tuned defaults (TD)\n- random search hyperparameter optimization (HPO), \n  sometimes also tree parzen estimator (HPO-TPE) or weighted ensembling (Ensemble)\n\nWe provide the following ML models:\n\n- **RealMLP** (TD, HPO, Ensemble): Our new neural net models with tuned defaults (TD),\n  random search hyperparameter optimization (HPO), or Ensembling\n- **XGB**, **LGBM**, **CatBoost** (D, TD, HPO, HPO-TPE): Interfaces for gradient-boosted\n  tree libraries XGBoost, LightGBM, CatBoost\n- **MLP**, **ResNet**, **FTT** (D, HPO): Models\n  from [Revisiting Deep Learning Models for Tabular Data](https://proceedings.neurips.cc/paper_files/paper/2021/hash/9d86d83f925f2149e9edb0ac3b49229c-Abstract.html)\n- **MLP-PLR** (D, HPO): MLP with numerical embeddings\n  from [On Embeddings for Numerical Features in Tabular Deep Learning](https://proceedings.neurips.cc/paper_files/paper/2022/hash/9e9f0ffc3d836836ca96cbf8fe14b105-Abstract-Conference.html)\n- **TabR** (D, HPO): TabR model\n  from [TabR: Tabular Deep Learning Meets Nearest Neighbors](https://openreview.net/forum?id=rhgIgTSSxW)\n- **TabM** (D): TabM model\n  from [TabM: Advancing Tabular Deep Learning with Parameter-Efficient Ensembling](https://arxiv.org/abs/2410.24210)\n- **RealTabR** (D): Our new TabR variant with default parameters\n- **Ensemble-TD**: Weighted ensemble of all TD models (RealMLP, XGB, LGBM, CatBoost)\n\n## Post-hoc calibration and refinement stopping\n\nFor using post-hoc temperature scaling and refinement stopping from our \npaper [Rethinking Early Stopping: Refine, Then Calibrate](https://arxiv.org/abs/2501.19195),\nyou can pass the following parameters to the scikit-learn interfaces:\n```python\nfrom pytabkit import RealMLP_TD_Classifier\nclf = RealMLP_TD_Classifier(\n    val_metric_name='ref-ll-ts',  # short for 'refinement_logloss_ts-mix_all'\n    calibration_method='ts-mix',  # temperature scaling with laplace smoothing\n    use_ls=False  # recommended for cross-entropy loss\n)\n```\nOther calibration methods and validation metrics\nfrom [probmetrics](https://github.com/dholzmueller/probmetrics)\ncan be used as well.\n\nFor reproducing the results from this paper, we refer to the\n[documentation](https://pytabkit.readthedocs.io/en/latest/bench/refine_then_calibrate.html).\n\n## Benchmarking code\n\nOur benchmarking code has functionality for\n\n- dataset download\n- running methods highly parallel on single-node/multi-node/multi-GPU hardware,\n  with automatic scheduling and trying to respect RAM constraints\n- analyzing/plotting results\n\nFor more details, we refer to the [documentation](https://pytabkit.readthedocs.io).\n\n## Preprocessing code\n\nWhile many preprocessing methods are implemented in this repository,\na standalone version of our robust scaling + smooth clipping\ncan be found [here](https://github.com/dholzmueller/realmlp-td-s_standalone/blob/main/preprocessing.py#L65C7-L65C37).\n\n## Citation\n\nIf you use this repository for research purposes, please cite our [paper](https://arxiv.org/abs/2407.04491):\n\n```\n@inproceedings{holzmuller2024better,\n  title={Better by default: {S}trong pre-tuned {MLPs} and boosted trees on tabular data},\n  author={Holzm{\\\"u}ller, David and Grinsztajn, Leo and Steinwart, Ingo},\n  booktitle = {Neural {Information} {Processing} {Systems}},\n  year={2024}\n}\n```\n\n## Contributors\n\n- David Holzm\u00fcller (main developer)\n- L\u00e9o Grinsztajn (deep learning baselines, plotting)\n- Ingo Steinwart (UCI dataset download)\n- Katharina Strecker (PyTorch-Lightning interface)\n- Lennart Purucker (some features/fixes)\n- J\u00e9r\u00f4me Dock\u00e8s (deployment, continuous integration)\n\n## Acknowledgements\n\nCode from other repositories is acknowledged as well as possible in code comments.\nEspecially, we used code from https://github.com/yandex-research/rtdl\nand sub-packages (Apache 2.0 license),\ncode from https://github.com/catboost/benchmarks/\n(Apache 2.0 license),\nand https://docs.ray.io/en/latest/cluster/vms/user-guides/community/slurm.html\n(Apache 2.0 license).\n\n## Releases (see git tags)\n\n- v1.6.0:\n    - Added support for other training losses in TabM through the `train_metric_name` parameter, \n      for example, (multi)quantile regression via `train_metric_name='multi_pinball(0.05,0.95)'`.\n    - RealMLP-TD now adds the `n_ens` hyperparameter, which can be set to values >1 \n      to train ensembles per train-validation split (called PackedEnsemble in the TabM paper). \n      This is especially useful when using holdout validation instead of cross-validation ensembles, \n      and to get more reliable validation predictions and scores for tuning/ensembling.\n    - fixed RealMLP TabArena search space (`hpo_space_name='tabarena'`) for classification \n      (allow no label smoothing through `use_ls=False` instead of `use_ls=\"auto\"`).\n- v1.5.2: fixed more device bugs for HPO and ensembling\n- v1.5.1: fixed a device bug in TabM for GPU\n- v1.5.0:\n    - added `n_repeats` parameter to scikit-learn interfaces for repeated cross-validation\n    - HPO sklearn interfaces (the ones using random search)\n      can now do weighted ensembling instead by setting `use_caruana_ensembling=True`.\n      Removed the `RealMLP_Ensemble_Classifier` and `RealMLP_Ensemble_Regressor` from v1.4.2 \n      since they are now redundant through this feature.\n    - renamed `space` parameter of GBDT HPO interface \n      to `hpo_space_name` so now it also works with non-TPE versions.\n    - Added new [TabArena](https://tabarena.ai) search spaces for boosted trees (not TPE), \n      which should be almost equivalent to the ones from TabArena \n      except for the early stopping logic. \n    - TabM now supports `val_metric_name` for early stopping on different metrics.\n    - fixed issues #20 and #21 regarding HPO\n    - small updates for the [\"Rethinking Early Stopping\" paper](https://arxiv.org/abs/2501.19195)\n- v1.4.2:\n    - fixed handling of custom `val_metric_name` HPO models and `Ensemble_TD_Regressor`.\n    - if `tmp_folder` is specified in HPO models, \n      save each model to disk immediately instead of holding all of them in memory.\n      This can considerably reduce RAM/VRAM usage.\n      In this case, pickled HPO models will still rely on the models stored in the `tmp_folder`.\n    - We now provide `RealMLP_Ensemble_Classifier` and `RealMLP_Ensemble_Regressor`,\n      which will use weighted ensembling and usually perform better than HPO \n      (but have slower inference time). We recommend using the new `hpo_space_name='tabarena'`\n      for best results.\n- v1.4.1: \n    - moved dill to optional dependencies\n    - updated TabM code to a newer version: \n      added option share_training_batches=False (old version: True), \n      exclude certain parameters from weight decay.\n    - added [documentation](https://pytabkit.readthedocs.io/en/latest/bench/using_the_scheduler.html) for using the scheduler with custom jobs.\n    - fixed bug in RealMLP refitting.\n    - updated process start method for scheduler to speed up benchmarking\n- v1.4.0:\n    - moved some imports to the new `models` optional dependencies\n      to have a more light-weight RealMLP installation\n    - Added GPU support for CatBoost with help from \n      [Maximilian Schambach](https://github.com/MaxSchambach) \n      in #16 (not guaranteed to produce exactly the same results).\n    - Ensembling now saves models after training if a path is supplied, to reduce memory usage\n    - Added more search spaces\n    - fixed error in multiquantile output when the passed y was one-dimensional \n      instead of having shape `(n_samples, 1)`\n    - Added some examples to the documentation\n- v1.3.0: \n    - Added multiquantile regression for RealMLP: \n      see the [documentation](https://pytabkit.readthedocs.io/en/latest/models/quantile_reg.html)\n    - More hyperparameters for RealMLP\n    - Added [TabICL](github.com/soda-inria/tabicl) wrapper\n    - Small fixes\n- v1.2.1: avoid error for older skorch versions\n- v1.2.0:\n    - Included post-hoc calibration and more metrics through \n      [probmetrics](https://github.com/dholzmueller/probmetrics).\n    - Added benchmarking code for [Rethinking Early Stopping: Refine, Then Calibrate](https://arxiv.org/abs/2501.19195).\n    - Updated format for saving predictions, \n      allow to stop on multiple metrics during the same training \n      in the benchmark.\n    - Better categorical handling, \n      avoiding an error for string and object columns,\n      not ignoring boolean columns by default but treating them as \n      categorical.\n    - Added Ensemble_HPO_Classifier and Ensemble_HPO_Regressor.\n- v1.1.3:\n  - Fixed a bug where the categorical encoding was incorrect if categories \n    were missing in the training or validation set. The bug affected XGBoost \n    and potentially many other models except RealMLP.\n  - Scikit-learn interfaces now accept and auto-detect categorical datatypes\n    (category, string, object) in dataframes.\n- v1.1.2:\n    - Some compatibility improvements for scikit-learn 1.6\n      (but disabled 1.6 since skorch is not compatible with it).\n    - Improved documentation for Pytorch-Lightning interface.\n    - Other small bugfixes and improvements.\n- v1.1.1:\n    - Added parameters `weight_decay`, `tfms`,\n      and `gradient_clipping_norm` to TabM.\n      The updated default parameters now apply the RTDL quantile transform.\n- v1.1.0:\n    - Included TabM\n    - Replaced `__` by `_` in parameter names for MLP, MLP-PLR, ResNet, and FTT,\n      to comply with scikit-learn interface requirements.\n    - Fixed non-determinism in NN baselines\n      by initializing the random state of quantile (and KDI)\n      preprocessing transforms.\n    - n_threads parameter is not ignored by NNs anymore.\n    - Changes by [Lennart Purucker](https://github.com/LennartPurucker):\n      Add time limit for RealMLP,\n      add support for `lightning` (but also still allowing `pytorch-lightning`),\n      making skorch a lazy import, removed msgpack\\_numpy dependency.\n- v1.0.0: Release for the NeurIPS version and arXiv v2+v3.\n    - More baselines (MLP-PLR, FT-Transformer, TabR-HPO, RF-HPO),\n      also some un-polished internal interfaces for other methods,\n      esp. the ones in AutoGluon.\n    - Updated benchmarking code (configurations, plots)\n      including the new version of the Grinsztajn et al. benchmark\n    - Updated fit() parameters in scikit-learn interfaces, etc.\n- v0.0.1: First release for arXiv v1.\n  Code and data are archived at [DaRUS](https://doi.org/10.18419/darus-4255).\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "ML models + benchmark for tabular data classification and regression",
    "version": "1.6.0",
    "project_urls": {
        "Documentation": "https://github.com/dholzmueller/pytabkit#readme",
        "Issues": "https://github.com/dholzmueller/pytabkit/issues",
        "Source": "https://github.com/dholzmueller/pytabkit"
    },
    "split_keywords": [
        "realmlp",
        " deep learning",
        " gradient boosting",
        " scikit-learn",
        " tabular data"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fb3e9157667d93dbf9e7c7315934c7a1d87847009eb8c55d02ca874161bcdc98",
                "md5": "055af10c91508472d50e1b9db176a8b6",
                "sha256": "5fe9effcb99868dac680479d8ca4858465e1f708f992d8093b51e688d8d015b0"
            },
            "downloads": -1,
            "filename": "pytabkit-1.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "055af10c91508472d50e1b9db176a8b6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 350759,
            "upload_time": "2025-07-27T17:40:01",
            "upload_time_iso_8601": "2025-07-27T17:40:01.030228Z",
            "url": "https://files.pythonhosted.org/packages/fb/3e/9157667d93dbf9e7c7315934c7a1d87847009eb8c55d02ca874161bcdc98/pytabkit-1.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "09d9592d267413e9a596768677221184722d01e501a95606f463a1ea88decd84",
                "md5": "a5f82ee4252a5632a6a582da582e956d",
                "sha256": "6f30576e18bd03cc700791005d71b7275cebd7091db08e1e5e1ac347daf9450f"
            },
            "downloads": -1,
            "filename": "pytabkit-1.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a5f82ee4252a5632a6a582da582e956d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 303362,
            "upload_time": "2025-07-27T17:40:09",
            "upload_time_iso_8601": "2025-07-27T17:40:09.105291Z",
            "url": "https://files.pythonhosted.org/packages/09/d9/592d267413e9a596768677221184722d01e501a95606f463a1ea88decd84/pytabkit-1.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-27 17:40:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dholzmueller",
    "github_project": "pytabkit#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pytabkit"
}

David Holzmüller, Léo Grinsztajn, Ingo Steinwart