# TanML: Automated Model Validation Toolkit for Tabular Machine Learning
[](https://github.com/tdlabs-ai/tanml#license--citation)
[](https://opensource.org/licenses/MIT)
[](https://pepy.tech/project/tanml)
**TanML** validates tabular ML models with a zero-config **Streamlit UI** and exports an audit-ready, **editable Word report (.docx)**. It covers data quality, correlation/VIF, performance, explainability (SHAP), and robustness/stress tests—built for regulated settings (MRM, credit risk, insurance, etc.).
* **Status:** Beta (`0.x`)
* **License:** MIT
* **Python:** 3.8–3.12
* **OS:** Linux / macOS / Windows (incl. WSL)
---
## Why TanML?
* **Zero-config UI:** launch Streamlit, upload data, click **Run**—no YAML needed.
* **Audit-ready outputs:** tables/plots + a polished DOCX your stakeholders can edit.
* **Regulatory alignment:** supports common Model Risk Management themes (e.g., SR 11-7 style).
* **Works with your stack:** scikit-learn, XGBoost/LightGBM/CatBoost, etc.
---
## Install
```bash
pip install tanml
```
## Quick Start (UI)
```bash
tanml ui
```
* Opens at **[http://127.0.0.1:8501](http://127.0.0.1:8501)**
* **Upload limit ~1 GB** (preconfigured)
* **Telemetry disabled by default**
### In the app
1. **Load data** — upload a cleaned CSV/XLSX/Parquet (optional: raw or separate Train/Test).
2. **Select target & features** — target auto-suggested; features default to all non-target columns.
3. **Pick a model** — choose library/algorithm (scikit-learn, XGBoost, LightGBM, CatBoost) and tweak params.
4. **Run validation** — click **▶️ Refit & validate**.
5. **Export** — click **⬇️ Download report** to get a **DOCX** (auto-selects classification/regression template).
**Outputs**
* Report: `./.ui_runs/<session>/tanml_report_*.docx`
* Artifacts (CSV/PNGs): `./.ui_runs/<session>/artifacts/*`
---
## What TanML Checks
* **Raw Data (optional):** rows/cols, missingness, duplicates, constant columns
* **Data Quality & EDA:** summaries, distributions
* **Correlation & Multicollinearity:** heatmap, top-pairs CSV, **VIF** table
* **Performance**
* **Classification:** AUC, PR-AUC, KS, decile lift, confusion
* **Regression:** R², MAE, MSE/RMSE, error stats
* **Explainability:** SHAP (auto explainer; configurable background size)
* **Robustness/Stress Tests:** feature perturbations → delta-metrics
* **Model Metadata:** model class, hyperparameters, features, training info
---
## Templates
TanML ships DOCX templates (packaged in wheel & sdist):
* `tanml/report/templates/report_template_cls.docx`
* `tanml/report/templates/report_template_reg.docx`
---
## License & Citation
**License:** MIT. See [LICENSE](https://github.com/tdlabs-ai/tanml/blob/main/LICENSE).
SPDX-License-Identifier: MIT
© 2025 Tanmay Sah and Dolly Sah. You may use, modify, and distribute this software with appropriate attribution.
### How to cite
If TanML helps your work or publications, please cite:
> Sah, T., & Sah, D. (2025). *TanML: Automated Model Validation Toolkit for Tabular Machine Learning* [Software]. Available at https://github.com/tdlabs-ai/tanml
Or in BibTeX (version-agnostic):
```bibtex
@misc{tanml,
author = {Sah, Tanmay and Sah, Dolly},
title = {TanML: Automated Model Validation Toolkit for Tabular Machine Learning},
year = {2025},
note = {Software; MIT License},
url = {https://github.com/tdlabs-ai/tanml}
}
```
A machine-readable citation file (`CITATION.cff`) is included for citation tools and GitHub’s “Cite this repository” button.
Raw data
{
"_id": null,
"home_page": null,
"name": "tanml",
"maintainer": "Tanmay Sah, Dolly Sah",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "model validation, model risk management, model governance, SR 11-7, tabular ML, credit risk, insurance analytics, explainability, XAI, SHAP, stress testing, reporting, docx, streamlit, xgboost, lightgbm, catboost",
"author": "Tanmay Sah, Dolly Sah",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/44/cc/585d7d0877077a34f7c6ad6dc4d259f5fc1f92962085f45943c25f123fe5/tanml-0.1.7.tar.gz",
"platform": null,
"description": "# TanML: Automated Model Validation Toolkit for Tabular Machine Learning\n\n[](https://github.com/tdlabs-ai/tanml#license--citation)\n[](https://opensource.org/licenses/MIT)\n[](https://pepy.tech/project/tanml)\n\n**TanML** validates tabular ML models with a zero-config **Streamlit UI** and exports an audit-ready, **editable Word report (.docx)**. It covers data quality, correlation/VIF, performance, explainability (SHAP), and robustness/stress tests\u2014built for regulated settings (MRM, credit risk, insurance, etc.).\n\n* **Status:** Beta (`0.x`)\n* **License:** MIT\n* **Python:** 3.8\u20133.12\n* **OS:** Linux / macOS / Windows (incl. WSL)\n\n---\n\n## Why TanML?\n\n* **Zero-config UI:** launch Streamlit, upload data, click **Run**\u2014no YAML needed.\n* **Audit-ready outputs:** tables/plots + a polished DOCX your stakeholders can edit.\n* **Regulatory alignment:** supports common Model Risk Management themes (e.g., SR 11-7 style).\n* **Works with your stack:** scikit-learn, XGBoost/LightGBM/CatBoost, etc.\n\n---\n\n## Install\n\n```bash\npip install tanml\n```\n\n## Quick Start (UI)\n\n```bash\ntanml ui\n```\n\n* Opens at **[http://127.0.0.1:8501](http://127.0.0.1:8501)**\n* **Upload limit ~1 GB** (preconfigured)\n* **Telemetry disabled by default**\n\n### In the app\n\n1. **Load data** \u2014 upload a cleaned CSV/XLSX/Parquet (optional: raw or separate Train/Test).\n2. **Select target & features** \u2014 target auto-suggested; features default to all non-target columns.\n3. **Pick a model** \u2014 choose library/algorithm (scikit-learn, XGBoost, LightGBM, CatBoost) and tweak params.\n4. **Run validation** \u2014 click **\u25b6\ufe0f Refit & validate**.\n5. **Export** \u2014 click **\u2b07\ufe0f Download report** to get a **DOCX** (auto-selects classification/regression template).\n\n**Outputs**\n\n* Report: `./.ui_runs/<session>/tanml_report_*.docx`\n* Artifacts (CSV/PNGs): `./.ui_runs/<session>/artifacts/*`\n\n---\n\n## What TanML Checks\n\n* **Raw Data (optional):** rows/cols, missingness, duplicates, constant columns\n* **Data Quality & EDA:** summaries, distributions\n* **Correlation & Multicollinearity:** heatmap, top-pairs CSV, **VIF** table\n* **Performance**\n\n * **Classification:** AUC, PR-AUC, KS, decile lift, confusion\n * **Regression:** R\u00b2, MAE, MSE/RMSE, error stats\n* **Explainability:** SHAP (auto explainer; configurable background size)\n* **Robustness/Stress Tests:** feature perturbations \u2192 delta-metrics\n* **Model Metadata:** model class, hyperparameters, features, training info\n\n---\n\n\n## Templates\n\nTanML ships DOCX templates (packaged in wheel & sdist):\n\n* `tanml/report/templates/report_template_cls.docx`\n* `tanml/report/templates/report_template_reg.docx`\n\n---\n\n\n## License & Citation\n\n**License:** MIT. See [LICENSE](https://github.com/tdlabs-ai/tanml/blob/main/LICENSE). \nSPDX-License-Identifier: MIT\n\n\u00a9 2025 Tanmay Sah and Dolly Sah. You may use, modify, and distribute this software with appropriate attribution.\n\n### How to cite\n\nIf TanML helps your work or publications, please cite:\n\n> Sah, T., & Sah, D. (2025). *TanML: Automated Model Validation Toolkit for Tabular Machine Learning* [Software]. Available at https://github.com/tdlabs-ai/tanml\n\nOr in BibTeX (version-agnostic):\n\n```bibtex\n@misc{tanml,\n author = {Sah, Tanmay and Sah, Dolly},\n title = {TanML: Automated Model Validation Toolkit for Tabular Machine Learning},\n year = {2025},\n note = {Software; MIT License},\n url = {https://github.com/tdlabs-ai/tanml}\n}\n```\n\nA machine-readable citation file (`CITATION.cff`) is included for citation tools and GitHub\u2019s \u201cCite this repository\u201d button.\n",
"bugtrack_url": null,
"license": null,
"summary": "Automated validation toolkit for tabular ML models\u2014MRM, credit risk, insurance, and other regulated use cases.",
"version": "0.1.7",
"project_urls": {
"Documentation": "https://github.com/tdlabs-ai/tanml#readme",
"Homepage": "https://github.com/tdlabs-ai/tanml",
"Issues": "https://github.com/tdlabs-ai/tanml/issues",
"Source": "https://github.com/tdlabs-ai/tanml"
},
"split_keywords": [
"model validation",
" model risk management",
" model governance",
" sr 11-7",
" tabular ml",
" credit risk",
" insurance analytics",
" explainability",
" xai",
" shap",
" stress testing",
" reporting",
" docx",
" streamlit",
" xgboost",
" lightgbm",
" catboost"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9193ac632e227b8e79389edd2b735598c9da59c95c4db9d1a97bb00d727bb4af",
"md5": "58b57ea21f04922016c5a143b9a2f183",
"sha256": "84fb266a9abb3d9671e3de28b3bebd42c2db5da02e496042446dc5c63f78bc77"
},
"downloads": -1,
"filename": "tanml-0.1.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "58b57ea21f04922016c5a143b9a2f183",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 136699,
"upload_time": "2025-10-06T17:48:45",
"upload_time_iso_8601": "2025-10-06T17:48:45.815233Z",
"url": "https://files.pythonhosted.org/packages/91/93/ac632e227b8e79389edd2b735598c9da59c95c4db9d1a97bb00d727bb4af/tanml-0.1.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "44cc585d7d0877077a34f7c6ad6dc4d259f5fc1f92962085f45943c25f123fe5",
"md5": "00b0136ef992ecdfa57e86d3aaf34d11",
"sha256": "a6b7f80c12690c03ebc6e265659250e728e32bf1c5164f8edf8d11624200cb9a"
},
"downloads": -1,
"filename": "tanml-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "00b0136ef992ecdfa57e86d3aaf34d11",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 121514,
"upload_time": "2025-10-06T17:48:46",
"upload_time_iso_8601": "2025-10-06T17:48:46.981185Z",
"url": "https://files.pythonhosted.org/packages/44/cc/585d7d0877077a34f7c6ad6dc4d259f5fc1f92962085f45943c25f123fe5/tanml-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-06 17:48:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tdlabs-ai",
"github_project": "tanml#readme",
"github_not_found": true,
"lcname": "tanml"
}