# lmdiag
**Python Library providing Diagnostic Plots for Linear Regression Models.** (Like
[plot.lm](https://www.rdocumentation.org/packages/stats/versions/3.5.0/topics/plot.lm)
in R.)
I built this, because I missed the diagnostics plots of R for a university project.
There are some substitutions in Python for individual charts, but they are spread over
different libraries and sometimes don't show the exact same. My implementation tries to
copycat the R-plots, but I didn't reimplement the R-code: The charts are just based on
available documentation.
## Installation
`pip install lmdiag`
## Usage
lmdiag generates plots for _fitted_ linear regression models from
[`statsmodels`](https://www.statsmodels.org/stable/index.html),
[`linearmodels`](https://bashtage.github.io/linearmodels/doc/index.html) and
[`scikit-learn`](https://scikit-learn.org/stable/).
You can find some usage examples in
[this jupyter notebook](https://github.com/dynobo/lmdiag/blob/master/example.ipynb).
### Example
```python
import numpy as np
import statsmodels.api as sm
import lmdiag
# Fit model with random sample data
np.random.seed(20)
X = np.random.normal(size=30, loc=20, scale=3)
y = 5 + 5 * X + np.random.normal(size=30)
X = sm.add_constant(predictor) # intercept required by statsmodels
lm = sm.OLS(y, X).fit()
# Plot lmdiag facet chart
lmdiag.style.use(style="black_and_red") # Mimic R's plot.lm style
fig = lmdiag.plot(lm)
fig.show()
```
![image](https://raw.githubusercontent.com/dynobo/lmdiag/master/example.png)
### Methods
- Draw matrix of all plots:
`lmdiag.plot(lm)`
- Draw individual plots:
`lmdiag.resid_fit(lm)`
`lmdiag.q_q(lm)`
`lmdiag.scale_loc(lm)`
`lmdiag.resid_lev(lm)`
- Print description to aid plot interpretation:
`lmdiag.help()` (for all plots)
`lmdiag.help('<method name>')` (for individual plot)
### Increase performance
Plotting models fitted on large datasets might be slow. There are some things you can
try to speed it up:
#### 1. Tune LOWESS-parameters
The red smoothing lines are calculated using the "Locally Weighted Scatterplot
Smoothing" algorithm, which can be quite expensive. Try a _lower_ value for `lowess_it`
and a _higher_ value for `lowess_delta` to gain speed at the cost of accuracy:
```python
lmdiag.plot(lm, lowess_it=1, lowess_delta=0.02)
# Defaults are: lowess_it=2, lowess_delta=0.005
```
(For details about those parameters, see
[statsmodels docs](https://www.statsmodels.org/stable/generated/statsmodels.nonparametric.smoothers_lowess.lowess.html).)
#### 2. Change matplotlib backend
Try a different
[matplotlib backend](https://matplotlib.org/stable/users/explain/figure/backends.html).
Especially static backends like `AGG` or `Cairo` should be faster, e.g.:
```python
import matplotlib
matplotlib.use('agg')
```
### Setup development environment
```sh
python -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'
pre-commit install
```
### Certification
![image](https://raw.githubusercontent.com/dynobo/lmdiag/master/badge.png)
Raw data
{
"_id": null,
"home_page": null,
"name": "lmdiag",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "lm, lineare, regression, diagnostics, plot, chart, linearmodels, statsmodels, scikit-learn",
"author": null,
"author_email": "dynobo <dynobo@mailbox.org>",
"download_url": "https://files.pythonhosted.org/packages/cd/a2/5f040f1a49525eae2fd85393aea9df21319f276c83b7100759dd0b51c370/lmdiag-0.4.1.tar.gz",
"platform": null,
"description": "# lmdiag\n\n**Python Library providing Diagnostic Plots for Linear Regression Models.** (Like\n[plot.lm](https://www.rdocumentation.org/packages/stats/versions/3.5.0/topics/plot.lm)\nin R.)\n\nI built this, because I missed the diagnostics plots of R for a university project.\nThere are some substitutions in Python for individual charts, but they are spread over\ndifferent libraries and sometimes don't show the exact same. My implementation tries to\ncopycat the R-plots, but I didn't reimplement the R-code: The charts are just based on\navailable documentation.\n\n## Installation\n\n`pip install lmdiag`\n\n## Usage\n\nlmdiag generates plots for _fitted_ linear regression models from\n[`statsmodels`](https://www.statsmodels.org/stable/index.html),\n[`linearmodels`](https://bashtage.github.io/linearmodels/doc/index.html) and\n[`scikit-learn`](https://scikit-learn.org/stable/).\n\nYou can find some usage examples in\n[this jupyter notebook](https://github.com/dynobo/lmdiag/blob/master/example.ipynb).\n\n### Example\n\n```python\nimport numpy as np\nimport statsmodels.api as sm\nimport lmdiag\n\n# Fit model with random sample data\nnp.random.seed(20)\nX = np.random.normal(size=30, loc=20, scale=3)\ny = 5 + 5 * X + np.random.normal(size=30)\nX = sm.add_constant(predictor) # intercept required by statsmodels\nlm = sm.OLS(y, X).fit()\n\n# Plot lmdiag facet chart\nlmdiag.style.use(style=\"black_and_red\") # Mimic R's plot.lm style\nfig = lmdiag.plot(lm)\nfig.show()\n```\n\n![image](https://raw.githubusercontent.com/dynobo/lmdiag/master/example.png)\n\n### Methods\n\n- Draw matrix of all plots:\n\n `lmdiag.plot(lm)`\n\n- Draw individual plots:\n\n `lmdiag.resid_fit(lm)`\n\n `lmdiag.q_q(lm)`\n\n `lmdiag.scale_loc(lm)`\n\n `lmdiag.resid_lev(lm)`\n\n- Print description to aid plot interpretation:\n\n `lmdiag.help()` (for all plots)\n\n `lmdiag.help('<method name>')` (for individual plot)\n\n### Increase performance\n\nPlotting models fitted on large datasets might be slow. There are some things you can\ntry to speed it up:\n\n#### 1. Tune LOWESS-parameters\n\nThe red smoothing lines are calculated using the \"Locally Weighted Scatterplot\nSmoothing\" algorithm, which can be quite expensive. Try a _lower_ value for `lowess_it`\nand a _higher_ value for `lowess_delta` to gain speed at the cost of accuracy:\n\n```python\nlmdiag.plot(lm, lowess_it=1, lowess_delta=0.02)\n# Defaults are: lowess_it=2, lowess_delta=0.005\n```\n\n(For details about those parameters, see\n[statsmodels docs](https://www.statsmodels.org/stable/generated/statsmodels.nonparametric.smoothers_lowess.lowess.html).)\n\n#### 2. Change matplotlib backend\n\nTry a different\n[matplotlib backend](https://matplotlib.org/stable/users/explain/figure/backends.html).\nEspecially static backends like `AGG` or `Cairo` should be faster, e.g.:\n\n```python\nimport matplotlib\nmatplotlib.use('agg')\n```\n\n### Setup development environment\n\n```sh\npython -m venv .venv\nsource .venv/bin/activate\npip install -e '.[dev]'\npre-commit install\n```\n\n### Certification\n\n![image](https://raw.githubusercontent.com/dynobo/lmdiag/master/badge.png)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Diagnostic Plots for Lineare Regression Models. Similar to plot.lm in R.",
"version": "0.4.1",
"project_urls": {
"Homepage": "http://github.com/dynobo/lmdiag"
},
"split_keywords": [
"lm",
" lineare",
" regression",
" diagnostics",
" plot",
" chart",
" linearmodels",
" statsmodels",
" scikit-learn"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7f03e05d405178fcc8554909af25937e79aaaaa7cbe3477b40542d90e4b81062",
"md5": "8dee1dbfe996002f41c6a7353d5f56dd",
"sha256": "1c7413463ba443d798e882abdf22dd4786bb39fe1f032dd745dadd2540b0a954"
},
"downloads": -1,
"filename": "lmdiag-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8dee1dbfe996002f41c6a7353d5f56dd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 13479,
"upload_time": "2024-06-25T09:11:19",
"upload_time_iso_8601": "2024-06-25T09:11:19.231163Z",
"url": "https://files.pythonhosted.org/packages/7f/03/e05d405178fcc8554909af25937e79aaaaa7cbe3477b40542d90e4b81062/lmdiag-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "cda25f040f1a49525eae2fd85393aea9df21319f276c83b7100759dd0b51c370",
"md5": "0b3dde71450e9ad44cc15277276914a5",
"sha256": "06bd1fad0b1fe66027254e159ba298c6a89c2ace8d3466895c528166604b3b9d"
},
"downloads": -1,
"filename": "lmdiag-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "0b3dde71450e9ad44cc15277276914a5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 14277,
"upload_time": "2024-06-25T09:11:20",
"upload_time_iso_8601": "2024-06-25T09:11:20.989084Z",
"url": "https://files.pythonhosted.org/packages/cd/a2/5f040f1a49525eae2fd85393aea9df21319f276c83b7100759dd0b51c370/lmdiag-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-25 09:11:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dynobo",
"github_project": "lmdiag",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "lmdiag"
}