Name | datanerd JSON |
Version |
0.5
JSON |
| download |
home_page | |
Summary | Contains multiple functions, stats() and iv_woe(), stats() function takes the dataframe and returns statistics i.e count, percentiles, unique_values etc and iv_woe() function is used to calculate the Weight of Evidence (woe) and Information Value (iv) for a dataframe |
upload_time | 2023-01-08 03:52:28 |
maintainer | |
docs_url | None |
author | Sunil Aleti |
requires_python | |
license | |
keywords |
python
describe
stats
unique values
information value
woe
iv
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
This function provides some statistical summary of a given dataframe.
The <b>stats()</b> function takes in a dataframe and returns the following statistics:
<ul>
<li>count</li>
<li>mean</li>
<li>std</li>
<li>min</li>
<li>10th percentile</li>
<li>20th percentile</li>
<li>25th percentile</li>
<li>30th percentile</li>
<li>40th percentile</li>
<li>50th percentile (median)</li>
<li>60th percentile</li>
<li>70th percentile</li>
<li>75th percentile</li>
<li>80th percentile</li>
<li>90th percentile</li>
<li>95th percentile</li>
<li>99th percentile</li>
<li>max</li>
<li>% of missing values</li>
<li>% of non-zero values</li>
<li>#numberofunique_values</li>
</ul>
The <b>iv_woe()</b> function is used to calculate the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe. The WoE is a measure of how much the presence or absence of a predictor (feature) contributes to the probability of a response (target). The IV is a measure of the strength of the relationship between the predictor and the response.
The <b>iv_woe()</b> function takes in the following arguments:
<b>data:</b> a dataframe containing the predictor variables and the target variable
<b>target:</b> the name of the target variable
<b>bins:</b> the number of bins to use for discretizing continuous variables
<b>optimize:</b> a boolean indicating whether to optimize the binning of the continuous variables
<b>thresold:</b> the minimum percentage of non-events (negative outcome) in each bin for optimization.
If optimize is set to True, the function will iterate over the number of bins from 20 to 1 and calculate the WoE and IV for each bin. If the percentage of non-events in each bin is greater than or equal to the specified thresold, it will return the WoE and IV for that bin. If it cannot find a binning that meets the thresold, it will return the WoE and IV for the best bin it could find.
If optimize is set to False, the function will calculate the WoE and IV for the specified number of bins.
The function returns a dataframe containing the WoE and IV for each predictor variable.
Raw data
{
"_id": null,
"home_page": "",
"name": "datanerd",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,describe,stats,unique values,information value,woe,iv",
"author": "Sunil Aleti",
"author_email": "iam@sunilaleti.dev",
"download_url": "",
"platform": null,
"description": "\nThis function provides some statistical summary of a given dataframe.\n\nThe <b>stats()</b> function takes in a dataframe and returns the following statistics:\n<ul>\n<li>count</li>\n<li>mean</li>\n<li>std</li>\n<li>min</li>\n<li>10th percentile</li>\n<li>20th percentile</li>\n<li>25th percentile</li>\n<li>30th percentile</li>\n<li>40th percentile</li>\n<li>50th percentile (median)</li>\n<li>60th percentile</li>\n<li>70th percentile</li>\n<li>75th percentile</li>\n<li>80th percentile</li>\n<li>90th percentile</li>\n<li>95th percentile</li>\n<li>99th percentile</li>\n<li>max</li>\n<li>% of missing values</li>\n<li>% of non-zero values</li>\n<li>#numberofunique_values</li>\n</ul>\n\nThe <b>iv_woe()</b> function is used to calculate the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe. The WoE is a measure of how much the presence or absence of a predictor (feature) contributes to the probability of a response (target). The IV is a measure of the strength of the relationship between the predictor and the response.\n\nThe <b>iv_woe()</b> function takes in the following arguments:\n\n<b>data:</b> a dataframe containing the predictor variables and the target variable\n<b>target:</b> the name of the target variable\n<b>bins:</b> the number of bins to use for discretizing continuous variables\n<b>optimize:</b> a boolean indicating whether to optimize the binning of the continuous variables\n<b>thresold:</b> the minimum percentage of non-events (negative outcome) in each bin for optimization.\nIf optimize is set to True, the function will iterate over the number of bins from 20 to 1 and calculate the WoE and IV for each bin. If the percentage of non-events in each bin is greater than or equal to the specified thresold, it will return the WoE and IV for that bin. If it cannot find a binning that meets the thresold, it will return the WoE and IV for the best bin it could find.\n\nIf optimize is set to False, the function will calculate the WoE and IV for the specified number of bins.\n\nThe function returns a dataframe containing the WoE and IV for each predictor variable.\n\n\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Contains multiple functions, stats() and iv_woe(), stats() function takes the dataframe and returns statistics i.e count, percentiles, unique_values etc and iv_woe() function is used to calculate the Weight of Evidence (woe) and Information Value (iv) for a dataframe",
"version": "0.5",
"split_keywords": [
"python",
"describe",
"stats",
"unique values",
"information value",
"woe",
"iv"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8b4954c36d3dfc8e5328297cd88fd4ef10eeb4331d9f278e438e104a78845436",
"md5": "d789c791cd30ddea4e3d23257cfb61af",
"sha256": "eba39791a89832ab91187187470b01150f42a701f78ecc13697509e21ffe4adb"
},
"downloads": -1,
"filename": "datanerd-0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d789c791cd30ddea4e3d23257cfb61af",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 4685,
"upload_time": "2023-01-08T03:52:28",
"upload_time_iso_8601": "2023-01-08T03:52:28.817839Z",
"url": "https://files.pythonhosted.org/packages/8b/49/54c36d3dfc8e5328297cd88fd4ef10eeb4331d9f278e438e104a78845436/datanerd-0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-08 03:52:28",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "datanerd"
}