datanerd


Namedatanerd JSON
Version 0.5 PyPI version JSON
download
home_page
SummaryContains multiple functions, stats() and iv_woe(), stats() function takes the dataframe and returns statistics i.e count, percentiles, unique_values etc and iv_woe() function is used to calculate the Weight of Evidence (woe) and Information Value (iv) for a dataframe
upload_time2023-01-08 03:52:28
maintainer
docs_urlNone
authorSunil Aleti
requires_python
license
keywords python describe stats unique values information value woe iv
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
This function provides some statistical summary of a given dataframe.

The <b>stats()</b> function takes in a dataframe and returns the following statistics:
<ul>
<li>count</li>
<li>mean</li>
<li>std</li>
<li>min</li>
<li>10th percentile</li>
<li>20th percentile</li>
<li>25th percentile</li>
<li>30th percentile</li>
<li>40th percentile</li>
<li>50th percentile (median)</li>
<li>60th percentile</li>
<li>70th percentile</li>
<li>75th percentile</li>
<li>80th percentile</li>
<li>90th percentile</li>
<li>95th percentile</li>
<li>99th percentile</li>
<li>max</li>
<li>% of missing values</li>
<li>% of non-zero values</li>
<li>#numberofunique_values</li>
</ul>

The <b>iv_woe()</b> function is used to calculate the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe. The WoE is a measure of how much the presence or absence of a predictor (feature) contributes to the probability of a response (target). The IV is a measure of the strength of the relationship between the predictor and the response.

The <b>iv_woe()</b> function takes in the following arguments:

<b>data:</b> a dataframe containing the predictor variables and the target variable
<b>target:</b> the name of the target variable
<b>bins:</b> the number of bins to use for discretizing continuous variables
<b>optimize:</b> a boolean indicating whether to optimize the binning of the continuous variables
<b>thresold:</b> the minimum percentage of non-events (negative outcome) in each bin for optimization.
If optimize is set to True, the function will iterate over the number of bins from 20 to 1 and calculate the WoE and IV for each bin. If the percentage of non-events in each bin is greater than or equal to the specified thresold, it will return the WoE and IV for that bin. If it cannot find a binning that meets the thresold, it will return the WoE and IV for the best bin it could find.

If optimize is set to False, the function will calculate the WoE and IV for the specified number of bins.

The function returns a dataframe containing the WoE and IV for each predictor variable.





            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "datanerd",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,describe,stats,unique values,information value,woe,iv",
    "author": "Sunil Aleti",
    "author_email": "iam@sunilaleti.dev",
    "download_url": "",
    "platform": null,
    "description": "\nThis function provides some statistical summary of a given dataframe.\n\nThe <b>stats()</b> function takes in a dataframe and returns the following statistics:\n<ul>\n<li>count</li>\n<li>mean</li>\n<li>std</li>\n<li>min</li>\n<li>10th percentile</li>\n<li>20th percentile</li>\n<li>25th percentile</li>\n<li>30th percentile</li>\n<li>40th percentile</li>\n<li>50th percentile (median)</li>\n<li>60th percentile</li>\n<li>70th percentile</li>\n<li>75th percentile</li>\n<li>80th percentile</li>\n<li>90th percentile</li>\n<li>95th percentile</li>\n<li>99th percentile</li>\n<li>max</li>\n<li>% of missing values</li>\n<li>% of non-zero values</li>\n<li>#numberofunique_values</li>\n</ul>\n\nThe <b>iv_woe()</b> function is used to calculate the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe. The WoE is a measure of how much the presence or absence of a predictor (feature) contributes to the probability of a response (target). The IV is a measure of the strength of the relationship between the predictor and the response.\n\nThe <b>iv_woe()</b> function takes in the following arguments:\n\n<b>data:</b> a dataframe containing the predictor variables and the target variable\n<b>target:</b> the name of the target variable\n<b>bins:</b> the number of bins to use for discretizing continuous variables\n<b>optimize:</b> a boolean indicating whether to optimize the binning of the continuous variables\n<b>thresold:</b> the minimum percentage of non-events (negative outcome) in each bin for optimization.\nIf optimize is set to True, the function will iterate over the number of bins from 20 to 1 and calculate the WoE and IV for each bin. If the percentage of non-events in each bin is greater than or equal to the specified thresold, it will return the WoE and IV for that bin. If it cannot find a binning that meets the thresold, it will return the WoE and IV for the best bin it could find.\n\nIf optimize is set to False, the function will calculate the WoE and IV for the specified number of bins.\n\nThe function returns a dataframe containing the WoE and IV for each predictor variable.\n\n\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Contains multiple functions, stats() and iv_woe(), stats() function takes the dataframe and returns statistics i.e count, percentiles, unique_values etc and iv_woe() function is used to calculate the Weight of Evidence (woe) and Information Value (iv) for a dataframe",
    "version": "0.5",
    "split_keywords": [
        "python",
        "describe",
        "stats",
        "unique values",
        "information value",
        "woe",
        "iv"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8b4954c36d3dfc8e5328297cd88fd4ef10eeb4331d9f278e438e104a78845436",
                "md5": "d789c791cd30ddea4e3d23257cfb61af",
                "sha256": "eba39791a89832ab91187187470b01150f42a701f78ecc13697509e21ffe4adb"
            },
            "downloads": -1,
            "filename": "datanerd-0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d789c791cd30ddea4e3d23257cfb61af",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4685,
            "upload_time": "2023-01-08T03:52:28",
            "upload_time_iso_8601": "2023-01-08T03:52:28.817839Z",
            "url": "https://files.pythonhosted.org/packages/8b/49/54c36d3dfc8e5328297cd88fd4ef10eeb4331d9f278e438e104a78845436/datanerd-0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-08 03:52:28",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "datanerd"
}
        
Elapsed time: 0.02576s