datanerd


Namedatanerd JSON
Version 1.2 PyPI version JSON
download
home_pageNone
SummaryContains multiple functions stats(), iv_woe(), pushdb(), teams_webhook(), and ntfy()
upload_time2024-12-18 04:07:48
maintainerNone
docs_urlNone
authorSunil Aleti
requires_pythonNone
licenseNone
keywords python describe stats unique values information value woe iv
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# DataNerd

This package provides various functions for data analysis, statistical calculations, database operations, and sending notifications.

## Installation

To use these functions, you need to have Python installed on your system. You also need to install the required libraries. You can install them using pip:

```
pip install pandas numpy sqlalchemy requests
```

## Functions

### 1. stats()

This function provides statistical summary of a given dataframe.

#### Parameters:
- `df` (pandas.DataFrame): The input dataframe

#### Returns:
- A dataframe containing various statistics for each column

#### Statistics provided:
- count
- mean
- std
- min
- 10th, 20th, 25th, 30th, 40th, 50th (median), 60th, 70th, 75th, 80th, 90th, 95th, 99th percentiles
- max
- % of missing values
- number of unique values

#### Usage:

```python
import pandas as pd
import datanerd as dn

df = pd.read_csv('titanic.csv')
summary_stats = dn.stats(df)
```

### 2. iv_woe()

This function calculates the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe.

#### Parameters:
- `data` (pandas.DataFrame): The input dataframe
- `target` (str): The name of the target variable
- `bins` (int): The number of bins to use for discretizing continuous variables
- `optimize` (bool): Whether to optimize the binning of continuous variables
- `threshold` (float): The minimum percentage of non-events in each bin for optimization

#### Returns:
- A tuple containing two dataframes: (iv, woe)

#### Usage:

```python
import pandas as pd
import datanerd as dn

df = pd.read_csv('cancer.csv')
iv, woe = dn.iv_woe(data=df, target='Diagnosis', bins=20, optimize=True, threshold=0.05)
```

### 3. pushdb()

This function pushes a Pandas dataframe to a Microsoft SQL Server database.

#### Parameters:
- `data` (pandas.DataFrame): The dataframe to be pushed
- `tablename` (str): The name of the table in the database
- `server` (str): The name of the SQL Server
- `database` (str): The name of the database
- `schema` (str): The name of the schema

#### Usage:

```python
import pandas as pd
import datanerd as dn

df = pd.read_csv('day.csv')
dn.pushdb(df, tablename='day', server='SQL', database='schedule', schema='analysis')
```

### 4. teams_webhook()

This function sends a formatted message to a Microsoft Teams channel using a webhook URL.

#### Parameters:
- `webhook_url` (str): The webhook URL for the Teams channel
- `title` (str): The title of the message
- `message` (str): The body of the message

#### Usage:

```python
import datanerd as dn

webhook_url = "https://outlook.office.com/webhook/..."
title = "Important Notification"
message = "This is a test message sent from Python!"

dn.teams_webhook(webhook_url, title, message)
```

### 5. ntfy()

This function sends a notification message to an ntfy.sh server.

#### Parameters:
- `server` (str): The name of the ntfy.sh server/topic to send the message to
- `message` (str): The message to be sent


#### Usage:

```python
import datanerd as dn

server = "your_server_name"
message = "This is a test notification from Python!"

dn.ntfy(server, message)
```






            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "datanerd",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, describe, stats, unique values, information value, woe, iv",
    "author": "Sunil Aleti",
    "author_email": "iam@sunilaleti.dev",
    "download_url": "https://files.pythonhosted.org/packages/bc/75/9d128d0b8c0728004d995defe9fc8492135215e57c064a9feb0ec7d5e9c3/datanerd-1.2.tar.gz",
    "platform": null,
    "description": "\n# DataNerd\n\nThis package provides various functions for data analysis, statistical calculations, database operations, and sending notifications.\n\n## Installation\n\nTo use these functions, you need to have Python installed on your system. You also need to install the required libraries. You can install them using pip:\n\n```\npip install pandas numpy sqlalchemy requests\n```\n\n## Functions\n\n### 1. stats()\n\nThis function provides statistical summary of a given dataframe.\n\n#### Parameters:\n- `df` (pandas.DataFrame): The input dataframe\n\n#### Returns:\n- A dataframe containing various statistics for each column\n\n#### Statistics provided:\n- count\n- mean\n- std\n- min\n- 10th, 20th, 25th, 30th, 40th, 50th (median), 60th, 70th, 75th, 80th, 90th, 95th, 99th percentiles\n- max\n- % of missing values\n- number of unique values\n\n#### Usage:\n\n```python\nimport pandas as pd\nimport datanerd as dn\n\ndf = pd.read_csv('titanic.csv')\nsummary_stats = dn.stats(df)\n```\n\n### 2. iv_woe()\n\nThis function calculates the Weight of Evidence (WoE) and Information Value (IV) for a given dataframe.\n\n#### Parameters:\n- `data` (pandas.DataFrame): The input dataframe\n- `target` (str): The name of the target variable\n- `bins` (int): The number of bins to use for discretizing continuous variables\n- `optimize` (bool): Whether to optimize the binning of continuous variables\n- `threshold` (float): The minimum percentage of non-events in each bin for optimization\n\n#### Returns:\n- A tuple containing two dataframes: (iv, woe)\n\n#### Usage:\n\n```python\nimport pandas as pd\nimport datanerd as dn\n\ndf = pd.read_csv('cancer.csv')\niv, woe = dn.iv_woe(data=df, target='Diagnosis', bins=20, optimize=True, threshold=0.05)\n```\n\n### 3. pushdb()\n\nThis function pushes a Pandas dataframe to a Microsoft SQL Server database.\n\n#### Parameters:\n- `data` (pandas.DataFrame): The dataframe to be pushed\n- `tablename` (str): The name of the table in the database\n- `server` (str): The name of the SQL Server\n- `database` (str): The name of the database\n- `schema` (str): The name of the schema\n\n#### Usage:\n\n```python\nimport pandas as pd\nimport datanerd as dn\n\ndf = pd.read_csv('day.csv')\ndn.pushdb(df, tablename='day', server='SQL', database='schedule', schema='analysis')\n```\n\n### 4. teams_webhook()\n\nThis function sends a formatted message to a Microsoft Teams channel using a webhook URL.\n\n#### Parameters:\n- `webhook_url` (str): The webhook URL for the Teams channel\n- `title` (str): The title of the message\n- `message` (str): The body of the message\n\n#### Usage:\n\n```python\nimport datanerd as dn\n\nwebhook_url = \"https://outlook.office.com/webhook/...\"\ntitle = \"Important Notification\"\nmessage = \"This is a test message sent from Python!\"\n\ndn.teams_webhook(webhook_url, title, message)\n```\n\n### 5. ntfy()\n\nThis function sends a notification message to an ntfy.sh server.\n\n#### Parameters:\n- `server` (str): The name of the ntfy.sh server/topic to send the message to\n- `message` (str): The message to be sent\n\n\n#### Usage:\n\n```python\nimport datanerd as dn\n\nserver = \"your_server_name\"\nmessage = \"This is a test notification from Python!\"\n\ndn.ntfy(server, message)\n```\n\n\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Contains multiple functions stats(), iv_woe(), pushdb(), teams_webhook(), and ntfy()",
    "version": "1.2",
    "project_urls": null,
    "split_keywords": [
        "python",
        " describe",
        " stats",
        " unique values",
        " information value",
        " woe",
        " iv"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5184ec0c366685b453261f5b3b3520ea69bca959e9981022f6c28d94aed4e1f4",
                "md5": "039e620410710c310011094a8da91cbf",
                "sha256": "62eb0ec18d16823fca7c83e37e7b5291a3d2691e4b085a8fdbddfa0f5ee46c60"
            },
            "downloads": -1,
            "filename": "datanerd-1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "039e620410710c310011094a8da91cbf",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 6257,
            "upload_time": "2024-12-18T04:07:45",
            "upload_time_iso_8601": "2024-12-18T04:07:45.855860Z",
            "url": "https://files.pythonhosted.org/packages/51/84/ec0c366685b453261f5b3b3520ea69bca959e9981022f6c28d94aed4e1f4/datanerd-1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bc759d128d0b8c0728004d995defe9fc8492135215e57c064a9feb0ec7d5e9c3",
                "md5": "3937498b42965ff36a6abad9a0335605",
                "sha256": "55e8543044255d7756cfe716d8b99b7402b9d5eada93607ee155a7248ce768ce"
            },
            "downloads": -1,
            "filename": "datanerd-1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "3937498b42965ff36a6abad9a0335605",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 5683,
            "upload_time": "2024-12-18T04:07:48",
            "upload_time_iso_8601": "2024-12-18T04:07:48.074851Z",
            "url": "https://files.pythonhosted.org/packages/bc/75/9d128d0b8c0728004d995defe9fc8492135215e57c064a9feb0ec7d5e9c3/datanerd-1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-18 04:07:48",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "datanerd"
}
        
Elapsed time: 0.44376s