# exploratory
Exploratory Data Analysis
## Description
This project explortory was created to perform Exploratory Data Analysis on any structured dataset. Dataset can have categorical or numerical data types.
This project takes pandas dataframe and gives summary statistics and individual plots having categorical count for catagorical variables and PDF's, CDF's with mean, median and mode for numerical variables. The both the results are stored in PDF and CSV file in your current directory/path.
## Installation:
Use the package manager [pip](https://pypi.org/project/exploratory/) to install exploratory
```bash
pip install exploratory
```
## Usage:
```python
from exploratory import EDA
EDA(df)
# df --> pandas dataframe
#Please input the DPI value, as DPI value increases runtime would increase. Defualt DPI value: 150
```
## Example Run:
![Exploratory Run](https://user-images.githubusercontent.com/114361354/207988464-669b7eec-5119-4ef3-88fe-bd094917ef16.gif)
## Expected Outputs:
* CSV File, DataFrame Containing
| Column | Description |
|-----------------|---------------------------------------------------------------------|
| Variable | Variable Name in the dataset provided |
| Cardinality | Number of levels/classes in each variable |
| total_count | Count of total records (non null) |
| unique_rate | Cardinality / total_count, Unique Rate of 1 indicates a ID variable |
| percent_missing | Percentage of missing values across each column |
| mean | Average of column (Ignores Object/String variables) |
| std | Standard deviation of column (Ignores Object/String variables) |
| min | Minimum of column (Ignores Object/String variables) |
| 25% | 25th percentile value of column (Ignores Object/String variables) |
| median | 50th percentile value of column (Ignores Object/String variables) |
| 75% | 75th percentile value of column (Ignores Object/String variables) |
| max | Maximum of column (Ignores Object/String variables) |
| data_types | Data type of column (Int / Float / Object etc) |
| range | Max Value - Min Value (Ignores Object/String variables) |
* PDF with Statistical Summary and variable distribution graphs (categorical & continous)
![Exported PDF](https://user-images.githubusercontent.com/114361354/207987618-ea144695-f18e-4f38-9bee-d7e313152bf6.gif)
## Contributing
Pull requests are welcome. Please use this 'https://github.com/Abhilash-MS/exploratory'
Please feel free to contact authors for any suggestions or issues, Ram <kakarlaramcharan@gmail.com>, Abhilash <abhilashmaspalli1996@gmail.com>
Raw data
{
"_id": null,
"home_page": "https://github.com/Abhilash-MS/exploratory",
"name": "exploratory",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,Exploratory-Analysis,EDA,PDF,CDF,Summary Statistics,Mean,Median,Mode,Distribution Plot",
"author": "Ram <kakarlaramcharan@gmail.com>, Abhilash <abhilashmaspalli1996@gmail.com>",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/14/b2/cb472ba23229218e7878a9b4841b12f3e858a632b25373865bc6cefb1ce1/exploratory-3.4.12.tar.gz",
"platform": null,
"description": "# exploratory\nExploratory Data Analysis\n\n## Description\n\nThis project explortory was created to perform Exploratory Data Analysis on any structured dataset. Dataset can have categorical or numerical data types. \nThis project takes pandas dataframe and gives summary statistics and individual plots having categorical count for catagorical variables and PDF's, CDF's with mean, median and mode for numerical variables. The both the results are stored in PDF and CSV file in your current directory/path. \n\n\n## Installation:\nUse the package manager [pip](https://pypi.org/project/exploratory/) to install exploratory\n```bash\npip install exploratory\n```\n## Usage:\n\n```python\nfrom exploratory import EDA\nEDA(df)\n# df --> pandas dataframe\n#Please input the DPI value, as DPI value increases runtime would increase. Defualt DPI value: 150\n```\n## Example Run:\n\n![Exploratory Run](https://user-images.githubusercontent.com/114361354/207988464-669b7eec-5119-4ef3-88fe-bd094917ef16.gif)\n\n## Expected Outputs:\n\n* CSV File, DataFrame Containing \n\n| Column | Description |\n|-----------------|---------------------------------------------------------------------|\n| Variable | Variable Name in the dataset provided |\n| Cardinality | Number of levels/classes in each variable |\n| total_count | Count of total records (non null) |\n| unique_rate | Cardinality / total_count, Unique Rate of 1 indicates a ID variable |\n| percent_missing | Percentage of missing values across each column |\n| mean | Average of column (Ignores Object/String variables) |\n| std | Standard deviation of column (Ignores Object/String variables) |\n| min | Minimum of column (Ignores Object/String variables) |\n| 25% | 25th percentile value of column (Ignores Object/String variables) |\n| median | 50th percentile value of column (Ignores Object/String variables) |\n| 75% | 75th percentile value of column (Ignores Object/String variables) |\n| max | Maximum of column (Ignores Object/String variables) |\n| data_types | Data type of column (Int / Float / Object etc) |\n| range | Max Value - Min Value (Ignores Object/String variables) |\n\n* PDF with Statistical Summary and variable distribution graphs (categorical & continous)\n\n![Exported PDF](https://user-images.githubusercontent.com/114361354/207987618-ea144695-f18e-4f38-9bee-d7e313152bf6.gif)\n\n## Contributing\nPull requests are welcome. Please use this 'https://github.com/Abhilash-MS/exploratory' \nPlease feel free to contact authors for any suggestions or issues, Ram <kakarlaramcharan@gmail.com>, Abhilash <abhilashmaspalli1996@gmail.com> \n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Exploratory Data Analysis",
"version": "3.4.12",
"split_keywords": [
"python",
"exploratory-analysis",
"eda",
"pdf",
"cdf",
"summary statistics",
"mean",
"median",
"mode",
"distribution plot"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "c32d83b06c74948f294479fc77d5a6e3",
"sha256": "bb088cc7fbe3ab79aab5da669c85c7bd9a188aa89c5e9a7e73168ac8f05360a9"
},
"downloads": -1,
"filename": "exploratory-3.4.12.tar.gz",
"has_sig": false,
"md5_digest": "c32d83b06c74948f294479fc77d5a6e3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 7032,
"upload_time": "2022-12-15T23:34:19",
"upload_time_iso_8601": "2022-12-15T23:34:19.315399Z",
"url": "https://files.pythonhosted.org/packages/14/b2/cb472ba23229218e7878a9b4841b12f3e858a632b25373865bc6cefb1ce1/exploratory-3.4.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-15 23:34:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "Abhilash-MS",
"github_project": "exploratory",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "exploratory"
}