# DFAnalyzer
DFAnalyzer Python is a Python package for data analysis, built on top of the popular DFAnalyzer for Excel. It provides a powerful set of tools for importing, exploring, cleaning, transforming, and visualizing data. It also offers features such as filtering, sorting, grouping, and performing calculations on data. DFAnalyzer Python is designed to enable users to quickly and easily analyze large amounts of data and **extract meaningful insights.**
* Find details & insight about each columns.
* Easy to perform cycles over pyspark.
* Percentage stats around NaN , Blank Values, Null Values.
* Describes datatypes of Pyspark Dataframe.
* Help in POC of data.
## Who Should use DFAnalyser
* Developers working with bigdata
* Developers using pyspark in the Data exploration.
* Developers who needs to do poc over raw data.
## Usage
### PySpark
You can install the DFAnalyzer package using the pip command. To install DFAnalyzer, open a terminal window and type: pip install dfanalyzer. Once the installation is complete, you can start using DFAnalyzer with Python.
1. Install the preset:
```sh
pip install dfanalyzer
```
2. Import it:
```diff
import DFAnalyzer as dfa
```
3. Use it on existing pyspark dataframe:
```python
#[isHavingNullData,%NullData,isHavingNanValues,%NanValues,isHavingBlankValues,%BlankValues,DataType]
options=[1,1,1,1,1,1,1]#flags of what all kind of analysis you need
dfa.analyze(df,options)
```
>More is about to come. Stay tuned.
Raw data
{
"_id": null,
"home_page": "",
"name": "dfanalyzer",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,pyspark,spark,ingestion,data,dataframe,analysis,schema,pandas",
"author": "Neetish Singh(AAYS Anaytics)",
"author_email": "<neetishsingh97@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e1/96/a7fd59b6041305ea52f36b39d6083d478381412ad43ff940af9eef85464f/dfanalyzer-0.0.4.tar.gz",
"platform": null,
"description": "\r\n# DFAnalyzer \r\n\r\n\r\nDFAnalyzer Python is a Python package for data analysis, built on top of the popular DFAnalyzer for Excel. It provides a powerful set of tools for importing, exploring, cleaning, transforming, and visualizing data. It also offers features such as filtering, sorting, grouping, and performing calculations on data. DFAnalyzer Python is designed to enable users to quickly and easily analyze large amounts of data and **extract meaningful insights.**\r\n\r\n* Find details & insight about each columns.\r\n* Easy to perform cycles over pyspark.\r\n* Percentage stats around NaN , Blank Values, Null Values.\r\n* Describes datatypes of Pyspark Dataframe.\r\n* Help in POC of data.\r\n\r\n\r\n\r\n## Who Should use DFAnalyser\r\n\r\n* Developers working with bigdata\r\n* Developers using pyspark in the Data exploration.\r\n* Developers who needs to do poc over raw data.\r\n\r\n\r\n\r\n\r\n## Usage\r\n\r\n### PySpark \r\n\r\nYou can install the DFAnalyzer package using the pip command. To install DFAnalyzer, open a terminal window and type: pip install dfanalyzer. Once the installation is complete, you can start using DFAnalyzer with Python.\r\n\r\n1. Install the preset:\r\n\r\n ```sh\r\n pip install dfanalyzer\r\n ```\r\n\r\n2. Import it:\r\n\r\n ```diff\r\n import DFAnalyzer as dfa\r\n ```\r\n\r\n3. Use it on existing pyspark dataframe:\r\n\r\n ```python\r\n #[isHavingNullData,%NullData,isHavingNanValues,%NanValues,isHavingBlankValues,%BlankValues,DataType]\r\n options=[1,1,1,1,1,1,1]#flags of what all kind of analysis you need\r\n dfa.analyze(df,options)\r\n\r\n ```\r\n>More is about to come. Stay tuned.\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Pyspark Dataframe Analyzer - Smartest DataFrame Analysis",
"version": "0.0.4",
"split_keywords": [
"python",
"pyspark",
"spark",
"ingestion",
"data",
"dataframe",
"analysis",
"schema",
"pandas"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7e5438a3ef27336e56c79a610eae092c104af388993e57c867acbcc09bfcd93f",
"md5": "4d840a175a8020be15dd3d961899d0d0",
"sha256": "0903a2aa1bcb2a010b6c54e9890fff7d5f8973b2bc960b93280b0c391560d316"
},
"downloads": -1,
"filename": "dfanalyzer-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4d840a175a8020be15dd3d961899d0d0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 3508,
"upload_time": "2023-01-09T10:35:31",
"upload_time_iso_8601": "2023-01-09T10:35:31.580809Z",
"url": "https://files.pythonhosted.org/packages/7e/54/38a3ef27336e56c79a610eae092c104af388993e57c867acbcc09bfcd93f/dfanalyzer-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e196a7fd59b6041305ea52f36b39d6083d478381412ad43ff940af9eef85464f",
"md5": "b6c369d735022f8392e9a28faaaa6cab",
"sha256": "9227c71619e1f6ffb579c080c5cc079cdeea7b29ceaff0d146437a750984c8f6"
},
"downloads": -1,
"filename": "dfanalyzer-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "b6c369d735022f8392e9a28faaaa6cab",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3052,
"upload_time": "2023-01-09T10:35:33",
"upload_time_iso_8601": "2023-01-09T10:35:33.420180Z",
"url": "https://files.pythonhosted.org/packages/e1/96/a7fd59b6041305ea52f36b39d6083d478381412ad43ff940af9eef85464f/dfanalyzer-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-09 10:35:33",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "dfanalyzer"
}