# Automating Data Preprocessing
Shortly **ADP** is now a Python Library and you can use it by just installing using the following commands
```pip install autodatap```
And will install the package into you system
## Purpose of autodatap:
- to help you in data preprocessing
to know how can you use it:
- import the package
```import autodatap as adp```
### The main function in autodatap package is mainMethod so,
```adp.mainMethod("link to data set")```
and that's it, everything is done, you are good to go.
Now everything you will be doing will be in console (run)
### Currently supported funcitons
- Categorical Values (One-Hot-Encoding)
- Normalization
- Check for Imbalanced Data
- Null values finder and filling with 0 (in future with mean)
- dropping duplicate
## Categorical Values (One-Hot-Encoding):
So, Categorical values are those values which may have to are more values of same class, if we look at the example below
Let's say we have gender class which is a categorical variable because it has 2 or more values (male, female etc)
example1
------------
| gender |
|----------|
| male |
| female |
| male |
| female |
------------
now as machine learning only except numerical values it does not support string values, we have to convert it from string to numerical values
so achieve that we have to (or more) option either we have to give custom values by replace function
```data.replace("male",1,inplace=True)```
or we can use builtin function like **label encoding** and **One-Hot-Encoding**.
in this library we are achieving this functionality using **One-Hot-Encoding**.
so, the above example could be like
-----------------
| gender_male |
|---------------|
| 1 |
| 0 |
| 1 |
| 0 |
-----------------
-------and---------
------------------
| gender_female |
|----------------|
| 0 |
| 1 |
| 0 |
| 1 |
------------------
## how to use:
To use this function you have to write the exact column name the step of preprocessing like
```[u'name', u'age', u'class', u'code']```
in the above code the coulmn name should be **u'name'**
## Licence
MIT License
Copyright (c) 2023 Syed Syab Ahmad
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
## Contribution
To contribute to the package follow the following link
https://github.com/SyabAhmad/Automating-Data-Preprocessing
Raw data
{
"_id": null,
"home_page": "",
"name": "autodatap",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "python,machine learning,data science,data,preprocessing,AI",
"author": "SyabAhmad",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/b2/95/ab8329fde94a86de2dc25a2cbafc3817fe3278a82ae55d73d72a4528159f/autodatap-1.5.2.tar.gz",
"platform": null,
"description": "\r\n# Automating Data Preprocessing\r\nShortly **ADP** is now a Python Library and you can use it by just installing using the following commands\r\n\r\n```pip install autodatap```\r\n\r\nAnd will install the package into you system\r\n\r\n## Purpose of autodatap:\r\n\r\n- to help you in data preprocessing\r\n\r\nto know how can you use it:\r\n\r\n- import the package\r\n\r\n```import autodatap as adp```\r\n\r\n### The main function in autodatap package is mainMethod so,\r\n\r\n```adp.mainMethod(\"link to data set\")```\r\n\r\nand that's it, everything is done, you are good to go.\r\n\r\nNow everything you will be doing will be in console (run)\r\n\r\n### Currently supported funcitons\r\n\r\n- Categorical Values (One-Hot-Encoding)\r\n\r\n- Normalization\r\n\r\n- Check for Imbalanced Data\r\n\r\n- Null values finder and filling with 0 (in future with mean)\r\n\r\n- dropping duplicate\r\n\r\n\r\n## Categorical Values (One-Hot-Encoding):\r\nSo, Categorical values are those values which may have to are more values of same class, if we look at the example below\r\n\r\nLet's say we have gender class which is a categorical variable because it has 2 or more values (male, female etc)\r\n example1\r\n------------\r\n| gender |\r\n|----------|\r\n| male |\r\n| female |\r\n| male |\r\n| female |\r\n------------\r\n\r\nnow as machine learning only except numerical values it does not support string values, we have to convert it from string to numerical values\r\n\r\nso achieve that we have to (or more) option either we have to give custom values by replace function \r\n\r\n```data.replace(\"male\",1,inplace=True)```\r\n\r\nor we can use builtin function like **label encoding** and **One-Hot-Encoding**.\r\n\r\nin this library we are achieving this functionality using **One-Hot-Encoding**.\r\n\r\nso, the above example could be like\r\n\r\n-----------------\r\n| gender_male |\r\n|---------------|\r\n| 1 |\r\n| 0 |\r\n| 1 |\r\n| 0 |\r\n-----------------\r\n\r\n-------and---------\r\n\r\n------------------\r\n| gender_female |\r\n|----------------|\r\n| 0 |\r\n| 1 |\r\n| 0 |\r\n| 1 |\r\n------------------\r\n\r\n## how to use:\r\nTo use this function you have to write the exact column name the step of preprocessing like\r\n\r\n```[u'name', u'age', u'class', u'code']```\r\n\r\nin the above code the coulmn name should be **u'name'**\r\n\r\n## Licence\r\nMIT License\r\n\r\nCopyright (c) 2023 Syed Syab Ahmad\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the \"Software\"), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.\r\n\r\n## Contribution\r\n\r\nTo contribute to the package follow the following link\r\n\r\nhttps://github.com/SyabAhmad/Automating-Data-Preprocessing\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Automating Data Preprocessing",
"version": "1.5.2",
"project_urls": null,
"split_keywords": [
"python",
"machine learning",
"data science",
"data",
"preprocessing",
"ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0eeb902fb4ad824e930d7ae7eb158e3c327baa2c50a673ac336c8a6145ab2d7e",
"md5": "2bd3eb5fd2cca502f40d578c5b832a5d",
"sha256": "21a62b549432fb47e8d9cf37cb4ac8c4cb57b929cd385326cefa25288bd3f72d"
},
"downloads": -1,
"filename": "autodatap-1.5.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2bd3eb5fd2cca502f40d578c5b832a5d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 8297,
"upload_time": "2023-09-24T17:42:55",
"upload_time_iso_8601": "2023-09-24T17:42:55.807211Z",
"url": "https://files.pythonhosted.org/packages/0e/eb/902fb4ad824e930d7ae7eb158e3c327baa2c50a673ac336c8a6145ab2d7e/autodatap-1.5.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b295ab8329fde94a86de2dc25a2cbafc3817fe3278a82ae55d73d72a4528159f",
"md5": "a6cfc0066cd754377e2189963bf18f80",
"sha256": "b35ea56a5e0bd01c2cca9734a081448f54d6b8093cf49d8b3635aa6c5d6556d7"
},
"downloads": -1,
"filename": "autodatap-1.5.2.tar.gz",
"has_sig": false,
"md5_digest": "a6cfc0066cd754377e2189963bf18f80",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5204,
"upload_time": "2023-09-24T17:42:57",
"upload_time_iso_8601": "2023-09-24T17:42:57.854260Z",
"url": "https://files.pythonhosted.org/packages/b2/95/ab8329fde94a86de2dc25a2cbafc3817fe3278a82ae55d73d72a4528159f/autodatap-1.5.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-24 17:42:57",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "autodatap"
}