# veda_lib
**A Python library designed to streamline the transition from raw data to machine learning models.**
veda_lib automates and simplifies data preprocessing, cleaning, and balancing, addressing the time-consuming and complex aspects of these tasks to provide clean, ready-to-use data for your models.
********************************************************
## Installation
First, install `veda_lib` using pip:
```bash
pip install veda_lib
```
**************************************
## How to use?
After installing `veda_lib`, import it into your project and start utilizing its modules to prepare your data. Below is a summary of the key functionalities provided by each module:
**1. Preprocessor Module**
- Functions:
- Removing null values
- Handling duplicates
- Imputing missing values with appropriate methods
- Usage: Ideal for initial data cleaning and preprocessing steps.
**2. OutlierHandler Module**
- Functions:
- Handling outliers by either removing or capping them
- Customizable based on the nature of your data
- Usage: Useful for managing data skewness and ensuring robust model performance.
**3. FeatureSelector Module**
- Functions:
- Selecting important features from the dataset
- Tailored selection based on the nature of the data
- Usage: Helps in reducing dimensionality and focusing on the most impactful features.
**4. DimensionReducer Module**
- Functions:
- Reducing data dimensionality using appropriate techniques
- Usage: Crucial for addressing the curse of dimensionality and improving model efficiency.
**5. BalanceData Module**
- Functions:
- Balancing class distribution in imbalanced datasets
- Methods chosen based on data characteristics
- Usage: Essential for improving model fairness and performance on imbalanced datasets.
**6. Veda Module**
- Functions:
- Integrates all the above functionalities into a single pipeline
- Usage: Pass your raw data through this module to perform comprehensive EDA and get fully preprocessed, cleaned, and balanced data ready for model training.
*******************************************************
## Importing
- Here is an example of importing Veda from veda_lib.Veda, here set classification to True if the problem is classification otherwise set to False.
```bash
from veda_lib import Veda
```
```bash
eda = Veda.Veda(classification=True)
eda.fit_transform(X, Y)
```
- Here is an example of importing DataPreprocessor from veda_lib.Preprocessor, using default values of parameters
```bash
from veda_lib import Preprocessor
```
```bash
preprocessor = Preprocessor.DataPreprocessor()
X, y = preprocessor.fit_transform(X, Y)
```
- Here is an example of importing OutlierPreprocessor from veda_lib.OutlierHandler, using default values of parameters.
```bash
from veda_lib import OutlierHandler
```
```bash
outlier_preprocessor = OutlierHandler.OutlierPreprocessor()
X, y = outlier_preprocessor.fit_transform(X, Y)
```
- Here is an example of importing FeatureSelection from veda_lib.FeatureSelector, using default values of parameters.
```bash
from veda_lib import FeatureSelector
```
```bash
selector = FeatureSelector.FeatureSelection()
X, y = selector.fit_transform(X, y)
```
- Here is an example of importing DimensionReducer from veda_lib.DimensionReducer, using default values of parameters.
```bash
from veda_lib import DimensionReducer
```
```bash
reducer = DimensionReducer.DimensionReducer()
X, y = reducer.fit_transform(X, y)
```
- Here is an example of importing AdaptiveBalancer from veda_lib.BalanceData, using default values of parameters.
```bash
from veda_lib import BalanceData
```
```bash
balancer = BalanceData.AdaptiveBalancer(classification=True)
X, y, strategy, model = balancer.fit_transform(X, y)
```
****************************************************************
## Contributing
I welcome contributions to `veda_lib`! If you have a bug report, feature suggestion, or want to contribute code, please open an issue or pull request on GitHub.
*************************************************************
## License
`veda_lib` is licensed under the Apache License Version 2.0. See the [LICENSE](https://github.com/vishallmaurya/VEDA?tab=Apache-2.0-1-ov-file) file for more details.
Raw data
{
"_id": null,
"home_page": "https://github.com/vishallmaurya/VEDA",
"name": "veda-lib",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "Automated Data Preprocessing, Data Cleaning, Data Balancing, Machine Learning, Data Transformation, Feature Engineering, Data Wrangling, Data Preparation, Exploratory Data Analysis",
"author": "Vishal Maurya",
"author_email": "vishallmaurya210@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/eb/1f/39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb/veda_lib-0.0.5.tar.gz",
"platform": null,
"description": "# veda_lib\r\n\r\n**A Python library designed to streamline the transition from raw data to machine learning models.** \r\nveda_lib automates and simplifies data preprocessing, cleaning, and balancing, addressing the time-consuming and complex aspects of these tasks to provide clean, ready-to-use data for your models.\r\n\r\n********************************************************\r\n\r\n## Installation\r\n\r\nFirst, install `veda_lib` using pip:\r\n\r\n```bash\r\npip install veda_lib\r\n```\r\n\r\n**************************************\r\n\r\n## How to use?\r\n\r\nAfter installing `veda_lib`, import it into your project and start utilizing its modules to prepare your data. Below is a summary of the key functionalities provided by each module:\r\n\r\n**1. Preprocessor Module**\r\n- Functions:\r\n - Removing null values\r\n - Handling duplicates\r\n - Imputing missing values with appropriate methods\r\n- Usage: Ideal for initial data cleaning and preprocessing steps.\r\n\r\n**2. OutlierHandler Module**\r\n- Functions:\r\n - Handling outliers by either removing or capping them\r\n - Customizable based on the nature of your data\r\n- Usage: Useful for managing data skewness and ensuring robust model performance.\r\n\r\n**3. FeatureSelector Module**\r\n- Functions:\r\n - Selecting important features from the dataset\r\n - Tailored selection based on the nature of the data\r\n- Usage: Helps in reducing dimensionality and focusing on the most impactful features.\r\n\r\n**4. DimensionReducer Module**\r\n- Functions:\r\n - Reducing data dimensionality using appropriate techniques\r\n- Usage: Crucial for addressing the curse of dimensionality and improving model efficiency.\r\n\r\n**5. BalanceData Module**\r\n- Functions:\r\n - Balancing class distribution in imbalanced datasets\r\n - Methods chosen based on data characteristics\r\n- Usage: Essential for improving model fairness and performance on imbalanced datasets.\r\n\r\n**6. Veda Module**\r\n- Functions:\r\n - Integrates all the above functionalities into a single pipeline\r\n- Usage: Pass your raw data through this module to perform comprehensive EDA and get fully preprocessed, cleaned, and balanced data ready for model training.\r\n\r\n*******************************************************\r\n\r\n## Importing\r\n\r\n- Here is an example of importing Veda from veda_lib.Veda, here set classification to True if the problem is classification otherwise set to False.\r\n```bash\r\nfrom veda_lib import Veda\r\n```\r\n```bash\r\neda = Veda.Veda(classification=True)\r\neda.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing DataPreprocessor from veda_lib.Preprocessor, using default values of parameters\r\n```bash\r\nfrom veda_lib import Preprocessor\r\n```\r\n```bash\r\npreprocessor = Preprocessor.DataPreprocessor()\r\nX, y = preprocessor.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing OutlierPreprocessor from veda_lib.OutlierHandler, using default values of parameters.\r\n```bash\r\nfrom veda_lib import OutlierHandler\r\n```\r\n```bash\r\noutlier_preprocessor = OutlierHandler.OutlierPreprocessor()\r\nX, y = outlier_preprocessor.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing FeatureSelection from veda_lib.FeatureSelector, using default values of parameters.\r\n```bash\r\nfrom veda_lib import FeatureSelector\r\n```\r\n```bash\r\nselector = FeatureSelector.FeatureSelection()\r\nX, y = selector.fit_transform(X, y)\r\n```\r\n\r\n\r\n- Here is an example of importing DimensionReducer from veda_lib.DimensionReducer, using default values of parameters.\r\n```bash\r\nfrom veda_lib import DimensionReducer\r\n```\r\n```bash\r\nreducer = DimensionReducer.DimensionReducer()\r\nX, y = reducer.fit_transform(X, y)\r\n```\r\n\r\n\r\n- Here is an example of importing AdaptiveBalancer from veda_lib.BalanceData, using default values of parameters.\r\n```bash\r\nfrom veda_lib import BalanceData\r\n```\r\n```bash\r\nbalancer = BalanceData.AdaptiveBalancer(classification=True)\r\nX, y, strategy, model = balancer.fit_transform(X, y)\r\n```\r\n\r\n**************************************************************** \r\n\r\n## Contributing\r\n\r\nI welcome contributions to `veda_lib`! If you have a bug report, feature suggestion, or want to contribute code, please open an issue or pull request on GitHub.\r\n\r\n*************************************************************\r\n\r\n## License\r\n\r\n`veda_lib` is licensed under the Apache License Version 2.0. See the [LICENSE](https://github.com/vishallmaurya/VEDA?tab=Apache-2.0-1-ov-file) file for more details.\r\n\r\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "veda_lib is a Python library designed to streamline the data preprocessing and cleaning workflow for machine learning projects. It offers a comprehensive set of tools to handle common data preparation tasks",
"version": "0.0.5",
"project_urls": {
"Bug Tracker": "https://github.com/vishallmaurya/VEDA/issues",
"Homepage": "https://github.com/vishallmaurya/VEDA"
},
"split_keywords": [
"automated data preprocessing",
" data cleaning",
" data balancing",
" machine learning",
" data transformation",
" feature engineering",
" data wrangling",
" data preparation",
" exploratory data analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ed516922bb0cda58125e55946b9624f8db1a20522eb60f546cd53853bb0a16b5",
"md5": "ffba9a528e4981aacf34a9434ff595fe",
"sha256": "5c3aa49ac53603ccc890254d30541fc8caf2fa0199ee2d2b756178b0f28e793c"
},
"downloads": -1,
"filename": "veda_lib-0.0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ffba9a528e4981aacf34a9434ff595fe",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 27008,
"upload_time": "2024-08-21T13:36:03",
"upload_time_iso_8601": "2024-08-21T13:36:03.852926Z",
"url": "https://files.pythonhosted.org/packages/ed/51/6922bb0cda58125e55946b9624f8db1a20522eb60f546cd53853bb0a16b5/veda_lib-0.0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "eb1f39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb",
"md5": "d8d19bcf98c16a4ffa42ad6736e17ac3",
"sha256": "ece8d2d98352b0f5f1b71969a24db44423706b2eaa42b28ff091346a52224997"
},
"downloads": -1,
"filename": "veda_lib-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "d8d19bcf98c16a4ffa42ad6736e17ac3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 23878,
"upload_time": "2024-08-21T13:36:05",
"upload_time_iso_8601": "2024-08-21T13:36:05.654936Z",
"url": "https://files.pythonhosted.org/packages/eb/1f/39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb/veda_lib-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-21 13:36:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "vishallmaurya",
"github_project": "VEDA",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "veda-lib"
}