veda-lib


Nameveda-lib JSON
Version 0.0.5 PyPI version JSON
download
home_pagehttps://github.com/vishallmaurya/VEDA
Summaryveda_lib is a Python library designed to streamline the data preprocessing and cleaning workflow for machine learning projects. It offers a comprehensive set of tools to handle common data preparation tasks
upload_time2024-08-21 13:36:05
maintainerNone
docs_urlNone
authorVishal Maurya
requires_python>=3.9
licenseApache License 2.0
keywords automated data preprocessing data cleaning data balancing machine learning data transformation feature engineering data wrangling data preparation exploratory data analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # veda_lib

**A Python library designed to streamline the transition from raw data to machine learning models.**  
veda_lib automates and simplifies data preprocessing, cleaning, and balancing, addressing the time-consuming and complex aspects of these tasks to provide clean, ready-to-use data for your models.

********************************************************

## Installation

First, install `veda_lib` using pip:

```bash
pip install veda_lib
```

**************************************

## How to use?

After installing `veda_lib`, import it into your project and start utilizing its modules to prepare your data. Below is a summary of the key functionalities provided by each module:

**1. Preprocessor Module**
- Functions:
   - Removing null values
   - Handling duplicates
   - Imputing missing values with appropriate methods
- Usage: Ideal for initial data cleaning and preprocessing steps.

**2. OutlierHandler Module**
- Functions:
   - Handling outliers by either removing or capping them
   - Customizable based on the nature of your data
- Usage: Useful for managing data skewness and ensuring robust model performance.

**3. FeatureSelector Module**
- Functions:
   - Selecting important features from the dataset
   - Tailored selection based on the nature of the data
- Usage: Helps in reducing dimensionality and focusing on the most impactful features.

**4. DimensionReducer Module**
- Functions:
   - Reducing data dimensionality using appropriate techniques
- Usage: Crucial for addressing the curse of dimensionality and improving model efficiency.

**5. BalanceData Module**
- Functions:
   - Balancing class distribution in imbalanced datasets
   - Methods chosen based on data characteristics
- Usage: Essential for improving model fairness and performance on imbalanced datasets.

**6. Veda Module**
- Functions:
   - Integrates all the above functionalities into a single pipeline
- Usage: Pass your raw data through this module to perform comprehensive EDA and get fully preprocessed, cleaned, and balanced data ready for model training.

*******************************************************

## Importing

- Here is an example of importing Veda from veda_lib.Veda, here set classification to True if the problem is classification otherwise set to False.
```bash
from veda_lib import Veda
```
```bash
eda = Veda.Veda(classification=True)
eda.fit_transform(X, Y)
```


- Here is an example of importing DataPreprocessor from veda_lib.Preprocessor, using default values of parameters
```bash
from veda_lib import Preprocessor
```
```bash
preprocessor = Preprocessor.DataPreprocessor()
X, y = preprocessor.fit_transform(X, Y)
```


- Here is an example of importing OutlierPreprocessor from veda_lib.OutlierHandler, using default values of parameters.
```bash
from veda_lib import OutlierHandler
```
```bash
outlier_preprocessor = OutlierHandler.OutlierPreprocessor()
X, y = outlier_preprocessor.fit_transform(X, Y)
```


- Here is an example of importing FeatureSelection from veda_lib.FeatureSelector, using default values of parameters.
```bash
from veda_lib import FeatureSelector
```
```bash
selector = FeatureSelector.FeatureSelection()
X, y = selector.fit_transform(X, y)
```


- Here is an example of importing DimensionReducer from veda_lib.DimensionReducer, using default values of parameters.
```bash
from veda_lib import DimensionReducer
```
```bash
reducer = DimensionReducer.DimensionReducer()
X, y = reducer.fit_transform(X, y)
```


- Here is an example of importing AdaptiveBalancer from veda_lib.BalanceData, using default values of parameters.
```bash
from veda_lib import BalanceData
```
```bash
balancer = BalanceData.AdaptiveBalancer(classification=True)
X, y, strategy, model = balancer.fit_transform(X, y)
```

**************************************************************** 

## Contributing

I welcome contributions to `veda_lib`! If you have a bug report, feature suggestion, or want to contribute code, please open an issue or pull request on GitHub.

*************************************************************

## License

`veda_lib` is licensed under the Apache License Version 2.0. See the [LICENSE](https://github.com/vishallmaurya/VEDA?tab=Apache-2.0-1-ov-file) file for more details.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/vishallmaurya/VEDA",
    "name": "veda-lib",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "Automated Data Preprocessing, Data Cleaning, Data Balancing, Machine Learning, Data Transformation, Feature Engineering, Data Wrangling, Data Preparation, Exploratory Data Analysis",
    "author": "Vishal Maurya",
    "author_email": "vishallmaurya210@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/eb/1f/39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb/veda_lib-0.0.5.tar.gz",
    "platform": null,
    "description": "# veda_lib\r\n\r\n**A Python library designed to streamline the transition from raw data to machine learning models.**  \r\nveda_lib automates and simplifies data preprocessing, cleaning, and balancing, addressing the time-consuming and complex aspects of these tasks to provide clean, ready-to-use data for your models.\r\n\r\n********************************************************\r\n\r\n## Installation\r\n\r\nFirst, install `veda_lib` using pip:\r\n\r\n```bash\r\npip install veda_lib\r\n```\r\n\r\n**************************************\r\n\r\n## How to use?\r\n\r\nAfter installing `veda_lib`, import it into your project and start utilizing its modules to prepare your data. Below is a summary of the key functionalities provided by each module:\r\n\r\n**1. Preprocessor Module**\r\n- Functions:\r\n   - Removing null values\r\n   - Handling duplicates\r\n   - Imputing missing values with appropriate methods\r\n- Usage: Ideal for initial data cleaning and preprocessing steps.\r\n\r\n**2. OutlierHandler Module**\r\n- Functions:\r\n   - Handling outliers by either removing or capping them\r\n   - Customizable based on the nature of your data\r\n- Usage: Useful for managing data skewness and ensuring robust model performance.\r\n\r\n**3. FeatureSelector Module**\r\n- Functions:\r\n   - Selecting important features from the dataset\r\n   - Tailored selection based on the nature of the data\r\n- Usage: Helps in reducing dimensionality and focusing on the most impactful features.\r\n\r\n**4. DimensionReducer Module**\r\n- Functions:\r\n   - Reducing data dimensionality using appropriate techniques\r\n- Usage: Crucial for addressing the curse of dimensionality and improving model efficiency.\r\n\r\n**5. BalanceData Module**\r\n- Functions:\r\n   - Balancing class distribution in imbalanced datasets\r\n   - Methods chosen based on data characteristics\r\n- Usage: Essential for improving model fairness and performance on imbalanced datasets.\r\n\r\n**6. Veda Module**\r\n- Functions:\r\n   - Integrates all the above functionalities into a single pipeline\r\n- Usage: Pass your raw data through this module to perform comprehensive EDA and get fully preprocessed, cleaned, and balanced data ready for model training.\r\n\r\n*******************************************************\r\n\r\n## Importing\r\n\r\n- Here is an example of importing Veda from veda_lib.Veda, here set classification to True if the problem is classification otherwise set to False.\r\n```bash\r\nfrom veda_lib import Veda\r\n```\r\n```bash\r\neda = Veda.Veda(classification=True)\r\neda.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing DataPreprocessor from veda_lib.Preprocessor, using default values of parameters\r\n```bash\r\nfrom veda_lib import Preprocessor\r\n```\r\n```bash\r\npreprocessor = Preprocessor.DataPreprocessor()\r\nX, y = preprocessor.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing OutlierPreprocessor from veda_lib.OutlierHandler, using default values of parameters.\r\n```bash\r\nfrom veda_lib import OutlierHandler\r\n```\r\n```bash\r\noutlier_preprocessor = OutlierHandler.OutlierPreprocessor()\r\nX, y = outlier_preprocessor.fit_transform(X, Y)\r\n```\r\n\r\n\r\n- Here is an example of importing FeatureSelection from veda_lib.FeatureSelector, using default values of parameters.\r\n```bash\r\nfrom veda_lib import FeatureSelector\r\n```\r\n```bash\r\nselector = FeatureSelector.FeatureSelection()\r\nX, y = selector.fit_transform(X, y)\r\n```\r\n\r\n\r\n- Here is an example of importing DimensionReducer from veda_lib.DimensionReducer, using default values of parameters.\r\n```bash\r\nfrom veda_lib import DimensionReducer\r\n```\r\n```bash\r\nreducer = DimensionReducer.DimensionReducer()\r\nX, y = reducer.fit_transform(X, y)\r\n```\r\n\r\n\r\n- Here is an example of importing AdaptiveBalancer from veda_lib.BalanceData, using default values of parameters.\r\n```bash\r\nfrom veda_lib import BalanceData\r\n```\r\n```bash\r\nbalancer = BalanceData.AdaptiveBalancer(classification=True)\r\nX, y, strategy, model = balancer.fit_transform(X, y)\r\n```\r\n\r\n**************************************************************** \r\n\r\n## Contributing\r\n\r\nI welcome contributions to `veda_lib`! If you have a bug report, feature suggestion, or want to contribute code, please open an issue or pull request on GitHub.\r\n\r\n*************************************************************\r\n\r\n## License\r\n\r\n`veda_lib` is licensed under the Apache License Version 2.0. See the [LICENSE](https://github.com/vishallmaurya/VEDA?tab=Apache-2.0-1-ov-file) file for more details.\r\n\r\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "veda_lib is a Python library designed to streamline the data preprocessing and cleaning workflow for machine learning projects. It offers a comprehensive set of tools to handle common data preparation tasks",
    "version": "0.0.5",
    "project_urls": {
        "Bug Tracker": "https://github.com/vishallmaurya/VEDA/issues",
        "Homepage": "https://github.com/vishallmaurya/VEDA"
    },
    "split_keywords": [
        "automated data preprocessing",
        " data cleaning",
        " data balancing",
        " machine learning",
        " data transformation",
        " feature engineering",
        " data wrangling",
        " data preparation",
        " exploratory data analysis"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ed516922bb0cda58125e55946b9624f8db1a20522eb60f546cd53853bb0a16b5",
                "md5": "ffba9a528e4981aacf34a9434ff595fe",
                "sha256": "5c3aa49ac53603ccc890254d30541fc8caf2fa0199ee2d2b756178b0f28e793c"
            },
            "downloads": -1,
            "filename": "veda_lib-0.0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ffba9a528e4981aacf34a9434ff595fe",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 27008,
            "upload_time": "2024-08-21T13:36:03",
            "upload_time_iso_8601": "2024-08-21T13:36:03.852926Z",
            "url": "https://files.pythonhosted.org/packages/ed/51/6922bb0cda58125e55946b9624f8db1a20522eb60f546cd53853bb0a16b5/veda_lib-0.0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "eb1f39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb",
                "md5": "d8d19bcf98c16a4ffa42ad6736e17ac3",
                "sha256": "ece8d2d98352b0f5f1b71969a24db44423706b2eaa42b28ff091346a52224997"
            },
            "downloads": -1,
            "filename": "veda_lib-0.0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "d8d19bcf98c16a4ffa42ad6736e17ac3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 23878,
            "upload_time": "2024-08-21T13:36:05",
            "upload_time_iso_8601": "2024-08-21T13:36:05.654936Z",
            "url": "https://files.pythonhosted.org/packages/eb/1f/39f85ae5770fa5f7f4b86c7abc6545e06e5b50fed448d4b7ba7b59fbc4cb/veda_lib-0.0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-21 13:36:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "vishallmaurya",
    "github_project": "VEDA",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "veda-lib"
}
        
Elapsed time: 0.35212s