preprocessinglib-tonga-gumustakim


Namepreprocessinglib-tonga-gumustakim JSON
Version 0.5 PyPI version JSON
download
home_pagehttps://github.com/Fzehzeh/mypreprocessinglib , https://github.com/ayserragm/mypreprocessinglib
SummaryA comprehensive data preprocessing library for Python
upload_time2024-05-24 11:38:42
maintainerNone
docs_urlNone
authorZehra Tonga-Ayse Serra Gumustakim
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            PreprocessingLib
PreprocessingLib is a Python library designed to facilitate data preprocessing steps. It provides various classes and functions to automate the process of cleaning, transforming, and engineering features in datasets.

Features
1. Missing Value Handling
Detect missing values in a dataset.
Fill missing values using mean, median, or a constant value.
Remove rows or columns with missing values.
2. Feature Engineering
Create new features based on existing ones.
3. Date and Time Handling
Extract features like year, month, day, and day of the week from datetime columns.
4. Data Type Conversion
Convert columns to numeric or categorical data types.
5. Categorical Encoding
Perform one-hot encoding or label encoding on categorical variables.
6. Outlier Handling
Detect outliers in numerical data.
Handle outliers by removing or replacing them.
7. Data Scaling
Standardize or normalize numerical data.
8. Text Cleaning
Clean text data by removing punctuation, stop words, and lemmatizing words.
Installation
You can install PreprocessingLib using pip:

pip install preprocessinglib
Usage
Here's how you can use PreprocessingLib in your Python projects:

from mypreprocessinglib import FeatureEngineer, MissingValueHandler, DateTimeHandler, DataTypeConverter, CategoricalEncoder, OutlierHandler, Scaler, TextCleaner
import pandas as pd

# Load sample dataset
data = pd.read_csv("sample_dataset.csv")

# Example usage of preprocessing functions
missing_handler = MissingValueHandler()
filled_data = missing_handler.fill_missing_values(data)

data_with_new_features = FeatureEngineer.create_new_features(data, column1='Column1', column2='Column2')

date_with_features = DateTimeHandler.extract_date_features(data, column='DateColumn')

numeric_data = DataTypeConverter.convert_to_numeric(data, columns=['Column1', 'Column2'])

encoded_data = CategoricalEncoder.one_hot_encode(data, columns=['CategoricalColumn'])

outliers_removed_data = OutlierHandler.handle_outliers(data, method='drop')

scaled_data = Scaler.standardize_data(data)

cleaned_text = TextCleaner.clean_text("example text")
Testing
You can run the unit tests to ensure the proper functioning of the library:

python -m unittest test_data_preprocessing.py
Contributing
Contributions are welcome! If you find any issues or have suggestions for improvements, please feel free to open an issue or submit a pull request.

License
This project is licensed under the MIT License - see the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Fzehzeh/mypreprocessinglib , https://github.com/ayserragm/mypreprocessinglib",
    "name": "preprocessinglib-tonga-gumustakim",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Zehra Tonga-Ayse Serra Gumustakim",
    "author_email": "tongafatmazehra@gmail.com-ayseserra.gumustakim@stu.fsm.edu.tr",
    "download_url": "https://files.pythonhosted.org/packages/06/5e/fc99f92b1e4361edb8745285635395c35fd44a37d12317f474c06657f00a/preprocessinglib_tonga_gumustakim-0.5.tar.gz",
    "platform": null,
    "description": "PreprocessingLib\r\nPreprocessingLib is a Python library designed to facilitate data preprocessing steps. It provides various classes and functions to automate the process of cleaning, transforming, and engineering features in datasets.\r\n\r\nFeatures\r\n1. Missing Value Handling\r\nDetect missing values in a dataset.\r\nFill missing values using mean, median, or a constant value.\r\nRemove rows or columns with missing values.\r\n2. Feature Engineering\r\nCreate new features based on existing ones.\r\n3. Date and Time Handling\r\nExtract features like year, month, day, and day of the week from datetime columns.\r\n4. Data Type Conversion\r\nConvert columns to numeric or categorical data types.\r\n5. Categorical Encoding\r\nPerform one-hot encoding or label encoding on categorical variables.\r\n6. Outlier Handling\r\nDetect outliers in numerical data.\r\nHandle outliers by removing or replacing them.\r\n7. Data Scaling\r\nStandardize or normalize numerical data.\r\n8. Text Cleaning\r\nClean text data by removing punctuation, stop words, and lemmatizing words.\r\nInstallation\r\nYou can install PreprocessingLib using pip:\r\n\r\npip install preprocessinglib\r\nUsage\r\nHere's how you can use PreprocessingLib in your Python projects:\r\n\r\nfrom mypreprocessinglib import FeatureEngineer, MissingValueHandler, DateTimeHandler, DataTypeConverter, CategoricalEncoder, OutlierHandler, Scaler, TextCleaner\r\nimport pandas as pd\r\n\r\n# Load sample dataset\r\ndata = pd.read_csv(\"sample_dataset.csv\")\r\n\r\n# Example usage of preprocessing functions\r\nmissing_handler = MissingValueHandler()\r\nfilled_data = missing_handler.fill_missing_values(data)\r\n\r\ndata_with_new_features = FeatureEngineer.create_new_features(data, column1='Column1', column2='Column2')\r\n\r\ndate_with_features = DateTimeHandler.extract_date_features(data, column='DateColumn')\r\n\r\nnumeric_data = DataTypeConverter.convert_to_numeric(data, columns=['Column1', 'Column2'])\r\n\r\nencoded_data = CategoricalEncoder.one_hot_encode(data, columns=['CategoricalColumn'])\r\n\r\noutliers_removed_data = OutlierHandler.handle_outliers(data, method='drop')\r\n\r\nscaled_data = Scaler.standardize_data(data)\r\n\r\ncleaned_text = TextCleaner.clean_text(\"example text\")\r\nTesting\r\nYou can run the unit tests to ensure the proper functioning of the library:\r\n\r\npython -m unittest test_data_preprocessing.py\r\nContributing\r\nContributions are welcome! If you find any issues or have suggestions for improvements, please feel free to open an issue or submit a pull request.\r\n\r\nLicense\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A comprehensive data preprocessing library for Python",
    "version": "0.5",
    "project_urls": {
        "Homepage": "https://github.com/Fzehzeh/mypreprocessinglib , https://github.com/ayserragm/mypreprocessinglib"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2bdb819cac21d9cb95752e6828a07a37f4c60428ca595737a8c6b6ac5d838457",
                "md5": "88a3596d690ced9ba112b24cd75f649e",
                "sha256": "febde82e84e6a2c4173120af7534782a67b9e70fcce827977c2e2d352b5c4933"
            },
            "downloads": -1,
            "filename": "preprocessinglib_tonga_gumustakim-0.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "88a3596d690ced9ba112b24cd75f649e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 8505,
            "upload_time": "2024-05-24T11:38:36",
            "upload_time_iso_8601": "2024-05-24T11:38:36.329918Z",
            "url": "https://files.pythonhosted.org/packages/2b/db/819cac21d9cb95752e6828a07a37f4c60428ca595737a8c6b6ac5d838457/preprocessinglib_tonga_gumustakim-0.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "065efc99f92b1e4361edb8745285635395c35fd44a37d12317f474c06657f00a",
                "md5": "39694c7468a5e88a114517f286f7cd47",
                "sha256": "0a5b4f808c273f45b5d86e7dbdc782ec972b27c11edad029ccb2870a758c209d"
            },
            "downloads": -1,
            "filename": "preprocessinglib_tonga_gumustakim-0.5.tar.gz",
            "has_sig": false,
            "md5_digest": "39694c7468a5e88a114517f286f7cd47",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6740,
            "upload_time": "2024-05-24T11:38:42",
            "upload_time_iso_8601": "2024-05-24T11:38:42.448036Z",
            "url": "https://files.pythonhosted.org/packages/06/5e/fc99f92b1e4361edb8745285635395c35fd44a37d12317f474c06657f00a/preprocessinglib_tonga_gumustakim-0.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-24 11:38:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Fzehzeh",
    "github_project": "mypreprocessinglib , https:",
    "github_not_found": true,
    "lcname": "preprocessinglib-tonga-gumustakim"
}
        
Elapsed time: 0.22536s