mlwizard


Namemlwizard JSON
Version 1.0.1 PyPI version JSON
download
home_page
SummaryLet's make building machine learning models the complex way, easy.
upload_time2024-01-08 03:22:36
maintainer
docs_urlNone
authorTechLeo (Onyiriuba Leonard Chukwubuikem)
requires_python>=3.0
licenseMIT
keywords machine learning data science data preprocessing supervised learning data exploration ml framework data cleaning regression classification machine learning toolkit
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MLWizard

MLWizard is a Python machine learning library designed to simplify the process of data preparation, feature engineering, model building, and evaluation. It provides a collection of tools for both classification and regression tasks, as well as functionalities for data exploration and manipulation. MLWizard is a distribution of the [TechLeo](https://www.linkedin.com/in/techleo/) community with the aim of making the complex, easy.

## Author

**TechLeo**

- **Email:** techleo.ng@outlook.com
- **GitHub:** [TechLeo GitHub](https://github.com/TechLeo)
- **LinkedIn:** [TechLeo LinkedIn](https://www.linkedin.com/in/techleo/)

## Contact

For inquiries, suggestions, or feedback, please feel free to reach out to the author:

- **Email:** techleo.ng@outlook.com
- **GitHub Issues:** [MLWizard Issues](https://github.com/TechLeo-Libraries/mlwizard/issues)
- **LinkedIn Messages:** [TechLeo LinkedIn](https://www.linkedin.com/in/techleo/)

Your feedback is valuable and contributes to the continuous improvement of MLWizard. The author welcomes collaboration and looks forward to hearing from the users of MLWizard.


## Features
Features from current release

### Data Loading and Handling
- `get_dataset`: Load a dataset.
- `get_training_test_data`: Split the dataset into training and test sets.
- `load_large_dataset`: Load a large dataset efficiently.
- `reduce_data_memory_useage`: Reduce memory usage of the dataset.

### Data Cleaning and Manipulation
- `drop_columns`: Drop specified columns from the dataset.
- `fix_missing_values`: Handle missing values in the dataset.
- `fix_unbalanced_dataset`: Address class imbalance in a classification dataset.
- `filter_data`: Filter data based on specified conditions.
- `remove_duplicates`: Remove duplicate rows from the dataset.
- `rename_columns`: Rename columns in the dataset.
- `replace_values`: Replace specified values in the dataset.
- `reset_index`: Reset the index of the dataset.
- `set_index`: Set a specific column as the index.
- `sort_index`: Sort the index of the dataset.
- `sort_values`: Sort the values of the dataset.

### Data Formatting and Transformation
- `categorical_to_datetime`: Convert categorical columns to datetime format.
- `categorical_to_numerical`: Convert categorical columns to numerical format.
- `numerical_to_categorical`: Convert numerical columns to categorical format.
- `column_binning`: Bin values in a column into specified bins.

### Exploratory Data Analysis
- `eda`: Perform exploratory data analysis on the dataset.
- `eda_visual`: Visualize exploratory data analysis results.
- `pandas_profiling`: Generate a Pandas Profiling report for the dataset.
- `sweetviz_profile_report`: Generate a Sweetviz Profile Report for the dataset.
- `count_column_categories`: Count the categories in a categorical column.
- `unique_elements_in_columns`: Get the unique elements that exist in each column in the dataset.

### Feature Engineering
- `extract_date_features`: Extract date-related features from a datetime column.
- `polyreg_x`: Get the polynomial regression x for independent variables after specifying the degree.
- `select_features`: Select relevant features for modeling.
- `select_dependent_and_independent`: Select dependent and independent variables.

### Data Preprocessing
- `scale_independent_variables`: Scale independent variables in the dataset.
- `remove_outlier`: Remove outliers from the dataset.
- `split_data`: Split the dataset into training and test sets.

### Model Building and Evaluation
- `get_bestK_KNNregressor`: Find the best K value for KNN regression.
- `train_model_regressor`: Train a regression model.
- `regressor_predict`: Make predictions using a regression model.
- `regressor_evaluation`: Evaluate the performance of a regression model.
- `regressor_model_testing`: Test a regression model.
- `polyreg_graph`: Visualize a polynomial regression graph.
- `simple_linregres_graph`: Visualize a regression graph.
- `build_multiple_regressors`: Build multiple regression models.
- `build_multiple_regressors_from_features`: Build regression models using selected features.
- `build_single_regressor_from_features`: Build a single regression model using selected features.
- `get_bestK_KNNclassifier`: Find the best K value for KNN classification.
- `train_model_classifier`: Train a classification model.
- `classifier_predict`: Make predictions using a classification model.
- `classifier_evaluation`: Evaluate the performance of a classification model.
- `classifier_model_testing`: Test a classification model.
- `classifier_graph`: Visualize a classification graph.
- `build_multiple_classifiers`: Build multiple classification models.
- `build_multiple_classifiers_from_features`: Build classification models using selected features.
- `build_single_classifier_from_features`: Build a single classification model using selected features.

### Data Aggregation and Summarization
- `group_data`: Group and summarize data based on specified conditions.

### Data Type Handling
- `select_datatype`: Select columns of a specific datatype in the dataset.

## Installation

You can install MLWizard using pip:

```bash
pip install mlwizard
```


## Useage
from mlwizard import SupervisedLearning

# Example usage
```bash
import numpy as np
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, DecisionTreeClassifier
from sklearn.snm import SVC


dataset = pd.read_csv("Your_file_path")  # Load your dataset(e.g Pandas DataFrame)
data = SupervisedLearning(dataset)

# Exploratory Data Analysis
eda = data.eda()
eda_visual = data.eda_visual()

# Build and Evaluate Classifier
classifiers = ["LogisticRegression(random_state = 0)", "RandomForestClassifier(random_state = 0)", "DecisionTreeClassifier(random_state = 0)", "SVC()"]
build_model = data.build_multiple_classifiers()
```

## Acknowledgments
MLWizard relies on several open-source libraries to provide its functionality. We would like to express our gratitude to the developers and contributors of the following libraries:

- [NumPy](https://numpy.org/)
- [Pandas](https://pandas.pydata.org/)
- [Matplotlib](https://matplotlib.org/)
- [Seaborn](https://seaborn.pydata.org/)
- [yData Profiling](https://github.com/ydataai/ydata-profiling)
- [Sweetviz](https://github.com/fbdesignpro/sweetviz)
- [Imbalanced-Learn (imblearn)](https://imbalanced-learn.org/)
- [Scikit-learn](https://scikit-learn.org/)
- [Warnings](https://docs.python.org/3/library/warnings.html)
- [Datatable](https://datatable.readthedocs.io/en/latest/)

The MLWizard library builds upon the functionality provided by these excellent tools, We sincerely thank the maintainers and contributors of these libraries for their valuable contributions to the open-source community.


## License
MLWizard is distributed under the MIT License. Feel free to use, modify, and distribute it according to the terms of the license.


## Changelog

### v1.0.1 (January 2024):

- First release


## Contributors
We'd like to express our gratitude to the following contributors that have influenced and supported MLWizard:

- [Onyiriuba Leonard](https://www.linkedin.com/in/chukwubuikem-leonard-onyiriuba/): for overseeing the entire project development lifecycle.
- Role: Project Lead and Maintainer.
- Email: workwithtechleo@gmail.com.
<br>


- [The TechLeo Community](https://www.linkedin.com/in/techleo/): for allowing the use of this project as a way to explain, learn, test, understand, and make easy, the machine learning process. 
- Role: Testers.
- Email: techleo.ng@gmail.com.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "mlwizard",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.0",
    "maintainer_email": "",
    "keywords": "machine learning,data science,data preprocessing,supervised learning,data exploration,ML framework,data cleaning,regression,classification,machine learning toolkit",
    "author": "TechLeo (Onyiriuba Leonard Chukwubuikem)",
    "author_email": "<techleo.ng@outlook.com>",
    "download_url": "https://files.pythonhosted.org/packages/6a/4c/800157480f3f36a3bc28b99c3bff0e4b373568b8a388f2d6ed89e7a2027d/mlwizard-1.0.1.tar.gz",
    "platform": null,
    "description": "# MLWizard\r\n\r\nMLWizard is a Python machine learning library designed to simplify the process of data preparation, feature engineering, model building, and evaluation. It provides a collection of tools for both classification and regression tasks, as well as functionalities for data exploration and manipulation. MLWizard is a distribution of the [TechLeo](https://www.linkedin.com/in/techleo/) community with the aim of making the complex, easy.\r\n\r\n## Author\r\n\r\n**TechLeo**\r\n\r\n- **Email:** techleo.ng@outlook.com\r\n- **GitHub:** [TechLeo GitHub](https://github.com/TechLeo)\r\n- **LinkedIn:** [TechLeo LinkedIn](https://www.linkedin.com/in/techleo/)\r\n\r\n## Contact\r\n\r\nFor inquiries, suggestions, or feedback, please feel free to reach out to the author:\r\n\r\n- **Email:** techleo.ng@outlook.com\r\n- **GitHub Issues:** [MLWizard Issues](https://github.com/TechLeo-Libraries/mlwizard/issues)\r\n- **LinkedIn Messages:** [TechLeo LinkedIn](https://www.linkedin.com/in/techleo/)\r\n\r\nYour feedback is valuable and contributes to the continuous improvement of MLWizard. The author welcomes collaboration and looks forward to hearing from the users of MLWizard.\r\n\r\n\r\n## Features\r\nFeatures from current release\r\n\r\n### Data Loading and Handling\r\n- `get_dataset`: Load a dataset.\r\n- `get_training_test_data`: Split the dataset into training and test sets.\r\n- `load_large_dataset`: Load a large dataset efficiently.\r\n- `reduce_data_memory_useage`: Reduce memory usage of the dataset.\r\n\r\n### Data Cleaning and Manipulation\r\n- `drop_columns`: Drop specified columns from the dataset.\r\n- `fix_missing_values`: Handle missing values in the dataset.\r\n- `fix_unbalanced_dataset`: Address class imbalance in a classification dataset.\r\n- `filter_data`: Filter data based on specified conditions.\r\n- `remove_duplicates`: Remove duplicate rows from the dataset.\r\n- `rename_columns`: Rename columns in the dataset.\r\n- `replace_values`: Replace specified values in the dataset.\r\n- `reset_index`: Reset the index of the dataset.\r\n- `set_index`: Set a specific column as the index.\r\n- `sort_index`: Sort the index of the dataset.\r\n- `sort_values`: Sort the values of the dataset.\r\n\r\n### Data Formatting and Transformation\r\n- `categorical_to_datetime`: Convert categorical columns to datetime format.\r\n- `categorical_to_numerical`: Convert categorical columns to numerical format.\r\n- `numerical_to_categorical`: Convert numerical columns to categorical format.\r\n- `column_binning`: Bin values in a column into specified bins.\r\n\r\n### Exploratory Data Analysis\r\n- `eda`: Perform exploratory data analysis on the dataset.\r\n- `eda_visual`: Visualize exploratory data analysis results.\r\n- `pandas_profiling`: Generate a Pandas Profiling report for the dataset.\r\n- `sweetviz_profile_report`: Generate a Sweetviz Profile Report for the dataset.\r\n- `count_column_categories`: Count the categories in a categorical column.\r\n- `unique_elements_in_columns`: Get the unique elements that exist in each column in the dataset.\r\n\r\n### Feature Engineering\r\n- `extract_date_features`: Extract date-related features from a datetime column.\r\n- `polyreg_x`: Get the polynomial regression x for independent variables after specifying the degree.\r\n- `select_features`: Select relevant features for modeling.\r\n- `select_dependent_and_independent`: Select dependent and independent variables.\r\n\r\n### Data Preprocessing\r\n- `scale_independent_variables`: Scale independent variables in the dataset.\r\n- `remove_outlier`: Remove outliers from the dataset.\r\n- `split_data`: Split the dataset into training and test sets.\r\n\r\n### Model Building and Evaluation\r\n- `get_bestK_KNNregressor`: Find the best K value for KNN regression.\r\n- `train_model_regressor`: Train a regression model.\r\n- `regressor_predict`: Make predictions using a regression model.\r\n- `regressor_evaluation`: Evaluate the performance of a regression model.\r\n- `regressor_model_testing`: Test a regression model.\r\n- `polyreg_graph`: Visualize a polynomial regression graph.\r\n- `simple_linregres_graph`: Visualize a regression graph.\r\n- `build_multiple_regressors`: Build multiple regression models.\r\n- `build_multiple_regressors_from_features`: Build regression models using selected features.\r\n- `build_single_regressor_from_features`: Build a single regression model using selected features.\r\n- `get_bestK_KNNclassifier`: Find the best K value for KNN classification.\r\n- `train_model_classifier`: Train a classification model.\r\n- `classifier_predict`: Make predictions using a classification model.\r\n- `classifier_evaluation`: Evaluate the performance of a classification model.\r\n- `classifier_model_testing`: Test a classification model.\r\n- `classifier_graph`: Visualize a classification graph.\r\n- `build_multiple_classifiers`: Build multiple classification models.\r\n- `build_multiple_classifiers_from_features`: Build classification models using selected features.\r\n- `build_single_classifier_from_features`: Build a single classification model using selected features.\r\n\r\n### Data Aggregation and Summarization\r\n- `group_data`: Group and summarize data based on specified conditions.\r\n\r\n### Data Type Handling\r\n- `select_datatype`: Select columns of a specific datatype in the dataset.\r\n\r\n## Installation\r\n\r\nYou can install MLWizard using pip:\r\n\r\n```bash\r\npip install mlwizard\r\n```\r\n\r\n\r\n## Useage\r\nfrom mlwizard import SupervisedLearning\r\n\r\n# Example usage\r\n```bash\r\nimport numpy as np\r\nimport pandas as pd\r\nfrom sklearn.linear_model import LogisticRegression\r\nfrom sklearn.ensemble import RandomForestClassifier, DecisionTreeClassifier\r\nfrom sklearn.snm import SVC\r\n\r\n\r\ndataset = pd.read_csv(\"Your_file_path\")  # Load your dataset(e.g Pandas DataFrame)\r\ndata = SupervisedLearning(dataset)\r\n\r\n# Exploratory Data Analysis\r\neda = data.eda()\r\neda_visual = data.eda_visual()\r\n\r\n# Build and Evaluate Classifier\r\nclassifiers = [\"LogisticRegression(random_state = 0)\", \"RandomForestClassifier(random_state = 0)\", \"DecisionTreeClassifier(random_state = 0)\", \"SVC()\"]\r\nbuild_model = data.build_multiple_classifiers()\r\n```\r\n\r\n## Acknowledgments\r\nMLWizard relies on several open-source libraries to provide its functionality. We would like to express our gratitude to the developers and contributors of the following libraries:\r\n\r\n- [NumPy](https://numpy.org/)\r\n- [Pandas](https://pandas.pydata.org/)\r\n- [Matplotlib](https://matplotlib.org/)\r\n- [Seaborn](https://seaborn.pydata.org/)\r\n- [yData Profiling](https://github.com/ydataai/ydata-profiling)\r\n- [Sweetviz](https://github.com/fbdesignpro/sweetviz)\r\n- [Imbalanced-Learn (imblearn)](https://imbalanced-learn.org/)\r\n- [Scikit-learn](https://scikit-learn.org/)\r\n- [Warnings](https://docs.python.org/3/library/warnings.html)\r\n- [Datatable](https://datatable.readthedocs.io/en/latest/)\r\n\r\nThe MLWizard library builds upon the functionality provided by these excellent tools, We sincerely thank the maintainers and contributors of these libraries for their valuable contributions to the open-source community.\r\n\r\n\r\n## License\r\nMLWizard is distributed under the MIT License. Feel free to use, modify, and distribute it according to the terms of the license.\r\n\r\n\r\n## Changelog\r\n\r\n### v1.0.1 (January 2024):\r\n\r\n- First release\r\n\r\n\r\n## Contributors\r\nWe'd like to express our gratitude to the following contributors that have influenced and supported MLWizard:\r\n\r\n- [Onyiriuba Leonard](https://www.linkedin.com/in/chukwubuikem-leonard-onyiriuba/): for overseeing the entire project development lifecycle.\r\n- Role: Project Lead and Maintainer.\r\n- Email: workwithtechleo@gmail.com.\r\n<br>\r\n\r\n\r\n- [The TechLeo Community](https://www.linkedin.com/in/techleo/): for allowing the use of this project as a way to explain, learn, test, understand, and make easy, the machine learning process. \r\n- Role: Testers.\r\n- Email: techleo.ng@gmail.com.\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Let's make building machine learning models the complex way, easy.",
    "version": "1.0.1",
    "project_urls": null,
    "split_keywords": [
        "machine learning",
        "data science",
        "data preprocessing",
        "supervised learning",
        "data exploration",
        "ml framework",
        "data cleaning",
        "regression",
        "classification",
        "machine learning toolkit"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ed80f2d2ea1212bd4183aea150bf68e37e099e28c2cc3b9ec6671d8533b359bc",
                "md5": "2572f9bbe9c3a2d7b052f8da9c7d46e6",
                "sha256": "b688e8e33f3d20f03cda11744f7e6de3a727fb11e51c932c636afa26590f3b20"
            },
            "downloads": -1,
            "filename": "mlwizard-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2572f9bbe9c3a2d7b052f8da9c7d46e6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.0",
            "size": 49155,
            "upload_time": "2024-01-08T03:22:35",
            "upload_time_iso_8601": "2024-01-08T03:22:35.475752Z",
            "url": "https://files.pythonhosted.org/packages/ed/80/f2d2ea1212bd4183aea150bf68e37e099e28c2cc3b9ec6671d8533b359bc/mlwizard-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6a4c800157480f3f36a3bc28b99c3bff0e4b373568b8a388f2d6ed89e7a2027d",
                "md5": "0fe8a61be123b1aefa16e88d7cfb31e7",
                "sha256": "af15b185f33097bbbf62fbe3bb3791f4ad17f00c8f4c4e6b24a022365309924d"
            },
            "downloads": -1,
            "filename": "mlwizard-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0fe8a61be123b1aefa16e88d7cfb31e7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.0",
            "size": 48481,
            "upload_time": "2024-01-08T03:22:36",
            "upload_time_iso_8601": "2024-01-08T03:22:36.611783Z",
            "url": "https://files.pythonhosted.org/packages/6a/4c/800157480f3f36a3bc28b99c3bff0e4b373568b8a388f2d6ed89e7a2027d/mlwizard-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-08 03:22:36",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "mlwizard"
}
        
Elapsed time: 2.57819s