classifier-agent

Name	classifier-agent JSON
Version	1.0.1 JSON
	download
home_page	https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package
Summary	A Python package for performing classification on datasets in CSV or Excel format.
upload_time	2024-08-01 15:22:00
maintainer	None
docs_url	None
author	Adnan Karol
requires_python	>=3.10
license	MIT
keywords	machine learning classification random forest xgboost svm logistic regression naive bayes knn decision tree
VCS
bugtrack_url
requirements	pandas numpy scikit-learn joblib matplotlib seaborn termcolor
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Machine Learning Classification Python Package

![Python](https://img.shields.io/badge/Python-3776AB?style=for-the-badge&logo=python&logoColor=white)
![Scikit-Learn](https://img.shields.io/badge/scikit_learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)

## Overview

This Python package provides a comprehensive solution for performing classification tasks using various popular machine learning algorithms. It allows you to read a dataset, preprocess it, train multiple classifiers, perform hyperparameter tuning, and visualize model performance. Additionally, it provides options for scaling data, saving trained models, and customizing the output display.

## Features

1. **Classification Algorithms**:
   - Logistic Regression
   - K-Nearest Neighbors
   - Decision Tree
   - Random Forest
   - Gradient Boosting
   - Support Vector Classifier
   - Gaussian Naive Bayes
   - Bernoulli Naive Bayes

2. **Advanced Functionality**:
   - **Data Scaling**: Options for Min-Max Scaling or Standard Normalization.
   - **Hyperparameter Tuning**: Option to perform Grid Search for finding the best model parameters.
   - **Model Persistence**: Save and load trained models using `joblib`.
   - **Visualization**: Option to plot confusion matrices and detailed classification reports.
   - **Cross-Validation**: Evaluate models using cross-validation scores.
   - **Configurable Outputs**: Options to control the display of confusion matrices and classification reports.

3. **Results**:
   - Returns a DataFrame with model names, accuracy, F1-score, and optionally the best hyperparameters.

## Parameters

The package takes the following parameters as input:
- `dataset`: Path to the CSV or Excel dataset file or a pandas DataFrame.
- `output_column`: Name of the output column containing the target variable.
- `train_test_ratio`: Ratio in which the dataset is divided into train and test splits (must be between 0 and 1).
- `scaling_method` (optional): Method to scale the data ('minmax' or 'normalize').
- `perform_grid_search` (optional): Whether to perform grid search for hyperparameter tuning (default is `False`).
- `save_models` (optional): Whether to save trained models to disk (default is `False`).
- `show_confusion_matrix` (optional): Whether to display confusion matrix plots (default is `False`).
- `show_classification_report` (optional): Whether to print classification reports (default is `False`).

## Installation

Make sure you have Python installed on your system. You can install the package using pip:

```sh
pip install classifier_agent
```

## Usage

Here's an example of how to use the package:

```python
from classifier_agent import classifier_agent

dataset_path = "diabetes.csv"
output_column = "Outcome"
train_test_ratio = 0.25
scaling_method = 'minmax'  # Choose 'minmax' or 'normalize'
perform_grid_search = True  # Whether to perform grid search
save_models = True  # Whether to save models
show_confusion_matrix = True  # Whether to plot the confusion matrix
show_classification_report = True  # Whether to print the classification report

results = classifier_agent(dataset_path, output_column, train_test_ratio, scaling_method, perform_grid_search, save_models, show_confusion_matrix, show_classification_report)
print(results)
```

## Example Output

The output is a DataFrame that looks like this:

| Classifier              | Accuracy | F1-Score | Best Parameters |
|-------------------------|----------|----------|-----------------|
| KNeighborsClassifier    | 0.78     | 0.76     | {'n_neighbors': 5, 'weights': 'uniform'} |
| LogisticRegression      | 0.80     | 0.79     | {'C': 0.1, 'solver': 'liblinear'} |
| DecisionTreeClassifier  | 0.72     | 0.70     | {'criterion': 'entropy', 'max_depth': 20} |
| RandomForestClassifier  | 0.85     | 0.84     | {'n_estimators': 200, 'max_depth': 20} |
| GradientBoostingClassifier | 0.83 | 0.82     | {'n_estimators': 200, 'learning_rate': 0.1} |
| SVC                     | 0.81     | 0.80     | {'C': 1, 'kernel': 'rbf'} |
| GaussianNB              | 0.75     | 0.73     | {} |
| BernoulliNB             | 0.73     | 0.72     | {} |

## Notes

- The package is actively developed and may receive updates.
- The project is developed with Python version `3.10`.
- If you encounter any issues or have questions, feel free to contact me on [LinkedIn](https://www.linkedin.com/in/adnan-karol-aa1666179/).

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Publishing to PyPI

To publish this package to PyPI, follow these steps:

1. **Ensure Your Package is Ready:**
   - Make sure your `setup.py` and `README.md` are correctly configured.
   - Verify that your package is properly structured and tested.

2. **Create Distribution Archives:**
   Run the following command to create distribution archives of your package:
   ```sh
   python setup.py sdist bdist_wheel
   ```

3. **Install Twine:**
   If you haven't already, install Twine, a utility for publishing packages to PyPI:
   ```sh
   pip install twine
   ```

4. **Upload to PyPI:**
   Use Twine to upload your package to PyPI:
   ```sh
   twine upload dist/*
   ```
   You will be prompted to enter your PyPI username and password.

5. **Verify Upload:**
   After uploading, check your package on [PyPI](https://pypi.org/) to ensure it appears correctly.

For more detailed instructions, refer to the [PyPI documentation](https://packaging.python.org/tutorials/packaging-projects/).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package",
    "name": "classifier-agent",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "machine learning, classification, random forest, xgboost, svm, logistic regression, naive bayes, knn, decision tree",
    "author": "Adnan Karol",
    "author_email": "adnanmushtaq5@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/bb/d5/1805c26cee6832ccd233528a002f54017bd26525b7e622b9af77471b9963/classifier_agent-1.0.1.tar.gz",
    "platform": null,
    "description": "# Machine Learning Classification Python Package\n\n![Python](https://img.shields.io/badge/Python-3776AB?style=for-the-badge&logo=python&logoColor=white)\n![Scikit-Learn](https://img.shields.io/badge/scikit_learn-F7931E?style=for-the-badge&logo=scikit-learn&logoColor=white)\n\n## Overview\n\nThis Python package provides a comprehensive solution for performing classification tasks using various popular machine learning algorithms. It allows you to read a dataset, preprocess it, train multiple classifiers, perform hyperparameter tuning, and visualize model performance. Additionally, it provides options for scaling data, saving trained models, and customizing the output display.\n\n## Features\n\n1. **Classification Algorithms**:\n   - Logistic Regression\n   - K-Nearest Neighbors\n   - Decision Tree\n   - Random Forest\n   - Gradient Boosting\n   - Support Vector Classifier\n   - Gaussian Naive Bayes\n   - Bernoulli Naive Bayes\n\n2. **Advanced Functionality**:\n   - **Data Scaling**: Options for Min-Max Scaling or Standard Normalization.\n   - **Hyperparameter Tuning**: Option to perform Grid Search for finding the best model parameters.\n   - **Model Persistence**: Save and load trained models using `joblib`.\n   - **Visualization**: Option to plot confusion matrices and detailed classification reports.\n   - **Cross-Validation**: Evaluate models using cross-validation scores.\n   - **Configurable Outputs**: Options to control the display of confusion matrices and classification reports.\n\n3. **Results**:\n   - Returns a DataFrame with model names, accuracy, F1-score, and optionally the best hyperparameters.\n\n## Parameters\n\nThe package takes the following parameters as input:\n- `dataset`: Path to the CSV or Excel dataset file or a pandas DataFrame.\n- `output_column`: Name of the output column containing the target variable.\n- `train_test_ratio`: Ratio in which the dataset is divided into train and test splits (must be between 0 and 1).\n- `scaling_method` (optional): Method to scale the data ('minmax' or 'normalize').\n- `perform_grid_search` (optional): Whether to perform grid search for hyperparameter tuning (default is `False`).\n- `save_models` (optional): Whether to save trained models to disk (default is `False`).\n- `show_confusion_matrix` (optional): Whether to display confusion matrix plots (default is `False`).\n- `show_classification_report` (optional): Whether to print classification reports (default is `False`).\n\n## Installation\n\nMake sure you have Python installed on your system. You can install the package using pip:\n\n```sh\npip install classifier_agent\n```\n\n## Usage\n\nHere's an example of how to use the package:\n\n```python\nfrom classifier_agent import classifier_agent\n\ndataset_path = \"diabetes.csv\"\noutput_column = \"Outcome\"\ntrain_test_ratio = 0.25\nscaling_method = 'minmax'  # Choose 'minmax' or 'normalize'\nperform_grid_search = True  # Whether to perform grid search\nsave_models = True  # Whether to save models\nshow_confusion_matrix = True  # Whether to plot the confusion matrix\nshow_classification_report = True  # Whether to print the classification report\n\nresults = classifier_agent(dataset_path, output_column, train_test_ratio, scaling_method, perform_grid_search, save_models, show_confusion_matrix, show_classification_report)\nprint(results)\n```\n\n## Example Output\n\nThe output is a DataFrame that looks like this:\n\n| Classifier              | Accuracy | F1-Score | Best Parameters |\n|-------------------------|----------|----------|-----------------|\n| KNeighborsClassifier    | 0.78     | 0.76     | {'n_neighbors': 5, 'weights': 'uniform'} |\n| LogisticRegression      | 0.80     | 0.79     | {'C': 0.1, 'solver': 'liblinear'} |\n| DecisionTreeClassifier  | 0.72     | 0.70     | {'criterion': 'entropy', 'max_depth': 20} |\n| RandomForestClassifier  | 0.85     | 0.84     | {'n_estimators': 200, 'max_depth': 20} |\n| GradientBoostingClassifier | 0.83 | 0.82     | {'n_estimators': 200, 'learning_rate': 0.1} |\n| SVC                     | 0.81     | 0.80     | {'C': 1, 'kernel': 'rbf'} |\n| GaussianNB              | 0.75     | 0.73     | {} |\n| BernoulliNB             | 0.73     | 0.72     | {} |\n\n## Notes\n\n- The package is actively developed and may receive updates.\n- The project is developed with Python version `3.10`.\n- If you encounter any issues or have questions, feel free to contact me on [LinkedIn](https://www.linkedin.com/in/adnan-karol-aa1666179/).\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Publishing to PyPI\n\nTo publish this package to PyPI, follow these steps:\n\n1. **Ensure Your Package is Ready:**\n   - Make sure your `setup.py` and `README.md` are correctly configured.\n   - Verify that your package is properly structured and tested.\n\n2. **Create Distribution Archives:**\n   Run the following command to create distribution archives of your package:\n   ```sh\n   python setup.py sdist bdist_wheel\n   ```\n\n3. **Install Twine:**\n   If you haven't already, install Twine, a utility for publishing packages to PyPI:\n   ```sh\n   pip install twine\n   ```\n\n4. **Upload to PyPI:**\n   Use Twine to upload your package to PyPI:\n   ```sh\n   twine upload dist/*\n   ```\n   You will be prompted to enter your PyPI username and password.\n\n5. **Verify Upload:**\n   After uploading, check your package on [PyPI](https://pypi.org/) to ensure it appears correctly.\n\nFor more detailed instructions, refer to the [PyPI documentation](https://packaging.python.org/tutorials/packaging-projects/).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for performing classification on datasets in CSV or Excel format.",
    "version": "1.0.1",
    "project_urls": {
        "Documentation": "https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package",
        "Homepage": "https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package",
        "Source": "https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package",
        "Tracker": "https://github.com/adnanmushtaq1996/ML-Classifier-Python-Package/issues"
    },
    "split_keywords": [
        "machine learning",
        " classification",
        " random forest",
        " xgboost",
        " svm",
        " logistic regression",
        " naive bayes",
        " knn",
        " decision tree"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "417d7b2f3d388db2ba191d21568cc38b8cf5da1f9d71989da366f2f08dea126c",
                "md5": "a21cd7020b995913c52cc6bd178ef52b",
                "sha256": "0c883d2ca58255209355169c5871b06c5ec2e4ac86abdbdb557ab84b5de3bf9d"
            },
            "downloads": -1,
            "filename": "classifier_agent-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a21cd7020b995913c52cc6bd178ef52b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 3939,
            "upload_time": "2024-08-01T15:21:59",
            "upload_time_iso_8601": "2024-08-01T15:21:59.219342Z",
            "url": "https://files.pythonhosted.org/packages/41/7d/7b2f3d388db2ba191d21568cc38b8cf5da1f9d71989da366f2f08dea126c/classifier_agent-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bbd51805c26cee6832ccd233528a002f54017bd26525b7e622b9af77471b9963",
                "md5": "4c1baa2c5dea14f78e35d1f4074c42c6",
                "sha256": "30406964eab92fad02987bbe7525ecf08ec14e69fbebcc601b6e7e3fa9c147f0"
            },
            "downloads": -1,
            "filename": "classifier_agent-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "4c1baa2c5dea14f78e35d1f4074c42c6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 4184,
            "upload_time": "2024-08-01T15:22:00",
            "upload_time_iso_8601": "2024-08-01T15:22:00.726005Z",
            "url": "https://files.pythonhosted.org/packages/bb/d5/1805c26cee6832ccd233528a002f54017bd26525b7e622b9af77471b9963/classifier_agent-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-01 15:22:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "adnanmushtaq1996",
    "github_project": "ML-Classifier-Python-Package",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.18.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "0.24.0"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.3.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.11.0"
                ]
            ]
        },
        {
            "name": "termcolor",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        }
    ],
    "lcname": "classifier-agent"
}

Adnan Karol