gms


Namegms JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/plugg1N/gms-module
SummaryGeneral Model Selection Module
upload_time2023-11-09 08:12:46
maintainer
docs_urlNone
authorplugg1N (Nikita Zhamkov Dmitrievich)
requires_python>=3.7
license
keywords python machine-learning ml models ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![](https://github.com/plugg1N/gms-module/blob/main/images/chart1.png?raw=true)
*Chart 1: Basic GMS Workflow*

-  [Этот README.md на русском языке](https://github.com/plugg1N/gms-module/blob/main/README_Russian.md) ❤️

# Brief Description

**<ins>General Model Selection Module</ins>** *(next: GMS-Module)* is a simple yet neat model selection tool that would help machine learning developers to get their hands on the most efficient model/pipeline for their specific task. *This
project has brought me 5 points additionally for IT General State Exam (ЕГЭ по информатике)* 😌

User only needs to pass:
- *Models AND/OR Pipelines* of their choice
- *Metrics* For evaluation
- *Pivot* if certain metric is more important than the others
- *Data* to train and evaluate on

Module would automatically make evaluations, store them and give verbose description of each model's performance!

# Installation

To install GMSModule ensure that python3 and pip are installed. In terminal simply type:
`pip install gms` OR `pip3 install gms`

```python
pip3 install gms
```

# How to use?

1. Make sure that all the variables are prepared to be used by GMS:
	- `mode`: A string of your ML task: `'regression' OR 'classification'`
	- `include`: A list of model-obj. of your choice: `[LinearRegression(), SVR()]`
	- `metrics`: A list of strings to evaluate on: 
		- `classification = ['accuracy', 'f1-score', 'precision', 'recall', 'roc-auc']`
		- `regression = ['mae', 'mape', 'mse', 'rmse', 'r2-score']`
	- `data`: A list of your data to train/validate on: 
		 `[X_train, X_test, y_train, y_test]`
	- `pivot`: *if necessary*: A string of one of metrics provided: `'accuracy'` (pivot is a metric that is most important for evaluation)

2. Import GMSModule into your project:

```python
from gms.GMSModule import GMSModule
```


2. Create a **GMSModule** object with your data:
```python
GMSPipe = GMSModule(mode="classification",
	pivot='f1-score',
	metrics=['accuracy', 'f1-score'],
	include=[LogisticRegression(), RandomForestClassifier()],
	data=[X_train, X_test, y_train, y_test])
```

3. Use any of methods provided:
```python
best_model, _ = GMSPipe.best_model()
print(best_model)
```

```python
RandomForestClassifier()
```


# Why this module?

Every Machine Learning developer, especially after extensive data analysis, has to pick **the most precise Machine Learning model**. Some engineers already know which model would fit perfectly, due to the ease of task given or due to the fact that ML model is evident.

> **But some engineers might struggle with the BLIND choice between dozens if not HUNDREDS of ML models / pipelines that they have built. That's where GMS Module could help!**

User doesn't have to build a custom function that would evaluate each model one by one on their metrics. **User just has to pass in each model and name metrics of their choice** and *voila!* 

Then, user is able to look at the `GMSModule.description()` and get verbose information about models' evaluations and see which models are better than the others.

Users can also get their data into variables for further usage, like in this <ins>example</ins>:

```python
# Get predictions of the best model from list
_, preds = GMSModule.best_model()

# DataFrame data
data = {
	'id': range(25000),
	'value': preds
}

# Create a DataFrame and pass information into it
df = pd.DataFrame(data)
df.to_csv('submit.csv', index=False)
```

# Project History


This project was created as a **fun side project for me** to experiment with scikit-learn tools. Project has helped me to become more focused on programming overall and taught me how to write m*y own PYPI module for others to use!*

> The idea was born on `16.10.2023` and the first draft of the project was so inefficient, so that I had to rewrite almost everything

Module used to re-evaluate each time I've tried to get evaluations of each model. Evaluations used 'if-statements' which looked hideous and unprofessional.

With the 5-th version done on `20.10.2023` everything has been changed. Re-evaluation problem was fixed, module could catch the most obvious exceptions caused by user and 'if-statements' were replaced with neat dictionaries.

As if `22.10.2023`, I am creating the first version of this Markdown (README.md) file. Project is polished. All I need is to:

- Create a module file (`*.py`)
- Write a bunch of documentation: this doc in Russian, code run-through, basic use-cases and much more!
- Get a license
- Post this module on PYPI

`21:54. 22.10.2023` I've already posted my project to PYPI. Everything seems to work fine.

`23:28. 07.11.2023` I've created a new 0.3.0 version that fixed some bugs I've encountered. Now module has less bugs. New feature added: `GMSModule.to_df()` :)

`11:02. 09.11.2023` New version 0.4.0. Now, you get get predictions of each model provided! Most of the comments were cleared due to the fact that they were
unnecessary.


# Quick Message

**I WON'T UPDATE FILES FOR PYPI FOR THIS REPO** because those commits are:

1. Unnecessary
2. Break auto merge for git

So please, don't look at files that are not related to gms-module code itself!


# TO DO:

- Create a Markdown file for Usage description and examples

# Fixed:

- Fixed issue with `pivot = None` error
- Fixed issue with non-binary classification support 
- Added: `GMSModule.get_predictions()` function. Now you can evaluate each model provided!



# My Socials

- Full Name:  Nikita Zhamkov (Dmitrievich)
- Country, city:  Russia, Saint-Petersburg
- Phone number: +79119109210
- Email: nikitazhamkov@gmail.com
- GitHub: https://github.com/plugg1N
- Telegram: https://t.me/jeberkarawita

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/plugg1N/gms-module",
    "name": "gms",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "python machine-learning ml models ai",
    "author": "plugg1N (Nikita Zhamkov Dmitrievich)",
    "author_email": "nikitazhamkov@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/67/81/cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927/gms-0.4.0.tar.gz",
    "platform": null,
    "description": "![](https://github.com/plugg1N/gms-module/blob/main/images/chart1.png?raw=true)\r\n*Chart 1: Basic GMS Workflow*\r\n\r\n-  [\u042d\u0442\u043e\u0442 README.md \u043d\u0430 \u0440\u0443\u0441\u0441\u043a\u043e\u043c \u044f\u0437\u044b\u043a\u0435](https://github.com/plugg1N/gms-module/blob/main/README_Russian.md) \u2764\ufe0f\r\n\r\n# Brief Description\r\n\r\n**<ins>General Model Selection Module</ins>** *(next: GMS-Module)* is a simple yet neat model selection tool that would help machine learning developers to get their hands on the most efficient model/pipeline for their specific task. *This\r\nproject has brought me 5 points additionally for IT General State Exam (\u0415\u0413\u042d \u043f\u043e \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0442\u0438\u043a\u0435)* \ud83d\ude0c\r\n\r\nUser only needs to pass:\r\n- *Models AND/OR Pipelines* of their choice\r\n- *Metrics* For evaluation\r\n- *Pivot* if certain metric is more important than the others\r\n- *Data* to train and evaluate on\r\n\r\nModule would automatically make evaluations, store them and give verbose description of each model's performance!\r\n\r\n# Installation\r\n\r\nTo install GMSModule ensure that python3 and pip are installed. In terminal simply type:\r\n`pip install gms` OR `pip3 install gms`\r\n\r\n```python\r\npip3 install gms\r\n```\r\n\r\n# How to use?\r\n\r\n1. Make sure that all the variables are prepared to be used by GMS:\r\n\t- `mode`: A string of your ML task: `'regression' OR 'classification'`\r\n\t- `include`: A list of model-obj. of your choice: `[LinearRegression(), SVR()]`\r\n\t- `metrics`: A list of strings to evaluate on: \r\n\t\t- `classification = ['accuracy', 'f1-score', 'precision', 'recall', 'roc-auc']`\r\n\t\t- `regression = ['mae', 'mape', 'mse', 'rmse', 'r2-score']`\r\n\t- `data`: A list of your data to train/validate on: \r\n\t\t `[X_train, X_test, y_train, y_test]`\r\n\t- `pivot`: *if necessary*: A string of one of metrics provided: `'accuracy'` (pivot is a metric that is most important for evaluation)\r\n\r\n2. Import GMSModule into your project:\r\n\r\n```python\r\nfrom gms.GMSModule import GMSModule\r\n```\r\n\r\n\r\n2. Create a **GMSModule** object with your data:\r\n```python\r\nGMSPipe = GMSModule(mode=\"classification\",\r\n\tpivot='f1-score',\r\n\tmetrics=['accuracy', 'f1-score'],\r\n\tinclude=[LogisticRegression(), RandomForestClassifier()],\r\n\tdata=[X_train, X_test, y_train, y_test])\r\n```\r\n\r\n3. Use any of methods provided:\r\n```python\r\nbest_model, _ = GMSPipe.best_model()\r\nprint(best_model)\r\n```\r\n\r\n```python\r\nRandomForestClassifier()\r\n```\r\n\r\n\r\n# Why this module?\r\n\r\nEvery Machine Learning developer, especially after extensive data analysis, has to pick **the most precise Machine Learning model**. Some engineers already know which model would fit perfectly, due to the ease of task given or due to the fact that ML model is evident.\r\n\r\n> **But some engineers might struggle with the BLIND choice between dozens if not HUNDREDS of ML models / pipelines that they have built. That's where GMS Module could help!**\r\n\r\nUser doesn't have to build a custom function that would evaluate each model one by one on their metrics. **User just has to pass in each model and name metrics of their choice** and *voila!* \r\n\r\nThen, user is able to look at the `GMSModule.description()` and get verbose information about models' evaluations and see which models are better than the others.\r\n\r\nUsers can also get their data into variables for further usage, like in this <ins>example</ins>:\r\n\r\n```python\r\n# Get predictions of the best model from list\r\n_, preds = GMSModule.best_model()\r\n\r\n# DataFrame data\r\ndata = {\r\n\t'id': range(25000),\r\n\t'value': preds\r\n}\r\n\r\n# Create a DataFrame and pass information into it\r\ndf = pd.DataFrame(data)\r\ndf.to_csv('submit.csv', index=False)\r\n```\r\n\r\n# Project History\r\n\r\n\r\nThis project was created as a **fun side project for me** to experiment with scikit-learn tools. Project has helped me to become more focused on programming overall and taught me how to write m*y own PYPI module for others to use!*\r\n\r\n> The idea was born on `16.10.2023` and the first draft of the project was so inefficient, so that I had to rewrite almost everything\r\n\r\nModule used to re-evaluate each time I've tried to get evaluations of each model. Evaluations used 'if-statements' which looked hideous and unprofessional.\r\n\r\nWith the 5-th version done on `20.10.2023` everything has been changed. Re-evaluation problem was fixed, module could catch the most obvious exceptions caused by user and 'if-statements' were replaced with neat dictionaries.\r\n\r\nAs if `22.10.2023`, I am creating the first version of this Markdown (README.md) file. Project is polished. All I need is to:\r\n\r\n- Create a module file (`*.py`)\r\n- Write a bunch of documentation: this doc in Russian, code run-through, basic use-cases and much more!\r\n- Get a license\r\n- Post this module on PYPI\r\n\r\n`21:54. 22.10.2023` I've already posted my project to PYPI. Everything seems to work fine.\r\n\r\n`23:28. 07.11.2023` I've created a new 0.3.0 version that fixed some bugs I've encountered. Now module has less bugs. New feature added: `GMSModule.to_df()` :)\r\n\r\n`11:02. 09.11.2023` New version 0.4.0. Now, you get get predictions of each model provided! Most of the comments were cleared due to the fact that they were\r\nunnecessary.\r\n\r\n\r\n# Quick Message\r\n\r\n**I WON'T UPDATE FILES FOR PYPI FOR THIS REPO** because those commits are:\r\n\r\n1. Unnecessary\r\n2. Break auto merge for git\r\n\r\nSo please, don't look at files that are not related to gms-module code itself!\r\n\r\n\r\n# TO DO:\r\n\r\n- Create a Markdown file for Usage description and examples\r\n\r\n# Fixed:\r\n\r\n- Fixed issue with `pivot = None` error\r\n- Fixed issue with non-binary classification support \r\n- Added: `GMSModule.get_predictions()` function. Now you can evaluate each model provided!\r\n\r\n\r\n\r\n# My Socials\r\n\r\n- Full Name:  Nikita Zhamkov (Dmitrievich)\r\n- Country, city:  Russia, Saint-Petersburg\r\n- Phone number: +79119109210\r\n- Email: nikitazhamkov@gmail.com\r\n- GitHub: https://github.com/plugg1N\r\n- Telegram: https://t.me/jeberkarawita\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "General Model Selection Module",
    "version": "0.4.0",
    "project_urls": {
        "Documentation": "https://github.com/plugg1N/gms-module/blob/main/README.md",
        "Homepage": "https://github.com/plugg1N/gms-module",
        "Project_github": "https://github.com/plugg1N/gms-module"
    },
    "split_keywords": [
        "python",
        "machine-learning",
        "ml",
        "models",
        "ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9ea1ab385bfb5cb9609bc1af87d772246963ca1113b246a1b49492127fce0367",
                "md5": "dd06d5583d6f228684f7be2ff0a818e9",
                "sha256": "47426954d0fdde847b2e9adfa835d216f6fcffe50d0a9e4c636b9547be34870e"
            },
            "downloads": -1,
            "filename": "gms-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dd06d5583d6f228684f7be2ff0a818e9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 9388,
            "upload_time": "2023-11-09T08:12:43",
            "upload_time_iso_8601": "2023-11-09T08:12:43.862902Z",
            "url": "https://files.pythonhosted.org/packages/9e/a1/ab385bfb5cb9609bc1af87d772246963ca1113b246a1b49492127fce0367/gms-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6781cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927",
                "md5": "2cbbac7f6c27281a3639d3895c641950",
                "sha256": "f41cef6ac7a51ef55bb4741395b43b1e789c5759571a52119b32de5831387659"
            },
            "downloads": -1,
            "filename": "gms-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "2cbbac7f6c27281a3639d3895c641950",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 9113,
            "upload_time": "2023-11-09T08:12:46",
            "upload_time_iso_8601": "2023-11-09T08:12:46.811584Z",
            "url": "https://files.pythonhosted.org/packages/67/81/cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927/gms-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-09 08:12:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "plugg1N",
    "github_project": "gms-module",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "gms"
}
        
Elapsed time: 0.12941s