![](https://github.com/plugg1N/gms-module/blob/main/images/chart1.png?raw=true)
*Chart 1: Basic GMS Workflow*
- [Этот README.md на русском языке](https://github.com/plugg1N/gms-module/blob/main/README_Russian.md) ❤️
# Brief Description
**<ins>General Model Selection Module</ins>** *(next: GMS-Module)* is a simple yet neat model selection tool that would help machine learning developers to get their hands on the most efficient model/pipeline for their specific task. *This
project has brought me 5 points additionally for IT General State Exam (ЕГЭ по информатике)* 😌
User only needs to pass:
- *Models AND/OR Pipelines* of their choice
- *Metrics* For evaluation
- *Pivot* if certain metric is more important than the others
- *Data* to train and evaluate on
Module would automatically make evaluations, store them and give verbose description of each model's performance!
# Installation
To install GMSModule ensure that python3 and pip are installed. In terminal simply type:
`pip install gms` OR `pip3 install gms`
```python
pip3 install gms
```
# How to use?
1. Make sure that all the variables are prepared to be used by GMS:
- `mode`: A string of your ML task: `'regression' OR 'classification'`
- `include`: A list of model-obj. of your choice: `[LinearRegression(), SVR()]`
- `metrics`: A list of strings to evaluate on:
- `classification = ['accuracy', 'f1-score', 'precision', 'recall', 'roc-auc']`
- `regression = ['mae', 'mape', 'mse', 'rmse', 'r2-score']`
- `data`: A list of your data to train/validate on:
`[X_train, X_test, y_train, y_test]`
- `pivot`: *if necessary*: A string of one of metrics provided: `'accuracy'` (pivot is a metric that is most important for evaluation)
2. Import GMSModule into your project:
```python
from gms.GMSModule import GMSModule
```
2. Create a **GMSModule** object with your data:
```python
GMSPipe = GMSModule(mode="classification",
pivot='f1-score',
metrics=['accuracy', 'f1-score'],
include=[LogisticRegression(), RandomForestClassifier()],
data=[X_train, X_test, y_train, y_test])
```
3. Use any of methods provided:
```python
best_model, _ = GMSPipe.best_model()
print(best_model)
```
```python
RandomForestClassifier()
```
# Why this module?
Every Machine Learning developer, especially after extensive data analysis, has to pick **the most precise Machine Learning model**. Some engineers already know which model would fit perfectly, due to the ease of task given or due to the fact that ML model is evident.
> **But some engineers might struggle with the BLIND choice between dozens if not HUNDREDS of ML models / pipelines that they have built. That's where GMS Module could help!**
User doesn't have to build a custom function that would evaluate each model one by one on their metrics. **User just has to pass in each model and name metrics of their choice** and *voila!*
Then, user is able to look at the `GMSModule.description()` and get verbose information about models' evaluations and see which models are better than the others.
Users can also get their data into variables for further usage, like in this <ins>example</ins>:
```python
# Get predictions of the best model from list
_, preds = GMSModule.best_model()
# DataFrame data
data = {
'id': range(25000),
'value': preds
}
# Create a DataFrame and pass information into it
df = pd.DataFrame(data)
df.to_csv('submit.csv', index=False)
```
# Project History
This project was created as a **fun side project for me** to experiment with scikit-learn tools. Project has helped me to become more focused on programming overall and taught me how to write m*y own PYPI module for others to use!*
> The idea was born on `16.10.2023` and the first draft of the project was so inefficient, so that I had to rewrite almost everything
Module used to re-evaluate each time I've tried to get evaluations of each model. Evaluations used 'if-statements' which looked hideous and unprofessional.
With the 5-th version done on `20.10.2023` everything has been changed. Re-evaluation problem was fixed, module could catch the most obvious exceptions caused by user and 'if-statements' were replaced with neat dictionaries.
As if `22.10.2023`, I am creating the first version of this Markdown (README.md) file. Project is polished. All I need is to:
- Create a module file (`*.py`)
- Write a bunch of documentation: this doc in Russian, code run-through, basic use-cases and much more!
- Get a license
- Post this module on PYPI
`21:54. 22.10.2023` I've already posted my project to PYPI. Everything seems to work fine.
`23:28. 07.11.2023` I've created a new 0.3.0 version that fixed some bugs I've encountered. Now module has less bugs. New feature added: `GMSModule.to_df()` :)
`11:02. 09.11.2023` New version 0.4.0. Now, you get get predictions of each model provided! Most of the comments were cleared due to the fact that they were
unnecessary.
# Quick Message
**I WON'T UPDATE FILES FOR PYPI FOR THIS REPO** because those commits are:
1. Unnecessary
2. Break auto merge for git
So please, don't look at files that are not related to gms-module code itself!
# TO DO:
- Create a Markdown file for Usage description and examples
# Fixed:
- Fixed issue with `pivot = None` error
- Fixed issue with non-binary classification support
- Added: `GMSModule.get_predictions()` function. Now you can evaluate each model provided!
# My Socials
- Full Name: Nikita Zhamkov (Dmitrievich)
- Country, city: Russia, Saint-Petersburg
- Phone number: +79119109210
- Email: nikitazhamkov@gmail.com
- GitHub: https://github.com/plugg1N
- Telegram: https://t.me/jeberkarawita
Raw data
{
"_id": null,
"home_page": "https://github.com/plugg1N/gms-module",
"name": "gms",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "python machine-learning ml models ai",
"author": "plugg1N (Nikita Zhamkov Dmitrievich)",
"author_email": "nikitazhamkov@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/67/81/cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927/gms-0.4.0.tar.gz",
"platform": null,
"description": "![](https://github.com/plugg1N/gms-module/blob/main/images/chart1.png?raw=true)\r\n*Chart 1: Basic GMS Workflow*\r\n\r\n- [\u042d\u0442\u043e\u0442 README.md \u043d\u0430 \u0440\u0443\u0441\u0441\u043a\u043e\u043c \u044f\u0437\u044b\u043a\u0435](https://github.com/plugg1N/gms-module/blob/main/README_Russian.md) \u2764\ufe0f\r\n\r\n# Brief Description\r\n\r\n**<ins>General Model Selection Module</ins>** *(next: GMS-Module)* is a simple yet neat model selection tool that would help machine learning developers to get their hands on the most efficient model/pipeline for their specific task. *This\r\nproject has brought me 5 points additionally for IT General State Exam (\u0415\u0413\u042d \u043f\u043e \u0438\u043d\u0444\u043e\u0440\u043c\u0430\u0442\u0438\u043a\u0435)* \ud83d\ude0c\r\n\r\nUser only needs to pass:\r\n- *Models AND/OR Pipelines* of their choice\r\n- *Metrics* For evaluation\r\n- *Pivot* if certain metric is more important than the others\r\n- *Data* to train and evaluate on\r\n\r\nModule would automatically make evaluations, store them and give verbose description of each model's performance!\r\n\r\n# Installation\r\n\r\nTo install GMSModule ensure that python3 and pip are installed. In terminal simply type:\r\n`pip install gms` OR `pip3 install gms`\r\n\r\n```python\r\npip3 install gms\r\n```\r\n\r\n# How to use?\r\n\r\n1. Make sure that all the variables are prepared to be used by GMS:\r\n\t- `mode`: A string of your ML task: `'regression' OR 'classification'`\r\n\t- `include`: A list of model-obj. of your choice: `[LinearRegression(), SVR()]`\r\n\t- `metrics`: A list of strings to evaluate on: \r\n\t\t- `classification = ['accuracy', 'f1-score', 'precision', 'recall', 'roc-auc']`\r\n\t\t- `regression = ['mae', 'mape', 'mse', 'rmse', 'r2-score']`\r\n\t- `data`: A list of your data to train/validate on: \r\n\t\t `[X_train, X_test, y_train, y_test]`\r\n\t- `pivot`: *if necessary*: A string of one of metrics provided: `'accuracy'` (pivot is a metric that is most important for evaluation)\r\n\r\n2. Import GMSModule into your project:\r\n\r\n```python\r\nfrom gms.GMSModule import GMSModule\r\n```\r\n\r\n\r\n2. Create a **GMSModule** object with your data:\r\n```python\r\nGMSPipe = GMSModule(mode=\"classification\",\r\n\tpivot='f1-score',\r\n\tmetrics=['accuracy', 'f1-score'],\r\n\tinclude=[LogisticRegression(), RandomForestClassifier()],\r\n\tdata=[X_train, X_test, y_train, y_test])\r\n```\r\n\r\n3. Use any of methods provided:\r\n```python\r\nbest_model, _ = GMSPipe.best_model()\r\nprint(best_model)\r\n```\r\n\r\n```python\r\nRandomForestClassifier()\r\n```\r\n\r\n\r\n# Why this module?\r\n\r\nEvery Machine Learning developer, especially after extensive data analysis, has to pick **the most precise Machine Learning model**. Some engineers already know which model would fit perfectly, due to the ease of task given or due to the fact that ML model is evident.\r\n\r\n> **But some engineers might struggle with the BLIND choice between dozens if not HUNDREDS of ML models / pipelines that they have built. That's where GMS Module could help!**\r\n\r\nUser doesn't have to build a custom function that would evaluate each model one by one on their metrics. **User just has to pass in each model and name metrics of their choice** and *voila!* \r\n\r\nThen, user is able to look at the `GMSModule.description()` and get verbose information about models' evaluations and see which models are better than the others.\r\n\r\nUsers can also get their data into variables for further usage, like in this <ins>example</ins>:\r\n\r\n```python\r\n# Get predictions of the best model from list\r\n_, preds = GMSModule.best_model()\r\n\r\n# DataFrame data\r\ndata = {\r\n\t'id': range(25000),\r\n\t'value': preds\r\n}\r\n\r\n# Create a DataFrame and pass information into it\r\ndf = pd.DataFrame(data)\r\ndf.to_csv('submit.csv', index=False)\r\n```\r\n\r\n# Project History\r\n\r\n\r\nThis project was created as a **fun side project for me** to experiment with scikit-learn tools. Project has helped me to become more focused on programming overall and taught me how to write m*y own PYPI module for others to use!*\r\n\r\n> The idea was born on `16.10.2023` and the first draft of the project was so inefficient, so that I had to rewrite almost everything\r\n\r\nModule used to re-evaluate each time I've tried to get evaluations of each model. Evaluations used 'if-statements' which looked hideous and unprofessional.\r\n\r\nWith the 5-th version done on `20.10.2023` everything has been changed. Re-evaluation problem was fixed, module could catch the most obvious exceptions caused by user and 'if-statements' were replaced with neat dictionaries.\r\n\r\nAs if `22.10.2023`, I am creating the first version of this Markdown (README.md) file. Project is polished. All I need is to:\r\n\r\n- Create a module file (`*.py`)\r\n- Write a bunch of documentation: this doc in Russian, code run-through, basic use-cases and much more!\r\n- Get a license\r\n- Post this module on PYPI\r\n\r\n`21:54. 22.10.2023` I've already posted my project to PYPI. Everything seems to work fine.\r\n\r\n`23:28. 07.11.2023` I've created a new 0.3.0 version that fixed some bugs I've encountered. Now module has less bugs. New feature added: `GMSModule.to_df()` :)\r\n\r\n`11:02. 09.11.2023` New version 0.4.0. Now, you get get predictions of each model provided! Most of the comments were cleared due to the fact that they were\r\nunnecessary.\r\n\r\n\r\n# Quick Message\r\n\r\n**I WON'T UPDATE FILES FOR PYPI FOR THIS REPO** because those commits are:\r\n\r\n1. Unnecessary\r\n2. Break auto merge for git\r\n\r\nSo please, don't look at files that are not related to gms-module code itself!\r\n\r\n\r\n# TO DO:\r\n\r\n- Create a Markdown file for Usage description and examples\r\n\r\n# Fixed:\r\n\r\n- Fixed issue with `pivot = None` error\r\n- Fixed issue with non-binary classification support \r\n- Added: `GMSModule.get_predictions()` function. Now you can evaluate each model provided!\r\n\r\n\r\n\r\n# My Socials\r\n\r\n- Full Name: Nikita Zhamkov (Dmitrievich)\r\n- Country, city: Russia, Saint-Petersburg\r\n- Phone number: +79119109210\r\n- Email: nikitazhamkov@gmail.com\r\n- GitHub: https://github.com/plugg1N\r\n- Telegram: https://t.me/jeberkarawita\r\n",
"bugtrack_url": null,
"license": "",
"summary": "General Model Selection Module",
"version": "0.4.0",
"project_urls": {
"Documentation": "https://github.com/plugg1N/gms-module/blob/main/README.md",
"Homepage": "https://github.com/plugg1N/gms-module",
"Project_github": "https://github.com/plugg1N/gms-module"
},
"split_keywords": [
"python",
"machine-learning",
"ml",
"models",
"ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9ea1ab385bfb5cb9609bc1af87d772246963ca1113b246a1b49492127fce0367",
"md5": "dd06d5583d6f228684f7be2ff0a818e9",
"sha256": "47426954d0fdde847b2e9adfa835d216f6fcffe50d0a9e4c636b9547be34870e"
},
"downloads": -1,
"filename": "gms-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dd06d5583d6f228684f7be2ff0a818e9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9388,
"upload_time": "2023-11-09T08:12:43",
"upload_time_iso_8601": "2023-11-09T08:12:43.862902Z",
"url": "https://files.pythonhosted.org/packages/9e/a1/ab385bfb5cb9609bc1af87d772246963ca1113b246a1b49492127fce0367/gms-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6781cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927",
"md5": "2cbbac7f6c27281a3639d3895c641950",
"sha256": "f41cef6ac7a51ef55bb4741395b43b1e789c5759571a52119b32de5831387659"
},
"downloads": -1,
"filename": "gms-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "2cbbac7f6c27281a3639d3895c641950",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9113,
"upload_time": "2023-11-09T08:12:46",
"upload_time_iso_8601": "2023-11-09T08:12:46.811584Z",
"url": "https://files.pythonhosted.org/packages/67/81/cd3087c6e1fc1fb6bf540c5ee525ec6f3193803532fb14ebedefdb2b5927/gms-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-11-09 08:12:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "plugg1N",
"github_project": "gms-module",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "gms"
}