<div align="center">
<p>
<a align="center" href="" target="_blank">
<img
width="1280"
src="https://raw.githubusercontent.com/shyam1326/autopilotml/main/images/autopilotml.png"
</a>
</p>
[](https://badge.fury.io/py/autopilotml)
<a href="https://pepy.tech/project/autopilotml"><img src="https://pepy.tech/badge/autopilotml" alt="total autopilotml downloads"></a>
[](LICENSE)
[](https://colab.research.google.com/github/shyam1326/autopilotml/blob/main/autopilotml/research/autopilotml_examples.ipynb)
</div>
# Autopilotml
> Automated machine learning library for analytics
## Installation
- `pip install autopilotml`
## Usage
### Load data
```python
from autopilotml import load_data, load_database
# For csv files
df = load_data(path = "dataset/titanic_train.csv", csv=True, **kwargs)
# For excel notebook
df = load_data(path = "dataset/titanic_train.xlsx", excel=True, **kwargs)
# To Load data from Database
# This framework supports sqlite, 'mysql', 'postgres', 'MongoDB'
df = load_database(database_type='sqlite', sqlite_db_path = 'database.db', query='select * from employee_table')
```
### Data Preprocessing
```python
from autopilotml import preprocessing
# If changing any values in the dictionary, whole dictionary has to be provided.
df = preprocessing(dataframe=df, label_column='Survived',
missing={
'type':'impute',
'drop_columns': False,
'threshold': 0.25,
'strategy_numerical': 'knn',
'strategy_categorical': 'most_frequent',
'fill_value': None},
outlier={
'method': 'None',
'zscore_threshold': 3,
'iqr_threshold': 1.5,
'Lc': 0.05,
'Uc': 0.95,
'cap': False})
```
### Data Transformation
```python
from autopilotml import transformation
# If the target_transform is true, then the function return 3 objects, (e.g) dataframe, feature encoder and target encoder
# else it will return 2 objects dataframe and feature encoder
df, encoder = transformation(dataframe=df,
label_column='Survived',
type = 'ordinal',
target_transform = False,
cardinality = True,
Cardinality_threshold = 0.3)
```
### Scaling
```python
# Here if target_scaling = True only applicable for regression then it will return 3 objects dataframe, feature scaler and target scaler
from autopilotml import scaling
df, scaler = scaling(df, label_column= 'Survived', type = 'standard', target_scaling = False)
```
### Feature Selecction
```python
from autopilotml import feature_selection
df, selector = feature_selection(dataframe=df, label_column='Survived',
estimator='RandomForestClassifier',
type='rfe', max_features=10,
min_features=2, scoring= 'accuracy',
cv=5)
```
### Model Training
```python
from autopilotml import training
model = training(dataframe=df, label_column='Survived', model_name='SVC', problem_type='Classification',
target_scaler=None, test_split =0.15, hypertune=True, n_epochs=100)
```
### MLFlow - Track the Model Training and model Parameters
```python
!mlflow ui
```
Raw data
{
"_id": null,
"home_page": "https://github.com/shyam1326",
"name": "autopilotml",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.8",
"maintainer_email": null,
"keywords": "autopilotml",
"author": "Shyam Prasath",
"author_email": "shshyam96@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/90/1f/e8235132eadca59508078830bc4d07a774fae988047796fdc75446030a7e/autopilotml-1.0.14.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <p>\n <a align=\"center\" href=\"\" target=\"_blank\">\n <img\n width=\"1280\"\n src=\"https://raw.githubusercontent.com/shyam1326/autopilotml/main/images/autopilotml.png\"\n </a>\n </p>\n\n\n[](https://badge.fury.io/py/autopilotml)\n<a href=\"https://pepy.tech/project/autopilotml\"><img src=\"https://pepy.tech/badge/autopilotml\" alt=\"total autopilotml downloads\"></a>\n[](LICENSE)\n[](https://colab.research.google.com/github/shyam1326/autopilotml/blob/main/autopilotml/research/autopilotml_examples.ipynb)\n\n\n</div>\n\n\n# Autopilotml\n> Automated machine learning library for analytics\n\n## Installation\n\n- `pip install autopilotml`\n\n## Usage\n\n### Load data\n\n```python\nfrom autopilotml import load_data, load_database\n\n# For csv files\ndf = load_data(path = \"dataset/titanic_train.csv\", csv=True, **kwargs)\n\n# For excel notebook\ndf = load_data(path = \"dataset/titanic_train.xlsx\", excel=True, **kwargs)\n\n# To Load data from Database\n\n# This framework supports sqlite, 'mysql', 'postgres', 'MongoDB'\ndf = load_database(database_type='sqlite', sqlite_db_path = 'database.db', query='select * from employee_table')\n```\n\n### Data Preprocessing\n\n```python\nfrom autopilotml import preprocessing\n\n# If changing any values in the dictionary, whole dictionary has to be provided.\n\ndf = preprocessing(dataframe=df, label_column='Survived',\n missing={\n 'type':'impute',\n 'drop_columns': False, \n 'threshold': 0.25, \n 'strategy_numerical': 'knn',\n 'strategy_categorical': 'most_frequent',\n 'fill_value': None},\n outlier={\n 'method': 'None',\n 'zscore_threshold': 3,\n 'iqr_threshold': 1.5,\n 'Lc': 0.05, \n 'Uc': 0.95,\n 'cap': False})\n```\n\n### Data Transformation\n\n```python\nfrom autopilotml import transformation\n\n# If the target_transform is true, then the function return 3 objects, (e.g) dataframe, feature encoder and target encoder\n# else it will return 2 objects dataframe and feature encoder\ndf, encoder = transformation(dataframe=df,\n label_column='Survived', \n type = 'ordinal',\n target_transform = False, \n cardinality = True, \n Cardinality_threshold = 0.3)\n```\n\n### Scaling\n\n```python\n# Here if target_scaling = True only applicable for regression then it will return 3 objects dataframe, feature scaler and target scaler\n\nfrom autopilotml import scaling\n\ndf, scaler = scaling(df, label_column= 'Survived', type = 'standard', target_scaling = False)\n```\n\n### Feature Selecction\n\n```python\nfrom autopilotml import feature_selection\n\ndf, selector = feature_selection(dataframe=df, label_column='Survived', \n estimator='RandomForestClassifier', \n type='rfe', max_features=10, \n min_features=2, scoring= 'accuracy', \n cv=5)\n```\n\n### Model Training\n\n```python\nfrom autopilotml import training\n\nmodel = training(dataframe=df, label_column='Survived', model_name='SVC', problem_type='Classification', \n target_scaler=None, test_split =0.15, hypertune=True, n_epochs=100)\n```\n\n### MLFlow - Track the Model Training and model Parameters\n\n```python\n!mlflow ui\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A package for automating machine learning tasks",
"version": "1.0.14",
"project_urls": {
"Bug Reports": "https://github.com/shyam1326/autopilotml/issues",
"Documentation": "https://github.com/shyam1326/autopilotml/blob/main/README.md",
"Homepage": "https://github.com/shyam1326",
"Source": "https://github.com/shyam1326/autopilotml"
},
"split_keywords": [
"autopilotml"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f9461a039a56c60672c6fd84339804c3c06102d902ef563cd597e15675d45bd7",
"md5": "d5954e8b79b1b3da9aa7d1444cb36e6c",
"sha256": "68dd199f1c658fbc0d76728d66b15ab9a820a2a71bfd654762aabdf5930a0f30"
},
"downloads": -1,
"filename": "autopilotml-1.0.14-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d5954e8b79b1b3da9aa7d1444cb36e6c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.8",
"size": 208230,
"upload_time": "2024-07-16T17:30:01",
"upload_time_iso_8601": "2024-07-16T17:30:01.599670Z",
"url": "https://files.pythonhosted.org/packages/f9/46/1a039a56c60672c6fd84339804c3c06102d902ef563cd597e15675d45bd7/autopilotml-1.0.14-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "901fe8235132eadca59508078830bc4d07a774fae988047796fdc75446030a7e",
"md5": "ddb1ad77591ebd087bf073ec32f2a98c",
"sha256": "a4ec1d70b5f7473556b589b0ee81141010c9f70de4de4bad1de023bd1e8d125c"
},
"downloads": -1,
"filename": "autopilotml-1.0.14.tar.gz",
"has_sig": false,
"md5_digest": "ddb1ad77591ebd087bf073ec32f2a98c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.8",
"size": 202857,
"upload_time": "2024-07-16T17:30:13",
"upload_time_iso_8601": "2024-07-16T17:30:13.370870Z",
"url": "https://files.pythonhosted.org/packages/90/1f/e8235132eadca59508078830bc4d07a774fae988047796fdc75446030a7e/autopilotml-1.0.14.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-16 17:30:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "shyam1326",
"github_project": "autopilotml",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "alembic",
"specs": [
[
"==",
"1.12.1"
]
]
},
{
"name": "anyio",
"specs": [
[
"==",
"4.0.0"
]
]
},
{
"name": "appnope",
"specs": [
[
"==",
"0.1.3"
]
]
},
{
"name": "argon2-cffi",
"specs": [
[
"==",
"23.1.0"
]
]
},
{
"name": "argon2-cffi-bindings",
"specs": [
[
"==",
"21.2.0"
]
]
},
{
"name": "arrow",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "asttokens",
"specs": [
[
"==",
"2.4.1"
]
]
},
{
"name": "async-lru",
"specs": [
[
"==",
"2.0.4"
]
]
},
{
"name": "attrs",
"specs": [
[
"==",
"23.1.0"
]
]
},
{
"name": "Babel",
"specs": [
[
"==",
"2.13.1"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
"==",
"4.12.2"
]
]
},
{
"name": "bleach",
"specs": [
[
"==",
"6.1.0"
]
]
},
{
"name": "blinker",
"specs": [
[
"==",
"1.6.3"
]
]
},
{
"name": "certifi",
"specs": [
[
"==",
"2023.7.22"
]
]
},
{
"name": "cffi",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
"==",
"3.3.1"
]
]
},
{
"name": "click",
"specs": [
[
"==",
"8.1.7"
]
]
},
{
"name": "cloudpickle",
"specs": [
[
"==",
"2.2.1"
]
]
},
{
"name": "colorlog",
"specs": [
[
"==",
"6.7.0"
]
]
},
{
"name": "comm",
"specs": [
[
"==",
"0.1.4"
]
]
},
{
"name": "contourpy",
"specs": [
[
"==",
"1.1.1"
]
]
},
{
"name": "cycler",
"specs": [
[
"==",
"0.12.1"
]
]
},
{
"name": "databricks-cli",
"specs": [
[
"==",
"0.18.0"
]
]
},
{
"name": "debugpy",
"specs": [
[
"==",
"1.8.0"
]
]
},
{
"name": "decorator",
"specs": [
[
"==",
"5.1.1"
]
]
},
{
"name": "defusedxml",
"specs": [
[
"==",
"0.7.1"
]
]
},
{
"name": "dnspython",
"specs": [
[
"==",
"2.4.2"
]
]
},
{
"name": "docker",
"specs": [
[
"==",
"6.1.3"
]
]
},
{
"name": "entrypoints",
"specs": [
[
"==",
"0.4"
]
]
},
{
"name": "exceptiongroup",
"specs": [
[
"==",
"1.1.3"
]
]
},
{
"name": "executing",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "fastjsonschema",
"specs": [
[
"==",
"2.18.1"
]
]
},
{
"name": "Flask",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "fonttools",
"specs": [
[
"==",
"4.43.1"
]
]
},
{
"name": "fqdn",
"specs": [
[
"==",
"1.5.1"
]
]
},
{
"name": "gitdb",
"specs": [
[
"==",
"4.0.11"
]
]
},
{
"name": "GitPython",
"specs": [
[
"==",
"3.1.40"
]
]
},
{
"name": "greenlet",
"specs": [
[
"==",
"3.0.1"
]
]
},
{
"name": "gunicorn",
"specs": [
[
"==",
"21.2.0"
]
]
},
{
"name": "idna",
"specs": [
[
"==",
"3.4"
]
]
},
{
"name": "importlib-metadata",
"specs": [
[
"==",
"6.8.0"
]
]
},
{
"name": "ipykernel",
"specs": [
[
"==",
"6.26.0"
]
]
},
{
"name": "ipython",
"specs": [
[
"==",
"8.17.2"
]
]
},
{
"name": "ipython-genutils",
"specs": [
[
"==",
"0.2.0"
]
]
},
{
"name": "ipywidgets",
"specs": [
[
"==",
"8.1.1"
]
]
},
{
"name": "isoduration",
"specs": [
[
"==",
"20.11.0"
]
]
},
{
"name": "itsdangerous",
"specs": [
[
"==",
"2.1.2"
]
]
},
{
"name": "jedi",
"specs": [
[
"==",
"0.19.1"
]
]
},
{
"name": "Jinja2",
"specs": [
[
"==",
"3.1.2"
]
]
},
{
"name": "joblib",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "json5",
"specs": [
[
"==",
"0.9.14"
]
]
},
{
"name": "jsonpointer",
"specs": [
[
"==",
"2.4"
]
]
},
{
"name": "jsonschema",
"specs": [
[
"==",
"4.19.2"
]
]
},
{
"name": "jsonschema-specifications",
"specs": [
[
"==",
"2023.7.1"
]
]
},
{
"name": "jupyter",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "jupyter-console",
"specs": [
[
"==",
"6.6.3"
]
]
},
{
"name": "jupyter-events",
"specs": [
[
"==",
"0.8.0"
]
]
},
{
"name": "jupyter-lsp",
"specs": [
[
"==",
"2.2.0"
]
]
},
{
"name": "jupyter_client",
"specs": [
[
"==",
"8.5.0"
]
]
},
{
"name": "jupyter_core",
"specs": [
[
"==",
"5.5.0"
]
]
},
{
"name": "jupyter_server",
"specs": [
[
"==",
"2.9.1"
]
]
},
{
"name": "jupyter_server_terminals",
"specs": [
[
"==",
"0.4.4"
]
]
},
{
"name": "jupyterlab",
"specs": [
[
"==",
"4.0.7"
]
]
},
{
"name": "jupyterlab-pygments",
"specs": [
[
"==",
"0.2.2"
]
]
},
{
"name": "jupyterlab-widgets",
"specs": [
[
"==",
"3.0.9"
]
]
},
{
"name": "jupyterlab_server",
"specs": [
[
"==",
"2.25.0"
]
]
},
{
"name": "kiwisolver",
"specs": [
[
"==",
"1.4.5"
]
]
},
{
"name": "Mako",
"specs": [
[
"==",
"1.2.4"
]
]
},
{
"name": "Markdown",
"specs": [
[
"==",
"3.5.1"
]
]
},
{
"name": "MarkupSafe",
"specs": [
[
"==",
"2.1.3"
]
]
},
{
"name": "matplotlib",
"specs": [
[
"==",
"3.8.0"
]
]
},
{
"name": "matplotlib-inline",
"specs": [
[
"==",
"0.1.6"
]
]
},
{
"name": "mistune",
"specs": [
[
"==",
"3.0.2"
]
]
},
{
"name": "mlflow",
"specs": [
[
"==",
"2.8.0"
]
]
},
{
"name": "mysql-connector-python",
"specs": [
[
"==",
"8.2.0"
]
]
},
{
"name": "nbclient",
"specs": [
[
"==",
"0.8.0"
]
]
},
{
"name": "nbconvert",
"specs": [
[
"==",
"7.10.0"
]
]
},
{
"name": "nbformat",
"specs": [
[
"==",
"5.9.2"
]
]
},
{
"name": "nest-asyncio",
"specs": [
[
"==",
"1.5.8"
]
]
},
{
"name": "notebook",
"specs": [
[
"==",
"7.0.6"
]
]
},
{
"name": "notebook_shim",
"specs": [
[
"==",
"0.2.3"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.26.1"
]
]
},
{
"name": "oauthlib",
"specs": [
[
"==",
"3.2.2"
]
]
},
{
"name": "optuna",
"specs": [
[
"==",
"3.4.0"
]
]
},
{
"name": "overrides",
"specs": [
[
"==",
"7.4.0"
]
]
},
{
"name": "packaging",
"specs": [
[
"==",
"23.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"2.1.2"
]
]
},
{
"name": "pandocfilters",
"specs": [
[
"==",
"1.5.0"
]
]
},
{
"name": "parso",
"specs": [
[
"==",
"0.8.3"
]
]
},
{
"name": "pexpect",
"specs": [
[
"==",
"4.8.0"
]
]
},
{
"name": "Pillow",
"specs": [
[
"==",
"10.1.0"
]
]
},
{
"name": "platformdirs",
"specs": [
[
"==",
"3.11.0"
]
]
},
{
"name": "prometheus-client",
"specs": [
[
"==",
"0.18.0"
]
]
},
{
"name": "prompt-toolkit",
"specs": [
[
"==",
"3.0.39"
]
]
},
{
"name": "protobuf",
"specs": [
[
"==",
"4.21.12"
]
]
},
{
"name": "psutil",
"specs": [
[
"==",
"5.9.6"
]
]
},
{
"name": "psycopg2-binary",
"specs": [
[
"==",
"2.9.9"
]
]
},
{
"name": "ptyprocess",
"specs": [
[
"==",
"0.7.0"
]
]
},
{
"name": "pure-eval",
"specs": [
[
"==",
"0.2.2"
]
]
},
{
"name": "pyarrow",
"specs": [
[
"==",
"13.0.0"
]
]
},
{
"name": "pycparser",
"specs": [
[
"==",
"2.21"
]
]
},
{
"name": "Pygments",
"specs": [
[
"==",
"2.16.1"
]
]
},
{
"name": "PyJWT",
"specs": [
[
"==",
"2.8.0"
]
]
},
{
"name": "pymongo",
"specs": [
[
"==",
"4.5.0"
]
]
},
{
"name": "pyparsing",
"specs": [
[
"==",
"3.1.1"
]
]
},
{
"name": "python-dateutil",
"specs": [
[
"==",
"2.8.2"
]
]
},
{
"name": "python-json-logger",
"specs": [
[
"==",
"2.0.7"
]
]
},
{
"name": "pytz",
"specs": [
[
"==",
"2023.3.post1"
]
]
},
{
"name": "PyYAML",
"specs": [
[
"==",
"6.0.1"
]
]
},
{
"name": "pyzmq",
"specs": [
[
"==",
"25.1.1"
]
]
},
{
"name": "qtconsole",
"specs": [
[
"==",
"5.4.4"
]
]
},
{
"name": "QtPy",
"specs": [
[
"==",
"2.4.1"
]
]
},
{
"name": "querystring-parser",
"specs": [
[
"==",
"1.2.4"
]
]
},
{
"name": "referencing",
"specs": [
[
"==",
"0.30.2"
]
]
},
{
"name": "requests",
"specs": [
[
"==",
"2.31.0"
]
]
},
{
"name": "rfc3339-validator",
"specs": [
[
"==",
"0.1.4"
]
]
},
{
"name": "rfc3986-validator",
"specs": [
[
"==",
"0.1.1"
]
]
},
{
"name": "rpds-py",
"specs": [
[
"==",
"0.10.6"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.3.2"
]
]
},
{
"name": "scipy",
"specs": [
[
"==",
"1.11.3"
]
]
},
{
"name": "Send2Trash",
"specs": [
[
"==",
"1.8.2"
]
]
},
{
"name": "six",
"specs": [
[
"==",
"1.16.0"
]
]
},
{
"name": "smmap",
"specs": [
[
"==",
"5.0.1"
]
]
},
{
"name": "sniffio",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "soupsieve",
"specs": [
[
"==",
"2.5"
]
]
},
{
"name": "SQLAlchemy",
"specs": [
[
"==",
"2.0.22"
]
]
},
{
"name": "sqlparse",
"specs": [
[
"==",
"0.4.4"
]
]
},
{
"name": "stack-data",
"specs": [
[
"==",
"0.6.3"
]
]
},
{
"name": "streamlit",
"specs": []
},
{
"name": "tabulate",
"specs": [
[
"==",
"0.9.0"
]
]
},
{
"name": "terminado",
"specs": [
[
"==",
"0.17.1"
]
]
},
{
"name": "threadpoolctl",
"specs": [
[
"==",
"3.2.0"
]
]
},
{
"name": "tinycss2",
"specs": [
[
"==",
"1.2.1"
]
]
},
{
"name": "tomli",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "tornado",
"specs": [
[
"==",
"6.3.3"
]
]
},
{
"name": "tqdm",
"specs": [
[
"==",
"4.66.1"
]
]
},
{
"name": "traitlets",
"specs": [
[
"==",
"5.13.0"
]
]
},
{
"name": "types-python-dateutil",
"specs": [
[
"==",
"2.8.19.14"
]
]
},
{
"name": "typing_extensions",
"specs": [
[
"==",
"4.8.0"
]
]
},
{
"name": "tzdata",
"specs": [
[
"==",
"2023.3"
]
]
},
{
"name": "uri-template",
"specs": [
[
"==",
"1.3.0"
]
]
},
{
"name": "urllib3",
"specs": [
[
"==",
"2.0.7"
]
]
},
{
"name": "wcwidth",
"specs": [
[
"==",
"0.2.9"
]
]
},
{
"name": "webcolors",
"specs": [
[
"==",
"1.13"
]
]
},
{
"name": "webencodings",
"specs": [
[
"==",
"0.5.1"
]
]
},
{
"name": "websocket-client",
"specs": [
[
"==",
"1.6.4"
]
]
},
{
"name": "Werkzeug",
"specs": [
[
"==",
"3.0.1"
]
]
},
{
"name": "widgetsnbextension",
"specs": [
[
"==",
"4.0.9"
]
]
},
{
"name": "xgboost",
"specs": [
[
"==",
"2.0.1"
]
]
},
{
"name": "zipp",
"specs": [
[
"==",
"3.17.0"
]
]
}
],
"lcname": "autopilotml"
}