# Keras Cross-validation
`keras-model-cv` allows you to cross-validate `keras` model.
## Installation
```python
pip install keras-model-cv
```
or
```python
pip install git+https://github.com/dubovikmaster/keras-model-cv.git
```
## Quickstart
```python
from keras_model_cv import KerasCV
from sklearn.model_selection import KFold
import tensorflow as tf
tf.get_logger().setLevel("INFO")
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
def build_model(hidden_units, dropout):
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(hidden_units, activation="relu"),
tf.keras.layers.Dropout(dropout),
tf.keras.layers.Dense(10),
]
)
model.compile(
optimizer="adam",
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=["accuracy"],
)
return model
PARAMS = {'hidden_units': 16, 'dropout': .3}
if __name__ == '__main__':
cv = KerasCV(
build_model,
KFold(n_splits=3, random_state=1234, shuffle=True),
PARAMS,
preprocessor=tf.keras.layers.Normalization(),
save_history=True,
directory='my_awesome_project',
name='my_cv',
)
cv.fit(x_train, y_train, verbose=0, epochs=3)
print(cv.get_cv_score())
```
```python
Out:
loss accuracy
mean 0.283194 0.919783
std 0.004215 0.002887
```
You can add another aggregate function (for more info see: [pandas.DataFrame.agg](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.agg.html)):
```python
print(cv.get_cv_score(agg_func={'loss': min, 'accuracy': max}))
```
```python
Out:
loss 0.27959
accuracy 0.92010
```
Also, you can get all train history for each splits as `pandas` dataframe:
```python
cv.get_train_history()
```
```python
Out:
loss accuracy split epochs
0 0.957261 0.679375 0 1
1 0.595646 0.809850 0 2
2 0.541124 0.824850 0 3
3 0.835493 0.722475 1 1
4 0.574581 0.810925 1 2
5 0.526098 0.829200 1 3
6 0.813172 0.736200 2 1
7 0.556871 0.816875 2 2
8 0.512916 0.829550 2 3
```
You can show train history as matplotlib plot:
```python
cv.show_train_history()
```
![](img/my_plot.png)
What about metrics per splits?
```python
cv.get_split_scores()
```
```python
Out:
accuracy loss split
0 0.9201 0.282442 0
1 0.9198 0.290500 1
2 0.9173 0.279590 2
```
If `save_history=True` train history, validation metrics and info about split will be saved to the specified directory.
In our example:
```python
my_awesome_project/
|--my_cv/
|--split_0/
|--history.yml
|--validation_metric.yml
|--split_info.yml
|--split_1/
|--split_2/
```
## Licence
MIT license
Raw data
{
"_id": null,
"home_page": "https://github.com/dubovikmaster/keras-model-cv",
"name": "keras-model-cv",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "keras cross-validate,validation keras modelscross-validation",
"author": "Pavel Dubovik",
"author_email": "geometryk@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/b4/70/a29dba9d33bd949a0ae5249d2fb8f498b9e20a4b665987d1204d0fd04688/keras_model_cv-0.5.4.tar.gz",
"platform": "any",
"description": "# Keras Cross-validation\n`keras-model-cv` allows you to cross-validate `keras` model. \n## Installation\n```python\npip install keras-model-cv\n```\nor\n```python\npip install git+https://github.com/dubovikmaster/keras-model-cv.git\n```\n\n## Quickstart\n\n```python\nfrom keras_model_cv import KerasCV\nfrom sklearn.model_selection import KFold\nimport tensorflow as tf\n\ntf.get_logger().setLevel(\"INFO\")\n\nmnist = tf.keras.datasets.mnist\n(x_train, y_train), (x_test, y_test) = mnist.load_data()\n\n\ndef build_model(hidden_units, dropout):\n model = tf.keras.models.Sequential(\n [\n tf.keras.layers.Flatten(input_shape=(28, 28)),\n tf.keras.layers.Dense(hidden_units, activation=\"relu\"),\n tf.keras.layers.Dropout(dropout),\n tf.keras.layers.Dense(10),\n ]\n )\n model.compile(\n optimizer=\"adam\",\n loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n metrics=[\"accuracy\"],\n )\n return model\n\n\nPARAMS = {'hidden_units': 16, 'dropout': .3}\n\nif __name__ == '__main__':\n cv = KerasCV(\n build_model,\n KFold(n_splits=3, random_state=1234, shuffle=True),\n PARAMS,\n preprocessor=tf.keras.layers.Normalization(),\n save_history=True,\n directory='my_awesome_project',\n name='my_cv',\n )\n cv.fit(x_train, y_train, verbose=0, epochs=3)\n print(cv.get_cv_score())\n```\n```python\nOut: \n loss accuracy\n mean 0.283194 0.919783\n std 0.004215 0.002887 \n```\nYou can add another aggregate function (for more info see: [pandas.DataFrame.agg](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.agg.html)):\n```python\nprint(cv.get_cv_score(agg_func={'loss': min, 'accuracy': max}))\n```\n```python\nOut:\n loss 0.27959\n accuracy 0.92010\n```\nAlso, you can get all train history for each splits as `pandas` dataframe:\n\n```python\ncv.get_train_history()\n```\n```python\nOut:\n loss accuracy split epochs\n 0 0.957261 0.679375 0 1\n 1 0.595646 0.809850 0 2\n 2 0.541124 0.824850 0 3\n 3 0.835493 0.722475 1 1\n 4 0.574581 0.810925 1 2\n 5 0.526098 0.829200 1 3\n 6 0.813172 0.736200 2 1\n 7 0.556871 0.816875 2 2\n 8 0.512916 0.829550 2 3\n```\nYou can show train history as matplotlib plot:\n```python\ncv.show_train_history()\n```\n![](img/my_plot.png)\n\n\n\nWhat about metrics per splits?\n```python\ncv.get_split_scores()\n```\n```python\nOut:\n accuracy loss split\n 0 0.9201 0.282442 0\n 1 0.9198 0.290500 1\n 2 0.9173 0.279590 2\n```\nIf `save_history=True` train history, validation metrics and info about split will be saved to the specified directory.\nIn our example:\n```python\nmy_awesome_project/\n |--my_cv/\n |--split_0/\n |--history.yml\n |--validation_metric.yml\n |--split_info.yml\n \n |--split_1/\n |--split_2/\n```\n## Licence\n MIT license\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Cross-validation for keras models",
"version": "0.5.4",
"project_urls": {
"Homepage": "https://github.com/dubovikmaster/keras-model-cv"
},
"split_keywords": [
"keras cross-validate",
"validation keras modelscross-validation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "12d59d06e9b433c953c7da7891fb007de86b4b92f16e1747f25674b1af4bf9bd",
"md5": "f48a287cc52a9f442ce8bea84ab370c6",
"sha256": "512c89101f1999b449f0bcb5655c7ef9e8eab7708ca7f3ce08ce62a9605f65e4"
},
"downloads": -1,
"filename": "keras_model_cv-0.5.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f48a287cc52a9f442ce8bea84ab370c6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 6204,
"upload_time": "2023-07-19T11:22:55",
"upload_time_iso_8601": "2023-07-19T11:22:55.745151Z",
"url": "https://files.pythonhosted.org/packages/12/d5/9d06e9b433c953c7da7891fb007de86b4b92f16e1747f25674b1af4bf9bd/keras_model_cv-0.5.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b470a29dba9d33bd949a0ae5249d2fb8f498b9e20a4b665987d1204d0fd04688",
"md5": "f4c44dbe311834da3add952519d3aada",
"sha256": "23983835c43f2af35eda7a308238b55e77000505678836201e0c80628110b37f"
},
"downloads": -1,
"filename": "keras_model_cv-0.5.4.tar.gz",
"has_sig": false,
"md5_digest": "f4c44dbe311834da3add952519d3aada",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 5930,
"upload_time": "2023-07-19T11:22:56",
"upload_time_iso_8601": "2023-07-19T11:22:56.918658Z",
"url": "https://files.pythonhosted.org/packages/b4/70/a29dba9d33bd949a0ae5249d2fb8f498b9e20a4b665987d1204d0fd04688/keras_model_cv-0.5.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-19 11:22:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dubovikmaster",
"github_project": "keras-model-cv",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "keras-model-cv"
}