Build a deep learning model to detect five gestures from videos captured through a smart tv's web camera.
These gestures are used to control TV functionality
The five gestures and their corresponding TV controls are :
The data set contains train and val folders contain 663 , 100 video frames respectively.
Two CSV files contain the frames and the corresponding class labels.
Each video is made of 30 frames. Frames come in two sizes — 120x160 and 320x320 as they are recorded from two different sources.
Develop a deep learning model that is able to classify the gesture based on the video frames.
The deep learning model should have High accuracy
Low memory footprint ( to fit in a webcam memory (typically < 50MB)
Two model architectures have been experimented with
Final Accuracy of 88% has been achieved using Retrained MobileNet 2D CNN + RNN.
# Note that this model has been trained on a kaggle gpu. requirements.txt has been provided along with this file.
# change train_csv_path, val_csv_path, train_path, val_path to work with appropriate data locations.
!pip install scikit-image
!pip install opencv-python
import numpy as np
import os
import skimage
from skimage.io import imread
from skimage.transform import resize
import datetime
import os
import warnings
warnings.filterwarnings("ignore")
Requirement already satisfied: scikit-image in /opt/conda/lib/python3.7/site-packages (0.18.1) Requirement already satisfied: matplotlib!=3.0.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (3.3.3) Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (2021.1.14) Requirement already satisfied: imageio>=2.3.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (2.9.0) Requirement already satisfied: networkx>=2.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (2.5) Requirement already satisfied: scipy>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (1.4.1) Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (1.1.1) Requirement already satisfied: pillow!=7.1.0,!=7.1.1,>=4.3.0 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (7.2.0) Requirement already satisfied: numpy>=1.16.5 in /opt/conda/lib/python3.7/site-packages (from scikit-image) (1.19.5) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (2.4.7) Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (0.10.0) Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (2.8.1) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages (from matplotlib!=3.0.0,>=2.0.0->scikit-image) (1.3.1) Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from cycler>=0.10->matplotlib!=3.0.0,>=2.0.0->scikit-image) (1.15.0) Requirement already satisfied: decorator>=4.3.0 in /opt/conda/lib/python3.7/site-packages (from networkx>=2.0->scikit-image) (4.4.2) WARNING: You are using pip version 21.0; however, version 21.0.1 is available. You should consider upgrading via the '/opt/conda/bin/python3.7 -m pip install --upgrade pip' command. Requirement already satisfied: opencv-python in /opt/conda/lib/python3.7/site-packages (4.5.1.48) Requirement already satisfied: numpy>=1.14.5 in /opt/conda/lib/python3.7/site-packages (from opencv-python) (1.19.5) WARNING: You are using pip version 21.0; however, version 21.0.1 is available. You should consider upgrading via the '/opt/conda/bin/python3.7 -m pip install --upgrade pip' command.
np.random.seed(1)
import random as rn
rn.seed(1)
from keras import backend as K
import tensorflow as tf
tf.random.set_seed(1)
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv3D, MaxPooling3D, Conv2D, MaxPooling2D
from keras.layers.recurrent import LSTM
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.layers import Dropout
# path with csv containing folder names
train_csv_path = './Project_data/train.csv'
val_csv_path = './Project_data/val.csv'
# path of train and val folders
train_path = './Project_data/train'
val_path = './Project_data/val'
# image size
image_shape = (160,160,3) # image size of 160x160 is chosen as it is a prerequisite to train on mobileNet architecture
# batch_size
batch_size = 16
# number of epochs
num_epochs = 30
# image augmentation
augmentation = False
# retrain cnn
retrain = True
# index of frames processed in each video
def video_frames(mode='alternate',length=None) :
if mode == 'alternate' :
return [0,2,4,6,8,10,12,14,16,18,20,22,24,26,28]
elif mode == 'all' :
return list(range(30))
elif mode == 'middle' :
return list(range(5,25))
elif (mode == 'random') and length :
return sorted(list(np.random(0,29,length)))
frames_to_sample = video_frames(mode='all')
# image augmentation
# detecting skin tones. Since, gestures are performed by humans, masking the background and only detecting the skin could be a greate preprocessing step.
# The below function is skin tone filter
def skin_rules(R_Frame, G_Frame, B_Frame) :
BRG_Max = np.maximum.reduce([B_Frame, G_Frame, R_Frame])
BRG_Min = np.minimum.reduce([B_Frame, G_Frame, R_Frame])
#at uniform daylight, The skin colour illumination's rule is defined by the following equation :
Rule_1 = np.logical_and.reduce([R_Frame > 95, G_Frame > 40, B_Frame > 20 ,
BRG_Max - BRG_Min > 15,abs(R_Frame - G_Frame) > 15,
R_Frame > G_Frame, R_Frame > B_Frame])
#the skin colour under flashlight or daylight lateral illumination rule is defined by the following equation :
Rule_2 = np.logical_and.reduce([R_Frame > 220, G_Frame > 210, B_Frame > 170,
abs(R_Frame - G_Frame) <= 15, R_Frame > B_Frame, G_Frame > B_Frame])
#Rule_1 U Rule_2
RGB_Rule = np.logical_or(Rule_1, Rule_2)
return RGB_Rule
# The below function detects skin and removes other scene elements.
def detect_skin(img) :
mask = skin_rules(img[:,:,0], img[:,:,1], img[:,:,2])
img[:,:,0] = img[:,:,0] * mask
img[:,:,1] = img[:,:,1] * mask
img[:,:,2] = img[:,:,2] * mask
return img
def erode(img,kernel) :
img_erode = np.zeros_like(img)
img_erode[:,:,0] = cv2.erode(img[:,:,0],kernel)
img_erode[:,:,1] = cv2.erode(img[:,:,1],kernel)
img_erode[:,:,2] = cv2.erode(img[:,:,2],kernel)
return img_erode
def dilate(img,kernel) :
img_dilate = np.zeros_like(img)
img_dilate[:,:,0] = cv2.dilate(img[:,:,0],kernel)
img_dilate[:,:,1] = cv2.dilate(img[:,:,1],kernel)
img_dilate[:,:,2] = cv2.dilate(img[:,:,2],kernel)
return img_dilate
def closing(img,kernel) :
return dilate(erode(img,kernel),kernel)
def opening(img,kernel) :
return erode(dilate(img,kernel),kernel)
def open_close(img,kernel) :
return closing(opening(img,kernel),kernel)
def close_open(img,kernel) :
return closing(opening(img,kernel),kernel)
# opencv normalisation is used to prevent any math overflows.
def cv_normalise(img) :
img_new = np.zeros_like(img)
cv2.normalize(img, img_new , 0,1, cv2.NORM_MINMAX)
assert round(np.max(img_new),1) == 1, 'Normalisation error'+ str(np.max(img_new))
assert round(np.min(img_new),1) == 0 ,'Normalisation error'+ str(np.min(img_new))
return img_new
# opening and then closing is performed to remove noise from images and then skin is detected
def preprocess_image(img,kernel) :
img = open_close(img,kernel)
img = detect_skin(img)
return img
# model training history plot
def plot_model_history(history):
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(15,4))
axes[0].plot(history.history['loss'])
axes[0].plot(history.history['val_loss'])
axes[0].legend(['loss','val_loss'])
axes[1].plot(history.history['categorical_accuracy'])
axes[1].plot(history.history['val_categorical_accuracy'])
axes[1].legend(['categorical_accuracy','val_categorical_accuracy'])
# Parsing train & validation csv
train_doc = np.random.permutation(open(train_csv_path).readlines())
val_doc = np.random.permutation(open(val_csv_path).readlines())
# Generator function
def generator(source_path, folder_list, batch_size=batch_size, augmentation=augmentation):
print( '\nSource path = ', source_path, '; batch size =', batch_size)
img_idx = frames_to_sample
while True:
t = np.random.permutation(folder_list)
num_batches = len(folder_list)// batch_size
for batch in range(num_batches):
batch_data = np.zeros((batch_size,len(img_idx),*image_shape))
batch_labels = np.zeros((batch_size,5))
for folder in range(batch_size):
imgs = os.listdir(source_path+'/'+ t[folder + (batch*batch_size)].split(';')[0])
for idx,item in enumerate(img_idx):
image = imread(source_path+'/'+ t[folder + (batch*batch_size)].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
# Although images are of two different sizes, 120x160 images do not have much information in 0-20 and 140-60 band.
# Hence 120x160 images could be cropped to 120x120 as follows.
if image.shape[0] == 120 :
image = image[:,20:140]
# Similarly, 360x360 images could be center cropped since the gesture information is contained in the center.
if image.shape[0] == 360 :
image = image[120:240,120:240]
# Both images are brought to the same dimension and then resized to 160x160
image = resize(image, (160,160))
# if augmentation is true, randomly mask scenes from a few images.
if augmentation and idx.isin(list(np.random.randint(0,29,5))):
kernel = (1/16)*np.ones((4,4)) # kernel for morphological transformations
image = preprocess_image(image,kernel)
# Normalisation
batch_data[folder,idx,:,:,0] = cv_normalise(image[:,:,0])
batch_data[folder,idx,:,:,1] = cv_normalise(image[:,:,1])
batch_data[folder,idx,:,:,2] = cv_normalise(image[:,:,2])
batch_labels[folder, int(t[folder + (batch*batch_size)].strip().split(';')[2])] = 1
yield batch_data, batch_labels
# Remaining data after integral batch_numbers
remainder_size = len(folder_list) % batch_size
remainder_folders = t[-1*(remainder_size + 1) : -1 ]
assert remainder_size == len(range(-1*(remainder_size + 1) , -1)) , 'Take care of the remainder folders '
# The last remaining image folders are still loaded onto a tensor of batch_size. It has been noted that this doesnot affect performance.
batch_data = np.zeros((batch_size,len(img_idx),*image_shape))
batch_labels = np.zeros((batch_size,5))
for folder in range(remainder_size):
imgs = os.listdir(source_path+'/'+ remainder_folders[folder].split(';')[0])
for idx,item in enumerate(img_idx):
image = imread(source_path+'/'+ remainder_folders[folder].strip().split(';')[0]+'/'+imgs[item]).astype(np.float32)
if image.shape[0] == 120 :
image = image[:,20:140]
if image.shape[0] == 360 :
image = image[120:240,120:240]
image = resize(image, (160,160))
if augmentation and idx.isin(list(np.random.randint(0,29,5))):
kernel = (1/16)*np.ones((4,4)) # kernel for morphological transformations
image = preprocess_image(image,kernel)
batch_data[folder,idx,:,:,0] = cv_normalise(image[:,:,0])
batch_data[folder,idx,:,:,1] = cv_normalise(image[:,:,1])
batch_data[folder,idx,:,:,2] = cv_normalise(image[:,:,2])
batch_labels[folder, int(remainder_folders[folder].strip().split(';')[2])] = 1
yield batch_data, batch_labels
# Generator check
x,y = next(generator(val_path, val_doc, batch_size,augmentation=False))
print(x.shape)
print(y.shape)
x,y = next(generator(train_path, train_doc, batch_size))
print(x.shape)
print(y.shape)
Source path = ../input/gesture-recognition/val ; batch size = 16 (16, 30, 160, 160, 3) (16, 5) Source path = ../input/gesture-recognition/train ; batch size = 16 (16, 30, 160, 160, 3) (16, 5)
# Sequence lengths
curr_dt_time = datetime.datetime.now()
num_train_sequences = len(train_doc)
print('# training sequences =', num_train_sequences)
num_val_sequences = len(val_doc)
print('# validation sequences =', num_val_sequences)
print ('# epochs =', num_epochs)
# training sequences = 663 # validation sequences = 100 # epochs = 30
# Final Model : CNN(MobileNet) + RNN
from keras.models import Sequential, Model
from keras.layers import Dense, GRU, Flatten, TimeDistributed, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers.recurrent import GRU
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras import optimizers
from keras.layers import Dropout
from keras.applications import mobilenet
# Model Parameters
gru_cells = 64
dense_layer=64
dropout_ratio = 0.25
retrain_cnn = False
# Re-trained Mobile-Net CONV2D architecture followed by GRU (RNN)
mobilenet_transfer = mobilenet.MobileNet(weights='imagenet', include_top=False)
# CNN-RNN model
model = Sequential()
model.add(TimeDistributed(mobilenet_transfer,input_shape=(len(frames_to_sample),*image_shape)))
for layer in model.layers:
layer.trainable = retrain
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(MaxPooling2D((2, 2))))
model.add(TimeDistributed(Flatten()))
model.add(GRU(gru_cells))
model.add(Dropout(dropout_ratio))
model.add(Dense(dense_layer,activation='relu'))
model.add(Dropout(dropout_ratio))
model.add(Dense(5, activation='softmax'))
optimiser = optimizers.Adam(lr=0.0005)
model.compile(optimizer=optimiser, loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.summary()
Model: "sequential_6" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= time_distributed_24 (TimeDis (None, 30, 5, 5, 1024) 3228864 _________________________________________________________________ time_distributed_25 (TimeDis (None, 30, 5, 5, 1024) 4096 _________________________________________________________________ time_distributed_26 (TimeDis (None, 30, 2, 2, 1024) 0 _________________________________________________________________ time_distributed_27 (TimeDis (None, 30, 4096) 0 _________________________________________________________________ gru_6 (GRU) (None, 64) 799104 _________________________________________________________________ dropout_12 (Dropout) (None, 64) 0 _________________________________________________________________ dense_12 (Dense) (None, 64) 4160 _________________________________________________________________ dropout_13 (Dropout) (None, 64) 0 _________________________________________________________________ dense_13 (Dense) (None, 5) 325 ================================================================= Total params: 4,036,549 Trainable params: 4,012,613 Non-trainable params: 23,936 _________________________________________________________________
model.count_params()
4036549
train_generator = generator(train_path, train_doc, batch_size)
val_generator = generator(val_path, val_doc, batch_size,augmentation=False)
model_name = 'mobilenet_cnn_rnn' + '_' + str(curr_dt_time).replace(' ','').replace(':','_') + '/'
if not os.path.exists(model_name):
os.mkdir(model_name)
filepath = model_name + 'model-{epoch:05d}-{loss:.5f}-{categorical_accuracy:.5f}-{val_loss:.5f}-{val_categorical_accuracy:.5f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=False, save_weights_only=False, mode='auto', period=1)
LR = tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_loss",
factor=0.2,
patience=4,
)
# earlyStopping = tf.keras.callbacks.EarlyStopping(
# monitor="val_loss",
# min_delta=0.00001,
# )
callbacks = [checkpoint,LR]
if (num_train_sequences%batch_size) == 0:
steps_per_epoch = int(num_train_sequences/batch_size)
else:
steps_per_epoch = (num_train_sequences//batch_size) + 1
if (num_val_sequences%batch_size) == 0:
validation_steps = int(num_val_sequences/batch_size)
else:
validation_steps = (num_val_sequences//batch_size) + 1
# Checking GPU
n_gpus = len(tf.config.experimental.list_physical_devices('GPU'))
assert n_gpus > 0 , 'No GPU available'
with tf.device('/GPU:0'):
model_history = model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=num_epochs, verbose=1,
callbacks=callbacks, validation_data=val_generator,
validation_steps=validation_steps, class_weight=None, workers=1, initial_epoch=0)
Source path = ../input/gesture-recognition/train ; batch size = 16 Epoch 1/30 42/42 [==============================] - ETA: 0s - loss: 1.5471 - categorical_accuracy: 0.3616 Source path = ../input/gesture-recognition/val ; batch size = 16 42/42 [==============================] - 208s 5s/step - loss: 1.5429 - categorical_accuracy: 0.3631 - val_loss: 1.0572 - val_categorical_accuracy: 0.4821 Epoch 00001: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00001-1.36763-0.42708-1.05720-0.48214.h5 Epoch 2/30 42/42 [==============================] - 199s 5s/step - loss: 0.8368 - categorical_accuracy: 0.6993 - val_loss: 0.6464 - val_categorical_accuracy: 0.6786 Epoch 00002: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00002-0.78126-0.71726-0.64639-0.67857.h5 Epoch 3/30 42/42 [==============================] - 199s 5s/step - loss: 0.4415 - categorical_accuracy: 0.8791 - val_loss: 0.5613 - val_categorical_accuracy: 0.6964 Epoch 00003: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00003-0.44089-0.86161-0.56134-0.69643.h5 Epoch 4/30 42/42 [==============================] - 197s 5s/step - loss: 0.2518 - categorical_accuracy: 0.9395 - val_loss: 0.4892 - val_categorical_accuracy: 0.6786 Epoch 00004: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00004-0.27437-0.90625-0.48919-0.67857.h5 Epoch 5/30 42/42 [==============================] - 204s 5s/step - loss: 0.1764 - categorical_accuracy: 0.9574 - val_loss: 0.3074 - val_categorical_accuracy: 0.7679 Epoch 00005: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00005-0.20042-0.93750-0.30736-0.76786.h5 Epoch 6/30 42/42 [==============================] - 198s 5s/step - loss: 0.1453 - categorical_accuracy: 0.9606 - val_loss: 0.6818 - val_categorical_accuracy: 0.7054 Epoch 00006: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00006-0.15889-0.94494-0.68176-0.70536.h5 Epoch 7/30 42/42 [==============================] - 196s 5s/step - loss: 0.1456 - categorical_accuracy: 0.9542 - val_loss: 0.5451 - val_categorical_accuracy: 0.7143 Epoch 00007: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00007-0.14889-0.93452-0.54514-0.71429.h5 Epoch 8/30 42/42 [==============================] - 199s 5s/step - loss: 0.0989 - categorical_accuracy: 0.9709 - val_loss: 0.5016 - val_categorical_accuracy: 0.7321 Epoch 00008: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00008-0.09979-0.96131-0.50155-0.73214.h5 Epoch 9/30 42/42 [==============================] - 198s 5s/step - loss: 0.0863 - categorical_accuracy: 0.9810 - val_loss: 0.3908 - val_categorical_accuracy: 0.7411 Epoch 00009: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00009-0.06504-0.97768-0.39079-0.74107.h5 Epoch 10/30 42/42 [==============================] - 202s 5s/step - loss: 0.0319 - categorical_accuracy: 0.9967 - val_loss: 0.2576 - val_categorical_accuracy: 0.8214 Epoch 00010: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00010-0.03687-0.97917-0.25756-0.82143.h5 Epoch 11/30 42/42 [==============================] - 197s 5s/step - loss: 0.0498 - categorical_accuracy: 0.9900 - val_loss: 0.3263 - val_categorical_accuracy: 0.7857 Epoch 00011: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00011-0.04199-0.97917-0.32633-0.78571.h5 Epoch 12/30 42/42 [==============================] - 197s 5s/step - loss: 0.0276 - categorical_accuracy: 0.9994 - val_loss: 0.4957 - val_categorical_accuracy: 0.7679 Epoch 00012: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00012-0.02397-0.98661-0.49573-0.76786.h5 Epoch 13/30 42/42 [==============================] - 214s 5s/step - loss: 0.0202 - categorical_accuracy: 0.9985 - val_loss: 0.3958 - val_categorical_accuracy: 0.7946 Epoch 00013: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00013-0.02184-0.98363-0.39575-0.79464.h5 Epoch 14/30 42/42 [==============================] - 216s 5s/step - loss: 0.0183 - categorical_accuracy: 0.9975 - val_loss: 0.3790 - val_categorical_accuracy: 0.7768 Epoch 00014: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00014-0.03019-0.98065-0.37903-0.77679.h5 Epoch 15/30 42/42 [==============================] - 199s 5s/step - loss: 0.0179 - categorical_accuracy: 0.9994 - val_loss: 0.3609 - val_categorical_accuracy: 0.7857 Epoch 00015: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00015-0.02013-0.98661-0.36094-0.78571.h5 Epoch 16/30 42/42 [==============================] - 206s 5s/step - loss: 0.0194 - categorical_accuracy: 0.9981 - val_loss: 0.3904 - val_categorical_accuracy: 0.7768 Epoch 00016: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00016-0.01926-0.98363-0.39043-0.77679.h5 Epoch 17/30 42/42 [==============================] - 200s 5s/step - loss: 0.0142 - categorical_accuracy: 0.9994 - val_loss: 0.4156 - val_categorical_accuracy: 0.7857 Epoch 00017: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00017-0.01449-0.98661-0.41556-0.78571.h5 Epoch 18/30 42/42 [==============================] - 200s 5s/step - loss: 0.0151 - categorical_accuracy: 0.9994 - val_loss: 0.3187 - val_categorical_accuracy: 0.7946 Epoch 00018: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00018-0.01669-0.98661-0.31866-0.79464.h5 Epoch 19/30 42/42 [==============================] - 199s 5s/step - loss: 0.0171 - categorical_accuracy: 0.9993 - val_loss: 0.3089 - val_categorical_accuracy: 0.7768 Epoch 00019: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00019-0.01601-0.98512-0.30895-0.77679.h5 Epoch 20/30 42/42 [==============================] - 207s 5s/step - loss: 0.0167 - categorical_accuracy: 0.9994 - val_loss: 0.3820 - val_categorical_accuracy: 0.7857 Epoch 00020: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00020-0.01604-0.98661-0.38195-0.78571.h5 Epoch 21/30 42/42 [==============================] - 201s 5s/step - loss: 0.0111 - categorical_accuracy: 0.9994 - val_loss: 0.3993 - val_categorical_accuracy: 0.7768 Epoch 00021: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00021-0.01243-0.98661-0.39935-0.77679.h5 Epoch 22/30 42/42 [==============================] - 200s 5s/step - loss: 0.0135 - categorical_accuracy: 0.9994 - val_loss: 0.3867 - val_categorical_accuracy: 0.7768 Epoch 00022: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00022-0.01643-0.98661-0.38665-0.77679.h5 Epoch 23/30 42/42 [==============================] - 199s 5s/step - loss: 0.0136 - categorical_accuracy: 0.9992 - val_loss: 0.3679 - val_categorical_accuracy: 0.7857 Epoch 00023: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00023-0.02286-0.98214-0.36788-0.78571.h5 Epoch 24/30 42/42 [==============================] - 204s 5s/step - loss: 0.0150 - categorical_accuracy: 0.9994 - val_loss: 0.4236 - val_categorical_accuracy: 0.7857 Epoch 00024: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00024-0.01517-0.98661-0.42363-0.78571.h5 Epoch 25/30 42/42 [==============================] - 194s 5s/step - loss: 0.0143 - categorical_accuracy: 0.9994 - val_loss: 0.3942 - val_categorical_accuracy: 0.7857 Epoch 00025: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00025-0.01420-0.98810-0.39419-0.78571.h5 Epoch 26/30 42/42 [==============================] - 194s 5s/step - loss: 0.0155 - categorical_accuracy: 0.9972 - val_loss: 0.4399 - val_categorical_accuracy: 0.7768 Epoch 00026: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00026-0.01914-0.98363-0.43994-0.77679.h5 Epoch 27/30 42/42 [==============================] - 194s 5s/step - loss: 0.0142 - categorical_accuracy: 0.9993 - val_loss: 0.3475 - val_categorical_accuracy: 0.7946 Epoch 00027: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00027-0.01581-0.98512-0.34751-0.79464.h5 Epoch 28/30 42/42 [==============================] - 198s 5s/step - loss: 0.0151 - categorical_accuracy: 0.9994 - val_loss: 0.3379 - val_categorical_accuracy: 0.7946 Epoch 00028: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00028-0.01497-0.98661-0.33787-0.79464.h5 Epoch 29/30 42/42 [==============================] - 195s 5s/step - loss: 0.0138 - categorical_accuracy: 0.9993 - val_loss: 0.3888 - val_categorical_accuracy: 0.7768 Epoch 00029: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00029-0.01956-0.98512-0.38885-0.77679.h5 Epoch 30/30 42/42 [==============================] - 194s 5s/step - loss: 0.0148 - categorical_accuracy: 0.9994 - val_loss: 0.2654 - val_categorical_accuracy: 0.8125 Epoch 00030: saving model to mobilenet_cnn_rnn_2021-02-0814_41_36.375578/model-00030-0.01441-0.98661-0.26539-0.81250.h5
plot_model_history(model_history)