Complete CNN Image Classification Models for Real Time Prediction

Do you find yourself questioning how machines can perceive images? In this thrilling endeavor, you will discover how to design CNN Image classification Models for Real-time prediction. The training of computers to perform image classification tasks is naturally suited for CNNs since they recognize both patterns and features effectively, which is a vital requirement in vision-related tasks.

This tutorial will cover all the steps from model implementation to live inference in an easy and fun way.

Project Overview

As part of this project, we will elaborate architecture of a CNN model that will be used to classify images into building and forest images. If you are a machine learning novice or just want to sharpen your skills, you are on the right platform! You will gain knowledge of the working principles of CNN models for images, and their significance in image processing with TensorFlow and Keras.

We developed the CNN model based on TensorFlow/Keras. And trained the model through a set of images including buildings and forests. First, for the model we used our training dataset. After that, we used data augmentation flipping, rotating, and zooming to make sure that the data had a more diverse and increased model performance. When the same data set was passed through the same model for a second time with the help of augmented data the accuracy was comparatively high.

The CNN architecture we used in this work has several convolutional layers for feature extraction and several fully connected layers for classification. This structure enables the model to inspect the images and make very accurate classification and differentiation. Validation resulted in an accuracy of over 93% for our model which will be especially useful in real-time predictions.

For anyone interested in setting up a similar kind of system or exploring the topic of image classification using AI, this project is detailed enough to give you a head start.

Prerequisites

Before we jump into the code, here’s what you’ll need:

An understanding of Python programming and usage of Google Colab
Basic knowledge about deep learning and medical images.
Comfortable using frameworks like Tensorflow, Keras, Numpy, OpenCV, and Matplotlib to handle data and build models and visualize data and performance of models
The image dataset consists of images of buildings and forests.

Approach

The project is organized stepwise to create a convolutional neural network for image classification of buildings and forests. The main aim of this study is to apply deep learning so that the model learns important characteristic features of the two classes automatically.

Firstly, we collected images of buildings and forests. To increase the reliability of the model and avoid overfitting, we applied data augmentation techniques such as random rotation, flipping, and zooming. These techniques increased the size of our dataset and created some level of differences as well.

The specific model used is CNN which is designed using the TensorFlow/Keras. In the model, convolutional layers are employed, where their main function is to extract the features in the image. When incorporating these layers, the model can capture the details and differentiate between forests and buildings.

At last, the capabilities of the trained model are demonstrated in image classification in a live environment. This proves useful in multiple fields such as urban planning, environmental monitoring, and automatic image processing. This not only shows how effective CNNs can be in image classification but also highlights the need for proper data handling and tuning of the model to achieve optimal performance.

Workflow and Methodology

Workflow

Let’s sequentially explain this project:

Data Collection: We gathered a set of images and preprocessed them by rescaling all images to 180*180 pixels. The data was then divided into training and test sets in the ratio of 80:20.
Data Augmentation: To reduce the problem of overfitting we adopted data augmentation methods such as flipping, rotation, and zooming of images which improve the generality of the model.
Model Design: We establish a CNN model with a number of convolutional layers for feature extraction, pooling layers for dimensionality reduction, and dense layers for classification.
Model Training: In preparing the model, the Adam optimizer was used, and sparse categorical cross-entropy was used as the loss function with an epoch of 10, and batch size of 32.
Validation and Evaluation: On the validation set, we obtained an accuracy of over 93%; for reviewing the performance with respect to different categories, we used the matrix of confusion.
Prediction: Last, we applied the model to new images for the purpose of real-time image classification to show that the model works as expected.

Methodology

Our approach to building this model includes:

Data Augmentation: To prevent overfitting by making our dataset more diverse.
Convolutional Layers: These layers identify relevant features of the input images including edges and textures.
Max Pooling Layers: These layers decrease the spatial dimensions of the feature maps, preserving only the critical information.
Flattening: What flattens the 2D output of the convolutional layers for the fully connected layers in this network?
Fully Connected Layers: These layers arrive at the last decisions that categorize data based on the features derived from other layers.
Softmax Activation: Applied in the output layer to give a probability density over the distinct classes.

Data Collection and Preparation

We collected a data set of images of buildings and forests from Kaggle. However, these images cannot be directly fed into the CNN architecture as they require some amount of cleaning and preparation.

Data Preparation Workflow

Rescaling: Rescale all pixel values of images within the range of 0 to 1
Splitting: Divide the complete dataset into two parts, 80% for training and 20% for validation.
Augmentation: Create new images using augmentation such as flipping and zooming.
Batching: Take the entire dataset and break it into smaller chunks to speed up the process of training.

Code Explanation

STEP 1:

Mounting Google Drive

We mount Google Drive to access our dataset stored in the cloud.

from google.colab import drive
drive.mount('/content/drive')

Installing Packages

This code provides the required environment for your project by installing the libraries necessary for numerical computations, the design of deep learning models, and data visualization.

!pip install numpy
!pip install keras
!pip install tensorflow
!pip install matplotlib

Importing Libraries

This code block imports all the required libraries for this project for creation, and training. It also imports image processing libraries like PIL for handling images, and matplotlib for data visualization. Tensorflow for creating CNN models.

import numpy as np
import pathlib
from tensorflow import keras
from tensorflow.keras import layers
import PIL
import tensorflow as tf
from tensorflow.keras.models import Sequential

STEP 2:

Data PreProcessing

In this code, the pathlib library is used to create a Path object for easy manipulation of file system paths. The specified path points to a folder in Google Drive that contains the datasets, allowing access to the training dataset for model training and testing purposes. The code then counts all the images in data_dir and its subdirectories. It retrieves all files in the provided path using data_dir.glob('*'), and the len() function returns the total count of entries. Finally, it prints the total number of training images in the dataset.

data_dir = pathlib.Path("/content/drive/MyDrive/Aionlinecourse/dataset/training")
image_count = len(list(data_dir.glob('*/*')))
print(image_count)

These lines of code organize the image paths into two separate lists. One for building images and one for forest images. Makes it easier to access and process them for model training. And the third image from the buildings list for additional processing or visualizing the picture.

buildings = list(data_dir.glob('buildings/*'))
forest = list(data_dir.glob('forest/*'))
PIL.Image.open(str(buildings[2]))

This line selects the third image from the buildings list for additional processing or visualizing the picture.

PIL.Image.open(str(forest[10]))

This code defines the values for batch_size, img_height, and img_width. With a batch size of 32, the model updates its weights after processing every 32 images. All input images are resized to the specified height and width before model processing. Additionally, this code prepares the training dataset by loading images from the specified directory, resizing them, and dividing the data into training and validation sets.

batch_size = 32
img_height = 180
img_width = 180
train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

This code reads images from a directory and splits them to make a validation dataset of 20 percent of the whole val_ds. It resizes each image to the size img_height, and img_width and gathers them into batches with specified batch_size. It guarantees that the split will be the same every time with setting seed \= 123.

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

This code obtains class names from the training dataset using train_ds.class_names. Which lists labels obtained from training picture subdirectories. Printing class names to the console shows the model's categories, such as "buildings" and "forests."

class_names = train_ds.class_names
print(class_names)