Introduction

In this project, we will build an image classification model using a popular dataset, such as CIFAR-10 or MNIST. The goal is to classify images into predefined categories. This project will help you understand the practical application of convolutional neural networks (CNNs) and other deep learning techniques in image classification tasks.

Objectives

  1. Understand the basics of image classification.
  2. Learn how to preprocess image data.
  3. Build and train a convolutional neural network (CNN).
  4. Evaluate the performance of the model.
  5. Fine-tune the model for better accuracy.

Dataset

For this project, we will use the CIFAR-10 dataset, which consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. The dataset is divided into 50,000 training images and 10,000 testing images.

Steps

  1. Import Libraries and Load Dataset

First, we need to import the necessary libraries and load the CIFAR-10 dataset.

import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

  1. Explore the Dataset

Let's take a look at some sample images from the dataset to understand what we are working with.

class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i])
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()

  1. Build the Convolutional Neural Network (CNN)

We will build a simple CNN with a few convolutional and pooling layers, followed by dense layers.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

  1. Compile and Train the Model

Next, we compile the model and train it using the training data.

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

  1. Evaluate the Model

After training, we evaluate the model on the test dataset to see how well it performs.

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f'\nTest accuracy: {test_acc}')

  1. Visualize Training Results

We can plot the training and validation accuracy and loss over epochs to visualize the training process.

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0, 1])
plt.legend(loc='lower right')
plt.show()

  1. Make Predictions

Finally, we can use the trained model to make predictions on new images.

predictions = model.predict(test_images)

# Display the first 5 test images, their predicted labels, and the true labels
plt.figure(figsize=(10,10))
for i in range(5):
    plt.subplot(1,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(test_images[i])
    plt.xlabel(f"True: {class_names[test_labels[i][0]]}\nPred: {class_names[np.argmax(predictions[i])]}")
plt.show()

Conclusion

In this project, we built a convolutional neural network to classify images from the CIFAR-10 dataset. We covered the entire process from loading and exploring the dataset, building and training the model, to evaluating and making predictions. This project serves as a practical introduction to image classification using deep learning techniques.

Exercises

  1. Experiment with Different Architectures: Modify the CNN architecture by adding more layers or changing the number of filters. Observe how these changes affect the model's performance.
  2. Data Augmentation: Implement data augmentation techniques to increase the diversity of the training data and improve the model's robustness.
  3. Transfer Learning: Use a pre-trained model (e.g., VGG16, ResNet) and fine-tune it on the CIFAR-10 dataset. Compare its performance with the model you built from scratch.
  4. Hyperparameter Tuning: Experiment with different hyperparameters such as learning rate, batch size, and number of epochs to find the optimal settings for your model.

Solutions

  1. Experiment with Different Architectures:

    model = models.Sequential()
    model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(64, (3, 3), activation='relu'))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Conv2D(128, (3, 3), activation='relu'))
    model.add(layers.MaxPooling2D((2, 2)))
    model.add(layers.Flatten())
    model.add(layers.Dense(128, activation='relu'))
    model.add(layers.Dense(10))
    
  2. Data Augmentation:

    from tensorflow.keras.preprocessing.image import ImageDataGenerator
    
    datagen = ImageDataGenerator(
        rotation_range=20,
        width_shift_range=0.2,
        height_shift_range=0.2,
        horizontal_flip=True)
    
    datagen.fit(train_images)
    
    model.compile(optimizer='adam',
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    history = model.fit(datagen.flow(train_images, train_labels, batch_size=32),
                        epochs=10, validation_data=(test_images, test_labels))
    
  3. Transfer Learning:

    base_model = tf.keras.applications.VGG16(input_shape=(32, 32, 3),
                                             include_top=False,
                                             weights='imagenet')
    base_model.trainable = False
    
    model = models.Sequential([
        base_model,
        layers.Flatten(),
        layers.Dense(128, activation='relu'),
        layers.Dense(10)
    ])
    
    model.compile(optimizer='adam',
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    history = model.fit(train_images, train_labels, epochs=10, 
                        validation_data=(test_images, test_labels))
    
  4. Hyperparameter Tuning:

    model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
                  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    history = model.fit(train_images, train_labels, epochs=20, batch_size=64, 
                        validation_data=(test_images, test_labels))
    

By completing these exercises, you will gain a deeper understanding of image classification and improve your skills in building and optimizing deep learning models.

© Copyright 2024. All rights reserved