IMAGE CLASSIFIER on CIFER10 dataset

Jay Prakash Thakur
6 min readMar 27, 2021

CIFER-10 dataset exploration & experiments

Problem Statement — To build a model to classify an image. Given dataset is CIFER-10 dataset.

Outcome — Given an image, out model will predict the class of that image.

so let’s start.

What is CIFER-10 ?

CIFER-10 is a dataset of 32*32 size colored images by CIFER, which consists 10 classes (‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’).

CIFER-10 data samples

Let’s see, What are the Steps to Train a model ?

We will do the following steps in order:

  1. Load and normalizing the CIFAR10 training and test datasets using torchvision

2. Define a Convolutional Neural Network

3. Define a loss function

4. Train the network on the training data

5. Test the network on the test data

Let’s do the steps one by one

  1. Loading and Normalizing CIFAR10
Pytorch gives us CIFER-10 dataset, so lets load our data & transform it using below code.transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)print("Train Dataset : ", len(trainloader))testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)print("Test Dataset : ", len(testloader))classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Here, we can see that our files has been downloaded. There are total 12500 data in training set whereas 2500 data in test set.

Let’s visualize some of the training images.

each image shape is 32 * 32 with 3 channel.

now let define a convolutional neural network.

2. Define Convolutional Neural Network

Before that lets understand what is CNN?

A CNN is a type of neural network, empowered with some specific hidden layers, including the Convolutional layer, the Pooling layer, and the Fully-Connected layer. CNN is mainly used in image processing applications.

since our image is of shape 32*32 the structure will look something like this

A convolutional operation is performed using kernel, maxpool, activation function, fully connected layer.

The kernel/filter slides over the input signal as shown below. You can see the filter (the green square) is sliding over our input (the blue square) and the sum of the convolution goes into the feature map (the red square).read it here.

Maxpool — Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality

Activation Functions — the activation function of a node defines the output of that node given an input or set of inputs. perform a transformation on the input received, in order to keep values within a manageable range. there are many activation functions. we will use ReLU.

Now Let’s code our network

# Define a Neural Network with 2 convolution layer for 3 channel imagesclass Net2CL(nn.Module):def __init__(self):super(Net2CL, self).__init__()# conv layersself.conv1 = nn.Conv2d(3, 6, 5)self.conv2 = nn.Conv2d(6, 16, 5)# pooling layersself.pool = nn.MaxPool2d(2, 2)# fully connected layersself.fc1 = nn.Linear(16 * 5 * 5, 120)self.fc2 = nn.Linear(120, 84)self.fc3 = nn.Linear(84, 10)def forward(self, x):x = self.pool(F.relu(self.conv1(x)))x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)x = F.relu(self.fc1(x))x = F.relu(self.fc2(x))x = self.fc3(x)return xnet2cl = Net2CL()net2cl = net2cl.to(device)

Our network shape looks like this

likewise we have defined for 3 & 4 conv layer, see structure below. you can see here.

3. Define a Loss function and optimizer

Loss Function — It’s a method of evaluating how well specific algorithm models the given data. If predictions deviates too much from actual results, loss function would cough up a very large number.we will use Cross-Entropy Loss, which is most common for classification task.

Cross-Entropy Loss

It measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0.

nn.CrossEntropyLoss()

Optimizer — The optimizer takes the parameters we want to update, the learning rate we want to use (and possibly many other parameters as well, and performs the updates through its step() method.

There are many optimisers, we will use SGD or Adam

optim.SGD(net.parameters(), lr=learning_rate, momentum=0.9)optim.Adam(net.parameters(), lr=learning_rate)

Now its time to train our Network.

after training our network on 2 layer model with SGD, the train loss looks like this.

Here we can see our loss reduced a lot with increasing epochs.

like this we can train our network on 3 layer & 4 layer. you can see the code on git.

Epoch vs Loss with 3 layers is as —

Epoch vs loss with 4 layer is as —

training on 2 layer gives us 66% accuracy whereas we have achieved a total of 70.40% accuracy with 3 layer model on test data.

Here is a comparison of model performance.

Here is a comparison of time elapsed of different layer neurons.

we can see Network with 2 layer takes around 2500000 ms ,network with 3 conv layer takes around 820000 ms and network with 4 layer takes around 7100000 ms

Challanges —

There were many challanges. some are

  1. creating network layer.
  2. observing accuracies with multiple optimisers, loss function & with different network layer.
  3. Overfitting

My Contribution —

  1. I have implemented model with different neurons & multiple layers.
  2. In the 4 Conv Layer, i added some dropout layers to reduce overfitting.
  3. Observed & experimented with multiple neurons, layers with time elapsed.

you can find the full code here.

Thank you so much for reading.

References

https://www.cs.toronto.edu/~kriz/cifar.html

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html

https://towardsdatascience.com/pytorch-basics-how-to-train-your-neural-net-intro-to-cnn-26a14c2ea29

https://sgugger.github.io/convolution-in-depth.html

https://www.researchgate.net/figure/Learning-hierarchy-of-visual-features-in-CNN-architecture_fig1_281607765

https://cs231n.github.io/convolutional-networks/#conv

https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148

https://medium.com/technologymadeeasy/the-best-explanation-of-convolutional-neural-networks-on-the-internet-fbb8b1ad5df8

https://pytorch.org/docs/stable/nn.html

https://datascience.stackexchange.com/questions/40906/determining-size-of-fc-layer-after-conv-layer-in-pytorch

https://stackoverflow.com/questions/56675943/meaning-of-parameters-in-torch-nn-conv2d

https://datascience.stackexchange.com/questions/40906/determining-size-of-fc-layer-after-conv-layer-in-pytorch

https://computersciencewiki.org/index.php/Max-pooling_/_Pooling

https://towardsdatascience.com/everything-you-need-to-know-about-activation-functions-in-deep-learning-models-84ba9f82c253

https://heartbeat.fritz.ai/the-right-loss-function-pytorch-58d2c0d77404

https://medium.com/udacity-pytorch-challengers/a-brief-overview-of-loss-functions-in-pytorch-c0ddb78068f7

https://neptune.ai/blog/pytorch-loss-functions

https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23

https://phuctrt.medium.com/loss-functions-why-what-where-or-when-189815343d3f

https://towardsdatascience.com/optimizers-for-training-neural-network-59450d71caf6

https://analyticsindiamag.com/ultimate-guide-to-pytorch-optimizers/

--

--