IMAGE CLASSIFIER on CIFER10 dataset
CIFER-10 dataset exploration & experiments
Problem Statement — To build a model to classify an image. Given dataset is CIFER-10 dataset.
Outcome — Given an image, out model will predict the class of that image.
so let’s start.
What is CIFER-10 ?
CIFER-10 is a dataset of 32*32 size colored images by CIFER, which consists 10 classes (‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’).
Let’s see, What are the Steps to Train a model ?—
We will do the following steps in order:
- Load and normalizing the CIFAR10 training and test datasets using torchvision
2. Define a Convolutional Neural Network
3. Define a loss function
4. Train the network on the training data
5. Test the network on the test data
Let’s do the steps one by one
- Loading and Normalizing CIFAR10
Pytorch gives us CIFER-10 dataset, so lets load our data & transform it using below code.transform = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)print("Train Dataset : ", len(trainloader))testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)print("Test Dataset : ", len(testloader))classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
Here, we can see that our files has been downloaded. There are total 12500 data in training set whereas 2500 data in test set.
Let’s visualize some of the training images.
each image shape is 32 * 32
with 3 channel.
now let define a convolutional neural network.
2. Define Convolutional Neural Network
Before that lets understand what is CNN?
A CNN is a type of neural network, empowered with some specific hidden layers, including the Convolutional layer, the Pooling layer, and the Fully-Connected layer. CNN is mainly used in image processing applications.
since our image is of shape 32*32
the structure will look something like this
A convolutional operation is performed using kernel, maxpool, activation function, fully connected layer.
The kernel/filter slides over the input signal as shown below. You can see the filter (the green square) is sliding over our input (the blue square) and the sum of the convolution goes into the feature map (the red square).read it here.
Maxpool — Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality
Activation Functions — the activation function of a node defines the output of that node given an input or set of inputs. perform a transformation on the input received, in order to keep values within a manageable range. there are many activation functions. we will use ReLU.
Now Let’s code our network
# Define a Neural Network with 2 convolution layer for 3 channel imagesclass Net2CL(nn.Module):def __init__(self):super(Net2CL, self).__init__()# conv layersself.conv1 = nn.Conv2d(3, 6, 5)self.conv2 = nn.Conv2d(6, 16, 5)# pooling layersself.pool = nn.MaxPool2d(2, 2)# fully connected layersself.fc1 = nn.Linear(16 * 5 * 5, 120)self.fc2 = nn.Linear(120, 84)self.fc3 = nn.Linear(84, 10)def forward(self, x):x = self.pool(F.relu(self.conv1(x)))x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)x = F.relu(self.fc1(x))x = F.relu(self.fc2(x))x = self.fc3(x)return xnet2cl = Net2CL()net2cl = net2cl.to(device)
Our network shape looks like this
likewise we have defined for 3 & 4 conv layer, see structure below. you can see here.
3. Define a Loss function and optimizer
Loss Function — It’s a method of evaluating how well specific algorithm models the given data. If predictions deviates too much from actual results, loss function would cough up a very large number.we will use Cross-Entropy Loss, which is most common for classification task.
Cross-Entropy Loss
It measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0.
nn.CrossEntropyLoss()
Optimizer — The optimizer takes the parameters we want to update, the learning rate we want to use (and possibly many other parameters as well, and performs the updates through its step()
method.
There are many optimisers, we will use SGD or Adam
optim.SGD(net.parameters(), lr=learning_rate, momentum=0.9)optim.Adam(net.parameters(), lr=learning_rate)
Now its time to train our Network.
after training our network on 2 layer model with SGD, the train loss looks like this.
Here we can see our loss reduced a lot with increasing epochs.
like this we can train our network on 3 layer & 4 layer. you can see the code on git.
Epoch vs Loss with 3 layers is as —
Epoch vs loss with 4 layer is as —
training on 2 layer gives us 66%
accuracy whereas we have achieved a total of 70.40%
accuracy with 3 layer model on test data.
Here is a comparison of model performance.
Here is a comparison of time elapsed of different layer neurons.
we can see Network with 2 layer takes around 2500000 ms
,network with 3 conv layer takes around 820000 ms
and network with 4 layer takes around 7100000 ms
Challanges —
There were many challanges. some are
- creating network layer.
- observing accuracies with multiple optimisers, loss function & with different network layer.
- Overfitting
My Contribution —
- I have implemented model with different neurons & multiple layers.
- In the 4 Conv Layer, i added some
dropout
layers to reduce overfitting. - Observed & experimented with multiple neurons, layers with time elapsed.
you can find the full code here.
Thank you so much for reading.
References
https://www.cs.toronto.edu/~kriz/cifar.html
https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html
https://towardsdatascience.com/pytorch-basics-how-to-train-your-neural-net-intro-to-cnn-26a14c2ea29
https://sgugger.github.io/convolution-in-depth.html
https://cs231n.github.io/convolutional-networks/#conv
https://pytorch.org/docs/stable/nn.html
https://stackoverflow.com/questions/56675943/meaning-of-parameters-in-torch-nn-conv2d
https://computersciencewiki.org/index.php/Max-pooling_/_Pooling
https://heartbeat.fritz.ai/the-right-loss-function-pytorch-58d2c0d77404
https://neptune.ai/blog/pytorch-loss-functions
https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23
https://phuctrt.medium.com/loss-functions-why-what-where-or-when-189815343d3f
https://towardsdatascience.com/optimizers-for-training-neural-network-59450d71caf6
https://analyticsindiamag.com/ultimate-guide-to-pytorch-optimizers/