A Model with the power to Discern —Using Convolutional neural nets to identify clothing items

Ayotomiwa salau
7 min readNov 3, 2018

On reaching a milestone in the AI Saturday learning track, we participants were grouped into teams and tasked with working on an image classifier project making use of an algorithm of our choice to classify the MNIST dataset or Fashion MNIST dataset.

Our team, Team G-4, chose the Convolutional Neural network algorithm using the Tensorflow Keras framework to classify the Fashion MNIST dataset. We fed the fashion image dataset into the algorithm to classify these images into their appropriate classes.

We started out on a Whatsapp group where we shared ideas and thoughts on how to go about the project, The idea was at first to create a colab notebook where everyone can edit and make changes where deemed fit, but there was an issue with this idea, we would have different understanding of the code which might not neccessary follow in line. We then went searching on github, kaggle for similar projects done

After some days, I stumbled upon a notebook by Margaret Maynard-Reid where she did something similar using CNN on the Fashion Mnist data, it was an easy to understand notebook which outlined the process step by step.

So we individually went through the notebook, ran it on our systems to understand the full workings of the code, Ugonna and I had our thinking caps on in suggesting ways the accuracy of the model can be greatly improved. it was suggested to increase the batch size and the number of epochs.

While that was ongoing, Abayomi had started some work on the Medium article where he got information on Fashion Mnist dataset and the working principle behind neural networks, Iyanda and Ayanmiayan also added some valuable information to the article.

Chimelie had been very helpful on the onset of the project but he ran into some slight technical difficulty, he wished us the best as he had to take some time to sort it out. I had also been helpful in coordinating the group work ensuring no end was left behind.

We were making progress as a team, communicated effectively, work was ongoing on the code and article content as well. As we neared the deadline of the project, it became imperative we started working on our google slides, Ugonna had started some good work on the slides, Iyanda joined in as well, and added some valuable content to the slides. We had a well put together slide.

Our model, CNN model, was also bearing good fruit by classifying the fashion items properly, but to significantly improve its accuracy, we doubled the number of epoch,increased the batch size and added to the number of layers for the neural network, which got us a from and 89% to 92% accuracy. Hmm, Interesting, yes we know.

Though along the line we had some hiccups and disagreements but we were able to put ourselves together and deliver a good presentation, we saw all smiles on the faces of the audiences and got positive feedbacks afterwards, ‘I learnt some new things while listening to you guys’, ‘you guys did good work’.

It was a good one and we had so much fun while at it.

A Brief Understanding on How the Neural network algorithm worked

A Neural Network is an algorithm that attempts to simulate how the human brain learns and how it sees patterns; the human brain is made up of special cells called neurons. There are more than a hundred different kinds of neurons, separated into groups called networks. Each network contains several thousand neurons that are highly interconnected. Thus, the brain can be viewed as a collection of neural networks.

Neural networks have been created to process data in an innovative and complex ways. Image classification calls upon a neural network to spot specific attributes in an image.The network is fed millions of images in order to build a robust foundation of attributes and classifications. As the layers develop, they begin to master specific features and continue to develop a sophisticated understanding of high-level features of the image. A basic identification would notice rough or smooth edges, the intermediate stage may detect shapes or larger components, and the final layer would tie together the attributes into a logical solution.

How do we feed our Data into our Algorithm?

For we human the task of identification of an image or object comes naturally for us we can easily identify this objects based on the colors, shapes, size… without much efforts but for a computer system it sees an image as an array of pixels so for a computer an image is a two-dimensional function(x, y), where x and y are spatial (plane) coordinates, and the amplitude of at any pair of coordinates is called the intensity of the image at that point.

The Fashion MNIST dataset

MNIST(Modified National Institute of Standard & Technology) is the de facto hello world dataset of computer vision, it serves as the basis for algorithms. The dataset contains 70,000 grayscale images in 10 categories (0–9) Indicating which clothing is displayed at low resolution of 28 by 28 pixels. The dataset consist of 60000 training examples and 10000 test sets. Each example is a 28 x 28 gray scale image,associated with a label from 10 classes.

Implementation of our codes

We firstly imported our helper libraries into the environment, then imported the dataset from the Keras library using Tensorflow.

Visualize examples in data set

We then define text label for the fashion_mnist_labels dataset just as shown below, thereafter we visualize the training image with index of 90 which text label happens to be ankle boot

Data Normalization
We normalize the input dimensions of the training dataset dimensions so that they are of approximately the same scale and to help the optimization process i.e help gradient descent converge faster and reduce training time

Further break training data into train / validation sets
Thereafter, we follow the rule of thumb in ML- we break how dataset into 55,000 training set and 5000 validation set.
So we can use our validation set to tune the hyperparameters of our model

Create the Model Architecture
In our research we found to that in order to achieve the state of the art result for classification of this dataset we have to use the Convolutional Neural Network alogrithm.
Convolutional Neural Network(CNN also ConvNet)

The model is configured with the learning process using three arguments:
an optimizer, a loss function, a list metrics
This uses the compile() API

We use the ModelCheckpoint API to save the model after every epoch,
Set ‘Save_best_only = True’ to save only when the validation accuracy improves.

Here the weights with the best validation accuracy is loaded

The model on test test is evaluated, and we achieved an accuracy of 92%

We then get the predictions with the model from the test data. Then we print out 15 images from the test data set, and set the titles with the prediction (and the ground truth label). If the prediction matches the true label, the title will be green; otherwise it’s displayed in red.

Team G-4
Tomiwa
Abayomi
Ugonna
Iyanda
Ayanmiayan
Chimelie

--

--