Image Classification with Single Class dataset using VGG-16 Transfer Learning

Transfer learning generally refers to a process where a model trained on one problem is used in some way on a second related problem. In deep learning, transfer learning is a technique whereby a neural network model is first trained on a problem similar to the problem that is being solved. One or more layers from the trained model are then used in a new model trained on the problem of interest.

This is typically understood in a supervised learning context, where the input is the same but the target may be of a different nature. For example, we may learn about one set of visual categories, such as cats and dogs, in the first setting, then learn about a different set of visual categories, such as ants and wasps, in the second setting.

— Page 536, Deep Learning, 2016.

Transfer learning has the benefit of decreasing the training time for a neural network model and can result in lower generalization error.

The weights in re-used layers may be used as the starting point for the training process and adapted in response to the new problem. This usage treats transfer learning as a type of weight initialization scheme. This may be useful when the first related problem has a lot more labeled data than the problem of interest and the similarity in the structure of the problem may be useful in both contexts.

… the objective is to take advantage of data from the first setting to extract information that may be useful when learning or even when directly making predictions in the second setting.

— Page 538, Deep Learning, 2016.

Step One: Install a deep-learning toolkit of your choice. they all come with nice tutorials these days.

Step Two: Grab a pre-trained imagenet model. In that model, there are already a few computer classes built into it! ( “desktop_computer”, “laptop”, ‘notebook”, and another class for hand-held computers “hand-held_computer”)

Step Three: Use model to predict. for this, you’ll need to have your images the correct size.

More Steps: further fine-tune the model…a bit more advanced but will give you some gains.

Something to think about is what is your goal? accuracy? false positives/negatives, etc? It’s always good having a goal of what you need to accomplish from the start.

import keras
import numpy as np
from keras.applications import vgg16

from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.imagenet_utils import decode_predictions
import matplotlib.pyplot as plt
from PIL import Image
import requests
from io import BytesIO

%matplotlib inline
vgg_model = vgg16.VGG16(weights='imagenet')

def predict_image(image_url, model):

    response = requests.get(image_url)
    original = Image.open(BytesIO(response.content))

    newsize = (224, 224) 
    original = original.resize(newsize) 

    # convert the PIL image to a numpy array
    # IN PIL - image is in (width, height, channel)
    # In Numpy - image is in (height, width, channel)
    numpy_image = img_to_array(original)

    # Convert the image / images into batch format
    # expand_dims will add an extra dimension to the data at a particular axis
    # We want the input matrix to the network to be of the form (batchsize, height, width, channels)
    # Thus we add the extra dimension to the axis 0.
    image_batch = np.expand_dims(numpy_image, axis=0)
    plt.imshow(np.uint8(image_batch[0]))
    plt.show()

    # prepare the image for the VGG model
    processed_image = vgg16.preprocess_input(image_batch.copy())

    # get the predicted probabilities for each class
    predictions = model.predict(processed_image)

    # convert the probabilities to class labels
    # We will get top 5 predictions which is the default
    label = decode_predictions(predictions)
    print (label[0][0:2])  #just display top 2


urls = ['https://blog.techcraft.org/desktop-computer.jpg', 'https://blog.techcraft.org/Laptop-computer.jpg']

for u in urls:
    predict_image(u, vgg_model)