Cat or Dog?

fastai
Author

Stephen Barrie

Published

October 22, 2022

This is my follow up to Lesson 1: Practical Deep Learning for Coders 2022 taught by Jeremy Howard, co-founder, along with Dr. Rachel Thomas, of fast.ai. This is my first attempt at building a classifier model.

Introducing Dúi

A puppy named Dúi went viral on Reddit a couple of years back, after some people pointed out that he looks like a mix of a dog and a cat. What do you think?

Let’s put a Machine Learning model to the test and see what it predicts.

Believe it or not, the following few lines of code represent a complete system for creating and training a state-of-the-art model for recognizing cats versus dogs. So, let’s train it now! To do so, just press Shift-Enter on your keyboard, or press the Play button on the toolbar. Then wait a few minutes while the following things happen:

  1. A dataset called the Oxford-IIIT Pet Dataset that contains 7,349 images of cats and dogs from 37 different breeds will be downloaded from the fast.ai datasets collection to the GPU server you are using, and will then be extracted.
  2. A pretrained model that has already been trained on 1.3 million images, using a competition-winning model will be downloaded from the internet.
  3. The pretrained model will be fine-tuned using the latest advances in transfer learning, to create a model that is specially customized for recognizing dogs and cats.

The first two steps only need to be run once on your GPU server. If you run the cell again, it will use the dataset and model that have already been downloaded, rather than downloading them again. Let’s take a look at the contents of the cell, and the results:

::: {.cell _cell_guid=‘936cf7ef-02f7-4a8c-bcfe-f028b63b6b4c’ _kg_hide-output=‘true’ _uuid=‘803d9fc5-fed7-464c-b6b6-5efdd0af1961’ jupyter=‘{“outputs_hidden”:false}’ tags=‘[]’}

!pip install -Uqq fastbook
from fastbook import *
from fastai.vision.all import *
path = untar_data(URLs.PETS)/'images'

def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))

learn = vision_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)
/home/stephen137/mambaforge/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/stephen137/mambaforge/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=ResNet34_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet34_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
0.00% [0/1 00:00<?]
epoch train_loss valid_loss error_rate time

33.70% [31/92 01:30<02:57 0.7032]

:::

So, how do we know if this model is any good? In the last column of the table you can see the error rate, which is the proportion of images that were incorrectly identified. The error rate serves as our metric—our measure of model quality, chosen to be intuitive and comprehensible. As you can see, the model is nearly perfect, even though the training time was only a few minutes (not including the one-time downloading of the dataset and the pretrained model). There are a lot of sources of small random variation involved in training models, however we generally see an error rate of well less than 0.02. In this example, the error rate is approx 0.008, which equates to 99.2% accuracy.

Finally, let’s check that this model actually works by uploading a picture of Dúi for our model to classify:

::: {.cell _cell_guid=‘392c10cc-9782-4c29-b131-6f42289856f8’ _uuid=‘eac26ec1-f23a-41ba-96e4-0eb504324959’ jupyter=‘{“outputs_hidden”:false}’ tags=‘[]’}

import ipywidgets as widgets
uploader = widgets.FileUpload()
uploader

:::

::: {.cell _cell_guid=‘efa0ad57-03a5-43c1-abf1-ff8eb9d0600a’ _uuid=‘08497abe-f829-476a-ba24-bc5079922994’ jupyter=‘{“outputs_hidden”:false}’ tags=‘[]’}

img = PILImage.create(uploader.data[0])
display(img)

:::

Now let’s see what the model predicts:

::: {.cell _cell_guid=‘f98c4e24-adf7-4272-a6e9-42236bd686df’ _uuid=‘a1e1b32c-476e-4fbc-8feb-7a902f5e58e3’ jupyter=‘{“outputs_hidden”:false}’ tags=‘[]’}

is_cat,_,probs = learn.predict(img)
print(f"Is Dúi a cat?: {is_cat}.")
print(f"Probability Dúi is a cat: {probs[1].item():.6f}")

:::

Conclusion

So, it seems that the confusion out there was justified. Although our model recorded an error rate of just 0.008119 (which means that over 99.1% of images were correctly classified), the model predicts with almost certainty (93.6% probability) that Dúi is in fact a cat (and not a dog)!