FastAI Multi-label image classification

by Gilbert Tanner on Feb 20, 2019 · 6 min read

FastAI Multi-label image classification

The FastAI library allows us to build models using only a few lines of code. Furthermore, it implements some of the newest state-of-the-art technics taken from research papers that allow you to get state-of-the-art results on almost any type of problem.

This article is my second article covering how to use the FastAI library. In this article, we will learn how to do multi-label image classification on the Planet Amazon satellite dataset and what differences there are between single- and multi-label classification.

If you don’t know the FastAI library and organization yet, I would highly encourage you to read my last article which gives you some basic information about the FastAi research lab and shows you how to use the FastAi vision module to build an animal detector.

If you prefer a visual tutorial you can check out my FastAi videos.

Getting data

The first thing we need to do is download the Planet Amazon satellite dataset from Kaggle.

After this is done we can read in the provided training dataframe using pandas.

from import * # import the vision module

path = Path('<path to the dataset>') # specify the path to your dataset

df = pd.read_csv(path/'train_v2.csv')
Training dataframe
Figure 1: training dataframe

We have two columns. The image_name column which contains the names of the images — without there suffix — and the tags column which contains the labels for each image separated by spaces.

This quick look at the dataframe provides us with enough information to load in the data but before we will to that we will specify which transformations we want to apply to the images.

Because these are satellite images it doesn’t matter what direction they have and so we can enable both do_flip — enabled by default — and flip_vert. We will also play around with the lighting by tuning the max_lighting parameter, change the zoom by adjusting the max_zoom parameter and disable warp by setting max_warp to 0.

tfms = get_transforms(flip_vert=True, max_lighting=0.1, max_zoom=1.05, max_warp=0.)

Now that we have our transforms we can load in the data using FastAIs data block api, which provides a lot of different classes and methods for loading in datasets.

np.random.seed(42) # set random seed so we always get the same validation set
src = (ImageItemList.from_csv(path, 'train_v2.csv', folder='train-jpg', suffix='.jpg')
       # Load data from csv
       # split data into training and validation set (20% validation)
       .label_from_df(label_delim=' ')
       # label data using the tags column (second column is default)

Now that we have an ImageItemList we just need to apply our transforms, convert our data in a databunch object and normalize it. We didn’t do the whole process in a single step because we will first of train on a smaller image size — 128x128px — and then switch to the original size, which is  256x256px.

This is a little trick which was introduced by Jeremy Howard in the Practical Deep Learning for Coders course. It doesn’t only help train the network faster but it also helps to prevent overfitting. For more information, you should definitely check out the course, which is completely free.

Now that we have our data we can visualize a random batch of images using the show_batch method.

data.show_batch(rows=3, figsize=(12, 9))
Random data batch
Figure 2: Random data batch

Training model

The model creation process is just like for a “normal” single-label image   classification problem. Therefore we can just us FastAIs create_cnn method, which can be used to create a convolutional neural network.

The method needs two arguments, the data, and the architecture, but also supports many other parameters that can be used to customize the model for a given problem.

One of these other parameters is called metrics. Metrics basically are all the “scores” printed out whilst training. The Planet Amazon competition uses the F2-Score to determine the ranking so we will specify the F2-Score and accuracy as our metrics.

But because we are working with multi-label data we can’t just use the normal accuracy and F2-score methods, which are for single label problems but we rather need to set a threshold which determines if the image contains a class.

Normally we would need to find the right value for the threshold but for this problem Jeremy Howard already did that so we will just use the same threshold as he used.

# create metrics
acc_02 = partial(accuracy_thresh, thresh=0.2)
f_score = partial(fbeta, thresh=0.2)
# create cnn with the resnet50 architecture
learn = create_cnn(data, models.resnet50, metrics=[acc_02, f_score])

The partial method just creates another method that calls the passed one with a fixed parameter (threshold=0.2 in our case).

Now that we have our model we can just follow the standard FastAI transfer learning process which we looked into in more depth in my last article so if you want more information about this process you can either check out the article or the first few lessons of the course(specifically lesson 1 and 2).

The only difference is that this time we will perform this process twice. Once for the 128x128px images and then with the 256x256px images.

learn.lr_find() # find learning rate
learn.recorder.plot() # plot learning rate
Learning rate plot
Figure 3: Learning rate plot
lr = 0.01 # chosen learning rate
learn.fit_one_cycle(4, lr) # train model for 4 epochs'planet-amazon-stage-1') # save model
Training output
Figure 4: Training output

The above code only trained the top layers — which were specifically added for our problem — and it didn’t train any of the convolutional layers. In order to train these, we need to call unfreeze.

learn.unfreeze() # unfreeze all layers

learn.lr_find() # find learning rate
learn.recorder.plot() # plot learning rate

learn.fit_one_cycle(5, slice(1e-5, lr/5)) # fit model with differential learning rates'planet-amazon-stage-2') # save model

Now that we have a pretty good model it’s time to change the image size from 128 to 256px. This can be done by creating a new databunch object and placing it inside

# switch resolution
data = (src.transform(tfms, size=256)
        .databunch(bs=16).normalize(imagenet_stats)) = data

Now the exact same training process can be executed once again.

learn.freeze() # freeze bottom layers again


learn.fit_one_cycle(5, lr)'planet-amazon-stage-3')

learn.fit_one_cycle(5, slice(1e-5, lr/5))'planet-amazon-stage-4')

Lastly, we can export our model using the .export method which saves everything needed in a pkl file.


Submitting to Kaggle

The testing data for this competition is split into two separate folders called test-jpg and test-jpg-additional. We will load the data from each folder using the ImageItemList class, get our predictions calling the get_preds method of our model and then we need to loop through the predictions and check if the prediction is bigger than our threshold. Lastly, we will save the results in a dataframe.

dataframes = []

for directory in ('test-jpg', 'test-jpg-additional'):
    test = ImageItemList.from_folder(path/directory)
    learn = load_learner(path, test=test)
    preds, _ = learn.get_preds(ds_type=DatasetType.Test)
    thresh = 0.2
    labelled_preds = [' '.join([[i] for i,p in enumerate(pred) if p > thresh]) for pred in preds]
    fnames = [[:-4] for f in]
    df = pd.DataFrame({'image_name':fnames, 'tags':labelled_preds}, columns=['image_name', 'tags'])

df = pd.concat(dataframes)
df.to_csv(path/'submission.csv', index=False)
Submission dataframe
Figure 5: Submission dataframe

The created csv-file can now be submitted to Kaggle which can be done by either going to the competition page and submitting it there directly.

Kaggle submit results
Figure 6: Kaggle submit results

Or by using the cli.

kaggle competitions submit planet-understanding-the-amazon-from-space -f {path/'submission.csv'} -m "My submission"


The FastAi library is a high-level library build on PyTorch which allows for easy prototyping and gives you access to a lot of state-of-the-art methods/techniques.

The vision module allows us to build  Convolutional Neural Networks for different problems without changing the underlying code-pipeline.

If you liked this article consider subscribing to my Youtube Channel and following me on social media.

The code covered in this article is available as a Github Repository.

If you have any questions, recommendations or critiques, I can be reached via Twitter or the comment section.