4.2 Data description
4.3 Visualize class distribution
4.5 Plot examples
5.1 Baseline classifier from scratch
5.2 Improve baseline classifier
5.3 Regularization and model tuning
Interpreting what Convnets learn
6.1 Preprocessing an input for Xception
6.2 Get the last convolutional output
7.1 Summary of model performance
7.2 Key takeaways
The iCassava 2019 Fine-Grained Visual Categorization Challenge is focused on addressing the problem of cassava diseases in Africa through the use of fine-grained visual recognition technology. Cassava is a vital food crop in the continent, but it is constantly threatened by various diseases, such as cassava brown streak disease and cassava mosaic disease. These diseases can lead to significant yield losses, thereby putting the food security of millions of people at risk. The challenge seeks to encourage the development of innovative solutions to accurately identify and classify cassava diseases through machine learning and computer vision techniques. This initiative aims to assist farmers in taking proactive measures in controlling the spread of cassava diseases and safeguarding their crop yields. [1]
By 2021 Smart Data Finance reports that most farmers, livestock keepers, and fishermen continue to use very low technology and productivity is very low even by regional standards. If productivity is still low then so is the level of commercialization. [2]
Given the crucial role that cassava plays in the lives and livelihoods of millions of people in Africa, and especially Tanzania, I saw an opportunity to continue contributing to the current solutions in order to explore improvement possibilities and develop innovative solutions to address these challenges. Therefore, it is the aim in this project to experiment on a variety of state of the art computer vision models and try to improve their overall performance on the iCassava dataset. The scope of this project does not cover the important consideration of distributing these Convnets on edge devices and scaling them to farmers across the continent.
The objective of image classification is to assign one or more labels or categories to an input image. In the context of the iCassava 2019 Fine-Grained Visual Categorization Challenge, the objective is to accurately identify and classify different types of cassava diseases from images of cassava plants.
The list of target labels will be one of 6 classes:
The main measure of this task's success is to improve the model's Precision-Recall Curve. This metric was chosen because of the class imbalance inherent in the dataset collected from Makerere University's AI Lab. Categorical accuracy is also monitored along the various model iterations.
✝️ - Refers to the customizations added in the various model development iterations.
import os
import math
import numpy as np
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds
import tensorflow_hub as hub
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_addons as tfa
from sklearn.metrics import confusion_matrix
import warnings
warnings.filterwarnings('ignore')
/opt/conda/lib/python3.7/site-packages/tensorflow_addons/utils/ensure_tf_install.py:67: UserWarning: Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.9.0 and strictly below 2.12.0 (nightly versions are not supported). The versions of TensorFlow you are currently using is 2.8.4 and is not supported. Some things might work, some things might not. If you were to encounter a bug, do not file an issue. If you want to make sure you're using a tested and supported configuration, either change the TensorFlow version or the TensorFlow Addons's version. You can find the compatibility matrix in TensorFlow Addon's readme: https://github.com/tensorflow/addons UserWarning,
print(tf.__version__)
2.8.4
Note: Version 2.9.3 is currently required to train the EfficientNet B4
print(tf.config.list_physical_devices('GPU'))
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2023-03-19 02:42:30.408677: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 02:42:30.624291: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 02:42:30.626334: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Set Batch Size
strategy = tf.distribute.get_strategy()
BATCH_SIZE= 32 * strategy.num_replicas_in_sync
(ds_train, ds_validation, ds_test), ds_info = tfds.load('cassava',
split=['train', 'validation', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True)
2023-03-19 03:00:21.787546: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:21.914259: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:21.916117: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:21.945246: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-03-19 03:00:21.957168: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:21.959138: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:21.960888: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:28.954046: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:28.956144: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:28.957950: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2023-03-19 03:00:28.982248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13598 MB memory: -> device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5
Here we will take a look at the description and citation information provided along with the dataset.
ds_info
tfds.core.DatasetInfo( name='cassava', full_name='cassava/0.1.0', description=""" Cassava consists of leaf images for the cassava plant depicting healthy and four (4) disease conditions; Cassava Mosaic Disease (CMD), Cassava Bacterial Blight (CBB), Cassava Greem Mite (CGM) and Cassava Brown Streak Disease (CBSD). Dataset consists of a total of 9430 labelled images. The 9430 labelled images are split into a training set (5656), a test set(1885) and a validation set (1889). The number of images per class are unbalanced with the two disease classes CMD and CBSD having 72% of the images. """, homepage='https://www.kaggle.com/c/cassava-disease/overview', data_path='/home/jupyter/tensorflow_datasets/cassava/0.1.0', file_format=tfrecord, download_size=1.26 GiB, dataset_size=1.26 GiB, features=FeaturesDict({ 'image': Image(shape=(None, None, 3), dtype=uint8), 'image/filename': Text(shape=(), dtype=string), 'label': ClassLabel(shape=(), dtype=int64, num_classes=5), }), supervised_keys=('image', 'label'), disable_shuffling=False, splits={ 'test': <SplitInfo num_examples=1885, num_shards=4>, 'train': <SplitInfo num_examples=5656, num_shards=8>, 'validation': <SplitInfo num_examples=1889, num_shards=4>, }, citation="""@misc{mwebaze2019icassava, title={iCassava 2019Fine-Grained Visual Categorization Challenge}, author={Ernest Mwebaze and Timnit Gebru and Andrea Frome and Solomon Nsumba and Jeremy Tusubira}, year={2019}, eprint={1908.02900}, archivePrefix={arXiv}, primaryClass={cs.CV} }""", )
# List of label categories
ds_info.features['label'].names
['cbb', 'cbsd', 'cgm', 'cmd', 'healthy']
The .names attribute returns string names for the integer classes. The order in which the names are provided is kept.
# Extend the cassava dataset classes with 'unknown'
class_names = ds_info.features['label'].names + ['unknown']
# Map the class names to human readable names
name_map = dict(
cmd='Mosaic Disease',
cbb='Bacterial Blight',
cgm='Green Mite',
cbsd='Brown Streak Disease',
healthy='Healthy',
unknown='Unknown')
label_map = {
0:'cbb',
1:'cbsd',
2:'cgm',
3:'cmd',
4:'healthy',
5:'unknown'
}
# print(len(class_names), 'classes:')
print(class_names)
print([name_map[name] for name in class_names])
['cbb', 'cbsd', 'cgm', 'cmd', 'healthy', 'unknown'] ['Bacterial Blight', 'Brown Streak Disease', 'Green Mite', 'Mosaic Disease', 'Healthy', 'Unknown']
ds_info.splits['train'].num_examples
5656
tf.data.experimental.cardinality(ds_train).numpy()
5656
def get_label_frequency(dataset):
class_distribution = np.array([record[1] for record in dataset.as_numpy_iterator()])
labels, frequency = np.unique(class_distribution, return_counts = True)
return frequency
get_label_frequency(ds_train)
array([ 466, 1443, 773, 2658, 316])
# most frequent category represents close to 50% of all training samples
get_label_frequency(ds_train)[3]/tf.data.experimental.cardinality(ds_train).numpy()
0.46994342291371993
# most frequent category represents close to 50% of all validation samples
get_label_frequency(ds_validation)[3]/tf.data.experimental.cardinality(ds_validation).numpy()
0.4695606140815246
def plot_distribution(frequency):
fig, ax = plt.subplots()
bar_colors = ['tab:red', 'tab:blue', 'tab:green', 'tab:orange', 'tab:cyan']
ax.bar(class_names[:-1], frequency, color=bar_colors)
ax.set_ylabel('Frequency')
ax.set_title('Class distribution')
plt.show()
Plot training set class distribution
plot_distribution(get_label_frequency(ds_train))
Plot validation set class distribution
plot_distribution(get_label_frequency(ds_validation))
Plot test set class distribution
plot_distribution(get_label_frequency(ds_test))
Clearly the datasets seem to be highly imbalanced however the class distributions in validation and test set are representative of the training set.
Our model will process 224 x 224 images but before we feed this data to the model, we are required to transform the data appropriately.
Let's apply the following transformations:
tf.data.Dataset.map - TFDS provides images of type tf.uint8, but the model expects tf.float32. Therefore we need to normalize images and resize them
tf.one_hot - The chosen metric, categorical accuracy expects one hot encoded labels, we perform this transformation using look up table
tf.data.Dataset.cache - For better performance, we cache data before shuffling.
tf.data.Dataset.shuffle - Since our data is larage we will use buffer_size=1000 for randomness.
tf.data.Dataset.batch - Batch elements of the dataset after shuffling to get unique batches at each epoch.
tf.data.Dataset.prefetch - For better performance, we will end the pipeline by prefetching
Prepare lookup table for mapping from integer to one hot encoding labels
indices = list(label_map.keys())
depth=6
one_hot_lookup = tf.one_hot(indices, depth)
one_hot_lookup
<tf.Tensor: shape=(6, 6), dtype=float32, numpy= array([[1., 0., 0., 0., 0., 0.], [0., 1., 0., 0., 0., 0.], [0., 0., 1., 0., 0., 0.], [0., 0., 0., 1., 0., 0.], [0., 0., 0., 0., 1., 0.], [0., 0., 0., 0., 0., 1.]], dtype=float32)>
indices
[0, 1, 2, 3, 4, 5]
def data_preprocessing(image, label, img_size=(224, 224)):
# Normalize [0, 255] to [0, 1]
image = tf.cast(image, tf.float32)
image = image / 255.
# Resize the images to 224 x 224
image = tf.image.resize(image, img_size)
# image = tf.image.resize(image, (512,512)) # Trainning efficientnetb4
# Map integer label to one hot encode value
label = one_hot_lookup[label]
return image, label
# Training pipeline
ds_train = ds_train.map(data_preprocessing, num_parallel_calls=tf.data.AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)
ds_train = ds_train.batch(BATCH_SIZE)
ds_train = ds_train.prefetch(tf.data.AUTOTUNE)
# Validation pipeline
ds_validation = ds_validation.map(data_preprocessing, num_parallel_calls=tf.data.AUTOTUNE)
# ds_validation = ds_validation.shuffle(ds_info.splits['validation'].num_examples) # no need for shuffle
ds_validation = ds_validation.batch(BATCH_SIZE)
ds_validation = ds_validation.cache() # cache after because batches can be the same between epochs
ds_validation = ds_train.prefetch(tf.data.AUTOTUNE)
def plot(examples, predictions=None):
# Get the images, labels, and optionally predictions
images = examples[0] # images
labels = examples[1] # labels
# batch_size = len(images)
# batch_size = 25
if predictions is None:
predictions = BATCH_SIZE * [None]
# Configure the layout of the grid
x = np.ceil(np.sqrt(BATCH_SIZE))
y = np.ceil(BATCH_SIZE / x)
fig = plt.figure(figsize=(x * 6, y * 7))
for i, (image, label, prediction) in enumerate(zip(images, labels, predictions)):
# Render the image
ax = fig.add_subplot(int(x), int(y), i+1)
ax.imshow(image, aspect='auto')
ax.grid(False)
ax.set_xticks([])
ax.set_yticks([])
# Display the label and optionally prediction
x_label = 'Label: ' + name_map[label_map[label.argmax()]]
if prediction is not None:
x_label = 'Prediction: ' + name_map[label_map[prediction.argmax()]] + '\n' + x_label
ax.xaxis.label.set_color('green' if label == prediction else 'red')
ax.set_xlabel(x_label)
plt.show()
Notes:
as_numpy_terator: returns an iterator which converts all elements of the dataset to numpy. We use as_numpy_iterator to inspect the content of our dataset.
batch: combines consecuritve elements of this dataset into batches of 25.
examples = next(ds_validation.as_numpy_iterator())
plot(examples)
2023-02-27 18:05:05.570767: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:392] Filling up shuffle buffer (this may take a while): 3284 of 5656 2023-02-27 18:05:13.129758: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:417] Shuffle buffer filled.