Computer Vision News - May 2021

14 Computer Vision Tool Class imbalance in classification tasks Dear readers, it is now time to discuss a real big topic in deep learning: how to deal with imbalanced datasets in classification tasks. We refer to imbalanced datasets when the distribution of examples across classes varies heavily. This can be said of many datasets and it occurs often across several computer vision applications. It can be due to incorrect data sampling or an inherent property of the domain. Some domains are more subject to this issue, such as fraud or anomaly detection- which can be also applied in medical imaging. Sometimes acquiring more data or a by Marica Muffoletto substantial level of augmentation can solve the problem, but in most of the cases it might be necessary to address it using specific tools and ad hoc libraries. Let’s have a look at a few tips! For this example, I used the labelled NIH Chest X-ray Dataset (available here) and a csv file listing all cases and corresponding labels. The techniques discussed can be applied to any classification task and relative pairs of images and labels. Here we are assuming that the csv file is contained in the current directory under the archive folder and the images are found under the archive/ sample/images directory. Once the data is placed, we can start importing all required libraries and read the csv as a dataframe: import numpy as np import pandas as pd import os import sys import datetime import json import seaborn as sns import matplotlib.pyplot as plt from sklearn.metrics import classification_report from keras.backend import clear_session from keras.callbacks import TensorBoard from keras.preprocessing import image from keras.layers import Dropout, Flatten, Activation, Dense from keras.constraints import maxnorm from keras.optimizers import SGD, RMSprop from keras.layers import Convolution2D, ZeroPadding2D from keras.layers.convolutional import MaxPooling2D from keras.utils import np_utils from keras import backend as K from keras.models import Sequential from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau from sklearn.model_selection import

RkJQdWJsaXNoZXIy NTc3NzU=