Mr G Myburgh
Department of Geography and Environmental Studies, Stellenbosch University, 2012 Bronze medal winner (Best masters by research dissertation)
View full gallery here.
Abstract of Thesis
The impact of training set size and feature dimensionality on supervised object-based classification: a comparison of three classifiers
Myburgh, Gerhard
Date Issued: 2012-12
ENGLISH ABSTRACT: Supervised classifiers are commonly used in remote sensing to extract land cover information. They are, however, limited in their ability to cost-effectively produce sufficiently accurate land cover maps. Various factors affect the accuracy of supervised classifiers. Notably, the number of available training samples is known to significantly influence classifier performance and to obtain a sufficient number of samples is not always practical. The support vector machine (SVM) does perform well with a limited number of training samples. But little research has been done to evaluate SVM’s performance for geographical object-based image analysis (GEOBIA). GEOBIA also allows the easy integration of additional features into the classification process, a factor which may significantly influence classification accuracies. As such, two experiments were developed and implemented in this research. The first compared the performances of object-based SVM, maximum likelihood (ML) and nearest neighbour (NN) classifiers using varying training set sizes. The effect of feature dimensionality on classifier accuracy was investigated in the second experiment. A SPOT 5 subscene and a four-class classification scheme were used. For the first experiment, training set sizes ranging from 4-20 per land cover class were tested. The performance of all the classifiers improved significantly as the training set size was increased. The ML classifier performed poorly when few (<10 per class) training samples were used and the NN classifier performed poorly compared to SVM throughout the experiment. SVM was the superior classifier for all training set sizes although ML achieved competitive results for sets of 12 or more training samples per class. Training sets were kept constant (20 and 10 samples per class) for the second experiment while an increasing number of features (1 to 22) were included. SVM consistently produced superior classification results. SVM and NN were not significantly (negatively) affected by an increase in feature dimensionality, but ML’s ability to perform under conditions of large feature dimensionalities and few training areas was limited. Further investigations using a variety of imagery types, classification schemes and additional features; finding optimal combinations of training set size and number of features; and determining the effect of specific features should prove valuable in developing more cost-effective ways to process large volumes of satellite imagery.
KEYWORDS Supervised classification, land cover, support vector machine, nearest neighbour classification maximum likelihood classification, geographic object-based image analysis