How do we choose the best model? The impact of cross-validation design on model evaluation for buried threat detection in ground penetrating radar

Abstract

A great deal of research has been focused on the development of computer algorithms for buried threat detection (BTD) in ground penetrating radar (GPR) data. Most recently proposed BTD algorithms are supervised, and therefore they employ machine learning models that infer their parameters using training data. Cross-validation (CV) is a popular method for evaluating the performance of such algorithms, in which the available data is systematically split into N disjoint subsets, and an algorithm is repeatedly trained on N-1 subsets and tested on the excluded subset. There are several common types of CV in BTD, which vary principally upon the spatial criterion used to partition the data: site-based, lane-based, region-based, etc. The performance metrics obtained via CV are often used to suggest the superiority of one model over others, however, most studies utilize just one type of CV, and the impact of this choice is unclear. Here we employ several types of CV to evaluate algorithms from a recent large-scale BTD study. The results indicate that the rank-order of the performance of the algorithms varies substantially depending upon which type of CV is used. For example, the rank-1 algorithm for region-based CV is the lowest ranked algorithm for site-based CV. This suggests that any algorithm results should be interpreted carefully with respect to the type of CV employed. We discuss some potential interpretations of performance, given a particular type of CV.

DOI

10.1117/12.2305793

Year

2018