If training data appears to be mislabeled, should we relabel it? Improving supervised learning algorithms for threat detection in ground penetrating radar data

Abstract

This work focuses on the development of automatic buried threat detection (BTD) algorithms using ground penetrating radar (GPR) data. Buried threats tend to exhibit unique characteristics in GPR imagery, such as high energy hyperbolic shapes, which can be leveraged for detection. Many recent BTD algorithms are supervised, and therefore they require training with exemplars of GPR data collected over non-threat locations and threat locations, respectively. Frequently, data from non-threat GPR examples will exhibit high energy hyperbolic patterns, similar to those observed from a buried threat. Is it still useful therefore, to include such examples during algorithm training, and encourage an algorithm to label such data as a non-threat? Similarly, some true buried threat examples exhibit very little distinctive threat-like patterns. We investigate whether it is beneficial to treat such GPR data examples as mislabeled, and either (i) relabel them, or (ii) remove them from training. We study this problem using two algorithms to automatically identify mislabeled examples, if they are present, and examine the impact of removing or relabeling them for training. We conduct these experiments on a large collection of GPR data with several state-of-the-art GPR-based BTD algorithms.

DOI
10.1117/12.2305881
Year