A nonparametric bayesian approach to multiple instance learning

Abstract

Multiple instance learning (MIL) is a type of supervised learning in which labels are available for sets of observations (bags), but not for individual observations (instances). MIL has been applied in different areas, which has led to a large number of algorithms for learning based on MIL data. Many of these approaches focus on maximizing class margins, performing instance selection, or developing distance metrics and kernels suitable for application directly to bags. Although these approaches have shown promise, most require cross-validation-based optimization of hyper parameters or iterative numerical optimization to determine the proper number of target concepts. This work proposes a nonparametric Bayesian approach to learning in MIL scenarios based on Dirichlet process mixture models. The nonparametric nature of the model and the use of noninformative priors remove the need to perform cross-validation-based optimization while variational Bayesian inference allows for rapid parameter learning. The resulting approach generalizes to different applications by easily incorporating alternate data generation models. In a related effort [A. Manandhar et al., IEEE Trans. Geosci. Remote Sensing53(4) (2015) 1737-1745.], the proposed model has been extended to incorporate time-varying data. Results indicate that when the data generation assumption holds, the proposed approach performs competitively with existing MIL and nonMIL methods for several standard MIL datasets and a new MIL dataset introduced in this work.

DOI

10.1142/S0218001415510015

Year

2015