Tsinghua Science and Technology


multiple-instance learning, instance selection, constructive covering algorithm, maximal Hausdorff


Multiple-Instance Learning (MIL) is used to predict the unlabeled bags’ label by learning the labeled positive training bags and negative training bags. Each bag is made up of several unlabeled instances. A bag is labeled positive if at least one of its instances is positive, otherwise negative. Existing multiple-instance learning methods with instance selection ignore the representative degree of the selected instances. For example, if an instance has many similar instances with the same label around it, the instance should be more representative than others. Based on this idea, in this paper, a multiple-instance learning with instance selection via constructive covering algorithm (MilCa) is proposed. In MilCa, we firstly use maximal Hausdorff to select some initial positive instances from positive bags, then use a Constructive Covering Algorithm (CCA) to restructure the structure of the original instances of negative bags. Then an inverse testing process is employed to exclude the false positive instances from positive bags and to select the high representative degree instances ordered by the number of covered instances from training bags. Finally, a similarity measure function is used to convert the training bag into a single sample and CCA is again used to classification for the converted samples. Experimental results on synthetic data and standard benchmark datasets demonstrate that MilCa can decrease the number of the selected instances and it is competitive with the state-of-the-art MIL algorithms.


Tsinghua University Press