Improve CAM with Auto-adapted Segmentationand Co-supervised Augmentation
- WSOL
Weakly Supervised Object Localization (WSOL) methodsgenerate both classification and localization results by learning from onlyimage category labels. Previous methods usually utilize class activationmap (CAM) to obtain target object regions. However, most of them onlyfocus on improving foreground object parts in CAM, but ignore the im-portant effect of its background contents. In this paper, we propose aconfidence segmentation (ConfSeg) module that builds confidence scorefor each pixel in CAM without introducing additional hyper-parameters. The generated sample-specific confidence mask is able to indicate theextent of determination for each pixel in CAM, and further supervisesadditional CAM extended from internal feature maps. Besides, we intro-duce Co-supervised Augmentation (CoAug) module to capture feature-level representation for foreground and background parts in CAM sep-arately. Then a metric loss is applied at batch sample level to augmentdistinguish ability of our model, which helps a lot to localize more re-lated object parts. Our final model, CSoA, combines the two modulesand achieves superior performance, e.g. 37.69% and 48.81% Top-1 lo-calization error on CUB-200 and ILSVRC datasets, respectively, whichoutperforms all previous methods and becomes the new state-of-the-art.
View on arXiv