Self-supervised Pre-training for Mirror Detection

City University of Hong Kong
ICCV 2023

State-of-the-art mirror detection methods are based on costly supervised ImageNet pre-training. They may fail even in obvious cases, e.g., (a) when the mirror clearly reflects a real object outside of the mirror. (b) and (c) are MirrorNet and VCNet, respectively, pre-trained on ImageNet with full supervision. (c) is MirrorNet with our proposed SSL framework and without supervised ImageNet pre-training. Our SSL scheme leverages the mirror reflection cue and avoids feature redundancy in the pre-training stage. It outperforms those pre-trained on Ima- geNet with full supervision (i.e., (b) and (c)).


Existing mirror detection methods require supervised ImageNet pre-training to obtain good general-purpose im- age features. However, supervised ImageNet pre-training focuses on category-level discrimination and may not be suitable for downstream tasks like mirror detection, due to the overfitting upstream tasks (e.g., supervised image classi- fication). We observe that mirror reflection is crucial to how people perceive the presence of mirrors, and such mid-level features can be better transferred from self-supervised pre- trained models. Inspired by this observation, in this paper we aim to improve mirror detection methods by proposing a new self-supervised learning (SSL) pre-training frame- work for modeling the representation of mirror reflection progressively in the pre-training process. Our framework consists of three pre-training stages at different levels: 1) an image-level pre-training stage to globally incorporate mirror reflection features into the pre-trained model; 2) a patch-level pre-training stage to spatially simulate and learn local mirror reflection from image patches; and 3) a pixel-level pre-training stage to pixel-wisely capture mirror reflection via reconstructing corrupted mirror images based on the relationship between the inside and outside of mir- rors. Extensive experiments show that our SSL pre-training framework significantly outperforms previous state-of-the- art CNN-based SSL pre-training frameworks and even out- performs supervised ImageNet pre-training when trans- ferred to the mirror detection task.


      author    = {Lin, Jiaying and Lau, Rynson W.H.},
      title     = {Self-supervised Pre-training for Mirror Detection},
      booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
      month     = {October},
      year      = {2023},