Sample-efficient image segmentation through recurrence
- 3DV
There is a growing consensus in vision science that recurrent neural networks constitute better models of visual cortex than feedforward architectures. Yet, feedforward neural networks continue to dominate most popular computer vision challenges. We bridge this gap with the Gamma-net. Inspired by recurrent feedback loops prevalent in the mammalian visual cortex, Gamma-net introduces gated recurrent dynamics through feedforward, horizontal, and top-down connections into the popular U-Net architecture. We demonstrate that Gamma-net performs on par or better than state-of-the-art architectures for dense prediction in both natural image and cell segmentation datasets. The re-entrant processing of the Gamma-net lead to especially large performance gains over the state-of-the-art on smaller datasets. We further show that Gamma-net reproduces a contextual bias in orientation estimation which is consistent with the tilt illusion in human psychophysics. The existence of this bias in Gamma-net -- which emerges from contour detection training in natural images -- supports the theory that this visual illusion is a byproduct of recurrent computational mechanisms underlying contour detection. Vision science theory suggests that recurrent processing underlies robust biological vision, and we demonstrate that similar principles can improve the data efficiency of computer vision systems.
View on arXiv