
Title |
|---|
![]() Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual
Context for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2022 |
![]() Learning to Discretely Compose Reasoning Module Networks for Video
CaptioningInternational Joint Conference on Artificial Intelligence (IJCAI), 2020 |