Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.14682
Cited By
Enriching Video Captions With Contextual Text
29 July 2020
Philipp Rimle
Pelin Dogan
Markus Gross
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Enriching Video Captions With Contextual Text"
3 / 3 papers shown
Title
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Chia-Wen Kuo
Z. Kira
6
52
0
09 May 2022
Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions
Jianan Wang
Boyang Albert Li
Xiangyu Fan
Jing-Hua Lin
Yanwei Fu
23
2
0
15 Nov 2020
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,465
0
06 Jun 2016
1