Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.13430
Cited By
Resolving References in Visually-Grounded Dialogue via Text Generation
23 September 2023
Bram Willemsen
Livia Qian
Gabriel Skantze
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Resolving References in Visually-Grounded Dialogue via Text Generation"
5 / 5 papers shown
Title
Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach
Xingyu Li
Chen Gong
G. Fu
VGen
29
0
0
19 Apr 2025
Referring Expression Generation in Visually Grounded Dialogue with Discourse-aware Comprehension Guiding
Bram Willemsen
Gabriel Skantze
25
0
0
09 Sep 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,244
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
298
3,700
0
11 Feb 2021
1