Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.16395
Cited By
Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks
31 July 2023
Kousik Rajesh
Mrigank Raman
M. A. Karim
Pranit Chawla
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks"
7 / 7 papers shown
Title
VideoCoT: A Video Chain-of-Thought Dataset with Active Annotation Tool
Yan Wang
Yawen Zeng
Jingsheng Zheng
Xiaofen Xing
Jin Xu
Xiangmin Xu
MLLM
LRM
VGen
27
1
0
07 Jul 2024
An Examination of the Compositionality of Large Generative Vision-Language Models
Teli Ma
Rong Li
Junwei Liang
CoGe
11
2
0
21 Aug 2023
Linearly Mapping from Image to Text Space
Jack Merullo
Louis Castricato
Carsten Eickhoff
Ellie Pavlick
VLM
153
104
0
30 Sep 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
Visually Grounded Reasoning across Languages and Cultures
Fangyu Liu
Emanuele Bugliarello
E. Ponti
Siva Reddy
Nigel Collier
Desmond Elliott
VLM
LRM
87
134
0
28 Sep 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
845
0
17 Feb 2021
1