Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.10549
Cited By
Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment
20 December 2022
Rohan Pandey
Rulin Shao
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment"
5 / 5 papers shown
Title
Progressive Compositionality in Text-to-Image Generative Models
Xu Han
Linghao Jin
Xiaofeng Liu
Paul Pu Liang
CoGe
93
2
0
22 Oct 2024
Encoding and Controlling Global Semantics for Long-form Video Question Answering
Thong Nguyen
Zhiyuan Hu
Xiaobao Wu
Cong-Duy Nguyen
See-Kiong Ng
A. Luu
32
2
0
30 May 2024
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use
Yonatan Bitton
Hritik Bansal
Jack Hessel
Rulin Shao
Wanrong Zhu
Anas Awadalla
Josh Gardner
Rohan Taori
L. Schimdt
VLM
29
76
0
12 Aug 2023
What You See is What You Read? Improving Text-Image Alignment Evaluation
Michal Yarom
Yonatan Bitton
Soravit Changpinyo
Roee Aharoni
Jonathan Herzig
Oran Lang
E. Ofek
Idan Szpektor
EGVM
31
72
0
17 May 2023
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
56
44
0
30 Dec 2020
1