Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1902.05660
Cited By
Cycle-Consistency for Robust Visual Question Answering
15 February 2019
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Cycle-Consistency for Robust Visual Question Answering"
50 / 129 papers shown
Title
HARMONY: Hidden Activation Representations and Model Output-Aware Uncertainty Estimation for Vision-Language Models
Erum Mushtaq
Zalan Fabian
Yavuz Faruk Bakman
Anil Ramakrishna
Mahdi Soltanolkotabi
Salman Avestimehr
121
2
0
25 Oct 2025
KBE-DME: Dynamic Multimodal Evaluation via Knowledge Enhanced Benchmark Evolution
Junzhe Zhang
Huixuan Zhang
Xiaojun Wan
61
0
0
24 Oct 2025
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Fucai Ke
Joy Hsu
Zhixi Cai
Zixian Ma
Xin Zheng
...
P. D. Haghighi
Gholamreza Haffari
Ranjay Krishna
Jiajun Wu
H. Rezatofighi
ReLM
CoGe
LRM
344
8
0
24 Aug 2025
Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations
Yahsin Yeh
Yilun Wu
Bokai Ruan
Honghan Shuai
AAML
64
1
0
17 Aug 2025
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance
Yuchu Jiang
Jian Zhao
Yuchen Yuan
Tianle Zhang
Yao Huang
...
Ya Zhang
Shuicheng Yan
Chi Zhang
Z. He
Xuelong Li
SILM
442
2
0
12 Aug 2025
LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Haotian Zhang
Liu Liu
Baosheng Yu
Jiayan Qiu
Yanwei Ren
Xianglong Liu
186
0
0
14 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
394
7
0
02 Jun 2025
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering
Computer Vision and Pattern Recognition (CVPR), 2025
Chengyue Huang
Brisa Maneechotesuwan
Shivang Chopra
Z. Kira
AAML
260
4
0
27 May 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Computer Vision and Pattern Recognition (CVPR), 2025
Rui Zhao
Weijia Mao
Mike Zheng Shou
274
4
0
05 Mar 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Neural Information Processing Systems (NeurIPS), 2024
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
390
13
0
21 Feb 2025
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
International Conference on Learning Representations (ICLR), 2025
Chengyue Huang
Junjiao Tian
Brisa Maneechotesuwan
Shivang Chopra
Z. Kira
487
7
0
21 Feb 2025
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
378
8
0
18 Dec 2024
Consistency of Compositional Generalization across Multiple Levels
AAAI Conference on Artificial Intelligence (AAAI), 2024
Chuanhao Li
Zhen Li
Chenchen Jing
Xiaomeng Fan
Wenbo Ye
Yuwei Wu
Yunde Jia
CoGe
233
0
0
18 Dec 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
261
8
0
17 Nov 2024
Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models
Neural Information Processing Systems (NeurIPS), 2024
Junjiao Tian
Chengyue Huang
Z. Kira
160
3
0
03 Nov 2024
Replace-then-Perturb: Targeted Adversarial Attacks With Visual Reasoning for Vision-Language Models
Jonggyu Jang
Hyeonsu Lyu
Jungyeon Koh
H. Yang
VLM
AAML
229
0
0
01 Nov 2024
Improving Generalization in Visual Reasoning via Self-Ensemble
Tien-Huy Nguyen
Quang-Khai Tran
Anh-Tuan Quang-Hoang
VLM
LRM
270
9
0
28 Oct 2024
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
Qinglin He
Yang Feng
Jian Wu
Zuozhu Liu
MoE
MLLM
295
18
0
20 Oct 2024
Efficient and Effective Universal Adversarial Attack against Vision-Language Pre-training Models
Fan Yang
Yihao Huang
Kaidi Wang
Ling Shi
G. Pu
Yang Liu
Jian Shu
AAML
VLM
249
2
0
15 Oct 2024
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
International Conference on Learning Representations (ICLR), 2024
Yue Yang
Shanghang Zhang
Wenqi Shao
Kaipeng Zhang
Yi Bin
Yu Wang
Ping Luo
368
16
0
11 Oct 2024
Revisiting Multi-Modal LLM Evaluation
Jian Lu
Shikhar Srivastava
Junyu Chen
Robik Shrestha
Manoj Acharya
Kushal Kafle
Christopher Kanan
137
5
0
09 Aug 2024
VideoQA in the Era of LLMs: An Empirical Study
International Journal of Computer Vision (IJCV), 2024
Junbin Xiao
Nanxin Huang
Hangyu Qin
Dongyang Li
Yicong Li
...
Zhulin Tao
Jianxing Yu
Liang Lin
Tat-Seng Chua
Angela Yao
335
23
0
08 Aug 2024
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Jian Ma
Wenguan Wang
Yi Yang
Feng Zheng
DiffM
233
1
0
15 Jul 2024
Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Kai Shen
Lingfei Wu
Siliang Tang
Fangli Xu
Bo Long
Yueting Zhuang
Jian Pei
197
0
0
06 Jul 2024
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production
Jian Ma
Wenguan Wang
Yi Yang
Feng Zheng
282
8
0
04 Jul 2024
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang
Jiawei Kong
Wenbo Yu
Bin Chen
Jiawei Li
Hao Wu
Ke Xu
Ke Xu
AAML
VLM
389
27
0
08 Jun 2024
Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Zaid Khan
Yun Fu
AAML
217
19
0
16 Apr 2024
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang
Hongye Fu
Wei Zou
Jinyuan Jia
AAML
358
4
0
28 Mar 2024
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
219
13
0
23 Dec 2023
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
European Conference on Computer Vision (ECCV), 2023
Rizhao Cai
Zirui Song
Dayan Guan
Zhenhao Chen
Xing Luo
Chenyu Yi
Alex C. Kot
MLLM
VLM
300
44
0
05 Dec 2023
Exploring Question Decomposition for Zero-Shot VQA
Neural Information Processing Systems (NeurIPS), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Manmohan Chandraker
Yun Fu
ReLM
186
18
0
25 Oct 2023
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
Holy Lovenia
Wenliang Dai
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
MLLM
239
71
0
09 Oct 2023
Towards Answering Health-related Questions from Medical Videos: Datasets and Approaches
International Conference on Language Resources and Evaluation (LREC), 2023
Deepak Gupta
Kush Attal
Dina Demner-Fushman
LM&MA
142
4
0
21 Sep 2023
Nougat: Neural Optical Understanding for Academic Documents
International Conference on Learning Representations (ICLR), 2023
Lukas Blecher
Guillem Cucurull
Thomas Scialom
Robert Stojnic
ViT
195
171
0
25 Aug 2023
Story Visualization by Online Text Augmentation with Context Memory
IEEE International Conference on Computer Vision (ICCV), 2023
Daechul Ahn
Daneul Kim
Gwangmo Song
Seung Wook Kim
Honglak Lee
Luan Tuyen Chau
Jonghyun Choi
DiffM
224
9
0
15 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
299
44
0
21 Jul 2023
Generative Visual Question Answering
Ethan Shen
Scotty Singh
B. Kumar
OOD
133
1
0
18 Jul 2023
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Computer Vision and Pattern Recognition (CVPR), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Xiang Yu
Y. Fu
Manmohan Chandraker
VLM
MLLM
228
24
0
06 Jun 2023
Cycle Consistency Driven Object Discovery
International Conference on Learning Representations (ICLR), 2023
Aniket Didolkar
Anirudh Goyal
Yoshua Bengio
OCL
329
10
0
03 Jun 2023
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
ACM Multimedia (ACM MM), 2023
Zikang Liu
Sihan Chen
Longteng Guo
Handong Li
Xingjian He
Qingbin Liu
188
3
0
19 May 2023
An Empirical Study on the Language Modal in Visual Question Answering
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Daowan Peng
Wei Wei
Xian-Ling Mao
Yuanyuan Fu
Dangyang Chen
204
5
0
17 May 2023
Iterative Adversarial Attack on Image-guided Story Ending Generation
IEEE transactions on multimedia (IEEE TMM), 2023
Youze Wang
Wenbo Hu
Richang Hong
209
8
0
16 May 2023
Adaptive loose optimization for robust question answering
Jie Ma
Pinghui Wang
Ze-you Wang
Dechen Kong
Min Hu
Tingxu Han
Jun Liu
OOD
381
4
0
06 May 2023
COLA: A Benchmark for Compositional Text-to-image Retrieval
Neural Information Processing Systems (NeurIPS), 2023
Arijit Ray
Filip Radenovic
Abhimanyu Dubey
Bryan A. Plummer
Ranjay Krishna
Kate Saenko
CoGe
VLM
412
55
0
05 May 2023
RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models
Seulki Park
Daeho Um
Hajung Yoon
Sanghyuk Chun
Sangdoo Yun
Hawook Jeong
395
5
0
21 Apr 2023
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Zheyuan Liu
Weixuan Sun
Yicong Hong
Damien Teney
Stephen Gould
285
52
0
29 Mar 2023
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
A. Maharana
Amita Kamath
Christopher Clark
Joey Tianyi Zhou
Aniruddha Kembhavi
227
3
0
28 Mar 2023
Logical Implications for Visual Question Answering Consistency
Computer Vision and Pattern Recognition (CVPR), 2023
Sergio Tascon-Morales
Pablo Márquez-Neila
Raphael Sznitman
235
9
0
16 Mar 2023
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Elias Stengel-Eskin
Jimena Guallar-Blasco
Yi Zhou
Benjamin Van Durme
UQLM
147
14
0
14 Nov 2022
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sahithya Ravi
Aditya Chinchure
Leonid Sigal
Renjie Liao
Vered Shwartz
131
42
0
24 Oct 2022
1
2
3
Next