Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1902.05660
Cited By
Cycle-Consistency for Robust Visual Question Answering
15 February 2019
Meet Shah
Xinlei Chen
Marcus Rohrbach
Devi Parikh
OOD
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Cycle-Consistency for Robust Visual Question Answering"
50 / 129 papers shown
HARMONY: Hidden Activation Representations and Model Output-Aware Uncertainty Estimation for Vision-Language Models
Erum Mushtaq
Zalan Fabian
Yavuz Faruk Bakman
Anil Ramakrishna
Mahdi Soltanolkotabi
Salman Avestimehr
164
3
0
25 Oct 2025
KBE-DME: Dynamic Multimodal Evaluation via Knowledge Enhanced Benchmark Evolution
Junzhe Zhang
Huixuan Zhang
Xiaojun Wan
99
0
0
24 Oct 2025
Explain Before You Answer: A Survey on Compositional Visual Reasoning
Fucai Ke
Joy Hsu
Zhixi Cai
Zixian Ma
Xin Zheng
...
P. D. Haghighi
Gholamreza Haffari
Ranjay Krishna
Jiajun Wu
H. Rezatofighi
ReLM
CoGe
LRM
364
10
0
24 Aug 2025
Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations
Yahsin Yeh
Yilun Wu
Bokai Ruan
Honghan Shuai
AAML
87
1
0
17 Aug 2025
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance
Yuchu Jiang
Jian Zhao
Yuchen Yuan
Tianle Zhang
Yao Huang
...
Ya Zhang
Shuicheng Yan
Chi Zhang
Z. He
Xuelong Li
SILM
468
3
0
12 Aug 2025
LARGO: Low-Rank Regulated Gradient Projection for Robust Parameter Efficient Fine-Tuning
Haotian Zhang
Liu Liu
Baosheng Yu
Jiayan Qiu
Yanwei Ren
Xianglong Liu
222
0
0
14 Jun 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
Hyojin Bahng
Caroline Chan
F. Durand
Phillip Isola
EGVM
425
7
0
02 Jun 2025
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering
Computer Vision and Pattern Recognition (CVPR), 2025
Chengyue Huang
Brisa Maneechotesuwan
Shivang Chopra
Z. Kira
AAML
288
4
0
27 May 2025
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Computer Vision and Pattern Recognition (CVPR), 2025
Rui Zhao
Weijia Mao
Mike Zheng Shou
316
4
0
05 Mar 2025
LOVA3: Learning to Visual Question Answering, Asking and Assessment
Neural Information Processing Systems (NeurIPS), 2024
Henry Hengyuan Zhao
Pan Zhou
Difei Gao
Zechen Bai
Mike Zheng Shou
417
14
0
21 Feb 2025
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
International Conference on Learning Representations (ICLR), 2025
Chengyue Huang
Junjiao Tian
Brisa Maneechotesuwan
Shivang Chopra
Z. Kira
520
7
0
21 Feb 2025
What makes a good metric? Evaluating automatic metrics for text-to-image consistency
Candace Ross
Melissa Hall
Adriana Romero Soriano
Adina Williams
405
10
0
18 Dec 2024
Consistency of Compositional Generalization across Multiple Levels
AAAI Conference on Artificial Intelligence (AAAI), 2024
Chuanhao Li
Zhen Li
Chenchen Jing
Xiaomeng Fan
Wenbo Ye
Yuwei Wu
Yunde Jia
CoGe
253
0
0
18 Dec 2024
A Comprehensive Survey on Visual Question Answering Datasets and Algorithms
Raihan Kabir
Naznin Haque
Md. Saiful Islam
Marium-E. Jannat
CoGe
289
8
0
17 Nov 2024
Rethinking Weight Decay for Robust Fine-Tuning of Foundation Models
Neural Information Processing Systems (NeurIPS), 2024
Junjiao Tian
Chengyue Huang
Z. Kira
190
3
0
03 Nov 2024
Replace-then-Perturb: Targeted Adversarial Attacks With Visual Reasoning for Vision-Language Models
Jonggyu Jang
Hyeonsu Lyu
Jungyeon Koh
H. Yang
VLM
AAML
256
0
0
01 Nov 2024
Improving Generalization in Visual Reasoning via Self-Ensemble
Tien-Huy Nguyen
Quang-Khai Tran
Anh-Tuan Quang-Hoang
VLM
LRM
327
10
0
28 Oct 2024
Modality-Fair Preference Optimization for Trustworthy MLLM Alignment
International Joint Conference on Artificial Intelligence (IJCAI), 2024
Songtao Jiang
Yan Zhang
Ruizhe Chen
Yeying Jin
Zuozhu Liu
Qinglin He
Yang Feng
Jian Wu
Zuozhu Liu
MoE
MLLM
322
18
0
20 Oct 2024
Efficient and Effective Universal Adversarial Attack against Vision-Language Pre-training Models
Fan Yang
Yihao Huang
Kaidi Wang
Ling Shi
G. Pu
Yang Liu
Jian Shu
AAML
VLM
274
2
0
15 Oct 2024
Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping
International Conference on Learning Representations (ICLR), 2024
Yue Yang
Shanghang Zhang
Wenqi Shao
Kaipeng Zhang
Yi Bin
Yu Wang
Ping Luo
442
15
0
11 Oct 2024
Revisiting Multi-Modal LLM Evaluation
Jian Lu
Shikhar Srivastava
Junyu Chen
Robik Shrestha
Manoj Acharya
Kushal Kafle
Christopher Kanan
167
5
0
09 Aug 2024
VideoQA in the Era of LLMs: An Empirical Study
International Journal of Computer Vision (IJCV), 2024
Junbin Xiao
Nanxin Huang
Hangyu Qin
Dongyang Li
Yicong Li
...
Zhulin Tao
Jianxing Yu
Liang Lin
Tat-Seng Chua
Angela Yao
359
25
0
08 Aug 2024
Mutual Learning for Acoustic Matching and Dereverberation via Visual Scene-driven Diffusion
Jian Ma
Wenguan Wang
Yi Yang
Feng Zheng
DiffM
273
1
0
15 Jul 2024
Ask Questions with Double Hints: Visual Question Generation with Answer-awareness and Region-reference
Kai Shen
Lingfei Wu
Siliang Tang
Fangli Xu
Bo Long
Yueting Zhuang
Jian Pei
215
1
0
06 Jul 2024
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Production
Jian Ma
Wenguan Wang
Yi Yang
Feng Zheng
300
8
0
04 Jul 2024
One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models
Hao Fang
Jiawei Kong
Wenbo Yu
Bin Chen
Jiawei Li
Hao Wu
Ke Xu
Ke Xu
AAML
VLM
441
29
0
08 Jun 2024
Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Zaid Khan
Yun Fu
AAML
258
21
0
16 Apr 2024
MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models
Yanting Wang
Hongye Fu
Wei Zou
Jinyuan Jia
AAML
382
5
0
28 Mar 2024
Cycle-Consistency Learning for Captioning and Grounding
Ning Wang
Jiajun Deng
Mingbo Jia
ObjD
236
13
0
23 Dec 2023
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
European Conference on Computer Vision (ECCV), 2023
Rizhao Cai
Zirui Song
Dayan Guan
Zhenhao Chen
Xing Luo
Chenyu Yi
Alex C. Kot
MLLM
VLM
331
45
0
05 Dec 2023
Exploring Question Decomposition for Zero-Shot VQA
Neural Information Processing Systems (NeurIPS), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Manmohan Chandraker
Yun Fu
ReLM
230
18
0
25 Oct 2023
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models
Holy Lovenia
Wenliang Dai
Samuel Cahyawijaya
Ziwei Ji
Pascale Fung
MLLM
253
73
0
09 Oct 2023
Towards Answering Health-related Questions from Medical Videos: Datasets and Approaches
International Conference on Language Resources and Evaluation (LREC), 2023
Deepak Gupta
Kush Attal
Dina Demner-Fushman
LM&MA
161
4
0
21 Sep 2023
Nougat: Neural Optical Understanding for Academic Documents
International Conference on Learning Representations (ICLR), 2023
Lukas Blecher
Guillem Cucurull
Thomas Scialom
Robert Stojnic
ViT
206
184
0
25 Aug 2023
Story Visualization by Online Text Augmentation with Context Memory
IEEE International Conference on Computer Vision (ICCV), 2023
Daechul Ahn
Daneul Kim
Gwangmo Song
Seung Wook Kim
Honglak Lee
Luan Tuyen Chau
Jonghyun Choi
DiffM
266
8
0
15 Aug 2023
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Ma
Pinghui Wang
Dechen Kong
Zewei Wang
Jun Liu
Hongbin Pei
Junzhou Zhao
OOD
334
45
0
21 Jul 2023
Generative Visual Question Answering
Ethan Shen
Scotty Singh
B. Kumar
OOD
156
1
0
18 Jul 2023
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Computer Vision and Pattern Recognition (CVPR), 2023
Zaid Khan
B. Vijaykumar
S. Schulter
Xiang Yu
Y. Fu
Manmohan Chandraker
VLM
MLLM
258
24
0
06 Jun 2023
Cycle Consistency Driven Object Discovery
International Conference on Learning Representations (ICLR), 2023
Aniket Didolkar
Anirudh Goyal
Yoshua Bengio
OCL
350
10
0
03 Jun 2023
Enhancing Vision-Language Pre-Training with Jointly Learned Questioner and Dense Captioner
ACM Multimedia (ACM MM), 2023
Zikang Liu
Sihan Chen
Longteng Guo
Handong Li
Xingjian He
Qingbin Liu
206
3
0
19 May 2023
An Empirical Study on the Language Modal in Visual Question Answering
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Daowan Peng
Wei Wei
Xian-Ling Mao
Yuanyuan Fu
Dangyang Chen
261
5
0
17 May 2023
Iterative Adversarial Attack on Image-guided Story Ending Generation
IEEE transactions on multimedia (IEEE TMM), 2023
Youze Wang
Wenbo Hu
Richang Hong
247
8
0
16 May 2023
Adaptive loose optimization for robust question answering
Jie Ma
Pinghui Wang
Ze-you Wang
Dechen Kong
Min Hu
Tingxu Han
Jun Liu
OOD
419
4
0
06 May 2023
COLA: A Benchmark for Compositional Text-to-image Retrieval
Neural Information Processing Systems (NeurIPS), 2023
Arijit Ray
Filip Radenovic
Abhimanyu Dubey
Bryan A. Plummer
Ranjay Krishna
Kate Saenko
CoGe
VLM
464
55
0
05 May 2023
RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching Models
Seulki Park
Daeho Um
Hajung Yoon
Sanghyuk Chun
Sangdoo Yun
Hawook Jeong
399
5
0
21 Apr 2023
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Zheyuan Liu
Weixuan Sun
Yicong Hong
Damien Teney
Stephen Gould
311
58
0
29 Mar 2023
Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models
A. Maharana
Amita Kamath
Christopher Clark
Joey Tianyi Zhou
Aniruddha Kembhavi
254
4
0
28 Mar 2023
Logical Implications for Visual Question Answering Consistency
Computer Vision and Pattern Recognition (CVPR), 2023
Sergio Tascon-Morales
Pablo Márquez-Neila
Raphael Sznitman
252
9
0
16 Mar 2023
Why Did the Chicken Cross the Road? Rephrasing and Analyzing Ambiguous Questions in VQA
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Elias Stengel-Eskin
Jimena Guallar-Blasco
Yi Zhou
Benjamin Van Durme
UQLM
165
14
0
14 Nov 2022
VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Sahithya Ravi
Aditya Chinchure
Leonid Sigal
Renjie Liao
Vered Shwartz
152
45
0
24 Oct 2022
1
2
3
Next
Page 1 of 3