Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.09175
Cited By
ReMI: A Dataset for Reasoning with Multiple Images
13 June 2024
Mehran Kazemi
Nishanth Dikkala
Ankit Anand
Petar Dević
Ishita Dasgupta
Fangyu Liu
Bahare Fatemi
Pranjal Awasthi
Dee Guo
Sreenivas Gollapudi
Ahmed Qureshi
LRM
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ReMI: A Dataset for Reasoning with Multiple Images"
13 / 13 papers shown
Title
MM-OPERA: Benchmarking Open-ended Association Reasoning for Large Vision-Language Models
Zimeng Huang
Jinxin Ke
Xiaoxuan Fan
Yufeng Yang
Yang Liu
...
Junteng Dai
Haoyi Jiang
Y. Zhou
Keze Wang
Z. Chen
LRM
VLM
267
0
0
30 Oct 2025
Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning
Angelos Vlachos
Giorgos Filandrianos
Maria Lymperaiou
Nikolaos Spanos
Ilias Mitsouras
Vasileios Karampinis
Athanasios Voulodimos
LRM
100
0
0
01 Aug 2025
PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning
Yizhen Zhang
Yang Ding
Shuoshuo Zhang
Xinchen Zhang
Haoling Li
...
Jie Wu
Lei Ji
Haoran Pan
Y. Yang
Yeyun Gong
OffRL
VLM
LRM
149
4
0
17 Jun 2025
Evaluating MLLMs with Multimodal Multi-image Reasoning Benchmark
Ziming Cheng
Binrui Xu
Lisheng Gong
Zuhe Song
Tianshuo Zhou
...
Wei Chen
Zhiyuan Huang
Mingjie Zhan
Xiaojie Wang
Fangxiang Feng
VLM
LRM
128
8
0
04 Jun 2025
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
461
684
0
25 Mar 2025
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models
Computer Vision and Pattern Recognition (CVPR), 2025
Xinyu Tian
Shu Zou
Zhaoyuan Yang
Jing Zhang
248
9
0
18 Mar 2025
BIG-Bench Extra Hard
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Mehran Kazemi
Bahare Fatemi
Hritik Bansal
John Palowitch
Chrysovalantis Anastasiou
...
Kate Olszewska
Yi Tay
Vinh Q. Tran
Quoc V. Le
Orhan Firat
ELM
LRM
483
48
0
26 Feb 2025
Natural Language Generation from Visual Events: State-of-the-Art and Key Open Questions
Aditya K Surikuchi
Raquel Fernández
Sandro Pezzelle
EGVM
986
0
0
18 Feb 2025
MultiChartQA: Benchmarking Vision-Language Models on Multi-Chart Problems
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zifeng Zhu
Mengzhao Jia
Zizhuo Zhang
Lang Li
Meng Jiang
LRM
301
17
0
18 Oct 2024
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
International Conference on Learning Representations (ICLR), 2024
Peng Xia
Siwei Han
Shi Qiu
Yiyang Zhou
Zhaoyang Wang
...
Chenhang Cui
Mingyu Ding
Linjie Li
Lijuan Wang
Huaxiu Yao
251
28
0
14 Oct 2024
UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios
AAAI Conference on Artificial Intelligence (AAAI), 2024
Baichuan Zhou
Haote Yang
Dairong Chen
Junyan Ye
Tianyi Bai
Jinhua Yu
Songyang Zhang
Dahua Lin
Conghui He
Weijia Li
VLM
262
24
0
30 Aug 2024
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
International Conference on Learning Representations (ICLR), 2024
Hritik Bansal
Arian Hosseini
Rishabh Agarwal
Vinh Q. Tran
Mehran Kazemi
SyDa
OffRL
LRM
213
62
0
29 Aug 2024
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models
Feng Li
Renrui Zhang
Hao Zhang
Yuanhan Zhang
Bo Li
Wei Li
Zejun Ma
Chunyuan Li
MLLM
VLM
299
409
0
10 Jul 2024
1