Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2407.09413
Cited By
v1
v2
v3 (latest)
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
12 July 2024
Shraman Pramanick
Rama Chellappa
Subhashini Venugopalan
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (11 upvotes)
Github (76★)
Papers citing
"SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers"
50 / 92 papers shown
VeriSciQA: An Auto-Verified Dataset for Scientific Visual Question Answering
Yuyi Li
Daoyuan Chen
Zhen Wang
Yutong Lu
Yaliang Li
229
0
0
25 Nov 2025
SciEGQA: A Dataset for Scientific Evidence-Grounded Question Answering and Reasoning
Wenhan Yu
Wang Chen
Guanqiang Qi
Weikang Li
Yang Li
Lei Sha
Deguo Xia
Jizhou Huang
219
4
0
19 Nov 2025
SciAgent: A Unified Multi-Agent System for Generalistic Scientific Reasoning
Mexican International Conference on Artificial Intelligence (MICAI), 2025
Xuchen Li
Ruitao Wu
Xuanbo Liu
Xukai Wang
Jinbo Hu
...
K. Huang
J. Xu
Haitao Mi
Wentao Zhang
Bin Dong
LLMAG
LM&Ro
LRM
AI4CE
870
1
0
11 Nov 2025
An MLCommons Scientific Benchmarks Ontology
B. Hawks
G. V. Laszewski
Matthew D. Sinclair
Marco Colombo
Shivaram Venkataraman
Rutwik Jain
Yiwei Jiang
Nhan Tran
Geoffrey C. Fox
127
1
0
06 Nov 2025
Expert Evaluation of LLM World Models: A High-
T
c
T_c
T
c
Superconductivity Case Study
Haoyu Guo
Maria Tikhanovskaya
Paul Raccuglia
Alexey Vlaskin
Chris Co
...
T. Senthil
J. M. Tranquada
M. Brenner
Subhashini Venugopalan
Eun-Ah Kim
ELM
199
0
0
05 Nov 2025
PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies
Lukas Selch
Yufang Hou
Muhammad Jehanzeb Mirza
Sivan Doveh
James Glass
Rogerio Feris
Wei Lin
288
0
0
18 Oct 2025
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding
Sensen Gao
Shanshan Zhao
Xu Jiang
Lunhao Duan
Yong Xien Chng
Qing-Guo Chen
Weihua Luo
Kaifu Zhang
Jia-Wang Bian
Mingming Gong
420
4
0
17 Oct 2025
A Survey on Parallel Reasoning
Z. Wang
Boye Niu
Zipeng Gao
Zhi Zheng
Tong Xu
...
Yilong Chen
Chen Zhu
Hua Wu
Haifeng Wang
Enhong Chen
ReLM
LRM
222
5
0
14 Oct 2025
CGBench: Benchmarking Language Model Scientific Reasoning for Clinical Genetics Research
Owen Queen
Harrison Zhang
James Zou
ELM
LM&MA
LRM
196
1
0
13 Oct 2025
PaperArena: An Evaluation Benchmark for Tool-Augmented Agentic Reasoning on Scientific Literature
Daoyu Wang
Mingyue Cheng
Qi Liu
Shuo Yu
Zirui Liu
Ze Guo
Qi Liu
LRM
408
5
0
13 Oct 2025
Table Question Answering in the Era of Large Language Models: A Comprehensive Survey of Tasks, Methods, and Evaluation
Wei Zhou
Bolei Ma
Annemarie Friedrich
Mohsen Mesgar
LMTD
ELM
235
3
0
08 Oct 2025
CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers
Haining Pan
James V. Roggeveen
Erez Berg
Juan Carrasquilla
Debanjan Chowdhury
...
Di Luo
Titus Neupert
Xiaoliang Qi
Michael P. Brenner
Eun-Ah Kim
AIMat
ALM
ELM
292
2
0
06 Oct 2025
DPDisc: From Factoid Questions to Data Product Requests for Open-World Data Product Discovery over Tables and Text
L. Zhang
Nandana Mihindukulasooriya
Niharika S. D'Souza
Sola S. Shirai
Sarthak Dash
Yao Ma
Horst Samulowitz
LMTD
325
2
0
30 Sep 2025
Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark
Minhui Zhu
Minyang Tian
Xiaocheng Yang
Tianci Zhou
Lifan Yuan
...
Ruixing Zhang
X. Wang
Ofir Press
Nicolas Chia
Eliu A. Huerta
LRM
ELM
192
5
0
30 Sep 2025
MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources
Sicong Leng
Jing Wang
Jiaxi Li
Hao Zhang
Zhiqiang Hu
...
Deli Zhao
Wei Lu
Yu Rong
Aixin Sun
Shijian Lu
OffRL
LRM
212
21
0
25 Sep 2025
CEMTM: Contextual Embedding-based Multimodal Topic Modeling
Amirhossein Abaskohi
Raymond Li
Chuyuan Li
Shafiq Joty
Giuseppe Carenini
186
2
0
14 Sep 2025
Retrieval Enhanced Feedback via In-context Neural Error-book
Jongyeop Hyun
Bumsoo Kim
LRM
347
0
0
22 Aug 2025
DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections
Jiwon Park
Seohyun Pyeon
Jinwoo Kim
Rina Carines Cabal
Yihao Ding
S. Han
LRM
124
2
0
20 Aug 2025
MAC: A Live Benchmark for Multimodal Large Language Models in Scientific Understanding
Mohan Jiang
Jin Gao
Jiahao Zhan
Dequan Wang
175
6
0
14 Aug 2025
Finding Needles in Images: Can Multimodal LLMs Locate Fine Details?
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Parth Thakkar
Ankush Agarwal
Prasad Kasu
Pulkit Bansal
Chaitanya Devaguptapu
166
0
0
07 Aug 2025
Doc2SAR: A Synergistic Framework for High-Fidelity Extraction of Structure-Activity Relationships from Scientific Documents
Jiaxi Zhuang
Kangning Li
Jue Hou
Mingjun Xu
Zhifeng Gao
Hengxing Cai
173
1
0
24 Jun 2025
BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
Saptarshi Sengupta
Shuhua Yang
Paul Kwong Yu
Fali Wang
Suhang Wang
229
2
0
06 Jun 2025
MuSciClaims: Multimodal Scientific Claim Verification
Yash Kumar Lal
Manikanta Bandham
Mohammad Saqib Hasan
Apoorva Kashi
Mahnaz Koupaee
Niranjan Balasubramanian
299
2
0
05 Jun 2025
Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?
Yang Yao
Lingyu Li
Jiaxin Song
Chiyu Chen
Zhenqi He
...
Xin Wang
Tianle Gu
Jie Li
Yan Teng
Yingchun Wang
LRM
409
1
0
03 Jun 2025
EAMET: Robust Massive Model Editing via Embedding Alignment Optimization
Yanbo Dai
Zhenlan Ji
Zongjie Li
Shuai Wang
KELM
326
0
0
17 May 2025
IRLBench: A Multi-modal, Culturally Grounded, Parallel Irish-English Benchmark for Open-Ended LLM Reasoning Evaluation
Khanh-Tung Tran
Barry O'Sullivan
Hoang D. Nguyen
ELM
LRM
482
4
0
16 May 2025
AutoP2C: An LLM-Based Agent Framework for Code Repository Generation from Multimodal Content in Academic Papers
Zijie Lin
Yiqing Shen
Qilin Cai
He Sun
Jinrui Zhou
Mingjun Xiao
394
16
0
28 Apr 2025
FEABench: Evaluating Language Models on Multiphysics Reasoning Ability
N. Mudur
Hao Cui
Subhashini Venugopalan
Paul Raccuglia
M. Brenner
Peter C. Norgaard
LLMAG
ELM
LRM
364
14
0
08 Apr 2025
DomainCQA: Crafting Knowledge-Intensive QA from Domain-Specific Charts
Ling Zhong
Yujing Lu
Jing Yang
Weiming Li
Peng Wei
Yongheng Wang
Manni Duan
Qing Zhang
581
2
0
25 Mar 2025
RoboDesign1M: A Large-scale Dataset for Robot Design Understanding
T. H. Le
T. H. Nguyen
Quang-Dieu Tran
Quang Minh Nguyen
Baoru Huang
Hoan Nguyen
M. Vu
Tung D. Ta
A. Nguyen
3DV
361
1
0
09 Mar 2025
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization
Rohit Saxena
Pasquale Minervini
Frank Keller
VLM
283
8
0
24 Feb 2025
Towards Question Answering over Large Semi-structured Tables
Yuxiang Wang
Jianzhong Qi
Junhao Gan
LMTD
438
0
0
19 Feb 2025
SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
Xuanliang Zhang
Dingzirui Wang
Baoxin Wang
Longxu Dou
Xinyuan Lu
Keyan Xu
Dayong Wu
Qingfu Zhu
Wanxiang Che
LMTD
1.1K
7
0
16 Dec 2024
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
421
27
0
14 Dec 2024
A Survey on Multimodal Benchmarks: In the Era of Large AI Models
Lin Li
Guikun Chen
Hanrong Shi
Jun Xiao
Long Chen
448
28
0
21 Sep 2024
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages
Bhawna Piryani
Jamshid Mozafari
Adam Jatowt
RALM
487
20
0
26 Mar 2024
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
...
Aojun Zhou
Pan Lu
Kai-Wei Chang
Shiyang Feng
Jiaming Song
403
567
0
21 Mar 2024
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
Lei Li
Yuqi Wang
Runxin Xu
Peiyi Wang
Xiachong Feng
Lingpeng Kong
Qi Liu
389
116
0
01 Mar 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Jiaming Song
Yu Qiao
Shiyang Feng
MLLM
609
146
0
08 Feb 2024
InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model
Xiao-wen Dong
Pan Zhang
Yuhang Zang
Yuhang Cao
Sijin Yu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Dahua Lin
Yuan Liu
VLM
MLLM
427
367
0
29 Jan 2024
Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Seongyun Lee
Seungone Kim
Sue Hyun Park
Geewook Kim
Minjoon Seo
MLLM
296
95
0
12 Jan 2024
CogVLM: Visual Expert for Pretrained Language Models
Neural Information Processing Systems (NeurIPS), 2023
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLM
MLLM
838
772
0
06 Nov 2023
Improved Baselines with Visual Instruction Tuning
Computer Vision and Pattern Recognition (CVPR), 2023
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLM
MLLM
746
4,820
0
05 Oct 2023
Improving Automatic VQA Evaluation Using Large Language Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Oscar Manas
Benno Krojer
Aishwarya Agrawal
388
56
0
04 Oct 2023
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
International Conference on Learning Representations (ICLR), 2023
Pan Lu
Hritik Bansal
Tony Xia
Hamish Ivison
Chun-yue Li
Hannaneh Hajishirzi
Hao Cheng
Kai-Wei Chang
Michel Galley
Jianfeng Gao
LRM
MLLM
728
1,381
0
03 Oct 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
International Conference on Learning Representations (ICLR), 2023
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
542
246
0
21 Sep 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
12.3K
16,448
0
18 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Transactions of the Association for Computational Linguistics (TACL), 2023
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Abigail Z. Jacobs
RALM
728
3,319
0
06 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Neural Information Processing Systems (NeurIPS), 2023
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
3.4K
7,658
0
09 Jun 2023
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ahmed Masry
P. Kavehzadeh
Do Xuan Long
Enamul Hoque
Shafiq Joty
LRM
435
183
0
24 May 2023
1
2
Next
Page 1 of 2