ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.04744
  4. Cited By
CRAG -- Comprehensive RAG Benchmark

CRAG -- Comprehensive RAG Benchmark

Neural Information Processing Systems (NeurIPS), 2024
7 June 2024
Xiao Yang
Kai Sun
Hao Xin
Yushi Sun
Nikita Bhalla
Xiangsen Chen
Sajal Choudhary
Rongze Daniel Gui
Ziran Will Jiang
Ziyu Jiang
Lingkun Kong
Brian Moran
Jiaqi Wang
Yongjun Xu
An Yan
Chenyu Yang
Eting Yuan
Hanwen Zha
Nan Tang
Lei Chen
Nicolas Scheffer
Yue Liu
Nirav Shah
Rakesh Wanga
Anuj Kumar
Anuj Kumar
Xin Luna Dong
ArXiv (abs)PDFHTMLHuggingFace (49 upvotes)

Papers citing "CRAG -- Comprehensive RAG Benchmark"

43 / 43 papers shown
Title
HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents
HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents
Anyang Tong
Xiang Niu
ZhiPing Liu
Chang Tian
Yanyan Wei
Zenglin Shi
Meng Wang
69
0
0
25 Nov 2025
Using MLIR Transform to Design Sliced Convolution Algorithm
Using MLIR Transform to Design Sliced Convolution Algorithm
Victor Ferrari
M. Pereira
Lucas Alvarenga
Gustavo Leite
Guido Araujo
48
0
0
22 Nov 2025
CRAG-MM: Multi-modal Multi-turn Comprehensive RAG Benchmark
CRAG-MM: Multi-modal Multi-turn Comprehensive RAG Benchmark
Jiaqi Wang
X. J. Yang
Kai Sun
Parth Suresh
Sanat Sharma
...
Rakesh Wanga
Anuj Kumar
Rohit Patel
Wen-tau Yih
Xin Luna Dong
100
0
0
30 Oct 2025
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
Yizhang Zhu
Liangwei Wang
Chenyu Yang
Xiaotian Lin
Boyan Li
...
Shaolei Zhang
Y. Zhang
Xuanhe Zhou
Guoliang Li
Yuyu Luo
AI4TS
132
0
0
27 Oct 2025
Interpretable Question Answering with Knowledge Graphs
Interpretable Question Answering with Knowledge Graphs
Kartikeya Aneja
Manasvi Srivastava
Subhayan Das
Nagender Aneja
RALM
121
0
0
22 Oct 2025
ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks
ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks
Liyang He
Yuren Zhang
Ziwei Zhu
Zhenghui Li
Shiwei Tong
60
0
0
21 Oct 2025
Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries
Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries
Gabrielle Kaili-May Liu
Bryan Li
Arman Cohan
William Walden
Eugene Yang
RALM
229
0
0
13 Oct 2025
From <Answer> to <Think>: Multidimensional Supervision of Reasoning Process for LLM Optimization
From <Answer> to <Think>: Multidimensional Supervision of Reasoning Process for LLM Optimization
Beining Wang
Weihang Su
Hongtao Tian
Tao Yang
Yujia Zhou
Ting Yao
Qingyao Ai
Yiqun Liu
LRM
41
0
0
13 Oct 2025
AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval
AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval
Kai Zhang
Xinyuan Zhang
Ejaz Ahmed
Hongda Jiang
Caleb Kumar
...
Aaron Colak
Ahmed Aly
Anuj Kumar
Xiaozhong Liu
Xin Luna Dong
RALM
104
0
0
12 Oct 2025
FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation
FATHOMS-RAG: A Framework for the Assessment of Thinking and Observation in Multimodal Systems that use Retrieval Augmented Generation
Samuel Hildebrand
Curtis Taylor
Sean Oesch
James M Ghawaly Jr
Amir Sadovnik
Ryan Shivers
Brandon Schreiber
Kevin Kurian
3DV
104
0
0
10 Oct 2025
Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Siddhant Arora
Haidar Khan
Kai Sun
Xin Luna Dong
Sajal Choudhary
...
Anuj Kumar
Ahmed Aly
Yue Liu
Florian Metze
Zhaojiang Lin
80
1
0
02 Oct 2025
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning
Zhepei Wei
X. J. Yang
Kai Sun
Jiaqi Wang
Rulin Shao
...
Rakesh Wanga
Anuj Kumar
Yu Meng
Wen-tau Yih
Xin Luna Dong
HILMLRM
143
2
0
30 Sep 2025
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge
Lukas Haas
Gal Yona
Giovanni DÁntonio
Sasha Goldshtein
Dipanjan Das
HILMALM
41
4
0
09 Sep 2025
Knowledge-Augmented Vision Language Models for Underwater Bioacoustic Spectrogram Analysis
Knowledge-Augmented Vision Language Models for Underwater Bioacoustic Spectrogram Analysis
Ragib Amin Nihal
Benjamin Yen
Takeshi Ashizawa
Kazuhiro Nakadai
28
0
0
06 Sep 2025
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap
Jun Wang
Ninglun Gu
Kailai Zhang
Zijiao Zhang
Yelun Bao
...
Liwei Liu
Yihuan Liu
Pengyong Li
Gary G. Yen
Junchi Yan
ALMELM
168
0
0
26 Aug 2025
A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation
A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation
Ziyang Chen
Erxue Min
Xiang Zhao
Yunxin Li
Xin Jia
Jinzhi Liao
Jichao Li
Shuaiqiang Wang
Baotian Hu
D. Yin
RALM
181
0
0
17 Aug 2025
Learning Facts at Scale with Active Reading
Learning Facts at Scale with Active Reading
Jessy Lin
Vincent-Pierre Berges
Xilun Chen
Anuj Kumar
Gargi Ghosh
Barlas Oğuz
RALMKELM
120
2
0
13 Aug 2025
Dynamic Context Adaptation for Consistent Role-Playing Agents with Retrieval-Augmented Generations
Dynamic Context Adaptation for Consistent Role-Playing Agents with Retrieval-Augmented Generations
Jeiyoon Park
Yongshin Han
Minseop Kim
Kisu Yang
106
1
0
04 Aug 2025
The SMeL Test: A simple benchmark for media literacy in language models
The SMeL Test: A simple benchmark for media literacy in language models
Gustaf Ahdritz
Anat Kleiman
173
0
0
04 Aug 2025
PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning
PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning
Mohammad Kachuee
Teja Gollapudi
Minseok Kim
Yin Huang
Kai Sun
...
Yue Liu
Aaron Colak
Anuj Kumar
Anuj Kumar
Xin Luna Dong
LRM
251
1
0
25 Jul 2025
Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges
Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges
Jintao Liang
Gang Su
Huifeng Lin
You Wu
Rui Zhao
Ziyue Li
3DVLRM
224
6
0
12 Jun 2025
BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
BioMol-MQA: A Multi-Modal Question Answering Dataset For LLM Reasoning Over Bio-Molecular Interactions
Saptarshi Sengupta
Shuhua Yang
Paul Kwong Yu
Fali Wang
Suhang Wang
173
1
0
06 Jun 2025
RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems
RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems
Yixiao Zeng
Tianyu Cao
Danqing Wang
Xinran Zhao
Zimeng Qiu
Morteza Ziyadi
Tongshuang Wu
Lei Li
RALM
199
1
0
01 Jun 2025
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation
Yunjia Xi
Jianghao Lin
Menghui Zhu
Yongzhao Xiao
Zhuoying Ou
...
Weiwen Liu
Yasheng Wang
Ruiming Tang
Weinan Zhang
Yong Yu
282
7
0
21 May 2025
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and DatasetsSwiss Conference on Data Science (SDS), 2025
Lorenz Brehme
Thomas Ströhle
Ruth Breu
429
6
0
28 Apr 2025
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey
Aoran Gan
Hao Yu
Kai Zhang
Qi Liu
Wenyu Yan
Zhenya Huang
Shiwei Tong
Guoping Hu
RALM3DV
243
9
0
21 Apr 2025
Retrieval-Augmented Generation with Conflicting Evidence
Retrieval-Augmented Generation with Conflicting Evidence
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Joey Tianyi Zhou
RALM
302
21
0
17 Apr 2025
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
Zhenru Zhang
Ning Li
Qi Liu
Rui Li
W. Gao
Qingyang Mao
Zhenya Huang
Baosheng Yu
Dacheng Tao
RALM
246
0
0
11 Apr 2025
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation ServingInternational Symposium on Computer Architecture (ISCA), 2025
Wenqi Jiang
Suvinay Subramanian
Cat Graves
Gustavo Alonso
Amir Yazdanbakhsh
Vidushi Dadu
316
24
0
18 Mar 2025
The Amazon Nova Family of Models: Technical Report and Model Card
The Amazon Nova Family of Models: Technical Report and Model Card
Amazon AGI
Aaron Langford
A. Shah
Abhanshu Gupta
Abhimanyu Bhatter
...
Benjamin Biggs
Benjamin Ott
Bhanu Vinzamuri
Bharath Venkatesh
Bhavana Ganesh
225
45
0
17 Mar 2025
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
A Pilot Empirical Study on When and How to Use Knowledge Graphs as Retrieval Augmented Generation
Xujie Yuan
Yongxu Liu
Hanmo Liu
Shiwen Wu
Libin Zheng
Rui Meng
Lei Chen
Xiaofang Zhou
Jian Yin
404
0
0
28 Feb 2025
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs
Reham Omar
Omij Mangukiya
Essam Mansour
221
5
0
20 Jan 2025
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems
MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation SystemsTransactions of the Association for Computational Linguistics (TACL), 2025
Yannis Katsis
Sara Rosenthal
Kshitij P. Fadnis
Chulaka Gunasekara
Young-Suk Lee
Lucian Popa
Vraj Shah
Khoi-Nguyen Tran
Danish Contractor
Marina Danilevsky
RALMLRM
185
26
0
08 Jan 2025
RAG-based Question Answering over Heterogeneous Data and Text
RAG-based Question Answering over Heterogeneous Data and TextIEEE Data Engineering Bulletin (DEB), 2024
Philipp Christmann
Gerhard Weikum
LMTDRALM
285
10
0
10 Dec 2024
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented GenerationInternational Conference on Learning Representations (ICLR), 2024
Mufei Li
Siqi Miao
Pan Li
RALM
481
52
0
28 Oct 2024
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems
MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation SystemsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Nandan Thakur
Suleman Kazi
Ge Luo
Jimmy J. Lin
Amin Ahmad
VLMRALM
384
13
0
17 Oct 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Fei Wang
Xingchen Wan
Ruoxi Sun
Jiefeng Chen
Sercan Ö. Arık
RALM
254
29
0
09 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based AgentsInternational Conference on Learning Representations (ICLR), 2024
H. Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAMLLLMAGELM
421
87
0
03 Oct 2024
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal
  Assistants
MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants
Zeyu Zhang
Quanyu Dai
Luyu Chen
Zeren Jiang
Rui Li
Jieming Zhu
Xu Chen
Yi Xie
Zhenhua Dong
Ji-Rong Wen
LLMAG
139
11
0
30 Sep 2024
MMSearch: Benchmarking the Potential of Large Models as Multi-modal
  Search Engines
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
Dongzhi Jiang
Renrui Zhang
Ziyu Guo
Yanmin Wu
Jiayi Lei
...
Guanglu Song
Peng Gao
Yu Liu
Chunyuan Li
Hongsheng Li
MLLM
255
39
0
19 Sep 2024
Evaluation of RAG Metrics for Question Answering in the Telecom Domain
Evaluation of RAG Metrics for Question Answering in the Telecom Domain
Sujoy Roychowdhury
Sumit Soman
H. G. Ranjani
Neeraj Gunda
Vansh Chhabra
Sai Krishna Bala
202
27
0
15 Jul 2024
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized RationalesInternational Conference on Learning Representations (ICLR), 2024
Zhepei Wei
Wei-Lin Chen
Yu Meng
RALM
506
12
0
19 Jun 2024
LLM-based NLG Evaluation: Current Status and Challenges
LLM-based NLG Evaluation: Current Status and Challenges
Mingqi Gao
Xinyu Hu
Jie Ruan
Xiao Pu
Xiaojun Wan
ELMLM&MA
535
80
0
02 Feb 2024
1