FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

30 September 2024

Senthil Purushwalkam

Xuan-Phi Nguyen

Papers citing "FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows""

13 / 13 papers shown

Title
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets Lorenz Brehme Thomas Ströhle Ruth Breu 32 0 0 28 Apr 2025
HalluLens: LLM Hallucination Benchmark Yejin Bang Ziwei Ji Alan Schelten Anthony Hartshorn Tara Fowler Cheng Zhang Nicola Cancedda Pascale Fung HILM 44 1 0 24 Apr 2025
aiXamine: Simplified LLM Safety and Security Fatih Deniz Dorde Popovic Yazan Boshmaf Euisuh Jeong M. Ahmad Sanjay Chawla Issa M. Khalil ELM 35 0 0 21 Apr 2025
Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation Jiajun Shen Tong Zhou Yubo Chen Delai Qiu Shengping Liu Kang-Jun Liu Jun Zhao HILM RALM 43 0 0 21 Apr 2025
Retrieval-Augmented Generation with Conflicting Evidence Han Wang Archiki Prasad Elias Stengel-Eskin Mohit Bansal RALM 24 1 0 17 Apr 2025
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings Austin Xu Srijan Bansal Yifei Ming Semih Yavuz Shafiq R. Joty ELM 52 2 0 19 Mar 2025
Command R7B Arabic: A Small, Enterprise Focused, Multilingual, and Culturally Aware Arabic LLM Yazeed Alnumay Alexandre Barbet Anna Bialas William Darling Shaan Desai ... Stephanie Howe Olivia Lasche Justin Lee Anirudh Shrinivason Jennifer Tracey 48 0 0 18 Mar 2025
Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru Dunant Cusipuma David Ortega Victor Flores-Benites Arturo Deza OOD 55 0 0 10 Mar 2025
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark Shengkun Ma Hao Peng Lei Hou Juanzi Li ELM 55 0 0 10 Mar 2025
Multi-Attribute Steering of Language Models via Targeted Intervention Duy Nguyen Archiki Prasad Elias Stengel-Eskin Mohit Bansal LLMSV 59 0 0 18 Feb 2025
Context-DPO: Aligning Language Models for Context-Faithfulness Baolong Bi Shaohan Huang Y. Wang Tianchi Yang Zihan Zhang ... Furu Wei Weiwei Deng Feng Sun Qi Zhang Shenghua Liu 68 2 0 18 Dec 2024
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Zhenmei Shi Yifei Ming Xuan-Phi Nguyen Yingyu Liang Shafiq Joty 38 2 0 25 Sep 2024
SFR-RAG: Towards Contextually Faithful LLMs Xuan-Phi Nguyen Shrey Pandit Senthil Purushwalkam Austin Xu Hailin Chen Yifei Ming Zixuan Ke Silvio Savarese Caiming Xong Shafiq Joty RALM 43 1 0 16 Sep 2024