ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.03727
  4. Cited By
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

30 September 2024
Yifei Ming
Senthil Purushwalkam
Shrey Pandit
Zixuan Ke
Xuan-Phi Nguyen
Caiming Xiong
Shafiq R. Joty
    HILM
ArXivPDFHTML

Papers citing "FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows""

13 / 13 papers shown
Title
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets
Can LLMs Be Trusted for Evaluating RAG Systems? A Survey of Methods and Datasets
Lorenz Brehme
Thomas Ströhle
Ruth Breu
40
0
0
28 Apr 2025
HalluLens: LLM Hallucination Benchmark
HalluLens: LLM Hallucination Benchmark
Yejin Bang
Ziwei Ji
Alan Schelten
Anthony Hartshorn
Tara Fowler
Cheng Zhang
Nicola Cancedda
Pascale Fung
HILM
60
1
0
24 Apr 2025
aiXamine: Simplified LLM Safety and Security
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
49
0
0
21 Apr 2025
Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation
Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation
Jiajun Shen
Tong Zhou
Yubo Chen
Delai Qiu
Shengping Liu
Kang-Jun Liu
Jun Zhao
HILM
RALM
61
0
0
21 Apr 2025
Retrieval-Augmented Generation with Conflicting Evidence
Retrieval-Augmented Generation with Conflicting Evidence
Han Wang
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
RALM
35
1
0
17 Apr 2025
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Austin Xu
Srijan Bansal
Yifei Ming
Semih Yavuz
Shafiq R. Joty
ELM
65
2
0
19 Mar 2025
Command R7B Arabic: A Small, Enterprise Focused, Multilingual, and Culturally Aware Arabic LLM
Command R7B Arabic: A Small, Enterprise Focused, Multilingual, and Culturally Aware Arabic LLM
Yazeed Alnumay
Alexandre Barbet
Anna Bialas
William Darling
Shaan Desai
...
Stephanie Howe
Olivia Lasche
Justin Lee
Anirudh Shrinivason
Jennifer Tracey
64
0
0
18 Mar 2025
Robusto-1 Dataset: Comparing Humans and VLMs on real out-of-distribution Autonomous Driving VQA from Peru
Dunant Cusipuma
David Ortega
Victor Flores-Benites
Arturo Deza
OOD
74
0
0
10 Mar 2025
MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark
Shengkun Ma
Hao Peng
Lei Hou
Juanzi Li
ELM
74
0
0
10 Mar 2025
Multi-Attribute Steering of Language Models via Targeted Intervention
Multi-Attribute Steering of Language Models via Targeted Intervention
Duy Nguyen
Archiki Prasad
Elias Stengel-Eskin
Mohit Bansal
LLMSV
76
0
0
18 Feb 2025
Context-DPO: Aligning Language Models for Context-Faithfulness
Context-DPO: Aligning Language Models for Context-Faithfulness
Baolong Bi
Shaohan Huang
Y. Wang
Tianchi Yang
Zihan Zhang
...
Furu Wei
Weiwei Deng
Feng Sun
Qi Zhang
Shenghua Liu
76
2
0
18 Dec 2024
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs
  with 1000x Input Token Reduction
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Zhenmei Shi
Yifei Ming
Xuan-Phi Nguyen
Yingyu Liang
Shafiq Joty
49
2
0
25 Sep 2024
SFR-RAG: Towards Contextually Faithful LLMs
SFR-RAG: Towards Contextually Faithful LLMs
Xuan-Phi Nguyen
Shrey Pandit
Senthil Purushwalkam
Austin Xu
Hailin Chen
Yifei Ming
Zixuan Ke
Silvio Savarese
Caiming Xong
Shafiq Joty
RALM
54
1
0
16 Sep 2024
1