Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2407.01370
Cited By
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
1 July 2024
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (90 upvotes)
Papers citing
"Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems"
50 / 59 papers shown
Title
NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework
Shanlin Zhou
Xinpeng Wang
Jianxun Lian
Zhenghao Liu
L. Lakshmanan
Xiaoyuan Yi
Yongtao Hao
LLMAG
334
0
0
19 Nov 2025
Stress Testing Factual Consistency Metrics for Long-Document Summarization
Zain Muhammad Mujahid
Dustin Wright
Isabelle Augenstein
HILM
173
0
0
10 Nov 2025
From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering
Lei Li
Xiao Zhou
Y. Zhang
X. Wu
RALM
MedIm
155
0
0
21 Oct 2025
Glyph: Scaling Context Windows via Visual-Text Compression
Jiale Cheng
Y. Liu
X. Zhang
Yulin Fei
Wenyi Hong
...
Xiao-Yang Liu
Yushi Bai
Jie Tang
Hongning Wang
Shiyu Huang
VLM
110
5
0
20 Oct 2025
PRISM: Agentic Retrieval with LLMs for Multi-Hop Question Answering
Md Mahadi Hasan Nahid
Davood Rafiei
RALM
143
0
0
16 Oct 2025
Rethinking Schema Linking: A Context-Aware Bidirectional Retrieval Approach for Text-to-SQL
Md Mahadi Hasan Nahid
Davood Rafiei
Weiwei Zhang
Yong Zhang
LRM
113
1
0
16 Oct 2025
Document Intelligence in the Era of Large Language Models: A Survey
Weishi Wang
Hengchang Hu
Zhijie Zhang
Zhaochen Li
Hongxin Shao
Daniel Dahlmeier
AI4TS
168
0
0
15 Oct 2025
Attribution Gradients: Incrementally Unfolding Citations for Critical Examination of Attributed AI Answers
Hita Kambhamettu
Alyssa Hwang
Philippe Laban
Andrew Head
136
0
0
01 Oct 2025
ClaimIQ at CheckThat! 2025: Comparing Prompted and Fine-Tuned Language Models for Verifying Numerical Claims
Anirban Saha Anik
Md Fahimul Kabir Chowdhury
Andrew Wyckoff
Sagnik Ray Choudhury
108
1
0
15 Sep 2025
Topic-Guided Reinforcement Learning with LLMs for Enhancing Multi-Document Summarization
Chuyuan Li
Austin Xu
Shafiq Joty
Giuseppe Carenini
BDL
148
0
0
11 Sep 2025
EviNote-RAG: Enhancing RAG Models via Answer-Supportive Evidence Notes
Yuqin Dai
Guoqing Wang
Yuan Wang
Kairan Dou
Kaichen Zhou
...
Can Yi
Changhua Meng
Yuchen Zhou
Yongliang Shen
Shuai Lu
RALM
226
3
0
31 Aug 2025
Memory Limitations of Prompt Tuning in Transformers
Maxime Meyer
Mario Michelessa
C. Chaux
Vincent Y. F. Tan
VLM
124
0
0
30 Aug 2025
OpinioRAG: Towards Generating User-Centric Opinion Highlights from Large-scale Online Reviews
Mir Tafseer Nayeem
Davood Rafiei
140
0
0
30 Aug 2025
The Rarity Blind Spot: A Framework for Evaluating Statistical Reasoning in LLMs
Seiji Maekawa
Hayate Iso
Nikita Bhutani
113
0
0
29 Aug 2025
LLM Chatbot-Creation Approaches
Hemil Mehta
Tanvi Raut
Kohav Yadav
Edward F. Gehringer
112
0
0
28 Aug 2025
Towards a Holistic and Automated Evaluation Framework for Multi-Level Comprehension of LLMs in Book-Length Contexts
Jiaqi Deng
Yuho Lee
Nicole Hee-Yeon Kim
Hyangsuk Min
Taewon Yun
Minjeong Ban
Kim Yul
Hwanjun Song
70
1
0
27 Aug 2025
Attribution, Citation, and Quotation: A Survey of Evidence-based Text Generation with Large Language Models
Tobias Schreieder
Tim Schopf
Michael Färber
HILM
120
1
0
21 Aug 2025
BEE-RAG: Balanced Entropy Engineering for Retrieval-Augmented Generation
Yuhao Wang
Ruiyang Ren
Yucheng Wang
Jing Liu
Wayne Xin Zhao
Hua Wu
Haifeng Wang
154
0
0
07 Aug 2025
NeedleChain: Measuring Intact Context Comprehension Capability of Large Language Models
Hyeonseok Moon
Heuiseok Lim
LLMAG
RALM
LRM
189
0
0
30 Jul 2025
Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
J. Wu
Gefei Gu
Yanan Zheng
Dit-Yan Yeung
Arman Cohan
LLMAG
ELM
194
3
0
13 Jul 2025
GenerationPrograms: Fine-grained Attribution with Executable Programs
David Wan
Eran Hirsch
Elias Stengel-Eskin
Ido Dagan
Mohit Bansal
247
0
0
17 Jun 2025
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
Wuwei Zhang
Fangcong Yin
Howard Yen
Danqi Chen
Xi Ye
LRM
273
4
0
11 Jun 2025
Team Anotheroption at SemEval-2025 Task 8: Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA
Nikolas Evkarpidi
Elena Tutubalina
LMTD
290
1
0
11 Jun 2025
GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ionut Teodor Sorodoc
Leonardo F. R. Ribeiro
Rexhina Blloshmi
Christopher Davis
Adria de Gispert
129
3
0
09 Jun 2025
Diagnosing and Resolving Cloud Platform Instability with Multi-modal RAG LLMs
Yifan Wang
Kenneth P. Birman
322
1
0
27 May 2025
MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Zhongzhan Huang
Guoming Ling
Shanshan Zhong
Hefeng Wu
Liang Lin
272
0
0
26 May 2025
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
344
100
0
09 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Qi Zhang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
520
22
0
26 Apr 2025
Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization
Adithya Pratapa
Teruko Mitamura
RALM
208
0
0
17 Apr 2025
ML For Hardware Design Interpretability: Challenges and Opportunities
Raymond Baartmans
Andrew Ensinger
Victor Agostinelli
Lizhong Chen
171
1
0
11 Apr 2025
Reasoning Beyond Limits: Advances and Open Problems for LLMs
ICT express (ICT Express), 2025
M. Ferrag
Norbert Tihanyi
Merouane Debbah
OffRL
LRM
AI4CE
ELM
788
17
0
26 Mar 2025
Extract, Match, and Score: An Evaluation Paradigm for Long Question-context-answer Triplets in Financial Analysis
Bo Hu
Han Yuan
Vlad Pandelea
Wuqiong Luo
Yingzhu Zhao
Zheng Ma
221
2
0
20 Mar 2025
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Austin Xu
Srijan Bansal
Yifei Ming
Semih Yavuz
Shafiq Joty
ELM
364
13
0
19 Mar 2025
RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration
Hong Qing Yu
Frank McQuade
260
7
0
14 Mar 2025
Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation
Junhao Zhang
Richong Zhang
Fanshuang Kong
Ziyang Miao
Yanhan Ye
Yaowei Zheng
SyDa
117
2
0
10 Mar 2025
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
Zhibin Lan
Liqiang Niu
Fandong Meng
Jie Zhou
Jinsong Su
VLM
279
1
0
04 Mar 2025
U-NIAH: Unified RAG and LLM Evaluation for Long Context Needle-In-A-Haystack
Yunfan Gao
Yun Xiong
Wenlong Wu
Zijing Huang
Bohan Li
Haoyu Wang
269
10
0
01 Mar 2025
Do Retrieval-Augmented Language Models Adapt to Varying User Needs?
Peilin Wu
Xinlu Zhang
Wenhao Yu
Xingyu Liu
Xinya Du
Zhiyu Zoey Chen
RALM
393
1
0
27 Feb 2025
Evaluating the Effect of Retrieval Augmentation on Social Biases
Tianhui Zhang
Yi Zhou
Danushka Bollegala
285
1
0
24 Feb 2025
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Adithya Pratapa
Teruko Mitamura
275
1
0
10 Feb 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
296
3
0
28 Jan 2025
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
International Conference on Learning Representations (ICLR), 2024
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
1.0K
7
0
07 Nov 2024
Long Context RAG Performance of Large Language Models
Quinn Leng
Jacob P. Portes
Sam Havens
Matei A. Zaharia
Michael Carbin
AIFin
RALM
3DV
252
24
0
05 Nov 2024
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Kung-Hsiang Huang
Akshara Prabhakar
Sidharth Dhawan
Yixin Mao
Huan Wang
Silvio Savarese
Caiming Xiong
Philippe Laban
Chien-Sheng Wu
446
28
0
04 Nov 2024
On Positional Bias of Faithfulness for Long-form Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
David Wan
Jesse Vig
Joey Tianyi Zhou
Shafiq Joty
HILM
244
16
0
31 Oct 2024
Understanding Synthetic Context Extension via Retrieval Heads
Xinyu Zhao
Fangcong Yin
Greg Durrett
563
4
0
29 Oct 2024
Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Kaige Xie
Philippe Laban
Prafulla Kumar Choubey
Caiming Xiong
Chien-Sheng Wu
153
3
0
20 Oct 2024
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Catarina G. Belem
Pouya Pezeskhpour
Hayate Iso
Seiji Maekawa
Nikita Bhutani
Estevam R. Hruschka
HILM
333
11
0
17 Oct 2024
Enhancing LLM Trading Performance with Fact-Subjectivity Aware Reasoning
Qian Wang
Yuchen Gao
Zhenheng Tang
B. Luo
Bingsheng He
LRM
207
0
0
16 Oct 2024
Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses
Pranav Narayanan Venkit
Philippe Laban
Yilun Zhou
Yixin Mao
Chien-Sheng Wu
ELM
214
15
0
15 Oct 2024
1
2
Next