Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.16695
Cited By
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack
23 July 2024
Xiaoyue Xu
Qinyuan Ye
Xiang Ren
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack"
8 / 8 papers shown
Title
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale
Bowen Jiang
Zhuoqun Hao
Y. Cho
B. Li
Yuan Yuan
Sihao Chen
Lyle Ungar
Camillo J. Taylor
Dan Roth
32
0
0
19 Apr 2025
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
50
24
0
03 Oct 2024
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
Zi Yang
28
0
0
10 Sep 2024
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
40
171
0
02 May 2024
Make Your LLM Fully Utilize the Context
Shengnan An
Zexiong Ma
Zeqi Lin
Nanning Zheng
Jian-Guang Lou
SyDa
44
52
0
25 Apr 2024
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
134
116
0
24 May 2022
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
212
140
0
18 Apr 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1