ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.16264
  4. Cited By
One Thousand and One Pairs: A "novel" challenge for long-context
  language models

One Thousand and One Pairs: A "novel" challenge for long-context language models

24 June 2024
Marzena Karpinska
Katherine Thai
Kyle Lo
Tanya Goyal
Mohit Iyyer
    LRM
ArXivPDFHTML

Papers citing "One Thousand and One Pairs: A "novel" challenge for long-context language models"

29 / 29 papers shown
Title
TRAIL: Trace Reasoning and Agentic Issue Localization
TRAIL: Trace Reasoning and Agentic Issue Localization
Darshan Deshpande
Varun Gangal
Hersh Mehta
Jitin Krishnan
Anand Kannappan
Rebecca Qian
7
0
0
13 May 2025
LLMs Get Lost In Multi-Turn Conversation
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
23
0
0
09 May 2025
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs
Piotr Nawrot
Robert Li
Renjie Huang
Sebastian Ruder
Kelly Marchisio
E. Ponti
25
0
0
24 Apr 2025
Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization
Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization
Adithya Pratapa
Teruko Mitamura
RALM
28
0
0
17 Apr 2025
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
21
0
0
17 Apr 2025
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Harnessing the Unseen: The Hidden Influence of Intrinsic Knowledge in Long-Context Language Models
Yu Fu
Haz Sameen Shahgir
Hui Liu
Xianfeng Tang
Qi He
Yue Dong
KELM
41
0
0
11 Apr 2025
Reasoning on Multiple Needles In A Haystack
Reasoning on Multiple Needles In A Haystack
Yidong Wang
LRM
28
0
0
05 Apr 2025
Focus Directions Make Your Language Models Pay More Attention to Relevant Contexts
Focus Directions Make Your Language Models Pay More Attention to Relevant Contexts
Youxiang Zhu
Ruochen Li
Danqing Wang
Daniel Haehn
Xiaohui Liang
LRM
53
0
0
30 Mar 2025
WritingBench: A Comprehensive Benchmark for Generative Writing
WritingBench: A Comprehensive Benchmark for Generative Writing
Yuning Wu
Jiahao Mei
M. Yan
Chenliang Li
Shaopeng Lai
...
Zijia Wang
J. Zhang
Mengyue Wu
Qin Jin
Fei Huang
69
1
0
07 Mar 2025
Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription
Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription
Benjamin Gutteridge
Matthew Thomas Jackson
Toni Kukurin
Xiaowen Dong
29
0
0
27 Feb 2025
DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities
DocPuzzle: A Process-Aware Benchmark for Evaluating Realistic Long-Context Reasoning Capabilities
Tianyi Zhuang
Chuqiao Kuang
Xiaoguang Li
Yihua Teng
Jihao Wu
Y. Wang
Lifeng Shang
RALM
ELM
LRM
63
0
0
25 Feb 2025
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning
Wenhao Zhu
Pinzhen Chen
Hanxu Hu
Shujian Huang
Fei Yuan
Jiajun Chen
Alexandra Birch
SyDa
51
0
0
24 Feb 2025
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Jiaxi Li
Xingxing Zhang
Xun Wang
Xiaolong Huang
Li Dong
Liang Wang
Si-Qing Chen
Wei Lu
Furu Wei
SyDa
60
0
0
23 Feb 2025
CLIPPER: Compression enables long-context synthetic data generation
CLIPPER: Compression enables long-context synthetic data generation
Chau Minh Pham
Yapei Chang
Mohit Iyyer
SyDa
72
1
0
21 Feb 2025
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches
Adithya Pratapa
Teruko Mitamura
83
1
0
10 Feb 2025
The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input
Alon Jacovi
Andrew Wang
Chris Alberti
Connie Tao
Jon Lipovetz
...
Rachana Fellinger
Rui Wang
Zizhao Zhang
Sasha Goldshtein
Dipanjan Das
HILM
ALM
77
11
0
06 Jan 2025
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long
  Context Extension for Large Language Models
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models
Haoran Lian
Junmin Chen
Wei Huang
Yizhe Xiong
Wenping Hu
...
Hui Chen
Jianwei Niu
Zijia Lin
Fuzheng Zhang
Di Zhang
76
0
0
10 Dec 2024
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context
  Training
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training
Haonan Wang
Qian Liu
Chao Du
Tongyao Zhu
Cunxiao Du
Kenji Kawaguchi
Tianyu Pang
82
5
0
20 Nov 2024
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
70
0
0
07 Nov 2024
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Seiji Maekawa
Hayate Iso
Nikita Bhutani
RALM
82
1
0
15 Oct 2024
ALR$^2$: A Retrieve-then-Reason Framework for Long-context Question
  Answering
ALR2^22: A Retrieve-then-Reason Framework for Long-context Question Answering
Huayang Li
Pat Verga
Priyanka Sen
Bowen Yang
Vijay Viswanathan
Patrick Lewis
Taro Watanabe
Yixuan Su
RALM
LRM
35
0
0
04 Oct 2024
L-CiteEval: Do Long-Context Models Truly Leverage Context for
  Responding?
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?
Zecheng Tang
Keyan Zhou
Juntao Li
Baibei Ji
Jianye Hou
Min Zhang
31
1
0
03 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
46
24
0
03 Oct 2024
How to Train Long-Context Language Models (Effectively)
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
62
36
0
03 Oct 2024
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
90
79
0
18 Sep 2024
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long
  Context Evaluation Tasks
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
Zi Yang
25
0
0
10 Sep 2024
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended
  Multi-Doc QA
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
Minzheng Wang
Longze Chen
Cheng Fu
Shengyi Liao
Xinghua Zhang
...
Run Luo
Yunshui Li
Min Yang
Fei Huang
Yongbin Li
RALM
22
41
0
25 Jun 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
111
22
0
20 May 2024
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4
Kent K. Chang
Mackenzie Cramer
Sandeep Soni
David Bamman
RALM
138
109
0
28 Apr 2023
1