ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.14196
  4. Cited By
ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding

23 May 2023
Uri Shaham
Maor Ivgi
Avia Efrat
Jonathan Berant
Omer Levy
    VLM
ArXivPDFHTML

Papers citing "ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding"

50 / 109 papers shown
Title
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
26
0
0
17 Apr 2025
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts
Yifei Yu
Qian Zhang
Lingfeng Qiao
Di Yin
Fang Li
Jie Wang
Z. Chen
Suncong Zheng
Xiaolong Liang
X. Sun
34
0
0
07 Apr 2025
Reasoning on Multiple Needles In A Haystack
Reasoning on Multiple Needles In A Haystack
Yidong Wang
LRM
31
0
0
05 Apr 2025
M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?
Haolong Yan
Kaijun Tan
Yeqing Shen
Xin Huang
Zheng Ge
Xiangyu Zhang
Si Li
Daxin Jiang
VLM
37
0
0
27 Mar 2025
Extract, Match, and Score: An Evaluation Paradigm for Long Question-context-answer Triplets in Financial Analysis
Extract, Match, and Score: An Evaluation Paradigm for Long Question-context-answer Triplets in Financial Analysis
Bo Hu
Han Yuan
Vlad Pandelea
Wuqiong Luo
Yingzhu Zhao
Zheng Ma
53
0
0
20 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
66
0
0
17 Mar 2025
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and Reasoning
Hao Cui
Zahra Shamsi
Gowoon Cheon
Xuejian Ma
Shutong Li
...
Eun-Ah Kim
M. Brenner
Viren Jain
Sameera Ponda
Subhashini Venugopalan
ELM
LRM
52
0
0
14 Mar 2025
EFPC: Towards Efficient and Flexible Prompt Compression
Yun-Hao Cao
Yangsong Wang
Shuzheng Hao
Zhenxing Li
Chengjun Zhan
Sichao Liu
Yi-Qi Hu
53
0
0
11 Mar 2025
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao
Chin-Jou Li
Yilin Zhang
Graham Neubig
Amanda Bertsch
BDL
70
0
0
11 Mar 2025
Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling
Zhenghua Wang
Yiran Ding
Changze Lv
Zhibo Xu
Tianlong Li
Tianyuan Shi
Xiaoqing Zheng
Xuanjing Huang
35
0
0
06 Mar 2025
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai
Jianqiao Lu
Yao Luo
Yiyuan Ma
Xun Zhou
63
5
0
28 Feb 2025
TRIX: A More Expressive Model for Zero-shot Domain Transfer in Knowledge Graphs
TRIX: A More Expressive Model for Zero-shot Domain Transfer in Knowledge Graphs
Yucheng Zhang
Beatrice Bevilacqua
Mikhail Galkin
Bruno Ribeiro
61
1
0
26 Feb 2025
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning
Generalizing From Short to Long: Effective Data Synthesis for Long-Context Instruction Tuning
Wenhao Zhu
Pinzhen Chen
Hanxu Hu
Shujian Huang
Fei Yuan
Jiajun Chen
Alexandra Birch
SyDa
54
0
0
24 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
95
3
0
04 Feb 2025
Context-Aware Hierarchical Merging for Long Document Summarization
Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou
Mirella Lapata
MoMe
113
1
0
03 Feb 2025
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Zhan Ling
Kang Liu
Kai Yan
Y. Yang
Weijian Lin
Ting-Han Fan
Lingfeng Shen
Zhengyin Du
Jiecao Chen
ReLM
ELM
LRM
42
2
0
25 Jan 2025
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
WeiZhi Fei
Xueyan Niu
Guoqing Xie
Yingqing Liu
Bo Bai
Wei Han
28
1
0
22 Jan 2025
Systematic Evaluation of Long-Context LLMs on Financial Concepts
Systematic Evaluation of Long-Context LLMs on Financial Concepts
Lavanya Gupta
Saket Sharma
Yiyun Zhao
66
2
0
19 Dec 2024
Investigating Factuality in Long-Form Text Generation: The Roles of
  Self-Known and Self-Unknown
Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-Unknown
Lifu Tu
Rui Meng
Shafiq R. Joty
Yingbo Zhou
Semih Yavuz
HILM
67
0
0
24 Nov 2024
LIFBench: Evaluating the Instruction Following Performance and Stability
  of Large Language Models in Long-Context Scenarios
LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios
Xiaodong Wu
Minhao Wang
Yichen Liu
Xiaoming Shi
He Yan
Xiangju Lu
Junmin Zhu
Wei Zhang
85
3
0
11 Nov 2024
LongSafety: Enhance Safety for Long-Context LLMs
LongSafety: Enhance Safety for Long-Context LLMs
Mianqiu Huang
Xiaoran Liu
Shaojun Zhou
Mozhi Zhang
Chenkun Tan
...
Zhikai Lei
Linlin Li
Q. Liu
Yaqian Zhou
Xipeng Qiu
ELM
ALM
40
2
0
11 Nov 2024
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?
Jonathan Roberts
Kai Han
Samuel Albanie
LLMAG
85
0
0
07 Nov 2024
What is Wrong with Perplexity for Long-context Language Modeling?
What is Wrong with Perplexity for Long-context Language Modeling?
Lizhe Fang
Yifei Wang
Zhaoyang Liu
Chenheng Zhang
Stefanie Jegelka
Jinyang Gao
Bolin Ding
Yisen Wang
58
4
0
31 Oct 2024
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu
Bowen Shi
Avi Caciularu
Idan Szpektor
Arman Cohan
58
3
0
30 Oct 2024
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee
Chanwoong Yoon
Kyochul Jang
Donghyeon Lee
Minju Song
Hyunjae Kim
Jaewoo Kang
ELM
30
1
0
22 Oct 2024
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks
M. Bueno
R. Lotufo
Rodrigo Nogueira
LRM
26
0
0
08 Oct 2024
Forgetting Curve: A Reliable Method for Evaluating Memorization
  Capability for Long-context Models
Forgetting Curve: A Reliable Method for Evaluating Memorization Capability for Long-context Models
Xinyu Liu
Runsong Zhao
Pengcheng Huang
Chunyang Xiao
Bei Li
Jingang Wang
Tong Xiao
Jingbo Zhu
21
0
0
07 Oct 2024
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning
  in LLMs
MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Lei Wang
Shan Dong
Yuhui Xu
Hanze Dong
Yalu Wang
Amrita Saha
Ee-Peng Lim
Caiming Xiong
Doyen Sahoo
LRM
40
1
0
07 Oct 2024
LongGenBench: Long-context Generation Benchmark
LongGenBench: Long-context Generation Benchmark
Xiang Liu
Peijie Dong
Xuming Hu
Xiaowen Chu
RALM
43
8
0
05 Oct 2024
L-CiteEval: Do Long-Context Models Truly Leverage Context for
  Responding?
L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?
Zecheng Tang
Keyan Zhou
Juntao Li
Baibei Ji
Jianye Hou
Min Zhang
39
2
0
03 Oct 2024
How to Train Long-Context Language Models (Effectively)
How to Train Long-Context Language Models (Effectively)
Tianyu Gao
Alexander Wettig
Howard Yen
Danqi Chen
RALM
69
37
0
03 Oct 2024
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
HELMET: How to Evaluate Long-Context Language Models Effectively and Thoroughly
Howard Yen
Tianyu Gao
Minmin Hou
Ke Ding
Daniel Fleischer
Peter Izsak
Moshe Wasserblat
Danqi Chen
ALM
ELM
56
25
0
03 Oct 2024
On The Adaptation of Unlimiformer for Decoder-Only Transformers
On The Adaptation of Unlimiformer for Decoder-Only Transformers
Kian Ahrabian
Alon Benhaim
Barun Patra
Jay Pujara
Saksham Singhal
Xia Song
32
0
0
02 Oct 2024
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation
B. Rappazzo
Yingheng Wang
Aaron Ferber
Carla P. Gomes
VLM
18
0
0
23 Sep 2024
E2LLM: Encoder Elongated Large Language Models for Long-Context
  Understanding and Reasoning
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
Zihan Liao
Jun Wang
Hang Yu
Lingxiao Wei
Jianguo Li
Jun Wang
Wei Zhang
19
2
0
10 Sep 2024
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long
  Context Evaluation Tasks
Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks
Zi Yang
28
0
0
10 Sep 2024
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
Yuhao Wu
Ming Shan Hee
Zhiqing Hu
Roy Ka-Wei Lee
RALM
30
0
0
03 Sep 2024
Prompt Compression with Context-Aware Sentence Encoding for Fast and
  Improved LLM Inference
Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference
Barys Liskavets
Maxim Ushakov
Shuvendu Roy
Mark Klibanov
Ali Etemad
Shane Luke
25
6
0
02 Sep 2024
LanguaShrink: Reducing Token Overhead with Psycholinguistics
LanguaShrink: Reducing Token Overhead with Psycholinguistics
Xuechen Liang
Meiling Tao
Yinghui Xia
Tianyu Shi
Jun Wang
JingSong Yang
18
1
0
01 Sep 2024
Writing in the Margins: Better Inference Pattern for Long Context
  Retrieval
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
M. Russak
Umar Jamil
Christopher Bryant
Kiran Kamble
Axel Magnuson
Mateusz Russak
Waseem Alshikh
19
2
0
27 Aug 2024
Multilingual Needle in a Haystack: Investigating Long-Context Behavior
  of Multilingual Large Language Models
Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models
Amey Hengle
Prasoon Bajpai
Soham Dan
Tanmoy Chakraborty
LRM
21
2
0
19 Aug 2024
Long Input Benchmark for Russian Analysis
Long Input Benchmark for Russian Analysis
I. Churin
Murat Apishev
Maria Tikhonova
Denis Shevelev
Aydar Bulatov
Yuri Kuratov
Sergej Averkiev
Alena Fenogenova
38
0
0
05 Aug 2024
Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache
  Consumption
Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Shi Luohe
Hongyi Zhang
Yao Yao
Z. Li
Zhao Hai
31
31
0
25 Jul 2024
Stress-Testing Long-Context Language Models with Lifelong ICL and Task
  Haystack
Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack
Xiaoyue Xu
Qinyuan Ye
Xiang Ren
38
6
0
23 Jul 2024
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on
  Long-Context Tasks
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks
Zheng Wang
Boxiao Jin
Zhongzhi Yu
Minjia Zhang
MoMe
37
23
0
11 Jul 2024
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Philippe Laban
Alexander R. Fabbri
Caiming Xiong
Chien-Sheng Wu
RALM
33
41
0
01 Jul 2024
Is It Really Long Context if All You Need Is Retrieval? Towards
  Genuinely Difficult Long Context NLP
Is It Really Long Context if All You Need Is Retrieval? Towards Genuinely Difficult Long Context NLP
Omer Goldman
Alon Jacovi
Aviv Slobodkin
Aviya Maimon
Ido Dagan
Reut Tsarfaty
58
10
0
29 Jun 2024
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
LongIns: A Challenging Long-context Instruction-based Exam for LLMs
Shawn Gavin
Tuney Zheng
Jiaheng Liu
Quehry Que
Noah Wang
Jian Yang
Chenchen Zhang
Wenhao Huang
Wenhu Chen
Ge Zhang
RALM
LRM
29
3
0
25 Jun 2024
One Thousand and One Pairs: A "novel" challenge for long-context
  language models
One Thousand and One Pairs: A "novel" challenge for long-context language models
Marzena Karpinska
Katherine Thai
Kyle Lo
Tanya Goyal
Mohit Iyyer
LRM
36
40
0
24 Jun 2024
SEAM: A Stochastic Benchmark for Multi-Document Tasks
SEAM: A Stochastic Benchmark for Multi-Document Tasks
Gili Lior
Avi Caciularu
Arie Cattan
Shahar Levy
Ori Shapira
Gabriel Stanovsky
RALM
33
4
0
23 Jun 2024
123
Next