The NarrativeQA Reading Comprehension Challenge

Transactions of the Association for Computational Linguistics (TACL), 2017

19 December 2017

Tomás Kociský

Jonathan Richard Schwarz

Papers citing "The NarrativeQA Reading Comprehension Challenge"

50 / 546 papers shown

Reasoning Models are Test Exploiters: Rethinking Multiple-Choice

214

21 Jul 2025

FlexOlmo: Open Language Models for Flexible Data Use

...

399

09 Jul 2025

RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

262

06 Jul 2025

Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting

Nathaniel Getachew

Abulhair Saparov

LRM

163

23 Jun 2025

Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?

178

20 Jun 2025

EvolvTrip: Enhancing Literary Character Understanding with Temporal Theory-of-Mind Graphs

158

16 Jun 2025

AbsenceBench: Language Models Can't Tell What's Missing

215

13 Jun 2025

Brevity is the soul of sustainability: Characterizing LLM response lengthsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

244

10 Jun 2025

Flow Matching Meets PDEs: A Unified Framework for Physics-Constrained Generation

205

10 Jun 2025

Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

200

09 Jun 2025

Advancing Question Generation with Joint Narrative and Difficulty ControlWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2025

Bernardo Leite

Henrique Lopes Cardoso

142

07 Jun 2025

Evolutionary Perspectives on the Evaluation of LLM-Based AI Agents: A Comprehensive Survey

...

297

06 Jun 2025

Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models

Alex Laitenberger

Christopher D. Manning

Nelson F. Liu

RALM

226

04 Jun 2025

TracLLM: A Generic Framework for Attributing Long Context LLMs

514

04 Jun 2025

Adaptive Two Sided Laplace Transforms: A Learnable, Interpretable, and Scalable Replacement for Self-Attention

Andrew Kiruluta

154

01 Jun 2025

Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

234

01 Jun 2025

Context is Gold to find the Gold Passage: Evaluating and Training Contextual Document Embeddings

289

30 May 2025

What Has Been Lost with Synthetic Evaluation?

Alexander Gill

Abhilasha Ravichander

Ana Marasović

ELM

362

28 May 2025

Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration

367

27 May 2025

ReadBench: Measuring the Dense Text Visual Reading Ability of Vision-Language Models

Benjamin Clavié

Florian Brand

VLM CoGe

233

25 May 2025

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

425

23 May 2025

PaTH Attention: Position Encoding via Accumulating Householder Transformations

887

22 May 2025

NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts

313

20 May 2025

Tianyi: A Traditional Chinese Medicine all-rounder language model and its Real-World Clinical PracticeInformation Fusion (Inf. Fusion), 2025

...

107

19 May 2025

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

...

586

16 May 2025

Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM

284

09 May 2025

HalluMix: A Task-Agnostic, Multi-Domain Benchmark for Real-World Hallucination Detection

325

01 May 2025

Rethinking Memory in LLM based Agents: Representations, Operations, and Emerging Topics

671

01 May 2025

EnronQA: Towards Personalized RAG over Private Documents

353

01 May 2025

LiveLongBench: Tackling Long-Context Understanding for Spoken Texts from Live Streams

373

24 Apr 2025

Long Context In-Context Compression by Getting to the Gist of Gisting

315

11 Apr 2025

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Prasanna Parthasarathi

441

28 Mar 2025

Survey on Evaluation of LLM-based Agents

Michal Shmueli-Scheuer

LLMAG ELM

509

20 Mar 2025

DocVideoQA: Towards Comprehensive Understanding of Document-Centric Videos through Question AnsweringIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

Han Wang

Kai Hu

Liangcai Gao

635

20 Mar 2025

Tuning LLMs by RAG Principles: Towards LLM-native Memory

241

20 Mar 2025

GPU-Accelerated Motion Planning of an Underactuated Forestry Crane in Cluttered Environments

280

18 Mar 2025

A Survey on Transformer Context Extension: Approaches and Evaluation

520

17 Mar 2025

OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs

378

14 Mar 2025

CURIE: Evaluating LLMs On Multitask Scientific Long Context Understanding and ReasoningInternational Conference on Learning Representations (ICLR), 2025

...

Subhashini Venugopalan

ELM LRM

478

14 Mar 2025

A Survey on Knowledge-Oriented Retrieval-Augmented Generation

...

374

11 Mar 2025

Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents

365

11 Mar 2025

Training Plug-n-Play Knowledge Modules with Deep Context Distillation

1.1K

11 Mar 2025

DeFine: A Decomposed and Fine-Grained Annotated Dataset for Long-form Article Generation

...

315

10 Mar 2025

MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark

249

10 Mar 2025

LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

359

04 Mar 2025

EAIRA: Establishing a Methodology for Evaluating AI Models as Scientific Research Assistants

...

352

27 Feb 2025

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

470

26 Feb 2025

Towards Threshold-Free KV Cache Pruning

352

24 Feb 2025

Self-Taught Agentic Long Context UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

340

21 Feb 2025

Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation

951

21 Feb 2025