Towards Ecologically Valid Research on Language User Interfaces

28 July 2020

H. D. Vries

Dzmitry Bahdanau

Christopher D. Manning

ArXiv (abs)PDF HTML

Papers citing "Towards Ecologically Valid Research on Language User Interfaces"

39 / 39 papers shown

The Collaboration Gap

155

04 Nov 2025

Towards Understanding Visual Grounding in Visual Language Models

Georgios Pantazopoulos

Eda B. Özyiğit

ObjD

505

12 Sep 2025

MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents

348

15 Aug 2025

From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered

237

09 Jun 2025

Societal Impacts Research Requires Benchmarks for Creative Composition Tasks

Judy Hanwen Shen

Carlos Guestrin

729

09 Apr 2025

Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

397

24 Mar 2025

Toward an Evaluation Science for Generative AI Systems

453

07 Mar 2025

Do Text-to-Vis Benchmarks Test Real Use of Visualisations?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Jonathan K. Kummerfeld

263

29 Jul 2024

Benchmarks as Microscopes: A Call for Model Metrology

Michael Stephen Saxon

Ari Holtzman

Peter West

William Y. Wang

Naomi Saphra

362

22 Jul 2024

Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

241

08 Mar 2024

Selective "Selective Prediction": Reducing Unnecessary Abstention in Vision-Language Reasoning

Yejin Choi

457

23 Feb 2024

KTO: Model Alignment as Prospect Theoretic Optimization

Kawin Ethayarajh

Winnie Xu

Niklas Muennighoff

Dan Jurafsky

Douwe Kiela

1.2K

933

02 Feb 2024

Do Androids Know They're Only Dreaming of Electric Sheep?Annual Meeting of the Association for Computational Linguistics (ACL), 2023

314

28 Dec 2023

FinanceBench: A New Benchmark for Financial Question Answering

Douwe Kiela

398

181

20 Nov 2023

Multitask Multimodal Prompted Training for Interactive Embodied Task CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Georgios Pantazopoulos

Malvina Nikandrou

259

07 Nov 2023

On Degrees of Freedom in Defining and Testing Natural Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Saku Sugawara

S. Tsugita

ELM

379

24 May 2023

Learning to Simulate Natural Language Feedback for Interactive Semantic ParsingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

346

14 May 2023

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine IntentsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

Xing Han Lù

Siva Reddy

H. D. Vries

LMTD

284

03 Apr 2023

JamPatoisNLI: A Jamaican Patois Natural Language Inference DatasetConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ruth-Ann Armstrong

John Hewitt

Christopher D. Manning

334

07 Dec 2022

Can In-context Learners Learn a Reasoning Concept from Demonstrations?

Michal Tefnik

Marek Kadlcík

LRM

409

03 Dec 2022

Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling ApproachesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Daniel Fried

290

15 Nov 2022

Going for GOAL: A Resource for Grounded Football Commentaries

Malvina Nikandrou

185

08 Nov 2022

Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval BenchmarksFindings (Findings), 2022

274

10 Oct 2022

Don't Copy the Teacher: Data and Model Challenges in Embodied DialogueConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

419

10 Oct 2022

Evaluation Gaps in Machine Learning PracticeConference on Fairness, Accountability and Transparency (FAccT), 2022

Vinodkumar Prabhakaran

ELM

409

11 May 2022

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Ruiqi Zhong

...

Lingpeng Kong

Luke Zettlemoyer

436

352

16 Jan 2022

Deep Transfer Learning & Beyond: Transformer Language Models in Information Systems Research

Ross Gruetzemacher

D. Paradice

360

18 Oct 2021

KaggleDBQA: Realistic Evaluation of Text-to-SQL ParsersAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

329

141

22 Jun 2021

Targeted Data Acquisition for Evolving Negotiation AgentsInternational Conference on Machine Learning (ICML), 2021

Minae Kwon

Siddharth Karamcheti

Mariano-Florentino Cuéllar

Dorsa Sadigh

365

14 Jun 2021

Maintaining Common Ground in Dynamic EnvironmentsTransactions of the Association for Computational Linguistics (TACL), 2021

Takuma Udagawa

Akiko Aizawa

191

29 May 2021

Conversational AI Systems for Social Good: Opportunities and Challenges

288

13 May 2021

Dynabench: Rethinking Benchmarking in NLPNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Douwe Kiela

...

Robin Jia

444

501

07 Apr 2021

DynaSent: A Dynamic Benchmark for Sentiment AnalysisAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Christopher Potts

Zhengxuan Wu

Atticus Geiger

Douwe Kiela

573

30 Dec 2020

Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL

232

23 Oct 2020

STAR: A Schema-Guided Dialog Dataset for Transfer Learning

Johannes E. M. Mosig

Shikib Mehri

Thomas Kober

331

22 Oct 2020

Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI

768

591

15 Oct 2020

Deploying Lifelong Open-Domain Dialogue Learning

Jason Weston

264

18 Aug 2020

Experience Grounds Language

...

668

422

21 Apr 2020

The Transformative Potential of Artificial Intelligence

Ross Gruetzemacher

Jess Whittlestone

315

171

27 Nov 2019