Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.14399
Cited By
We're Afraid Language Models Aren't Modeling Ambiguity
27 April 2023
Alisa Liu
Zhaofeng Wu
Julian Michael
Alane Suhr
Peter West
Alexander Koller
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"We're Afraid Language Models Aren't Modeling Ambiguity"
12 / 12 papers shown
Title
FalseReject: A Resource for Improving Contextual Safety and Mitigating Over-Refusals in LLMs via Structured Reasoning
Zhehao Zhang
Weijie Xu
Fanyou Wu
Chandan K. Reddy
14
0
0
12 May 2025
LLMs Get Lost In Multi-Turn Conversation
Philippe Laban
Hiroaki Hayashi
Yingbo Zhou
Jennifer Neville
25
0
0
09 May 2025
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling
Hang Zheng
Hongshen Xu
Yuncong Liu
Lu Chen
Pascale Fung
Kai Yu
76
2
0
04 Mar 2025
A linguistically-motivated evaluation methodology for unraveling model's abilities in reading comprehension tasks
Elie Antoine
Frédéric Béchet
Géraldine Damnati
Philippe Langlais
49
1
0
29 Jan 2025
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael J.Q. Zhang
W. Bradley Knox
Eunsol Choi
41
3
0
17 Oct 2024
Evaluating the Factuality of Large Language Models using Large-Scale Knowledge Graphs
Xiaoze Liu
Feijie Wu
Tianyang Xu
Zhuo Chen
Yichi Zhang
Xiaoqian Wang
Jing Gao
HILM
33
8
0
01 Apr 2024
How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities
Lingbo Mo
Boshi Wang
Muhao Chen
Huan Sun
9
25
0
15 Nov 2023
DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning
Hengli Li
Songchun Zhu
Zilong Zheng
11
8
0
15 Jun 2023
SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables
Xinyuan Lu
Liangming Pan
Qian Liu
Preslav Nakov
Min-Yen Kan
LMTD
23
24
0
22 May 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
204
607
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1