Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1808.05326
Cited By
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
16 August 2018
Rowan Zellers
Yonatan Bisk
Roy Schwartz
Yejin Choi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference"
50 / 475 papers shown
Title
A New Benchmark Dataset and Mixture-of-Experts Language Models for Adversarial Natural Language Inference in Vietnamese
Tin Van Huynh
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
260
2
0
25 Jun 2024
UBench: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions
Xunzhi Wang
Zhuowei Zhang
Qiongyu Li
Gaonan Chen
Mengting Hu
Zhixin Han
Bitong Luo
Zhiyu li
Hang Gao
Mengting Hu
ELM
324
3
0
18 Jun 2024
Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox
Yijun Liu
Yuan Meng
Fang Wu
Shenhao Peng
Hang Yao
Chaoyu Guan
Chen Tang
Cheng Wang
Zhi Wang
Wenwu Zhu
MQ
255
9
0
15 Jun 2024
BlockPruner: Fine-grained Pruning for Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Longguang Zhong
Fanqi Wan
Ruijun Chen
Xiaojun Quan
Liangzhi Li
257
15
0
15 Jun 2024
mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Yusuke Sakai
Hidetaka Kamigaito
Taro Watanabe
LRM
188
6
0
06 Jun 2024
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Qi Cheng
Michael Boratko
Pranay Kumar Yelugam
T. O’Gorman
Nalini Singh
Andrew McCallum
X. Li
ELM
LRM
214
6
0
06 Jun 2024
Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions
ACM Multimedia (MM), 2024
Junzhang Liu
Zhecan Wang
Hammad A. Ayyubi
Haoxuan You
Chris Thomas
Rui Sun
Shih-Fu Chang
Kai-Wei Chang
488
0
0
18 May 2024
AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning
International Workshop on Semantic Evaluation (SemEval), 2024
Mina Ghashami
Soumya Smruti Mishra
LRM
212
1
0
16 May 2024
SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Computer Vision and Pattern Recognition (CVPR), 2024
Andong Wang
Bo Wu
Sunli Chen
Zhenfang Chen
Haotian Guan
Wei-Ning Lee
Li Erran Li
Chuang Gan
LRM
RALM
222
29
0
15 May 2024
Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models
Emre Onal
Klemens Flöge
Emma Caldwell
A. Sheverdin
Vincent Fortuin
UQCV
BDL
252
13
0
06 May 2024
Semi-supervised Text-based Person Search
Daming Gao
Yang Bai
Min Cao
Hao Dou
Mang Ye
Min Zhang
174
2
0
28 Apr 2024
How often are errors in natural language reasoning due to paraphrastic variability?
Neha Srikanth
Marine Carpuat
Rachel Rudinger
LRM
191
4
0
17 Apr 2024
Improving Language Model Reasoning with Self-motivated Learning
International Conference on Language Resources and Evaluation (LREC), 2024
Yunlong Feng
Yang Xu
Libo Qin
Yasheng Wang
Wanxiang Che
LRM
ReLM
184
8
0
10 Apr 2024
uTeBC-NLP at SemEval-2024 Task 9: Can LLMs be Lateral Thinkers?
International Workshop on Semantic Evaluation (SemEval), 2024
Pouya Sadeghi
Amirhossein Abaskohi
Yadollah Yaghoobzadeh
LRM
ReLM
173
2
0
03 Apr 2024
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset
Janis Goldzycher
Paul Röttger
Gerold Schneider
AAML
168
15
0
28 Mar 2024
Bridging the Sim-to-Real Gap with Bayesian Inference
Jonas Rothfuss
Bhavya Sukhija
Lenart Treven
Florian Dorfler
Stelian Coros
Andreas Krause
AI4CE
278
10
0
25 Mar 2024
PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset
Arda Uzunouglu
Abdalfatah Rashid Safa
Gözde Gül Sahin
LRM
160
3
0
05 Mar 2024
Unsupervised multiple choices question answering via universal corpus
Qin Zhang
Hao Ge
Xiaojun Chen
Menglu Fang
OffRL
198
2
0
27 Feb 2024
Cleaner Pretraining Corpus Curation with Neural Web Scraping
Zhipeng Xu
Zhenghao Liu
Shi Yu
Zhiyuan Liu
Ge Yu
Chenyan Xiong
CLIP
OnRL
226
9
0
22 Feb 2024
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
Ning Bian
Xianpei Han
Hongyu Lin
Yaojie Lu
Xianpei Han
Le Sun
217
2
0
22 Feb 2024
EvoGrad: A Dynamic Take on the Winograd Schema Challenge with Human Adversaries
Jing Han Sun
Ali Emami
224
6
0
20 Feb 2024
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
Runlong Zhou
Simon S. Du
Beibin Li
OffRL
195
9
0
20 Feb 2024
TEXT2AFFORD: Probing Object Affordance Prediction abilities of Language Models solely from Text
Sayantan Adak
Daivik Agrawal
Animesh Mukherjee
Somak Aditya
267
5
0
20 Feb 2024
Beyond the Answers: Reviewing the Rationality of Multiple Choice Question Answering for the Evaluation of Large Language Models
Hao Wang
Sendong Zhao
Zewen Qiang
Nuwa Xi
Bing Qin
Ting Liu
LRM
ELM
62
7
0
02 Feb 2024
Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Erik Arakelyan
Zhaoqi Liu
Isabelle Augenstein
AAML
262
14
0
25 Jan 2024
From Text to Multimodal: A Comprehensive Survey of Adversarial Example Generation in Question Answering Systems
Gulsum Yigit
M. Amasyalı
AAML
137
0
0
26 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
312
32
0
01 Dec 2023
Explanatory Argument Extraction of Correct Answers in Resident Medical Exams
Iakes Goenaga
Aitziber Atutxa
Koldo Gojenola
Maite Oronoz
Rodrigo Agerri
ELM
191
9
0
01 Dec 2023
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
352
44
0
24 Nov 2023
MacGyver: Are Large Language Models Creative Problem Solvers?
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yufei Tian
Abhilasha Ravichander
Lianhui Qin
Ronan Le Bras
Raja Marjieh
Nanyun Peng
Yejin Choi
Thomas Griffiths
Faeze Brahman
AI4CE
LLMAG
346
26
0
16 Nov 2023
Measuring Adversarial Datasets
Yuanchen Bai
Raoyi Huang
Vijay Viswanathan
Tzu-Sheng Kuo
Tongshuang Wu
222
1
0
06 Nov 2023
Learning to Play Chess from Textbooks (LEAP): a Corpus for Evaluating Chess Moves based on Sentiment Analysis
Haifa Alrdahi
Riza Batista-Navarro
147
2
0
31 Oct 2023
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Aradhana Sinha
Ananth Balashankar
Ahmad Beirami
Thi Avrahami
Jilin Chen
Alex Beutel
AAML
172
6
0
25 Oct 2023
CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Mete Ismayilzada
Debjit Paul
Syrielle Montariol
Mor Geva
Antoine Bosselut
LRM
184
7
0
23 Oct 2023
TeleQnA: A Benchmark Dataset to Assess Large Language Models Telecommunications Knowledge
Ali Maatouk
Fadhel Ayed
Nicola Piovesan
Antonio De Domenico
Merouane Debbah
Zhi-Quan Luo
158
69
0
23 Oct 2023
QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for Zero-Shot Commonsense Question Answering
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Haochen Shi
Weiqi Wang
Tianqing Fang
Baixuan Xu
Wenxuan Ding
Xin Liu
Yangqiu Song
196
7
0
17 Oct 2023
Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Gyuseong Lee
Wooseok Jang
Jin Hyeon Kim
Jaewoo Jung
Seungryong Kim
MoE
OOD
160
9
0
17 Oct 2023
Data Contamination Through the Lens of Time
Manley Roberts
Himanshu Thakur
Christine Herlihy
Colin White
Samuel Dooley
240
37
0
16 Oct 2023
PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model
Tatsuya Ide
Eiki Murata
Daisuke Kawahara
T. Yamazaki
Shengzhe Li
K. Shinzato
Toshinori Sato
LRM
217
2
0
11 Oct 2023
NEWTON: Are Large Language Models Capable of Physical Reasoning?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yi Ru Wang
Jiafei Duan
Dieter Fox
S. Srinivasa
ELM
LRM
AIMat
ReLM
228
50
0
10 Oct 2023
Empower Nested Boolean Logic via Self-Supervised Curriculum Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hongqiu Wu
Linfeng Liu
Haizhen Zhao
Min Zhang
LRM
AI4CE
NAI
ELM
216
8
0
09 Oct 2023
Retrieval-Generation Synergy Augmented Large Language Models
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zhangyin Feng
Xiaocheng Feng
Dezhi Zhao
Maojin Yang
Bing Qin
LRM
RALM
161
43
0
08 Oct 2023
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hamish Ivison
Ramakanth Pasunuru
Hannaneh Hajishirzi
Yejin Choi
Asli Celikyilmaz
LRM
ReLM
176
29
0
07 Oct 2023
Inferring Capabilities from Task Performance with Bayesian Triangulation
John Burden
Konstantinos Voudouris
Ryan Burnell
Danaja Rutar
Lucy G. Cheke
José Hernández-Orallo
124
10
0
21 Sep 2023
Mitigating Shortcuts in Language Models with Soft Label Encoding
International Conference on Language Resources and Evaluation (LREC), 2023
Zirui He
Huiqi Deng
Haiyan Zhao
Ninghao Liu
Jundong Li
120
2
0
17 Sep 2023
Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on Turkish
International Joint Conference on Natural Language Processing (IJCNLP), 2023
Arda Uzunouglu
Gözde Gül Sahin
162
6
0
13 Sep 2023
AGent: A Novel Pipeline for Automatically Creating Unanswerable Questions
Son Quoc Tran
Gia-Huy Do
Phong Nguyen-Thuan Do
Matt Kretchmar
Xinya Du
234
0
0
10 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
264
223
0
31 Aug 2023
A Survey on Out-of-Distribution Evaluation of Neural NLP Models
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Xinzhe Li
Ming Liu
Shang Gao
Wray Buntine
169
24
0
27 Jun 2023
SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality
Neural Information Processing Systems (NeurIPS), 2023
Cheng-Yu Hsieh
Jieyu Zhang
Zixian Ma
Aniruddha Kembhavi
Ranjay Krishna
CoGe
245
181
0
26 Jun 2023
Previous
1
2
3
4
5
...
8
9
10
Next