Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.14599
Cited By
Adversarial NLI: A New Benchmark for Natural Language Understanding
31 October 2019
Yixin Nie
Adina Williams
Emily Dinan
Mohit Bansal
Jason Weston
Douwe Kiela
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adversarial NLI: A New Benchmark for Natural Language Understanding"
50 / 182 papers shown
Title
IM-BERT: Enhancing Robustness of BERT through the Implicit Euler Method
Mihyeon Kim
Juhyoung Park
Youngbin Kim
24
0
0
11 May 2025
Always Tell Me The Odds: Fine-grained Conditional Probability Estimation
Liaoyaqi Wang
Zhengping Jiang
Anqi Liu
Benjamin Van Durme
57
0
0
02 May 2025
Pushing the boundary on Natural Language Inference
Pablo Miralles-González
Javier Huertas-Tato
Alejandro Martín
David Camacho
LRM
39
0
0
25 Apr 2025
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
75
0
0
21 Apr 2025
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
38
106
0
10 Apr 2025
Unveiling Reasoning Thresholds in Language Models: Scaling, Fine-Tuning, and Interpretability through Attention Maps
Yen-Che Hsiao
Abhishek Dutta
LRM
ReLM
ELM
54
0
0
24 Feb 2025
Diversity-Oriented Data Augmentation with Large Language Models
Zaitian Wang
Jinghan Zhang
Xinhao Zhang
Kunpeng Liu
Pengfei Wang
Yuanchun Zhou
78
1
0
17 Feb 2025
SuperMerge: An Approach For Gradient-Based Model Merging
Haoyu Yang
Zheng Zhang
Saket Sathe
MoMe
125
0
0
17 Feb 2025
MIH-TCCT: Mitigating Inconsistent Hallucinations in LLMs via Event-Driven Text-Code Cyclic Training
Xinxin You
Xien Liu
Qixin Sun
Huan Zhang
Kaiyin Zhou
Shaohui Liu
Guoping Hu
Shijin Wang
Si Liu
Ji Wu
83
0
0
13 Feb 2025
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification
Y. Meena
Vaibhav Singh
Ayush Maheshwari
Amrith Krishna
Ganesh Ramakrishnan
AI4TS
67
0
0
09 Feb 2025
Zero-shot and Few-shot Learning with Instruction-following LLMs for Claim Matching in Automated Fact-checking
Dina Pisarevskaya
Arkaitz Zubiaga
48
0
0
18 Jan 2025
Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets
Vatsal Gupta
Pranshu Pandya
Tushar Kataria
Vivek Gupta
Dan Roth
AAML
53
1
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
88
11
0
31 Dec 2024
Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization
Yue Zhang
Liqiang Jing
Vibhav Gogate
116
2
0
19 Dec 2024
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
45
20
0
30 Oct 2024
Improving Model Evaluation using SMART Filtering of Benchmark Datasets
Vipul Gupta
Candace Ross
David Pantoja
R. Passonneau
Megan Ung
Adina Williams
55
1
0
26 Oct 2024
NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples
Baiqi Li
Zhiqiu Lin
Wenxuan Peng
Jean de Dieu Nyandwi
Daniel Jiang
Zixian Ma
Simran Khanuja
Ranjay Krishna
Graham Neubig
Deva Ramanan
AAML
CoGe
VLM
61
20
0
18 Oct 2024
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
15
2
0
19 Sep 2024
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang
Yuchen Yan
Yang Liu
Yonggang Jin
Shuai Peng
M. Zhang
Xunliang Cai
Yixin Cao
Liangcai Gao
Zhi Tang
LRM
40
3
0
19 Sep 2024
Speech-Guided Sequential Planning for Autonomous Navigation using Large Language Model Meta AI 3 (Llama3)
Alkesh K. Srivastava
Philip Dames
LLMAG
LM&Ro
40
1
0
13 Jul 2024
Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay
Gonçalo Hora de Carvalho
Oscar Knap
R. Pollice
ReLM
ELM
LRM
29
1
0
12 Jul 2024
Prompting Techniques for Secure Code Generation: A Systematic Investigation
Catherine Tony
Nicolás E. Díaz Ferreyra
Markus Mutas
Salem Dhiff
Riccardo Scandariato
SILM
64
9
0
09 Jul 2024
MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation
Jiakuan Xie
Pengfei Cao
Yuheng Chen
Yubo Chen
Kang Liu
Jun Zhao
KELM
32
3
0
17 Jun 2024
MoE-RBench
\texttt{MoE-RBench}
MoE-RBench
: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen
Xinyu Zhao
Tianlong Chen
Yu Cheng
MoE
62
5
0
17 Jun 2024
IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
David Ifeoluwa Adelani
Jessica Ojo
Israel Abebe Azime
Jian Yun Zhuang
Jesujoba Oluwadara Alabi
...
Salomey Osei
Sokhar Samb
Tadesse Kebede Guge
Pontus Stenetorp
Pontus Stenetorp
ELM
50
7
0
05 Jun 2024
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
Siyu Lou
Yuntian Chen
Xiaodan Liang
Liang Lin
Quanshi Zhang
24
2
0
20 May 2024
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Qizhou Chen
Taolin Zhang
Xiaofeng He
Dongyang Li
Chengyu Wang
Longtao Huang
Hui Xue
CLL
KELM
41
10
0
06 May 2024
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks
Melissa Ailem
Katerina Marazopoulou
Charlotte Siska
James Bono
51
13
0
25 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
41
42
0
15 Apr 2024
MSciNLI: A Diverse Benchmark for Scientific Natural Language Inference
Mobashir Sadat
Cornelia Caragea
32
4
0
11 Apr 2024
Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study
Myrthe Reuver
Suzan Verberne
Antske Fokkens
29
1
0
05 Apr 2024
Epistemology of Language Models: Do Language Models Have Holistic Knowledge?
Minsu Kim
James Thorne
22
6
0
19 Mar 2024
A Closer Look at Claim Decomposition
Miriam Wanner
Seth Ebner
Zhengping Jiang
Mark Dredze
Benjamin Van Durme
39
18
0
18 Mar 2024
Learning to Maximize Mutual Information for Chain-of-Thought Distillation
Xin Chen
Hanxian Huang
Yanjun Gao
Yi Wang
Jishen Zhao
Ke Ding
35
11
0
05 Mar 2024
WinoViz: Probing Visual Properties of Objects Under Different States
Woojeong Jin
Tejas Srinivasan
Jesse Thomason
Xiang Ren
23
1
0
21 Feb 2024
Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models
Erik Arakelyan
Zhaoqi Liu
Isabelle Augenstein
AAML
37
9
0
25 Jan 2024
The Critique of Critique
Shichao Sun
Junlong Li
Weizhe Yuan
Ruifeng Yuan
Wenjie Li
Pengfei Liu
ELM
32
0
0
09 Jan 2024
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
Liesbeth Allein
Maria Mihaela Trucscva
Marie-Francine Moens
18
1
0
27 Nov 2023
More Samples or More Prompts? Exploring Effective In-Context Sampling for LLM Few-Shot Prompt Engineering
Bingsheng Yao
Guiming Hardy Chen
Ruishi Zou
Yuxuan Lu
Jiachen Li
Shao Zhang
Yisi Sang
Sijia Liu
James A. Hendler
Dakuo Wang
35
13
0
16 Nov 2023
Mirror: A Universal Framework for Various Information Extraction Tasks
Tong Zhu
Junfei Ren
Zijian Yu
Mengsong Wu
Guoliang Zhang
Xiaoye Qu
Wenliang Chen
Zhefeng Wang
Baoxing Huai
Min Zhang
24
14
0
09 Nov 2023
Mind the instructions: a holistic evaluation of consistency and interactions in prompt-based learning
Lucas Weber
Elia Bruni
Dieuwke Hupkes
28
24
0
20 Oct 2023
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters
Yixin Wan
George Pu
Jiao Sun
Aparna Garimella
Kai-Wei Chang
Nanyun Peng
27
159
0
13 Oct 2023
Calibrating Likelihoods towards Consistency in Summarization Models
Polina Zablotskaia
Misha Khalman
Rishabh Joshi
Livio Baldini Soares
Shoshana Jakobovits
Joshua Maynez
Shashi Narayan
26
3
0
12 Oct 2023
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics
Yupei Du
Albert Gatt
Dong Nguyen
19
1
0
10 Oct 2023
Are Large Language Models Really Robust to Word-Level Perturbations?
Haoyu Wang
Guozheng Ma
Cong Yu
Ning Gui
Linrui Zhang
...
Sen Zhang
Li Shen
Xueqian Wang
Peilin Zhao
Dacheng Tao
KELM
21
22
0
20 Sep 2023
GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue
Yanrui Du
Sendong Zhao
Yuhan Chen
Rai Bai
Jing Liu
Huaqin Wu
Haifeng Wang
Bing Qin
22
2
0
08 Sep 2023
Which Spurious Correlations Impact Reasoning in NLI Models? A Visual Interactive Diagnosis through Data-Constrained Counterfactuals
Robin Shing Moon Chan
Afra Amini
Mennatallah El-Assady
LRM
AAML
24
2
0
21 Jun 2023
No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference
Animesh Nighojkar
Antonio Laverghetta
John Licato
23
4
0
16 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Ganqu Cui
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
32
72
0
07 Jun 2023
What does the Failure to Reason with "Respectively" in Zero/Few-Shot Settings Tell Us about Language Models?
Ruixiang Cui
Seolhwa Lee
Daniel Hershcovich
Anders Søgaard
25
2
0
31 May 2023
1
2
3
4
Next