Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.11166
Cited By
Are Large Language Models Really Robust to Word-Level Perturbations?
20 September 2023
Haoyu Wang
Guozheng Ma
Cong Yu
Ning Gui
Linrui Zhang
Zhiqi Huang
Suwei Ma
Yongzhe Chang
Sen Zhang
Li Shen
Xueqian Wang
Peilin Zhao
Dacheng Tao
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Large Language Models Really Robust to Word-Level Perturbations?"
19 / 19 papers shown
Title
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
80
1
0
21 Feb 2025
UniGuardian: A Unified Defense for Detecting Prompt Injection, Backdoor Attacks and Adversarial Attacks in Large Language Models
Huawei Lin
Yingjie Lao
Tong Geng
Tan Yu
Weijie Zhao
AAML
SILM
79
2
0
18 Feb 2025
Interpreting token compositionality in LLMs: A robustness analysis
Nura Aljaafari
Danilo S. Carvalho
André Freitas
11
0
0
16 Oct 2024
Not All LLM Reasoners Are Created Equal
Arian Hosseini
Alessandro Sordoni
Daniel Toyama
Aaron C. Courville
Rishabh Agarwal
LRM
36
11
0
02 Oct 2024
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning
Yilun Kong
Hangyu Mao
Qi Zhao
Bin Zhang
Jingqing Ruan
Li Shen
Yongzhe Chang
Xueqian Wang
Rui Zhao
Dacheng Tao
OffRL
29
1
0
20 Aug 2024
Robustness of LLMs to Perturbations in Text
Ayush Singh
Navpreet Singh
Shubham Vatsal
21
5
0
12 Jul 2024
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Zihao Zhou
Shudong Liu
Maizhen Ning
Wei Liu
Jindong Wang
Derek F. Wong
Xiaowei Huang
Qiufeng Wang
Kaizhu Huang
ELM
LRM
61
23
0
11 Jul 2024
Harmonic LLMs are Trustworthy
Nicholas S. Kersting
Mohammad Rahman
Suchismitha Vedala
Yang Wang
38
0
0
30 Apr 2024
From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
Xenia Ohmer
Elia Bruni
Dieuwke Hupkes
AI4CE
28
6
0
18 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
35
6
0
12 Apr 2024
Fairness in Large Language Models: A Taxonomic Survey
Zhibo Chu
Zichong Wang
Wenbin Zhang
AILaw
39
31
0
31 Mar 2024
PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion
Zekai Zhang
Yiduo Guo
Yaobo Liang
Dongyan Zhao
Nan Duan
33
1
0
06 Mar 2024
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
Qintong Li
Leyang Cui
Xueliang Zhao
Lingpeng Kong
Wei Bi
LRM
35
46
0
29 Feb 2024
Securing Reliability: A Brief Overview on Enhancing In-Context Learning for Foundation Models
Yunpeng Huang
Yaonan Gu
Jingwei Xu
Zhihong Zhu
Zhaorun Chen
Xiaoxing Ma
33
3
0
27 Feb 2024
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Haoyu Wang
Guozheng Ma
Ziqiao Meng
Zeyu Qin
Li Shen
...
Liu Liu
Yatao Bian
Tingyang Xu
Xueqian Wang
Peilin Zhao
55
12
0
12 Feb 2024
Measurement in the Age of LLMs: An Application to Ideological Scaling
Sean O'Hagan
Aaron Schein
40
8
0
14 Dec 2023
Hijacking Large Language Models via Adversarial In-Context Learning
Yao Qiang
Xiangyu Zhou
Dongxiao Zhu
30
32
0
16 Nov 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1