ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11932
  4. Cited By
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on
  Text Classification and Entailment
v1v2v3v4v5v6 (latest)

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

27 July 2019
Di Jin
Zhijing Jin
Qiufeng Wang
Peter Szolovits
    SILMAAML
ArXiv (abs)PDFHTMLGithub (511★)

Papers citing "Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment"

50 / 567 papers shown
Title
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still
  Easy To Generate Unsafe Images ... For Now
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
Yimeng Zhang
Jinghan Jia
Xin Chen
Aochuan Chen
Yihua Zhang
Jiancheng Liu
Ke Ding
Sijia Liu
DiffM
177
101
0
18 Oct 2023
BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries
BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries
Wenjie Lv
Zhen Wang
Yitao Zheng
Zhehua Zhong
Qi Xuan
Tianyi Chen
AAML
83
1
0
14 Oct 2023
Effects of Human Adversarial and Affable Samples on BERT Generalization
Effects of Human Adversarial and Affable Samples on BERT Generalization
Aparna Elangovan
Jiayuan He
Yuan Li
Karin Verspoor
69
3
0
12 Oct 2023
Jailbreak and Guard Aligned Language Models with Only Few In-Context
  Demonstrations
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Zeming Wei
Yifei Wang
Ang Li
Yichuan Mo
Yisen Wang
117
279
0
10 Oct 2023
VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via
  Pre-trained Models
VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models
Ziyi Yin
Muchao Ye
Tianrong Zhang
Tianyu Du
Jinguo Zhu
Han Liu
Jinghui Chen
Ting Wang
Fenglong Ma
AAMLVLMCoGe
89
44
0
07 Oct 2023
Fooling the Textual Fooler via Randomizing Latent Representations
Fooling the Textual Fooler via Randomizing Latent Representations
Duy C. Hoang
Quang H. Nguyen
Saurav Manchanda
MinLong Peng
Kok-Seng Wong
Khoa D. Doan
SILMAAML
64
0
0
02 Oct 2023
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
A. Maritan
Jiaao Chen
S. Dey
Luca Schenato
Diyi Yang
Xing Xie
ELMLRM
154
54
0
29 Sep 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
The Trickle-down Impact of Reward (In-)consistency on RLHF
Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu
93
23
0
28 Sep 2023
SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via
  Substitution
SurrogatePrompt: Bypassing the Safety Filter of Text-To-Image Models via Substitution
Zhongjie Ba
Jieming Zhong
Jiachen Lei
Pengyu Cheng
Qinglong Wang
Zhan Qin
Peng Kuang
Kui Ren
73
22
0
25 Sep 2023
On the Relationship between Skill Neurons and Robustness in Prompt
  Tuning
On the Relationship between Skill Neurons and Robustness in Prompt Tuning
Leon Ackermann
Xenia Ohmer
AAML
42
0
0
21 Sep 2023
Are Large Language Models Really Robust to Word-Level Perturbations?
Are Large Language Models Really Robust to Word-Level Perturbations?
Haoyu Wang
Guozheng Ma
Cong Yu
Ning Gui
Linrui Zhang
...
Sen Zhang
Li Shen
Xueqian Wang
Peilin Zhao
Dacheng Tao
KELM
109
24
0
20 Sep 2023
What Learned Representations and Influence Functions Can Tell Us About
  Adversarial Examples
What Learned Representations and Influence Functions Can Tell Us About Adversarial Examples
Shakila Mahjabin Tonni
Mark Dras
TDIAAMLGAN
60
0
0
19 Sep 2023
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
Bochuan Cao
Yu Cao
Lu Lin
Jinghui Chen
AAML
86
152
0
18 Sep 2023
How to Handle Different Types of Out-of-Distribution Scenarios in
  Computational Argumentation? A Comprehensive and Fine-Grained Field Study
How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study
Andreas Waldis
Yufang Hou
Iryna Gurevych
62
4
0
15 Sep 2023
MathAttack: Attacking Large Language Models Towards Math Solving Ability
MathAttack: Attacking Large Language Models Towards Math Solving Ability
Zihao Zhou
Qiufeng Wang
Mingyu Jin
Jie Yao
Jianan Ye
Wei Liu
Wei Wang
Xiaowei Huang
Kaizhu Huang
AAML
115
29
0
04 Sep 2023
Open Sesame! Universal Black Box Jailbreaking of Large Language Models
Open Sesame! Universal Black Box Jailbreaking of Large Language Models
Raz Lapid
Ron Langberg
Moshe Sipper
AAML
135
112
0
04 Sep 2023
Explainability for Large Language Models: A Survey
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jundong Li
LRM
106
469
0
02 Sep 2023
A Classification-Guided Approach for Adversarial Attacks against Neural
  Machine Translation
A Classification-Guided Approach for Adversarial Attacks against Neural Machine Translation
Sahar Sadrizadeh
Ljiljana Dolamic
P. Frossard
AAMLSILM
80
2
0
29 Aug 2023
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and
  Vulnerabilities
Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities
Maximilian Mozes
Xuanli He
Bennett Kleinberg
Lewis D. Griffin
87
87
0
24 Aug 2023
LEAP: Efficient and Automated Test Method for NLP Software
LEAP: Efficient and Automated Test Method for NLP Software
Ming-Ming Xiao
Yan Xiao
Hai Dong
Shunhui Ji
Pengcheng Zhang
AAML
62
8
0
22 Aug 2023
An Image is Worth a Thousand Toxic Words: A Metamorphic Testing
  Framework for Content Moderation Software
An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software
Wenxuan Wang
Jingyuan Huang
Jen-tse Huang
Chang Chen
Jiazhen Gu
Pinjia He
Michael R. Lyu
VLM
56
6
0
18 Aug 2023
Robustness Over Time: Understanding Adversarial Examples' Effectiveness
  on Longitudinal Versions of Large Language Models
Robustness Over Time: Understanding Adversarial Examples' Effectiveness on Longitudinal Versions of Large Language Models
Yugeng Liu
Tianshuo Cong
Zhengyu Zhao
Michael Backes
Yun Shen
Yang Zhang
AAML
90
8
0
15 Aug 2023
Robust Infidelity: When Faithfulness Measures on Masked Language Models
  Are Misleading
Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading
Evan Crothers
H. Viktor
Nathalie Japkowicz
AAML
68
1
0
13 Aug 2023
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak
  Prompts on Large Language Models
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen
Zhenpeng Chen
Michael Backes
Yun Shen
Yang Zhang
SILM
163
302
0
07 Aug 2023
Mondrian: Prompt Abstraction Attack Against Large Language Models for
  Cheaper API Pricing
Mondrian: Prompt Abstraction Attack Against Large Language Models for Cheaper API Pricing
Waiman Si
Michael Backes
Yang Zhang
41
6
0
07 Aug 2023
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial
  Attack
LimeAttack: Local Explainable Method for Textual Hard-Label Adversarial Attack
HaiXiang Zhu
Zhaoqing Yang
Weiwei Shang
Yuren Wu
AAMLFAtt
80
3
0
01 Aug 2023
Adversarially Robust Neural Legal Judgement Systems
Adversarially Robust Neural Legal Judgement Systems
R. Raj
V. Devi
AILawELMAAML
38
0
0
31 Jul 2023
On the Trustworthiness Landscape of State-of-the-art Generative Models:
  A Survey and Outlook
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook
Mingyuan Fan
Chengyu Wang
Cen Chen
Yang Liu
Jun Huang
HILM
74
3
0
31 Jul 2023
Text-CRS: A Generalized Certified Robustness Framework against Textual
  Adversarial Attacks
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks
Xinyu Zhang
Hanbin Hong
Yuan Hong
Peng Huang
Binghui Wang
Zhongjie Ba
Kui Ren
SILM
129
25
0
31 Jul 2023
When Measures are Unreliable: Imperceptible Adversarial Perturbations
  toward Top-$k$ Multi-Label Learning
When Measures are Unreliable: Imperceptible Adversarial Perturbations toward Top-kkk Multi-Label Learning
Yuchen Sun
Qianqian Xu
Zitai Wang
Qingming Huang
AAML
107
1
0
27 Jul 2023
Set-level Guidance Attack: Boosting Adversarial Transferability of
  Vision-Language Pre-training Models
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
Dong Lu
Zhiqiang Wang
Teng Wang
Weili Guan
Hongchang Gao
Feng Zheng
AAML
118
76
0
26 Jul 2023
Lost In Translation: Generating Adversarial Examples Robust to
  Round-Trip Translation
Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation
Neel Bhandari
Pin-Yu Chen
AAMLSILM
84
3
0
24 Jul 2023
Gradient-Based Word Substitution for Obstinate Adversarial Examples
  Generation in Language Models
Gradient-Based Word Substitution for Obstinate Adversarial Examples Generation in Language Models
Yimu Wang
Peng Shi
Hongyang Zhang
SILM
23
3
0
24 Jul 2023
NatLogAttack: A Framework for Attacking Natural Language Inference
  Models with Natural Logic
NatLogAttack: A Framework for Attacking Natural Language Inference Models with Natural Logic
Zióu Zheng
Xiao-Dan Zhu
AAMLLRM
86
6
0
06 Jul 2023
SCAT: Robust Self-supervised Contrastive Learning via Adversarial
  Training for Text Classification
SCAT: Robust Self-supervised Contrastive Learning via Adversarial Training for Text Classification
J. Wu
Dit-Yan Yeung
SILM
72
0
0
04 Jul 2023
Interpretability and Transparency-Driven Detection and Transformation of
  Textual Adversarial Examples (IT-DT)
Interpretability and Transparency-Driven Detection and Transformation of Textual Adversarial Examples (IT-DT)
Bushra Sabir
Muhammad Ali Babar
Sharif Abuadbba
SILM
74
10
0
03 Jul 2023
MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning
MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning
Zhehua Zhong
Tianyi Chen
Zhen Wang
AAML
37
3
0
27 Jun 2023
A Survey on Out-of-Distribution Evaluation of Neural NLP Models
A Survey on Out-of-Distribution Evaluation of Neural NLP Models
Xinzhe Li
Ming Liu
Shang Gao
Wray Buntine
74
20
0
27 Jun 2023
On the Universal Adversarial Perturbations for Efficient Data-free
  Adversarial Detection
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection
Songyang Gao
Shihan Dou
Qi Zhang
Xuanjing Huang
Jin Ma
Yingchun Shan
AAML
55
3
0
27 Jun 2023
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk
  Minimization
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization
Songyang Gao
Shihan Dou
Yan Liu
Xiao Wang
Qi Zhang
Zhongyu Wei
Jin Ma
Yingchun Shan
OOD
60
4
0
27 Jun 2023
Towards Understanding What Code Language Models Learned
Towards Understanding What Code Language Models Learned
Toufique Ahmed
Dian Yu
Chen Huang
Cathy Wang
Prem Devanbu
Kenji Sagae
ELM
77
5
0
20 Jun 2023
Exploring New Frontiers in Agricultural NLP: Investigating the Potential
  of Large Language Models for Food Applications
Exploring New Frontiers in Agricultural NLP: Investigating the Potential of Large Language Models for Food Applications
Saed Rezayi
Zheng Liu
Zihao Wu
Chandra Dhakal
Bao Ge
...
Gengchen Mai
Ninghao Liu
Chen Zhen
Tianming Liu
Sheng Li
73
33
0
20 Jun 2023
Investigating Masking-based Data Generation in Language Models
Investigating Masking-based Data Generation in Language Models
Edward Ma
61
0
0
16 Jun 2023
A Relaxed Optimization Approach for Adversarial Attacks against Neural
  Machine Translation Models
A Relaxed Optimization Approach for Adversarial Attacks against Neural Machine Translation Models
Sahar Sadrizadeh
C. Barbier
Ljiljana Dolamic
P. Frossard
AAML
32
0
0
14 Jun 2023
Can Large Language Models Infer Causation from Correlation?
Can Large Language Models Infer Causation from Correlation?
Zhijing Jin
Jiarui Liu
Zhiheng Lyu
Spencer Poff
Mrinmaya Sachan
Rada Mihalcea
Mona T. Diab
Bernhard Schölkopf
LRM
80
129
0
09 Jun 2023
Enhancing Robustness of AI Offensive Code Generators via Data
  Augmentation
Enhancing Robustness of AI Offensive Code Generators via Data Augmentation
Cristina Improta
Pietro Liguori
R. Natella
B. Cukic
Domenico Cotroneo
AAML
86
4
0
08 Jun 2023
PromptAttack: Probing Dialogue State Trackers with Adversarial Prompts
PromptAttack: Probing Dialogue State Trackers with Adversarial Prompts
Xiangjue Dong
Yun He
Ziwei Zhu
James Caverlee
AAML
57
7
0
07 Jun 2023
VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard
  Labels of Transformations
VoteTRANS: Detecting Adversarial Text without Training by Voting on Hard Labels of Transformations
Hoang-Quoc Nguyen-Son
Seira Hidano
Kazuhide Fukushima
S. Kiyomoto
Isao Echizen
57
0
0
02 Jun 2023
AMR4NLI: Interpretable and robust NLI measures from semantic graphs
AMR4NLI: Interpretable and robust NLI measures from semantic graphs
Juri Opitz
Shira Wein
Julius Steen
Anette Frank
Nathan Schneider
80
0
0
01 Jun 2023
Measuring the Robustness of NLP Models to Domain Shifts
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
121
8
0
31 May 2023
Previous
123456...101112
Next