Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.10013
Cited By
Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
18 March 2021
Xuanli He
Lingjuan Lyu
Xingliang Yuan
Lichao Sun
MIACV
SILM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!"
50 / 52 papers shown
Adversarial Defence without Adversarial Defence: Enhancing Language Model Robustness via Instance-level Principal Component Removal
Yang Wang
Chenghao Xiao
Yi Zhou
Stuart E. Middleton
Noura Al Moubayed
C. D. Lin
AAML
356
1
0
29 Jul 2025
Coordinated Robustness Evaluation Framework for Vision-Language Models
Ashwin Ramesh Babu
Sajad Mousavi
Vineet Gundecha
Sahand Ghorbanpour
Avisek Naug
Antonio Guillen
Ricardo Luna Gutierrez
Soumyendu Sarkar
AAML
204
0
0
05 Jun 2025
Model Stealing for Any Low-Rank Language Model
Symposium on the Theory of Computing (STOC), 2024
Allen Liu
Ankur Moitra
253
10
0
12 Nov 2024
A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality
Hanbo Huang
Yihan Li
Bowen Jiang
Bo Jiang
Lin Liu
Tian Ding
Zhuotao Liu
Shiyu Liang
278
0
0
15 Oct 2024
Privacy Evaluation Benchmarks for NLP Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Wei Huang
Yinggui Wang
Cen Chen
ELM
SILM
439
5
0
24 Sep 2024
WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Anudeex Shetty
Xingliang Yuan
Jey Han Lau
WaLM
430
7
0
29 Aug 2024
VidModEx: Interpretable and Efficient Black Box Model Extraction for High-Dimensional Spaces
Somnath Sendhil Kumar
Yuvaraj Govindarajulu
Pavan Kulkarni
Manojkumar Somabhai Parmar
FAtt
244
1
0
04 Aug 2024
Risks, Causes, and Mitigations of Widespread Deployments of Large Language Models (LLMs): A Survey
Md. Nazmus Sakib
Md Athikul Islam
Royal Pathak
Md Mashrur Arifin
ALM
PILM
320
13
0
01 Aug 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
560
45
0
20 Jul 2024
Image-to-Text Logic Jailbreak: Your Imagination can Help You Do Anything
Xiaotian Zou
Ke Li
Yongkang Chen
MLLM
320
6
0
01 Jul 2024
IDT: Dual-Task Adversarial Attacks for Privacy Protection
Pedro Faustini
Shakila Mahjabin Tonni
Annabelle McIver
Xingliang Yuan
Mark Dras
SILM
AAML
258
0
0
28 Jun 2024
Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries
Yu-Hsiang Huang
Yuche Tsai
Hsiang Hsiao
Hong-Yi Lin
Shou-De Lin
SILM
360
23
0
12 Jun 2024
The Impact of Quantization on the Robustness of Transformer-based Text Classifiers
Seyed Parsa Neshaei
Yasaman Boreshban
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
MQ
238
2
0
08 Mar 2024
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
Anudeex Shetty
Yue Teng
Ke He
Xingliang Yuan
WaLM
348
14
0
03 Mar 2024
Amplifying Training Data Exposure through Fine-Tuning with Pseudo-Labeled Memberships
Myung Gyo Oh
Hong Eun Ahn
L. Park
T.-H. Kwon
MIALM
AAML
371
0
0
19 Feb 2024
PAL: Proxy-Guided Black-Box Attack on Large Language Models
Chawin Sitawarin
Norman Mu
David Wagner
Alexandre Araujo
ELM
254
52
0
15 Feb 2024
Revealing Vulnerabilities in Stable Diffusion via Targeted Attacks
Chenyu Zhang
Yiwen Ma
Anan Liu
597
12
0
16 Jan 2024
Punctuation Matters! Stealthy Backdoor Attack for Language Models
Xuan Sheng
Zhicheng Li
Zhaoyang Han
Xiangmao Chang
Piji Li
311
7
0
26 Dec 2023
SenTest: Evaluating Robustness of Sentence Encoders
Tanmay Chavan
Shantanu Patankar
Aditya Kane
Omkar Gokhale
Geetanjali Kale
Raviraj Joshi
248
1
0
29 Nov 2023
Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration
Neural Information Processing Systems (NeurIPS), 2023
Wenjie Fu
Huandong Wang
Chen Gao
Guanghua Liu
Yong Li
Tao Jiang
MIALM
588
30
0
10 Nov 2023
Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection
Akshit Jindal
Vikram Goyal
Saket Anand
Chetan Arora
FedML
292
5
0
08 Nov 2023
A Survey on Transferability of Adversarial Examples across Deep Neural Networks
Jindong Gu
Yang Liu
Pau de Jorge
Wenqain Yu
Xinwei Liu
...
Anjun Hu
Ashkan Khakzar
Zhijiang Li
Simeng Qin
Juil Sock
AAML
494
55
0
26 Oct 2023
BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries
Wenjie Lv
Zhen Wang
Yitao Zheng
Zhehua Zhong
Qi Xuan
Tianyi Chen
AAML
430
2
0
14 Oct 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
Lingfeng Shen
Sihao Chen
Linfeng Song
Lifeng Jin
Baolin Peng
Haitao Mi
Daniel Khashabi
Dong Yu
269
28
0
28 Sep 2023
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks
Hongcheng Gao
Hao Zhang
Yinpeng Dong
Zhijie Deng
AAML
364
28
0
16 Jun 2023
Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS
Workshop on Representation Learning for NLP (RepL4NLP), 2023
Cheng-Han Chiang
Yung-Sung Chuang
James R. Glass
Hung-yi Lee
AI4TS
324
3
0
08 Jun 2023
MAWSEO: Adversarial Wiki Search Poisoning for Illicit Online Promotion
IEEE Symposium on Security and Privacy (IEEE S&P), 2023
Zilong Lin
Zhengyi Li
Xiaojing Liao
Luyi Xing
Xiaozhong Liu
AAML
281
16
0
22 Apr 2023
Stealing the Decoding Algorithms of Language Models
Conference on Computer and Communications Security (CCS), 2023
A. Naseh
Kalpesh Krishna
Mohit Iyyer
Amir Houmansadr
MLAU
427
31
0
08 Mar 2023
Training-free Lexical Backdoor Attacks on Language Models
The Web Conference (WWW), 2023
Yujin Huang
Terry Yue Zhuo
Xingliang Yuan
Han Hu
Lizhen Qu
Chunyang Chen
SILM
313
55
0
08 Feb 2023
Protecting Language Generation Models via Invisible Watermarking
International Conference on Machine Learning (ICML), 2023
Xuandong Zhao
Yu-Xiang Wang
Lei Li
WaLM
465
114
0
06 Feb 2023
TextShield: Beyond Successfully Detecting Adversarial Sentences in Text Classification
International Conference on Learning Representations (ICLR), 2023
Lingfeng Shen
Ze Zhang
Haiyun Jiang
Ying-Cong Chen
AAML
454
10
0
03 Feb 2023
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen-An Li
Tung-Yu Wu
Hung-yi Lee
284
1
0
29 Nov 2022
UPTON: Preventing Authorship Leakage from Public Text Release via Data Poisoning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ziyao Wang
Thai Le
Dongwon Lee
233
1
0
17 Nov 2022
Preserving Semantics in Textual Adversarial Attacks
European Conference on Artificial Intelligence (ECAI), 2022
David Herel
Hugo Cisneros
Tomas Mikolov
AAML
333
11
0
08 Nov 2022
Extracted BERT Model Leaks More Information than You Think!
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xuanli He
Chen Chen
Lingjuan Lyu
Xingliang Yuan
SILM
MIACV
233
6
0
21 Oct 2022
Distillation-Resistant Watermarking for Model Protection in NLP
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xuandong Zhao
Lei Li
Yu-Xiang Wang
WaLM
299
30
0
07 Oct 2022
CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks
Neural Information Processing Systems (NeurIPS), 2022
Xuanli He
Xingliang Yuan
Yi Zeng
Lingjuan Lyu
Fangzhao Wu
Jiwei Li
R. Jia
WaLM
409
94
0
19 Sep 2022
I Know What You Trained Last Summer: A Survey on Stealing Machine Learning Models and Defences
ACM Computing Surveys (ACM CSUR), 2022
Daryna Oliynyk
Rudolf Mayer
Andreas Rauber
374
165
0
16 Jun 2022
Edge Security: Challenges and Issues
Xin Jin
Charalampos Katsis
Fan Sang
Jiahao Sun
A. Kundu
Ramana Rao Kompella
292
15
0
14 Jun 2022
A Word is Worth A Thousand Dollars: Adversarial Attack on Tweets Fools Stock Predictions
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Yong Xie
Dakuo Wang
Pin-Yu Chen
Jinjun Xiong
Sijia Liu
Oluwasanmi Koyejo
AAML
380
26
0
01 May 2022
A Girl Has A Name, And It's ... Adversarial Authorship Attribution for Deobfuscation
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Wanyue Zhai
Jonathan Rusert
Zubair Shafiq
P. Srinivasan
207
7
0
22 Mar 2022
On Robust Prefix-Tuning for Text Classification
International Conference on Learning Representations (ICLR), 2022
Zonghan Yang
Yang Liu
VLM
262
23
0
19 Mar 2022
A Survey of Adversarial Defences and Robustness in NLP
Shreyansh Goyal
Sumanth Doddapaneni
Mitesh M.Khapra
B. Ravindran
AAML
645
36
0
12 Mar 2022
Threats to Pre-trained Language Models: Survey and Taxonomy
Shangwei Guo
Chunlong Xie
Jiwei Li
Lingjuan Lyu
Tianwei Zhang
PILM
216
36
0
14 Feb 2022
Fooling MOSS Detection with Pretrained Language Models
International Conference on Information and Knowledge Management (CIKM), 2022
Stella Biderman
Edward Raff
DeLMO
246
40
0
19 Jan 2022
Protecting Intellectual Property of Language Generation APIs with Lexical Watermark
AAAI Conference on Artificial Intelligence (AAAI), 2021
Xuanli He
Xingliang Yuan
Lingjuan Lyu
Fangzhao Wu
Chenguang Wang
WaLM
466
122
0
05 Dec 2021
Virtual Data Augmentation: A Robust and General Framework for Fine-tuning Pre-trained Models
Kun Zhou
Wayne Xin Zhao
Sirui Wang
Fuzheng Zhang
Wei Wu
Ji-Rong Wen
AAML
231
9
0
13 Sep 2021
Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs
International Conference on Computational Linguistics (COLING), 2021
Xingliang Yuan
Xuanli He
Lingjuan Lyu
Zhuang Li
Gholamreza Haffari
MLAU
317
26
0
29 Aug 2021
Killing One Bird with Two Stones: Model Extraction and Attribute Inference Attacks against BERT-based APIs
Chen Chen
Xuanli He
Lingjuan Lyu
Fangzhao Wu
SILM
MIACV
206
11
0
23 May 2021
Membership Inference Attacks on Knowledge Graphs
Yu Wang
Lifu Huang
Philip S. Yu
Lichao Sun
MIACV
258
19
0
16 Apr 2021
1
2
Next
Page 1 of 2