Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2502.17537
Cited By
v1
v2
v3 (latest)
Rethinking the Vulnerability of Concept Erasure and a New Method
24 February 2025
Alex D. Richardson
Alex D. Richardson
Lucas Beerens
Dongdong Chen
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2157★)
Papers citing
"Rethinking the Vulnerability of Concept Erasure and a New Method"
43 / 43 papers shown
Title
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
291
41
0
17 Apr 2025
ACE: Attentional Concept Erasure in Diffusion Models
Finn Carter
DiffM
273
3
0
16 Apr 2025
A Comprehensive Survey on Concept Erasure in Text-to-Image Diffusion Models
Changhoon Kim
Yanjun Qi
DiffM
387
5
0
17 Feb 2025
One-Step is Enough: Sparse Autoencoders for Text-to-Image Diffusion Models
Viacheslav Surkov
Chris Wendler
Antonio Mari
Mikhail Terekhov
Justin Deschenaux
Robert West
Çağlar Gülçehre
David Bau
VLM
460
14
0
28 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
International Conference on Learning Representations (ICLR), 2024
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
599
52
0
16 Oct 2024
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models
Chao Gong
Kai-xiang Chen
Zhipeng Wei
Yue Yu
Yulong Jiang
DiffM
294
63
0
17 Jul 2024
R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model
Changhoon Kim
Kyle Min
Yezhou Yang
264
42
0
25 May 2024
Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models
Yimeng Zhang
Xin Chen
Jinghan Jia
Yihua Zhang
Chongyu Fan
Jiancheng Liu
Mingyi Hong
Ke Ding
Sijia Liu
DiffM
421
108
0
24 May 2024
Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective
Xiaoxuan Han
Songlin Yang
Wei Wang
Yang Li
Jing Dong
DiffM
AAML
191
9
0
30 Apr 2024
EraseDiff: Erasing Data Influence in Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2024
Jing Wu
Trung Le
Munawar Hayat
Mehrtash Harandi
DiffM
341
13
0
11 Jan 2024
Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
European Conference on Computer Vision (ECCV), 2024
Jing Wu
Mehrtash Harandi
308
34
0
11 Jan 2024
One-Dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
Mengyao Lyu
Yuhong Yang
Haiwen Hong
Hui Chen
Xuan Jin
Yuan He
Hui Xue
Jungong Han
Guiguang Ding
DiffM
295
106
0
26 Dec 2023
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
International Conference on Learning Representations (ICLR), 2023
Chongyu Fan
Jiancheng Liu
Yihua Zhang
Eric Wong
Dennis Wei
Sijia Liu
MU
457
251
0
19 Oct 2023
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now
European Conference on Computer Vision (ECCV), 2023
Yimeng Zhang
Jinghan Jia
Xin Chen
Chenyi Zi
Yihua Zhang
Jiancheng Liu
Ke Ding
Sijia Liu
DiffM
621
157
0
18 Oct 2023
Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?
Yu-Lin Tsai
Chia-Yi Hsu
Chulin Xie
Chih-Hsun Lin
Jia-You Chen
Yue Liu
Pin-Yu Chen
Chia-Mu Yu
Chun-ying Huang
DiffM
249
147
0
16 Oct 2023
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
International Conference on Machine Learning (ICML), 2023
Zhi-Yi Chin
Chieh-Ming Jiang
Ching-Chun Huang
Pin-Yu Chen
Wei-Chen Chiu
DiffM
267
115
0
12 Sep 2023
Unified Concept Editing in Diffusion Models
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Rohit Gandikota
Hadas Orgad
Yonatan Belinkov
Joanna Materzyñska
David Bau
DiffM
343
285
0
25 Aug 2023
Circumventing Concept Erasure Methods For Text-to-Image Generative Models
International Conference on Learning Representations (ICLR), 2023
Minh Pham
Kelly O. Marshall
Niv Cohen
Govind Mittal
Chinmay Hegde
DiffM
194
63
0
03 Aug 2023
Understanding the Latent Space of Diffusion Models through the Lens of Riemannian Geometry
Neural Information Processing Systems (NeurIPS), 2023
Yong-Hyun Park
Mingi Kwon
J. Choi
Junghyo Jo
Youngjung Uh
DiffM
333
106
0
24 Jul 2023
Are aligned neural networks adversarially aligned?
Neural Information Processing Systems (NeurIPS), 2023
Nicholas Carlini
Milad Nasr
Christopher A. Choquette-Choo
Matthew Jagielski
Irena Gao
...
Pang Wei Koh
Daphne Ippolito
Katherine Lee
Florian Tramèr
Ludwig Schmidt
AAML
244
311
0
26 Jun 2023
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
Eric Zhang
Kai Wang
Xingqian Xu
Zinan Lin
Humphrey Shi
DiffM
317
251
0
30 Mar 2023
Your Diffusion Model is Secretly a Zero-Shot Classifier
IEEE International Conference on Computer Vision (ICCV), 2023
Alexander C. Li
Mihir Prabhudesai
Shivam Duggal
Ellis L Brown
Deepak Pathak
DiffM
VLM
647
303
0
28 Mar 2023
Text-to-Image Diffusion Models are Zero-Shot Classifiers
Neural Information Processing Systems (NeurIPS), 2023
Kevin Clark
P. Jaini
DiffM
VLM
352
147
0
27 Mar 2023
Ablating Concepts in Text-to-Image Diffusion Models
IEEE International Conference on Computer Vision (ICCV), 2023
Nupur Kumari
Bin Zhang
Sheng-Yu Wang
Eli Shechtman
Richard Y. Zhang
Jun-Yan Zhu
VLM
390
272
0
23 Mar 2023
Erasing Concepts from Diffusion Models
IEEE International Conference on Computer Vision (ICCV), 2023
Rohit Gandikota
Joanna Materzyñska
Jaden Fiotto-Kaufman
David Bau
DiffM
417
419
0
13 Mar 2023
Automatically Auditing Large Language Models via Discrete Optimization
International Conference on Machine Learning (ICML), 2023
Erik Jones
Anca Dragan
Aditi Raghunathan
Jacob Steinhardt
227
210
0
08 Mar 2023
Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery
Neural Information Processing Systems (NeurIPS), 2023
Yuxin Wen
Neel Jain
John Kirchenbauer
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
DiffM
281
353
1
07 Feb 2023
Toward Human Readable Prompt Tuning: Kubrick's The Shining is a good movie, and a good prompt too?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Weijia Shi
Xiaochuang Han
Hila Gonen
Ari Holtzman
Yulia Tsvetkov
Luke Zettlemoyer
189
56
0
20 Dec 2022
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Gowthami Somepalli
Vasu Singla
Micah Goldblum
Jonas Geiping
Tom Goldstein
354
415
0
07 Dec 2022
Safe Latent Diffusion: Mitigating Inappropriate Degeneration in Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
P. Schramowski
Manuel Brack
Bjorn Deiseroth
Kristian Kersting
461
436
0
09 Nov 2022
Diffusion Models: A Comprehensive Survey of Methods and Applications
ACM Computing Surveys (ACM CSUR), 2022
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Tengjiao Wang
Ming-Hsuan Yang
DiffM
MedIm
1.3K
1,839
0
02 Sep 2022
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
International Conference on Learning Representations (ICLR), 2022
Rinon Gal
Yuval Alaluf
Yuval Atzmon
Or Patashnik
Amit H. Bermano
Gal Chechik
Daniel Cohen-Or
445
2,397
0
02 Aug 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
International Conference on Machine Learning (ICML), 2022
Junnan Li
Dongxu Li
Caiming Xiong
Guosheng Lin
MLLM
BDL
VLM
CLIP
1.3K
5,628
0
28 Jan 2022
High-Resolution Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
1.7K
20,624
0
20 Dec 2021
Gradient-based Adversarial Attacks against Text Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Chuan Guo
Alexandre Sablayrolles
Edouard Grave
Douwe Kiela
SILM
236
284
0
15 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
International Conference on Machine Learning (ICML), 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
2.0K
40,340
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
1.3K
53,970
0
22 Oct 2020
Denoising Diffusion Implicit Models
International Conference on Learning Representations (ICLR), 2020
Jiaming Song
Chenlin Meng
Stefano Ermon
VLM
DiffM
1.3K
9,995
0
06 Oct 2020
Universal Adversarial Triggers for Attacking and Analyzing NLP
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Eric Wallace
Shi Feng
Nikhil Kandpal
Matt Gardner
Sameer Singh
AAML
SILM
443
979
0
20 Aug 2019
YOLOv3: An Incremental Improvement
Joseph Redmon
Ali Farhadi
ObjD
775
23,900
0
08 Apr 2018
Categorical Reparameterization with Gumbel-Softmax
Eric Jang
S. Gu
Ben Poole
BDL
1.0K
5,883
0
03 Nov 2016
Adversarial examples in the physical world
International Conference on Learning Representations (ICLR), 2016
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
1.3K
6,397
0
08 Jul 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
3.0K
88,298
0
18 May 2015
1