Planting Undetectable Backdoors in Machine Learning Models

14 April 2022

Papers citing "Planting Undetectable Backdoors in Machine Learning Models"

42 / 42 papers shown

Title
LLM vs. SAST: A Technical Analysis on Detecting Coding Bugs of GPT4-Advanced Data Analysis Madjid G Tehrani Eldar Sultanow William J. Buchanan Mahkame Houmani Christel H. Djaha Fodja ELM 14 0 0 18 Jun 2025
A Cryptographic Perspective on Mitigation vs. Detection in Machine Learning Greg Gluch Shafi Goldwasser AAML 135 0 0 28 Apr 2025
Large Language Models Can Verbatim Reproduce Long Malicious Sequences Sharon Lin Krishnamurthy Dvijotham Jamie Hayes Chongyang Shi Ilia Shumailov Shuang Song AAML 80 0 0 21 Mar 2025
Data Free Backdoor Attacks Bochuan Cao Jinyuan Jia Chuxuan Hu Wenbo Guo Zhen Xiang Jinghui Chen Yue Liu Dawn Song AAML 153 0 0 09 Dec 2024
Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors Satoru Koda I. Morikawa AAML 131 0 0 20 Nov 2024
The poison of dimensionality Lê-Nguyên Hoang 77 2 0 25 Sep 2024
OATH: Efficient and Flexible Zero-Knowledge Proofs of End-to-End ML Fairness Olive Franzese Ali Shahin Shamsabadi Hamed Haddadi 49 3 0 17 Sep 2024
Backdoor defense, learnability and obfuscation Paul Christiano Jacob Hilton Victor Lecomte Mark Xu AAML 50 1 0 04 Sep 2024
Rethinking Backdoor Detection Evaluation for Language Models Jun Yan Wenjie Jacky Mo Xiang Ren Robin Jia ELM 101 3 0 31 Aug 2024
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits Andis Draguns Andrew Gritsevskiy S. Motwani Charlie Rogers-Smith Jeffrey Ladish Christian Schroeder de Witt 158 2 0 03 Jun 2024
Interactive Simulations of Backdoors in Neural Networks Peter Bajcsy Maxime Bros 55 0 0 21 May 2024
Trustless Audits without Revealing Data or Models Suppakit Waiwitlikhit Ion Stoica Yi Sun Tatsunori Hashimoto Daniel Kang MLAU 38 10 0 06 Apr 2024
Cryptographic Hardness of Score Estimation Min Jae Song 60 0 0 04 Apr 2024
Understanding polysemanticity in neural networks through coding theory Simon C. Marshall Jan H. Kirchner FAtt MILM AAML 62 10 0 31 Jan 2024
Performance-lossless Black-box Model Watermarking Na Zhao Kejiang Chen Weiming Zhang Neng H. Yu 86 3 0 11 Dec 2023
Label Poisoning is All You Need Rishi Jha J. Hayase Sewoong Oh AAML 86 31 0 29 Oct 2023
Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning Abhilekha Dalal Md Kamruzzaman Sarker Adrita Barua Eugene Y. Vasserman Pascal Hitzler FAtt 44 1 0 08 Aug 2023
Robots as AI Double Agents: Privacy in Motion Planning Rahul Shome Zachary Kingston Lydia E. Kavraki 18 3 0 07 Aug 2023
Backdoor Attacks for In-Context Learning with Language Models Nikhil Kandpal Matthew Jagielski Florian Tramèr Nicholas Carlini SILM AAML 118 84 0 27 Jul 2023
Rethinking Backdoor Attacks Alaa Khaddaj Guillaume Leclerc Aleksandar Makelov Kristian Georgiev Hadi Salman Andrew Ilyas Aleksander Madry SILM 76 29 0 19 Jul 2023
Tools for Verifying Neural Models' Training Data Dami Choi Yonadav Shavit David Duvenaud MIALM 76 17 0 02 Jul 2023
Undetectable Watermarks for Language Models Miranda Christ Sam Gunn Or Zamir WaLM 74 146 0 25 May 2023
Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks Nils Lukas Florian Kerschbaum 95 1 0 07 May 2023
Single Image Backdoor Inversion via Robust Smoothed Classifiers Mingjie Sun Zico Kolter AAML 57 13 0 01 Mar 2023
Analyzing And Editing Inner Mechanisms Of Backdoored Language Models Max Lamparth Anka Reuel KELM 76 11 0 24 Feb 2023
On Feasibility of Server-side Backdoor Attacks on Split Learning Behrad Tajalli Oguzhan Ersoy S. Picek FedML SILM 106 8 0 19 Feb 2023
Threats, Vulnerabilities, and Controls of Machine Learning Based Systems: A Survey and Taxonomy Yusuke Kawamoto Kazumasa Miyake K. Konishi Y. Oiwa 58 4 0 18 Jan 2023
Circumventing interpretability: How to defeat mind-readers Lee D. Sharkey 86 4 0 21 Dec 2022
AI Model Utilization Measurements For Finding Class Encoding Patterns P. Bajcsy Antonio Cardone Chenyi Ling Philippe Dessauw Michael Majurski Timothy Blattner D. Juba Walid Keyrouz 46 0 0 12 Dec 2022
Benchmarking Adversarially Robust Quantum Machine Learning at Scale Maxwell T. West S. Erfani C. Leckie M. Sevior Lloyd C. L. Hollenberg Muhammad Usman AAML OOD 84 35 0 23 Nov 2022
Understanding Impacts of Task Similarity on Backdoor Attack and Detection Di Tang Rui Zhu Xiaofeng Wang Haixu Tang Yi Chen AAML 113 5 0 12 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels J. Hayase Sewoong Oh 72 21 0 12 Oct 2022
ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks Eleanor Clifford Ilia Shumailov Yiren Zhao Ross J. Anderson Robert D. Mullins 93 14 0 30 Sep 2022
On the Impossible Safety of Large AI Models El-Mahdi El-Mhamdi Sadegh Farhadkhani R. Guerraoui Nirupam Gupta L. Hoang Rafael Pinot Sébastien Rouault John Stephan 110 33 0 30 Sep 2022
The Alignment Problem from a Deep Learning Perspective Richard Ngo Lawrence Chan Sören Mindermann 139 192 0 30 Aug 2022
Architectural Backdoors in Neural Networks Mikel Bober-Irizar Ilia Shumailov Yiren Zhao Robert D. Mullins Nicolas Papernot AAML 60 26 0 15 Jun 2022
Rashomon Capacity: A Metric for Predictive Multiplicity in Classification Hsiang Hsu Flavio du Pin Calmon 74 40 0 02 Jun 2022
SafeNet: The Unreasonable Effectiveness of Ensembles in Private Collaborative Learning Harsh Chaudhari Matthew Jagielski Alina Oprea 72 7 0 20 May 2022
Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning Antonio Emanuele Cinà Kathrin Grosse Ambra Demontis Sebastiano Vascon Werner Zellinger Bernhard A. Moser Alina Oprea Battista Biggio Marcello Pelillo Fabio Roli AAML 89 127 0 04 May 2022
Continuous LWE is as Hard as LWE & Applications to Learning Gaussian Mixtures A. Gupte Neekon Vafa Vinod Vaikuntanathan 108 39 0 06 Apr 2022
A Non-Expert's Introduction to Data Ethics for Mathematicians M. A. Porter FaML 61 3 0 18 Jan 2022
Detecting Trojaned DNNs Using Counterfactual Attributions Karan Sikka Indranil Sur Susmit Jha Anirban Roy Ajay Divakaran AAML 35 13 0 03 Dec 2020