On the Trustworthiness Landscape of State-of-the-art Generative Models:
A Survey and Outlook

On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook

31 July 2023

Chengyu Wang

Cen Chen

Papers citing "On the Trustworthiness Landscape of State-of-the-art Generative Models: A Survey and Outlook"

11 / 11 papers shown

Title
ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger Jiazhao Li Yijin Yang Zhuofeng Wu V. Vydiswaran Chaowei Xiao SILM 41 41 0 27 Apr 2023
The Internal State of an LLM Knows When It's Lying A. Azaria Tom Michael Mitchell HILM 210 297 0 26 Apr 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang Jason W. Wei Dale Schuurmans Quoc Le Ed H. Chi Sharan Narang Aakanksha Chowdhery Denny Zhou ReLM BDL LRM AI4CE 297 3,163 0 21 Mar 2022
Palette: Image-to-Image Diffusion Models Chitwan Saharia William Chan Huiwen Chang Chris A. Lee Jonathan Ho Tim Salimans David J. Fleet Mohammad Norouzi DiffM VLM 320 1,570 0 10 Nov 2021
Gradient-based Adversarial Attacks against Text Transformers Chuan Guo Alexandre Sablayrolles Hervé Jégou Douwe Kiela SILM 95 225 0 15 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick Sahana Udupa Hinrich Schütze 254 374 0 28 Feb 2021
Zero-Shot Text-to-Image Generation Aditya A. Ramesh Mikhail Pavlov Gabriel Goh Scott Gray Chelsea Voss Alec Radford Mark Chen Ilya Sutskever VLM 253 4,735 0 24 Feb 2021
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models Alex Tamkin Miles Brundage Jack Clark Deep Ganguli AILaw ELM 192 248 0 04 Feb 2021
Extracting Training Data from Large Language Models Nicholas Carlini Florian Tramèr Eric Wallace Matthew Jagielski Ariel Herbert-Voss ... Tom B. Brown D. Song Ulfar Erlingsson Alina Oprea Colin Raffel MLAU SILM 264 1,798 0 14 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference Timo Schick Hinrich Schütze 248 1,382 0 21 Jan 2020
Generating Natural Language Adversarial Examples M. Alzantot Yash Sharma Ahmed Elgohary Bo-Jhang Ho Mani B. Srivastava Kai-Wei Chang AAML 230 909 0 21 Apr 2018