Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12999
Cited By
Securing the Future of GenAI: Policy and Technology
21 May 2024
Mihai Christodorescu
Craven
S. Feizi
Neil Zhenqiang Gong
Mia Hoffmann
Somesh Jha
Zhengyuan Jiang
Mehrdad Saberi
John C. Mitchell
Jessica Newman
Emelia Probasco
Yanjun Qi
Khawaja Shams
Turek
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Securing the Future of GenAI: Policy and Technology"
8 / 8 papers shown
Title
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
David Dalrymple
Joar Skalse
Yoshua Bengio
Stuart J. Russell
Max Tegmark
...
Clark Barrett
Ding Zhao
Zhi-Xuan Tan
Jeannette Wing
Joshua Tenenbaum
44
51
0
10 May 2024
Watermark-based Detection and Attribution of AI-Generated Content
Zhengyuan Jiang
Moyang Guo
Yuepeng Hu
Neil Zhenqiang Gong
16
5
0
05 Apr 2024
Publicly-Detectable Watermarking for Language Models
Jaiden Fairoze
Sanjam Garg
Somesh Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
WaLM
139
45
0
27 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
144
139
0
16 Oct 2023
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
117
314
0
21 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
218
441
0
23 Aug 2022
Diffusion Models for Adversarial Purification
Weili Nie
Brandon Guo
Yujia Huang
Chaowei Xiao
Arash Vahdat
Anima Anandkumar
WIGM
184
410
0
16 May 2022
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
225
3,658
0
28 Feb 2017
1