ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.12999
  4. Cited By
Securing the Future of GenAI: Policy and Technology

Securing the Future of GenAI: Policy and Technology

21 May 2024
Mihai Christodorescu
Craven
S. Feizi
Neil Zhenqiang Gong
Mia Hoffmann
Somesh Jha
Zhengyuan Jiang
Mehrdad Saberi
John C. Mitchell
Jessica Newman
Emelia Probasco
Yanjun Qi
Khawaja Shams
Turek
    SILM
ArXivPDFHTML

Papers citing "Securing the Future of GenAI: Policy and Technology"

8 / 8 papers shown
Title
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable
  AI Systems
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
David Dalrymple
Joar Skalse
Yoshua Bengio
Stuart J. Russell
Max Tegmark
...
Clark Barrett
Ding Zhao
Zhi-Xuan Tan
Jeannette Wing
Joshua Tenenbaum
44
51
0
10 May 2024
Watermark-based Detection and Attribution of AI-Generated Content
Watermark-based Detection and Attribution of AI-Generated Content
Zhengyuan Jiang
Moyang Guo
Yuepeng Hu
Neil Zhenqiang Gong
14
5
0
05 Apr 2024
Publicly-Detectable Watermarking for Language Models
Publicly-Detectable Watermarking for Language Models
Jaiden Fairoze
Sanjam Garg
Somesh Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
WaLM
139
45
0
27 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by
  Adversarial Attacks
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
138
139
0
16 Oct 2023
Toy Models of Superposition
Toy Models of Superposition
Nelson Elhage
Tristan Hume
Catherine Olsson
Nicholas Schiefer
T. Henighan
...
Sam McCandlish
Jared Kaplan
Dario Amodei
Martin Wattenberg
C. Olah
AAML
MILM
117
314
0
21 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors,
  and Lessons Learned
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
216
327
0
23 Aug 2022
Diffusion Models for Adversarial Purification
Diffusion Models for Adversarial Purification
Weili Nie
Brandon Guo
Yujia Huang
Chaowei Xiao
Arash Vahdat
Anima Anandkumar
WIGM
184
410
0
16 May 2022
Towards A Rigorous Science of Interpretable Machine Learning
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
219
3,658
0
28 Feb 2017
1