ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.21061
  4. Cited By
Security practices in AI development

Security practices in AI development

Ai & Society (AS), 2025
17 May 2025
Petr Spelda
Vit Stritecky
ArXiv (abs)PDFHTML

Papers citing "Security practices in AI development"

3 / 3 papers shown
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
Mrinank Sharma
Meg Tong
Jesse Mu
Jerry Wei
Jorrit Kruthoff
...
Ruiqi Zhong
Giulio Zhou
Jan Leike
Jared Kaplan
Ethan Perez
403
94
0
31 Jan 2025
Open Problems in Machine Unlearning for AI Safety
Open Problems in Machine Unlearning for AI Safety
Fazl Barez
Tingchen Fu
Christian Schroeder de Witt
Stephen Casper
Amartya Sanyal
...
David M. Krueger
Sören Mindermann
José Hernandez-Orallo
Mor Geva
Y. Gal
MU
347
36
0
10 Jan 2025
Tamper-Resistant Safeguards for Open-Weight LLMs
Tamper-Resistant Safeguards for Open-Weight LLMsInternational Conference on Learning Representations (ICLR), 2024
Rishub Tamirisa
Bhrugu Bharathi
Long Phan
Andy Zhou
Alice Gatti
...
Andy Zou
Dawn Song
Bo Li
Dan Hendrycks
Mantas Mazeika
AAMLMU
460
105
0
01 Aug 2024
1