ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.14595
  4. Cited By
Adversaries Can Misuse Combinations of Safe Models

Adversaries Can Misuse Combinations of Safe Models

20 June 2024
Erik Jones
Anca Dragan
Jacob Steinhardt
ArXivPDFHTML

Papers citing "Adversaries Can Misuse Combinations of Safe Models"

4 / 4 papers shown
Title
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Christian Schroeder de Witt
AAML
AI4CE
46
0
0
04 May 2025
Superintelligence Strategy: Expert Version
Superintelligence Strategy: Expert Version
Dan Hendrycks
Eric Schmidt
Alexandr Wang
57
1
0
07 Mar 2025
Feedback Loops With Language Models Drive In-Context Reward Hacking
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan
Erik Jones
Meena Jagadeesan
Jacob Steinhardt
KELM
42
25
0
09 Feb 2024
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
206
1,701
0
07 Apr 2023
1