ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07510
  4. Cited By
Secret Collusion among AI Agents: Multi-Agent Deception via Steganography
v1v2v3v4v5 (latest)

Secret Collusion among AI Agents: Multi-Agent Deception via Steganography

Neural Information Processing Systems (NeurIPS), 2024
12 February 2024
S. Motwani
Mikhail Baranchuk
Martin Strohmeier
Vijay Bolina
Juil Sock
Lewis Hammond
Christian Schroeder de Witt
ArXiv (abs)PDFHTMLGithub (18132★)

Papers citing "Secret Collusion among AI Agents: Multi-Agent Deception via Steganography"

11 / 11 papers shown
TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems
TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems
Ishan Kavathekar
Hemang Jain
Ameya Rathod
Ponnurangam Kumaraguru
Tanuja Ganu
LLMAGAAML
442
1
0
07 Nov 2025
An Economy of AI Agents
An Economy of AI Agents
Gillian K. Hadfield
Andrew Koh
293
13
0
01 Sep 2025
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and MitigationACM Asia Conference on Computer and Communications Security (AsiaCCS), 2025
Yashothara Shanmugarasa
Ming Ding
M. Chamikara
Thierry Rakotoarivelo
PILMAILaw
555
21
0
15 Jun 2025
Large language models can learn and generalize steganographic chain-of-thought under process supervision
Large language models can learn and generalize steganographic chain-of-thought under process supervision
Joey Skaf
Luis Ibanez-Lissen
Robert McCarthy
Connor Watts
Vasil Georgiv
...
Lorena Gonzalez-Manzano
David Lindner
Cameron Tice
Edward James Young
Puria Radmard
LRM
213
18
0
02 Jun 2025
TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent
TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent
Dominik Meier
Jan Philip Wahle
Paul Röttger
Terry Ruas
Bela Gipp
PILM
423
2
0
26 May 2025
The Problem of Algorithmic Collisions: Mitigating Unforeseen Risks in a Connected World
The Problem of Algorithmic Collisions: Mitigating Unforeseen Risks in a Connected World
Maurice Chiodo
Dennis Müller
191
2
0
26 May 2025
Deceptive Automated Interpretability: Language Models Coordinating to Fool Oversight Systems
Deceptive Automated Interpretability: Language Models Coordinating to Fool Oversight Systems
Simon Lermen
Mateusz Dziemian
Natalia Pérez-Campanero Antolín
449
4
0
10 Apr 2025
Exploiting Fine-Grained Skip Behaviors for Micro-Video Recommendation
Exploiting Fine-Grained Skip Behaviors for Micro-Video RecommendationAAAI Conference on Artificial Intelligence (AAAI), 2025
Sanghyuck Lee
Sangkeun Park
Jaesung Lee
380
3
0
04 Apr 2025
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking
Sebastian Farquhar
Vikrant Varma
David Lindner
David Elson
Caleb Biddulph
Ian Goodfellow
Rohin Shah
526
14
0
22 Jan 2025
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Hidden in Plain Text: Emergence & Mitigation of Steganographic Collusion in LLMs
Yohan Mathew
Ollie Matthews
Robert McCarthy
Joan Velja
Christian Schroeder de Witt
Dylan R. Cope
Nandi Schoots
377
30
0
02 Oct 2024
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
Andis Draguns
Andrew Gritsevskiy
S. Motwani
Charlie Rogers-Smith
Jeffrey Ladish
Christian Schroeder de Witt
507
5
0
03 Jun 2024
1
Page 1 of 1