v1v2v3v4 (latest)

Adversarial Policies Beat Superhuman Go AIs

International Conference on Machine Learning (ICML), 2022

1 November 2022

Adam Gleave

Kellin Pelrine

Papers citing "Adversarial Policies Beat Superhuman Go AIs"

21 / 21 papers shown

Outbidding and Outbluffing Elite Humans: Mastering Liar's Poker via Self-Play and Reinforcement Learning

163

05 Nov 2025

Look-ahead Reasoning with a Learned Model in Imperfect Information Games

Ondřej Kubíček

Viliam Lisý

LRM

104

06 Oct 2025

Relevance-Zone Reduction in Game Solving

01 Oct 2025

Virtual Agent Economies

William A. Cunningham

Iason Gabriel

Simon Osindero

180

12 Sep 2025

LLM world models are mental: Output layer evidence of brittle world model use in LLM mechanical reasoning

Cole Robertson

Philip Wolff

21 Jul 2025

TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models

234

15 Jun 2025

The Structural Safety Generalization ProblemAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

396

13 Apr 2025

UNIDOOR: A Universal Framework for Action-Level Backdoor Attacks in Deep Reinforcement Learning

339

28 Jan 2025

Demystifying MuZero Planning: Interpreting the Learned ModelIEEE Transactions on Artificial Intelligence (IEEE TAI), 2024

283

07 Nov 2024

Bridging Local and Global Knowledge via Transformer in Board GamesInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

250

07 Oct 2024

Scaling Trends in Language Model Robustness

647

25 Jul 2024

Games of Knightian Uncertainty as AGI testbeds

Spyridon Samothrakis

Dennis J. N. J. Soemers

Damian Machlanski

284

26 Jun 2024

Open-Endedness is Essential for Artificial Superhuman IntelligenceInternational Conference on Machine Learning (ICML), 2024

Edward Hughes

Michael Dennis

Jack Parker-Holder

Feryal M. P. Behbahani

338

06 Jun 2024

SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning SystemsConference on Computer and Communications Security (CCS), 2024

Yuwen Pu

281

06 Feb 2024

Black-Box Access is Insufficient for Rigorous AI AuditsConference on Fairness, Accountability and Transparency (FAccT), 2024

...

Dylan Hadfield-Menell

AAML

560

133

25 Jan 2024

Multi-Agent Diagnostics for Robustness via Illuminated DiversityAdaptive Agents and Multi-Agent Systems (AAMAS), 2024

330

24 Jan 2024

Minimax Exploiter: A Data Efficient Approach for Competitive Self-PlayAdaptive Agents and Multi-Agent Systems (AAMAS), 2023

202

28 Nov 2023

Managing extreme AI risks amid rapid progress

...

351

26 Oct 2023

On existence, uniqueness and scalability of adversarial robustness measures for AI classifiers

I. Horenko

AAML

204

19 Oct 2023

Combining Deep Reinforcement Learning and Search with Generative Models for Game-Theoretic Opponent ModelingInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

311

01 Feb 2023

Impartial Games: A Challenge for Reinforcement Learning

Bei Zhou

Søren Riis

367

25 May 2022