ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07221
  4. Cited By
The Reasons that Agents Act: Intention and Instrumental Goals

The Reasons that Agents Act: Intention and Instrumental Goals

11 February 2024
Francis Rhys Ward
Matt MacDermott
Francesco Belardinelli
Francesca Toni
Tom Everitt
    AI4CE
ArXivPDFHTML

Papers citing "The Reasons that Agents Act: Intention and Instrumental Goals"

11 / 11 papers shown
Title
OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
Yichen Wu
Xudong Pan
Geng Hong
Min Yang
LLMAG
29
0
0
18 Apr 2025
Higher-Order Belief in Incomplete Information MAIDs
Jack Foxabbott
Rohan Subramani
Francis Rhys Ward
36
0
0
08 Mar 2025
Measuring Goal-Directedness
Measuring Goal-Directedness
Matt MacDermott
James Fox
Francesco Belardinelli
Tom Everitt
78
1
0
06 Dec 2024
From Imitation to Introspection: Probing Self-Consciousness in Language
  Models
From Imitation to Introspection: Probing Self-Consciousness in Language Models
Sirui Chen
Shu Yu
Shengjie Zhao
Chaochao Lu
MILM
LRM
30
1
0
24 Oct 2024
Evaluating Language Model Character Traits
Evaluating Language Model Character Traits
Francis Rhys Ward
Zejia Yang
Alex Jackson
Randy Brown
Chandler Smith
Grace Colverd
Louis Thomson
Raymond Douglas
Patrik Bartak
Andrew Rowan
32
0
0
05 Oct 2024
Possible principles for aligned structure learning agents
Possible principles for aligned structure learning agents
Lancelot Da Costa
Tomáš Gavenčiak
David Hyland
Mandana Samiei
Cristian Dragos-Manta
Candice Pattisapu
Adeel Razi
Karl J. Friston
16
1
0
30 Sep 2024
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij
Felix Hofstätter
Ollie Jaffe
Samuel F. Brown
Francis Rhys Ward
ELM
30
22
0
11 Jun 2024
Robust agents learn causal world models
Robust agents learn causal world models
Jonathan G. Richens
Tom Everitt
OOD
111
34
0
16 Feb 2024
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
Tom Everitt
110
27
0
03 Dec 2023
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
453
0
24 Sep 2022
User Tampering in Reinforcement Learning Recommender Systems
User Tampering in Reinforcement Learning Recommender Systems
Charles Evans
Atoosa Kasirzadeh
OffRL
AAML
79
39
0
09 Sep 2021
1