ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.04235
  4. Cited By
Definitions of intent suitable for algorithms

Definitions of intent suitable for algorithms

8 June 2021
Hal Ashton
ArXivPDFHTML

Papers citing "Definitions of intent suitable for algorithms"

9 / 9 papers shown
Title
Evaluating Language Model Character Traits
Evaluating Language Model Character Traits
Francis Rhys Ward
Zejia Yang
Alex Jackson
Randy Brown
Chandler Smith
Grace Colverd
Louis Thomson
Raymond Douglas
Patrik Bartak
Andrew Rowan
47
0
0
05 Oct 2024
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij
Felix Hofstätter
Ollie Jaffe
Samuel F. Brown
Francis Rhys Ward
ELM
52
22
0
11 Jun 2024
The Reasons that Agents Act: Intention and Instrumental Goals
The Reasons that Agents Act: Intention and Instrumental Goals
Francis Rhys Ward
Matt MacDermott
Francesco Belardinelli
Francesca Toni
Tom Everitt
AI4CE
37
12
0
11 Feb 2024
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Honesty Is the Best Policy: Defining and Mitigating AI Deception
Francis Rhys Ward
Francesco Belardinelli
Francesca Toni
Tom Everitt
115
27
0
03 Dec 2023
SHAPE: A Framework for Evaluating the Ethicality of Influence
SHAPE: A Framework for Evaluating the Ethicality of Influence
Elfia Bezou-Vrakatseli
Benedikt Brückner
Luke Thorburn
TDI
40
3
0
08 Sep 2023
Experiments with Detecting and Mitigating AI Deception
Experiments with Detecting and Mitigating AI Deception
Ismail Sahbane
Francis Rhys Ward
Henrik ˚Aslund
28
1
0
26 Jun 2023
Human Control: Definitions and Algorithms
Human Control: Definitions and Algorithms
Ryan Carey
Tom Everitt
40
6
0
31 May 2023
What is Proxy Discrimination?
What is Proxy Discrimination?
Michael Carl Tschantz
22
18
0
11 May 2022
Extending counterfactual accounts of intent to include oblique intent
Extending counterfactual accounts of intent to include oblique intent
Hal Ashton
21
3
0
07 Jun 2021
1