ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.14499
  4. Cited By
Measuring AI Ability to Complete Long Tasks

Measuring AI Ability to Complete Long Tasks

18 March 2025
Thomas Kwa
Ben West
Joel Becker
Amy Deng
Katharyn Garcia
Max Hasin
Sami Jawhar
Megan Kinniment
Nate Rush
Sydney Von Arx
Ryan Bloom
Thomas Broadley
Haoxing Du
Brian Goodrich
Nikola Jurkovic
Luke Harold Miles
Seraphina Nix
Tao R. Lin
Neev Parikh
David Rein
Lucas Jun Koba Sato
H. Wijk
Daniel M. Ziegler
Elizabeth Barnes
Lawrence Chan
    ELM
ArXivPDFHTML

Papers citing "Measuring AI Ability to Complete Long Tasks"

4 / 4 papers shown
Title
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
HiBayES: A Hierarchical Bayesian Modeling Framework for AI Evaluation Statistics
Lennart Luettgau
Harry Coppock
Magda Dubois
Christopher Summerfield
Cozmin Ududec
21
0
0
08 May 2025
Characterizing AI Agents for Alignment and Governance
Characterizing AI Agents for Alignment and Governance
Atoosa Kasirzadeh
Iason Gabriel
44
0
0
30 Apr 2025
Unraveling Human-AI Teaming: A Review and Outlook
Unraveling Human-AI Teaming: A Review and Outlook
Bowen Lou
Tian Lu
T. S. Raghu
Yingjie Zhang
26
0
0
08 Apr 2025
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence
Tomek Korbak
Mikita Balesni
Buck Shlegeris
Geoffrey Irving
ELM
27
1
0
07 Apr 2025
1