v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021

Tyna Eloundou

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,123 papers shown

Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and ChallengesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

315

19 May 2025

Pairwise Calibrated Rewards for Pluralistic Alignment

220

17 May 2025

Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Reward Design

...

349

17 May 2025

PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual ReasoningIEEE International Conference on Information Reuse and Integration (IRI), 2025

Falong Fan

Xi Li

LLMAG AAML

336

16 May 2025

LLM Agents Are Hypersensitive to Nudges

Manuel Cherep

Pattie Maes

Nikhil Singh

292

16 May 2025

Demystifying AI Agents: The Final Generation of Intelligence

Kevin J McNamara

Rhea Pritham Marpu

188

15 May 2025

Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging

Hongjin Qian

Zhengyang Liang

RALM LRM

514

14 May 2025

Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical ApproachEthics: An International Journal of Social, Political, and Legal Philosophy (Ethics), 2025

Shannon Lodoen

Alexi Orchard

243

14 May 2025

HealthBench: Evaluating Large Language Models Towards Improved Human Health

Joaquin Quiñonero Candela

...

296

125

13 May 2025

Large Language Models for Computer-Aided Design: A Survey

391

13 May 2025

ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution

...

377

12 May 2025

Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving

528

12 May 2025

Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated Intelligence

401

11 May 2025

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

342

08 May 2025

Advancing and Benchmarking Personalized Tool Invocation for LLMs

248

07 May 2025

Assessing and Enhancing the Robustness of LLM-based Multi-Agent Systems Through Chaos Engineering

Joshua Owotogbe

LLMAG

278

06 May 2025

Soft Best-of-n Sampling for Model Alignment

861

06 May 2025

RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval-Augmented Generation

Tiantian Gan

Qiyao Sun

06 May 2025

A Survey on Progress in LLM Alignment from the Perspective of Reward Design

373

05 May 2025

Sailing by the Stars: A Survey on Reward Models and Learning Strategies for Learning from Rewards

Xiaobao Wu

LRM

581

05 May 2025

Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents

Christian Schroeder de Witt

AAML AI4CE

1.1K

04 May 2025

Visual Test-time Scaling for GUI Agent Grounding

375

01 May 2025

Coral Protocol: Open Infrastructure Connecting The Internet of Agents

402

30 Apr 2025

A Domain-Agnostic Scalable AI Safety Ensuring Framework

630

29 Apr 2025

BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese

...

356

27 Apr 2025

AndroidGen: Building an Android Language Agent under Data ScarcityAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

320

27 Apr 2025

Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning

511

25 Apr 2025

TTRL: Test-Time Reinforcement Learning

...

1.3K

117

22 Apr 2025

Establishing Reliability Metrics for Reward Models in Large Language Models

287

21 Apr 2025

a1: Steep Test-time Scaling Law via Environment Augmented Generation

295

20 Apr 2025

Syntactic and Semantic Control of Large Language Models via Sequential Monte CarloInternational Conference on Learning Representations (ICLR), 2025

...

544

17 Apr 2025

Memorization vs. Reasoning: Updating LLMs with New KnowledgeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Aochong Oliver Li

Tanya Goyal

KELM

351

16 Apr 2025

BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents

274

206

16 Apr 2025

Offline Learning and Forgetting for Reasoning with Large Language Models

1.2K

15 Apr 2025

Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data

Shuai Zhao

Linchao Zhu

Yi Yang

456

14 Apr 2025

DeepTrans: Deep Reasoning Translation via Reinforcement Learning

459

14 Apr 2025

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Xing Han Lù

Amirhossein Kazemnejad

360

11 Apr 2025

TALE: A Tool-Augmented Framework for Reference-Free Evaluation of Large Language Models

373

10 Apr 2025

Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations

359

07 Apr 2025

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling

...

373

07 Apr 2025

Building LLM Agents by Incorporating Insights from Computer Systems

363

06 Apr 2025

The Use of Gaze-Derived Confidence of Inferred Operator Intent in Adjusting Safety-Conscious Haptic Assistance

322

04 Apr 2025

On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows

Souradip Chakraborty

Mohammadreza Pourreza

...

546

02 Apr 2025

HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents

361

01 Apr 2025

Exploring the Roles of Large Language Models in Reshaping Transportation Systems: A Survey, Framework, and Roadmap

Tong Nie

Jian Sun

Wei Ma

564

27 Mar 2025

Collab: Controlled Decoding using Mixture of Agents for LLM AlignmentInternational Conference on Learning Representations (ICLR), 2025

Souradip Chakraborty

Sujay Bhatt

Udari Madhushani Sehwag

368

27 Mar 2025

3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models

600

27 Mar 2025

debug-gym: A Text-Based Environment for Interactive Debugging

...

283

27 Mar 2025

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

352

26 Mar 2025

OmniNova:A General Multimodal Agent Framework

Pengfei Du

LLMAG

213

25 Mar 2025