v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021

Tyna Eloundou

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,123 papers shown

OmniNova:A General Multimodal Agent Framework

Pengfei Du

LLMAG

213

25 Mar 2025

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

334

24 Mar 2025

Video-T1: Test-Time Scaling for Video Generation

450

24 Mar 2025

ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses

Esmail Gumaan

MoE

314

23 Mar 2025

A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World ApplicationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

769

21 Mar 2025

Parameters vs. Context: Fine-Grained Control of Knowledge Reliance in Language Models

296

20 Mar 2025

Are AI Agents interacting with Online Ads?

Andreas Stöckl

Joel Nitu

472

20 Mar 2025

Survey on Evaluation of LLM-based Agents

Michal Shmueli-Scheuer

LLMAG ELM

506

20 Mar 2025

MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent CollaborationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

279

19 Mar 2025

A Review on Large Language Models for Visual Analytics

Navya Sonal Agarwal

Sanjay Kumar Sonbhadra

367

19 Mar 2025

MP-GUI: Modality Perception with MLLMs for GUI UnderstandingComputer Vision and Pattern Recognition (CVPR), 2025

338

18 Mar 2025

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

442

129

14 Mar 2025

Agentic AI for Scientific Discovery: A Survey of Progress, Challenges, and Future Directions

Mourad Gridach

Jay Nanavati

Khaldoun Zine El Abidine

Lenon Mendes

Christina Mack

371

12 Mar 2025

A Survey on Knowledge-Oriented Retrieval-Augmented Generation

...

374

11 Mar 2025

Robust Multi-Objective Controlled Decoding of Large Language Models

353

11 Mar 2025

Adversarial Policy Optimization for Offline Preference-based Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2025

Hyungkyu Kang

Min-hwan Oh

OffRL

337

07 Mar 2025

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

539

06 Mar 2025

ValuePilot: A Two-Phase Framework for Value-Driven Decision-Making

313

06 Mar 2025

Human Implicit Preference-Based Policy Fine-tuning for Multi-Agent Reinforcement Learning in USV Swarm

425

05 Mar 2025

AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification

449

03 Mar 2025

Dynamic Search for Inference-Time Alignment in Diffusion Models

420

03 Mar 2025

Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding

500

03 Mar 2025

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

407

27 Feb 2025

Learning to Align Multi-Faceted Evaluation: A Unified and Robust FrameworkAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

528

26 Feb 2025

Conversational Planning for Personal Plans

Konstantina Christakopoulou

240

26 Feb 2025

Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization

987

26 Feb 2025

Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

Shiven Sinha

Shashwat Goel

Ponnurangam Kumaraguru

Jonas Geiping

Matthias Bethge

Christian Schroeder de Witt

ReLM ELM LRM

484

26 Feb 2025

Larger or Smaller Reward Margins to Select Preferences for Alignment?

213

25 Feb 2025

PiCO: Peer Review in LLMs based on the Consistency Optimization

505

24 Feb 2025

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control AgentsInternational Conference on Learning Representations (ICLR), 2024

Youssef Attia El Hili

OffRL

502

24 Feb 2025

Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs

Giulio Zizzo

Giandomenico Cornacchia

364

24 Feb 2025

On Synthesizing Data for Context Attribution in Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

Carolin (Haas) Lawrence

246

21 Feb 2025

A Survey of Model Architectures in Information Retrieval

585

20 Feb 2025

STaR-SQL: Self-Taught Reasoner for Text-to-SQLAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

200

20 Feb 2025

Faster WIND: Accelerating Iterative Best-of-

N

Distillation for LLM AlignmentInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

371

20 Feb 2025

Rethinking Diverse Human Preference Learning through Principal Component AnalysisAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

422

18 Feb 2025

Solving the Cold Start Problem on One's Own as an End User via Preference Transfer

Ryoma Sato

259

18 Feb 2025

Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models

428

17 Feb 2025

AgentStudio: A Toolkit for Building General Virtual AgentsInternational Conference on Learning Representations (ICLR), 2024

448

17 Feb 2025

CMCTS: A Constrained Monte Carlo Tree Search Framework for Mathematical Reasoning in Large Language Model

345

16 Feb 2025

CiteCheck: Towards Accurate Citation Faithfulness Detection

181

15 Feb 2025

CoT-Valve: Length-Compressible Chain-of-Thought TuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

501

120

13 Feb 2025

C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Generation

452

10 Feb 2025

Optimizing Knowledge Integration in Retrieval-Augmented Generation with Self-Selection

397

10 Feb 2025

The Order Effect: Investigating Prompt Sensitivity to Input Order in LLMs

425

06 Feb 2025

Context-Aware Hierarchical Merging for Long Document SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Litu Ou

Mirella Lapata

MoMe

1.1K

03 Feb 2025

SyntheT2C: Generating Synthetic Data for Fine-Tuning Large Language Models on the Text2Cypher Task

325

28 Jan 2025

Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language ModelsKnowledge Discovery and Data Mining (KDD), 2023

452

154

28 Jan 2025

Chain-of-Retrieval Augmented Generation

435

24 Jan 2025

Episodic memory in AI agents poses risks that should be studied and mitigated

Chad DeChant

462

20 Jan 2025