v1v2v3 (latest)

WebGPT: Browser-assisted question-answering with human feedback

17 December 2021

Tyna Eloundou

ArXiv (abs)PDF HTML HuggingFace (2 upvotes)

Papers citing "WebGPT: Browser-assisted question-answering with human feedback"

50 / 1,123 papers shown

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

380

25 Oct 2024

Infogent: An Agent-Based Framework for Web Information AggregationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Zhenhailong Wang

259

24 Oct 2024

Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies

547

24 Oct 2024

IPL: Leveraging Multimodal Large Language Models for Intelligent Product ListingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

136

22 Oct 2024

Beyond Retrieval: Generating Narratives in Conversational Recommender SystemsThe Web Conference (WWW), 2024

292

22 Oct 2024

AdvAgent: Controllable Blackbox Red-teaming on Web Agents

176

22 Oct 2024

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and StyleInternational Conference on Learning Representations (ICLR), 2024

Juanzi Li

345

21 Oct 2024

ComPO: Community Preferences for Language Model PersonalizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

263

21 Oct 2024

A Survey of Conversational Search

552

21 Oct 2024

Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question CoverageNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Kaige Xie

Philippe Laban

Prafulla Kumar Choubey

Caiming Xiong

Chien-Sheng Wu

171

20 Oct 2024

Personalized Adaptation via In-Context Preference Learning

108

17 Oct 2024

ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise

168

17 Oct 2024

RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images with Autonomous AgentsIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

Zhuoran Liu

Danpei Zhao

Bo Yuan

322

17 Oct 2024

Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching

Yanyong Zhang

316

17 Oct 2024

Divide-Verify-Refine: Can LLMs Self-Align with Complex Instructions?Annual Meeting of the Association for Computational Linguistics (ACL), 2024

Hui Liu

Qi He

Suhang Wang

281

16 Oct 2024

On the Capacity of Citation Generation by Large Language ModelsChina Conference on Information Retrieval (CIR), 2024

211

15 Oct 2024

Ada-K Routing: Boosting the Efficiency of MoE-based LLMs

292

14 Oct 2024

MisinfoEval: Generative AI in the Era of "Alternative Facts"Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

271

13 Oct 2024

LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

203

12 Oct 2024

Retrieving Contextual Information for Long-Form Question Answering using Weak SupervisionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

204

11 Oct 2024

Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents

...

244

11 Oct 2024

Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both

1.1K

11 Oct 2024

Understanding the Interplay between Parametric and Contextual Knowledge for Large Language Models

Sitao Cheng

Liangming Pan

Xunjian Yin

Xinyi Wang

William Yang Wang

KELM

242

10 Oct 2024

Agents Thinking Fast and Slow: A Talker-Reasoner Architecture

Konstantina Christakopoulou

Shibl Mourad

Maja Matarić

LLMAG

226

10 Oct 2024

Rewarding Progress: Scaling Automated Process Verifiers for LLM ReasoningInternational Conference on Learning Representations (ICLR), 2024

Amrith Rajagopal Setlur

Rishabh Agarwal

Aviral Kumar

396

161

10 Oct 2024

Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior PredictionInternational Conference on Learning Representations (ICLR), 2024

...

Michael Bronstein

Avishek Joey Bose

283

10 Oct 2024

AppBench: Planning of Multiple APIs from Various APPs for Complex User InstructionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

207

10 Oct 2024

From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven InteractionsInternational Conference on Learning Representations (ICLR), 2024

Shuaiqiang Wang

Jun Xu

Ji-Rong Wen

355

10 Oct 2024

Self-Boosting Large Language Models with Synthetic Preference DataInternational Conference on Learning Representations (ICLR), 2024

Qingxiu Dong

Zhifang Sui

248

09 Oct 2024

ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents

328

09 Oct 2024

Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond

Shanshan Han

607

09 Oct 2024

Uncovering Factor Level Preferences to Improve Human-Model AlignmentConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

380

09 Oct 2024

TinyClick: Single-Turn Agent for Empowering GUI Automation

412

09 Oct 2024

ToolBridge: An Open-Source Dataset to Equip LLMs with External Tool Capabilities

129

08 Oct 2024

Integrating Planning into Single-Turn Long-Form Text Generation

Yi Liang

You Wu

Honglei Zhuang

Li Chen

Jiaming Shen

...

Zhen Qin

239

08 Oct 2024

Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

150

08 Oct 2024

AgentSquare: Automatic LLM Agent Search in Modular Design SpaceInternational Conference on Learning Representations (ICLR), 2024

Yu Li

Yong Li

532

08 Oct 2024

Driving with Regulation: Trustworthy and Interpretable Decision-Making for Autonomous Driving with Retrieval-Augmented Reasoning

418

07 Oct 2024

LRHP: Learning Representations for Human Preferences via Preference Pairs

Jingbo Zhu

316

06 Oct 2024

Identification des paramètres dún modèle logistique en dynamique des populations avec sortie affine

Messaoud Souilah

Imene Sabira Soualah

104

06 Oct 2024

Aligning LLMs with Individual Preferences via InteractionInternational Conference on Computational Linguistics (COLING), 2024

May Fung

Heng Ji

345

04 Oct 2024

CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning

314

03 Oct 2024

MA-RLHF: Reinforcement Learning from Human Feedback with Macro ActionsInternational Conference on Learning Representations (ICLR), 2024

984

03 Oct 2024

Evaluating Robustness of Reward Models for Mathematical Reasoning

Sunghwan Kim

Jinyoung Yeo

199

02 Oct 2024

HelpSteer2-Preference: Complementing Ratings with PreferencesInternational Conference on Learning Representations (ICLR), 2024

Zhilin Wang

Yi Dong

460

103

02 Oct 2024

HybridFlow: A Flexible and Efficient RLHF FrameworkEuropean Conference on Computer Systems (EuroSys), 2024

Wang Zhang

Haibin Lin

659

1,008

28 Sep 2024

Align

^2

LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation

...

Juncheng Li

Hao Jiang

Haoyuan Li

Yueting Zhuang

MLLM ALM

108

27 Sep 2024

Open-World Evaluation for Retrieving Diverse PerspectivesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Hung-Ting Chen

Eunsol Choi

363

26 Sep 2024

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward InferenceInternational Conference on Learning Representations (ICLR), 2024

Qining Zhang

Lei Ying

OffRL

483

25 Sep 2024

Analyzing Probabilistic Methods for Evaluating Agent Capabilities

309

24 Sep 2024