Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2410.13852
Cited By
v1
v2 (latest)
Retrospective Learning from Interactions
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
17 October 2024
Zizhao Chen
Mustafa Omer Gul
Yiwei Chen
Gloria Geng
Anne Wu
Yoav Artzi
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (9 upvotes)
Papers citing
"Retrospective Learning from Interactions"
38 / 38 papers shown
Title
Context informs pragmatic interpretation in vision-language models
A. W. M. Tan
Ben Prystawski
Veronica Boyce
Michael C. Frank
ReLM
LRM
163
0
0
05 Nov 2025
The Era of Real-World Human Interaction: RL from User Conversations
Chuanyang Jin
Jing Xu
Bo Liu
Leitian Tao
O. Yu. Golovneva
Tianmin Shu
Wenting Zhao
Xian Li
Jason Weston
OffRL
72
1
0
29 Sep 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
378
0
0
11 Apr 2025
CoGen: Learning from Feedback with Coupled Comprehension and Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Mustafa Omer Gul
Yoav Artzi
195
10
0
28 Aug 2024
Prover-Verifier Games improve legibility of LLM outputs
Jan Hendrik Kirchner
Yining Chen
Harri Edwards
Jan Leike
Nat McAleese
Yuri Burda
LRM
AAML
214
50
0
18 Jul 2024
What matters when building vision-language models?
Neural Information Processing Systems (NeurIPS), 2024
Hugo Laurençon
Léo Tronchon
Matthieu Cord
Victor Sanh
VLM
232
266
0
03 May 2024
Aligning LLM Agents by Learning Latent Preference from User Edits
Ge Gao
Alexey Taymanov
Eduardo Salinas
Paul Mineiro
Dipendra Kumar Misra
LLMAG
239
47
0
23 Apr 2024
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
E. Zelikman
Georges Harik
Yijia Shao
Varuna Jayasiri
Nick Haber
Noah D. Goodman
LLMAG
ReLM
LRM
536
198
0
14 Mar 2024
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Ming Li
Lichang Chen
Jiuhai Chen
Shwai He
Jiuxiang Gu
Wanrong Zhu
367
76
0
15 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
694
790
0
02 Feb 2024
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
782
440
0
18 Jan 2024
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
International Conference on Machine Learning (ICML), 2023
Collin Burns
Pavel Izmailov
Jan Hendrik Kirchner
Bowen Baker
Leo Gao
...
Adrien Ecoffet
Manas Joglekar
Jan Leike
Ilya Sutskever
Jeff Wu
ELM
284
374
0
14 Dec 2023
Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend Existing Ones?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dominic Petrak
N. Moosavi
Ye Tian
Nikolai Rozanov
Iryna Gurevych
158
8
0
24 Oct 2023
Benchmarking and Improving Generator-Validator Consistency of Language Models
Xiang Lisa Li
Vaishnavi Shrivastava
Siyan Li
Tatsunori Hashimoto
Abigail Z. Jacobs
200
40
0
03 Oct 2023
Leveraging Implicit Feedback from Deployment Data in Dialogue
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Richard Yuanzhe Pang
Stephen Roller
Dong Wang
He He
Jason Weston
238
12
0
26 Jul 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Neural Information Processing Systems (NeurIPS), 2023
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
727
6,327
0
29 May 2023
Continually Improving Extractive QA via Human Feedback
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ge Gao
Hung-Ting Chen
Yoav Artzi
Eunsol Choi
209
14
0
21 May 2023
Self-Refine: Iterative Refinement with Self-Feedback
Neural Information Processing Systems (NeurIPS), 2023
Aman Madaan
Niket Tandon
Prakhar Gupta
Skyler Hallinan
Luyu Gao
...
Bodhisattwa Prasad Majumder
Katherine Hermann
Sean Welleck
Amir Yazdanbakhsh
Peter Clark
ReLM
LRM
DiffM
652
2,461
0
30 Mar 2023
Training Language Models with Language Feedback at Scale
Jérémy Scheurer
Jon Ander Campos
Tomasz Korbak
Jun Shern Chan
Angelica Chen
Dong Wang
Ethan Perez
ALM
245
120
0
28 Mar 2023
Continual Learning for Instruction Following from Realtime Feedback
Neural Information Processing Systems (NeurIPS), 2022
Alane Suhr
Yoav Artzi
232
20
0
19 Dec 2022
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai
Saurav Kadavath
Sandipan Kundu
Amanda Askell
John Kernion
...
Dario Amodei
Nicholas Joseph
Sam McCandlish
Tom B. Brown
Jared Kaplan
SyDa
MoMe
594
2,189
0
15 Dec 2022
Abstract Visual Reasoning with Tangram Shapes
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Anya Ji
Noriyuki Kojima
N. Rush
Alane Suhr
Wai Keen Vong
Robert D. Hawkins
Yoav Artzi
LRM
165
50
0
29 Nov 2022
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning
Neural Information Processing Systems (NeurIPS), 2022
Haokun Liu
Derek Tam
Mohammed Muqeeth
Jay Mohta
Tenghao Huang
Joey Tianyi Zhou
Colin Raffel
373
1,126
0
11 May 2022
Training language models to follow instructions with human feedback
Neural Information Processing Systems (NeurIPS), 2022
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
1.9K
16,867
0
04 Mar 2022
Analysis of Language Change in Collaborative Instruction Following
Anna Effenberger
Eva Yan
Rhia Singh
Alane Suhr
Yoav Artzi
131
13
0
09 Sep 2021
Learning to Give Checkable Answers with Prover-Verifier Games
Cem Anil
Guodong Zhang
Yuhuai Wu
Roger C. Grosse
194
19
0
27 Aug 2021
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
Transactions of the Association for Computational Linguistics (TACL), 2021
Noriyuki Kojima
Alane Suhr
Yoav Artzi
155
28
0
10 Aug 2021
LoRA: Low-Rank Adaptation of Large Language Models
International Conference on Learning Representations (ICLR), 2021
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
1.5K
14,676
0
17 Jun 2021
Learning Rewards from Linguistic Feedback
T. Sumers
Mark K. Ho
Robert D. Hawkins
Karthik Narasimhan
Thomas Griffiths
276
59
0
30 Sep 2020
Empirica: a virtual lab for high-throughput macro-level experiments
Abdullah Almaatouq
Joshua P. Becker
J. Houghton
Nicolas Paton
Duncan J. Watts
Mark E. Whiting
203
50
0
19 Jun 2020
Characterizing the dynamics of learning in repeated reference games
Cognitive Sciences (CS), 2019
Robert D. Hawkins
Michael C. Frank
Noah D. Goodman
144
60
0
16 Dec 2019
Continual adaptation for efficient machine communication
Conference on Computational Natural Language Learning (CoNLL), 2019
Robert D. Hawkins
Minae Kwon
Dorsa Sadigh
Noah D. Goodman
CLL
178
37
0
22 Nov 2019
When Does Label Smoothing Help?
Neural Information Processing Systems (NeurIPS), 2019
Rafael Müller
Simon Kornblith
Geoffrey E. Hinton
UQCV
666
2,180
0
06 Jun 2019
Learning from Dialogue after Deployment: Feed Yourself, Chatbot!
Braden Hancock
Antoine Bordes
Pierre-Emmanuel Mazaré
Jason Weston
430
210
0
16 Jan 2019
Ray: A Distributed Framework for Emerging AI Applications
Philipp Moritz
Robert Nishihara
Stephanie Wang
Alexey Tumanov
Richard Liaw
...
Melih Elibol
Zongheng Yang
William Paul
Sai Li
Ion Stoica
GNN
389
1,450
0
16 Dec 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
1.1K
23,432
0
20 Jul 2017
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Kumar Misra
John Langford
Yoav Artzi
218
249
0
28 Apr 2017
Dialogue Learning With Human-In-The-Loop
Jiwei Li
Alexander H. Miller
S. Chopra
MarcÁurelio Ranzato
Jason Weston
OffRL
488
140
0
29 Nov 2016
1