Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.16751
Cited By
A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models
29 August 2024
Yi-Lin Tuan
William Yang Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models"
4 / 4 papers shown
Title
Teaching Large Language Models to Reason through Learning and Forgetting
Tianwei Ni
Allen Nie
Sapana Chaudhary
Yao Liu
Huzefa Rangwala
Rasool Fakoor
ReLM
CLL
LRM
130
0
0
15 Apr 2025
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
311
11,915
0
04 Mar 2022
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
280
1,587
0
18 Sep 2019
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
230
31,253
0
16 Jan 2013
1