ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12621
  4. Cited By
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs

Reflect-RL: Two-Player Online RL Fine-Tuning for LMs

20 February 2024
Runlong Zhou
Simon S. Du
Beibin Li
    OffRL
ArXivPDFHTML

Papers citing "Reflect-RL: Two-Player Online RL Fine-Tuning for LMs"

6 / 6 papers shown
Title
An Interactive Agent Foundation Model
An Interactive Agent Foundation Model
Zane Durante
Bidipta Sarkar
Ran Gong
Rohan Taori
Yusuke Noda
...
Katsushi Ikeuchi
Fei-Fei Li
Jianfeng Gao
Naoki Wake
Qiuyuan Huang
LM&Ro
AI4CE
LLMAG
86
14
0
08 Feb 2024
True Knowledge Comes from Practice: Aligning LLMs with Embodied
  Environments via Reinforcement Learning
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
Weihao Tan
Wentao Zhang
Shanqi Liu
Longtao Zheng
Xinrun Wang
Bo An
OffRL
31
16
0
25 Jan 2024
Self-Rewarding Language Models
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
218
291
0
18 Jan 2024
Generative Agents: Interactive Simulacra of Human Behavior
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
215
1,701
0
07 Apr 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
275
1,561
0
18 Sep 2019
1