ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.02511
  4. Cited By
Offline Reinforcement Learning from Human Feedback in Real-World
  Sequence-to-Sequence Tasks
v1v2v3 (latest)

Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks

4 November 2020
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
    RALMOffRL
ArXiv (abs)PDFHTML

Papers citing "Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks"

11 / 11 papers shown
PARL: Prompt-based Agents for Reinforcement Learning
PARL: Prompt-based Agents for Reinforcement Learning
Yarik Menchaca Resendiz
Roman Klinger
LLMAGLRM
228
0
0
24 Oct 2025
Conversational User-AI Intervention: A Study on Prompt Rewriting for Improved LLM Response Generation
Conversational User-AI Intervention: A Study on Prompt Rewriting for Improved LLM Response Generation
Rupak Sarkar
Bahareh Sarrafzadeh
N. Chandrasekaran
Nagu Rangan
Philip Resnik
Longqi Yang
S. Jauhar
393
7
0
21 Mar 2025
TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for
  T-Cell Receptor Repertoires Generation
TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation
Yicheng Lin
David Soto
Roberto Santana
223
6
0
02 Aug 2024
Participation in the age of foundation models
Participation in the age of foundation models
Harini Suresh
Emily Tseng
Meg Young
Mary L. Gray
Emma Pierson
Karen Levy
435
60
0
29 May 2024
Multi-User Chat Assistant (MUCA): a Framework Using LLMs to Facilitate
  Group Conversations
Multi-User Chat Assistant (MUCA): a Framework Using LLMs to Facilitate Group Conversations
Manqing Mao
Paishun Ting
Yijian Xiang
Mingyang Xu
Julia Chen
Jianzhe Lin
LLMAG
476
14
0
10 Jan 2024
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
588
27
0
28 Aug 2023
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement
  Learning
Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Emmy Liu
S. Suri
Tong Mu
Allan Zhou
Chelsea Finn
LLMAGLM&Ro
284
4
0
14 Jun 2023
SPEECH: Structured Prediction with Energy-Based Event-Centric
  Hyperspheres
SPEECH: Structured Prediction with Energy-Based Event-Centric HyperspheresAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shumin Deng
Shengyu Mao
Ningyu Zhang
Bryan Hooi
285
6
0
23 May 2023
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
684
286
0
03 Oct 2022
Learning Non-Autoregressive Models from Search for Unsupervised Sentence
  Summarization
Learning Non-Autoregressive Models from Search for Unsupervised Sentence SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Puyuan Liu
Chenyang Huang
Lili Mou
237
20
0
28 May 2022
A Survey of Human-in-the-loop for Machine Learning
A Survey of Human-in-the-loop for Machine Learning
Xingjiao Wu
Luwei Xiao
Yixuan Sun
Junhang Zhang
Tianlong Ma
Liangbo He
SyDa
655
717
0
02 Aug 2021
1
Page 1 of 1