ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.11226
  4. Cited By
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback

The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback

18 May 2024
Ruitao Chen
Liwei Wang
ArXivPDFHTML

Papers citing "The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback"

7 / 7 papers shown
Title
Annotation-Efficient Preference Optimization for Language Model
  Alignment
Annotation-Efficient Preference Optimization for Language Model Alignment
Yuu Jinnai
Ukyo Honda
23
0
0
22 May 2024
Offline Multi-task Transfer RL with Representational Penalization
Offline Multi-task Transfer RL with Representational Penalization
Avinandan Bose
S. S. Du
Maryam Fazel
OffRL
35
12
0
19 Feb 2024
Sharing Knowledge in Multi-Task Deep Reinforcement Learning
Sharing Knowledge in Multi-Task Deep Reinforcement Learning
Carlo DÉramo
Davide Tateo
Andrea Bonarini
Marcello Restelli
Jan Peters
48
121
0
17 Jan 2024
Human-in-the-loop: Provably Efficient Preference-based Reinforcement
  Learning with General Function Approximation
Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation
Xiaoyu Chen
Han Zhong
Zhuoran Yang
Zhaoran Wang
Liwei Wang
113
59
0
23 May 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Active Multi-Task Representation Learning
Active Multi-Task Representation Learning
Yifang Chen
S. Du
Kevin G. Jamieson
20
10
0
02 Feb 2022
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Marc Abeille
Louis Faury
Clément Calauzènes
94
32
0
23 Oct 2020
1