ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.14228
  4. Cited By
COPR: Continual Human Preference Learning via Optimal Policy
  Regularization

COPR: Continual Human Preference Learning via Optimal Policy Regularization

22 February 2024
Han Zhang
Lin Gui
Yu Lei
Yuanzhao Zhai
Yehong Zhang
Yulan He
Hui Wang
Yue Yu
Kam-Fai Wong
Bin Liang
Ruifeng Xu
    CLL
ArXivPDFHTML

Papers citing "COPR: Continual Human Preference Learning via Optimal Policy Regularization"

4 / 4 papers shown
Title
Towards LifeSpan Cognitive Systems
Towards LifeSpan Cognitive Systems
Yu Wang
Chi Han
Tongtong Wu
Xiaoxin He
Wangchunshu Zhou
...
Zexue He
Wei Wang
Gholamreza Haffari
Heng Ji
Julian McAuley
KELM
CLL
71
1
0
20 Sep 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
20
14
0
10 Jun 2024
Second Thoughts are Best: Learning to Re-Align With Human Values from
  Text Edits
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
45
34
0
01 Jan 2023
Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information
Understanding Dataset Difficulty with V\mathcal{V}V-Usable Information
Kawin Ethayarajh
Yejin Choi
Swabha Swayamdipta
154
157
0
16 Oct 2021
1