ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2409.13948
  4. Cited By
Aligning Language Models Using Follow-up Likelihood as Reward Signal

Aligning Language Models Using Follow-up Likelihood as Reward Signal

20 September 2024
Chen Zhang
Dading Chong
Feng Jiang
Chengguang Tang
Anningzhe Gao
Guohua Tang
Haizhou Li
    ALM
ArXivPDFHTML

Papers citing "Aligning Language Models Using Follow-up Likelihood as Reward Signal"

2 / 2 papers shown
Title
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large
  Language Models
OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models
Junda Wu
Xintong Li
Ruoyu Wang
Yu Xia
Yuxin Xiong
...
Xiang Chen
B. Kveton
Lina Yao
Jingbo Shang
Julian McAuley
OffRL
LRM
24
0
0
31 Oct 2024
TS-Align: A Teacher-Student Collaborative Framework for Scalable
  Iterative Finetuning of Large Language Models
TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models
Chen Zhang
Chengguang Tang
Dading Chong
Ke Shi
Guohua Tang
Feng Jiang
Haizhou Li
14
4
0
30 May 2024
1