Title |
---|
![]() I-SHEEP: Self-Alignment of LLM from Scratch through an Iterative
Self-Enhancement Paradigm Yiming Liang Ge Zhang Xingwei Qu Tianyu Zheng Jiawei Guo ...Jiaheng Liu Chenghua Lin Lei Ma Wenhao Huang Jiajun Zhang |
![]() RewardBench: Evaluating Reward Models for Language Modeling Nathan Lambert Valentina Pyatkin Jacob Morrison Lester James Validad Miranda Bill Yuchen Lin ...Sachin Kumar Tom Zick Yejin Choi Noah A. Smith Hanna Hajishirzi |