Understanding the Effects of RLHF on LLM Generalisation and DiversityInternational Conference on Learning Representations (ICLR), 2023 |
Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |
Direct Preference Optimization: Your Language Model is Secretly a Reward
ModelNeural Information Processing Systems (NeurIPS), 2023 |
Measuring Massive Multitask Language UnderstandingInternational Conference on Learning Representations (ICLR), 2020 |
Energy and Policy Considerations for Deep Learning in NLPAnnual Meeting of the Association for Computational Linguistics (ACL), 2019 |