Title |
---|
![]() Tool-Planner: Task Planning with Clusters across Multiple Tools Yanming Liu Xinyue Peng Jiannan Cao Jiannan Cao Xuhong Zhang Sheng Cheng Xun Wang Xun Wang Jianwei Yin Tianyu Du |
![]() Transfer Q Star: Principled Decoding for LLM Alignment Souradip Chakraborty Soumya Suvra Ghosal Ming Yin Dinesh Manocha Mengdi Wang Amrit Singh Bedi Furong Huang |
![]() Offline Regularised Reinforcement Learning for Large Language Models
Alignment Pierre Harvey Richemond Yunhao Tang Daniel Guo Daniele Calandriello M. G. Azar ...Gil Shamir Rishabh Joshi Tianqi Liu Rémi Munos Bilal Piot |