Title |
---|
![]() QPO: Query-dependent Prompt Optimization via Multi-Loop Offline
Reinforcement Learning Yilun Kong Hangyu Mao Qi Zhao Bin Zhang Jingqing Ruan Li Shen Yongzhe Chang Xueqian Wang Rui Zhao Dacheng Tao |
![]() Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts Xiangyang Liu Tianxiang Sun Xuanjing Huang Xipeng Qiu |