Title |
---|
![]() Flames: Benchmarking Value Alignment of LLMs in Chinese Kexin Huang Xiangyang Liu Qianyu Guo Tianxiang Sun Jiawei Sun ...Yixu Wang Yan Teng Xipeng Qiu Yingchun Wang Dahua Lin |
![]() Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language
Models that Follow Instructions Federico Bianchi Mirac Suzgun Giuseppe Attanasio Paul Röttger Dan Jurafsky Tatsunori Hashimoto James Y. Zou |