Title |
---|
![]() AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Xiaogeng Liu Peiran Li Edward Suh Yevgeniy Vorobeychik Zhuoqing Mao Somesh Jha Patrick McDaniel Huan Sun Bo Li Chaowei Xiao |
![]() Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in
Red Teaming GenAI Ambrish Rawat Stefan Schoepf Giulio Zizzo Giandomenico Cornacchia Muhammad Zaid Hameed ...Elizabeth M. Daly Mark Purcell P. Sattigeri Pin-Yu Chen Kush R. Varshney |
![]() Towards a Unified View of Preference Learning for Large Language Models:
A Survey Bofei Gao Feifan Song Yibo Miao Zefan Cai Z. Yang ...Houfeng Wang Zhifang Sui Peiyi Wang Baobao Chang Baobao Chang |