Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.00829
Cited By
TroubleLLM: Align to Red Team Expert
28 February 2024
Zhuoer Xu
Jianping Zhang
Shiwen Cui
Changhua Meng
Weiqiang Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TroubleLLM: Align to Red Team Expert"
3 / 3 papers shown
Title
Purple-teaming LLMs with Adversarial Defender Training
Jingyan Zhou
Kun Li
Junan Li
Jiawen Kang
Minda Hu
Xixin Wu
Helen Meng
AAML
34
1
0
01 Jul 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
311
11,915
0
04 Mar 2022
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
208
616
0
03 Sep 2019
1