Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2409.16913
Cited By
v1
v2 (latest)
Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
25 September 2024
Wenhao Liu
Siyu An
Junru Lu
Muling Wu
Tianlong Li
Xiaohua Wang
Changze Lv
Xiaoqing Zheng
Di Yin
Xing Sun
Xuanjing Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Tell Me What You Don't Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing"
2 / 2 papers shown
From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses
Xiangtao Meng
Tianshuo Cong
Li Wang
Wenyu Chen
Zheng Li
Shanqing Guo
Xiaoyun Wang
AAML
211
2
0
09 Oct 2025
RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data
Wenhao Liu
Wenhao Liu
Mingchen Xie
Jingwen Xu
Zisu Huang
...
Changze Lv
He-Da Wang
Qi Zhang
Xiaoqing Zheng
Xuanjing Huang
561
1
0
25 May 2025
1
Page 1 of 1