Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2505.15710
Cited By
Advancing LLM Safe Alignment with Safety Representation Ranking
21 May 2025
Tianqi Du
Zeming Wei
Quan Chen
Chenheng Zhang
Yisen Wang
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Advancing LLM Safe Alignment with Safety Representation Ranking"
6 / 6 papers shown
Language Ranker: A Lightweight Ranking framework for LLM Decoding
Chenheng Zhang
Tianqi Du
Jizhe Zhang
Mingqing Xiao
Yifei Wang
Yisen Wang
Zhouchen Lin
ALM
190
0
0
23 Oct 2025
AdaptiveGuard: Towards Adaptive Runtime Safety for LLM-Powered Software
Rui Yang
Michael Fu
Chakkrit Tantithamthavorn
Chetan Arora
Gunel Gulmammadova
Joey Chua
137
0
0
21 Sep 2025
ReGA: Representation-Guided Abstraction for Model-based Safeguarding of LLMs
Zeming Wei
Chengcan Wu
Meng Sun
215
3
0
02 Jun 2025
LiPO: Listwise Preference Optimization through Learning-to-Rank
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tianqi Liu
Zhen Qin
Junru Wu
Jiaming Shen
Misha Khalman
...
Mohammad Saleh
Simon Baumgartner
Jialu Liu
Peter J. Liu
Xuanhui Wang
601
84
0
28 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
928
571
0
03 Jan 2025
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Tinghao Xie
Xiangyu Qi
Yi Zeng
Yangsibo Huang
Udari Madhushani Sehwag
...
Bo Li
Kai Li
Danqi Chen
Peter Henderson
Prateek Mittal
ALM
ELM
423
135
0
20 Jun 2024
1