Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong DecodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Personalize Your LLM: Fake it then Align itNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025 |
Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMsInternational Conference on Learning Representations (ICLR), 2025 |
Mixture of Attentions For Speculative DecodingInternational Conference on Learning Representations (ICLR), 2024 |
Programming Refusal with Conditional Activation SteeringInternational Conference on Learning Representations (ICLR), 2024 |