Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
One-for-All Pruning: A Universal Model for Customized Compression of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Accurate KV Cache Quantization with Outlier Tokens TracingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM InferenceInternational Conference on Learning Representations (ICLR), 2025 |