Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Energy Considerations of Large Language Model Inference and Efficiency OptimizationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model InferenceInternational Conference on Learning Representations (ICLR), 2024 |