DReSD: Dense Retrieval for Speculative DecodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2025 |
Unlocking Efficiency in Large Language Model Inference: A Comprehensive
Survey of Speculative DecodingAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 |
Billion-scale similarity search with GPUsIEEE Transactions on Big Data (TBD), 2017 |