LATTE: Low-Precision Approximate Attention with Head-wise Trainable Threshold for Efficient Transformer

11 April 2024

Papers citing "LATTE: Low-Precision Approximate Attention with Head-wise Trainable Threshold for Efficient Transformer"

1 / 1 papers shown

Title
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention Zhe Zhou Junling Liu Zhenyu Gu Guangyu Sun 61 42 0 18 Oct 2021