DAPE V2: Process Attention Score as Feature Map for Length ExtrapolationAnnual Meeting of the Association for Computational Linguistics (ACL), 2024 Chuanyang Zheng Yihang Gao Han Shi Jing Xiong Jiankai Sun ...Xiaozhe Ren Michael Ng Xin Jiang Zhenguo Li Yu Li |
CAPE: Context-Adaptive Positional Encoding for Length ExtrapolationNeural Information Processing Systems (NeurIPS), 2024 |