Fovea Transformer: Efficient Long-Context Modeling with Structured
Fine-to-Coarse Attention

Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention

13 November 2023

Jingwen Leng

Papers citing "Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention"

5 / 5 papers shown

Title
LSG Attention: Extrapolation of pretrained Transformers to long sequences Charles Condevaux S. Harispe 25 24 0 13 Oct 2022
Adapting Pretrained Text-to-Text Models for Long Text Sequences Wenhan Xiong Anchit Gupta Shubham Toshniwal Yashar Mehdad Wen-tau Yih RALM VLM 49 30 0 21 Sep 2022
Transkimmer: Transformer Learns to Layer-wise Skim Yue Guan Zhengyi Li Jingwen Leng Zhouhan Lin Minyi Guo 61 38 0 15 May 2022
PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization Wen Xiao Iz Beltagy Giuseppe Carenini Arman Cohan CVBM 77 113 0 16 Oct 2021
Efficient Content-Based Sparse Attention with Routing Transformers Aurko Roy M. Saffar Ashish Vaswani David Grangier MoE 238 579 0 12 Mar 2020