Fastformer: Additive Attention Can Be All You Need

Fastformer: Additive Attention Can Be All You Need

20 August 2021

Papers citing "Fastformer: Additive Attention Can Be All You Need"

16 / 16 papers shown

Title
Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models Andrew DiGiugno Ausif Mahmood 33 0 0 24 Feb 2025
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding Titouan Parcollet Rogier van Dalen Shucong Zhang S. Bhattacharya 16 6 0 12 Jul 2023
ONCE: Boosting Content-based Recommendation with Both Open- and Closed-source Large Language Models Qijiong Liu Nuo Chen Tetsuya Sakai Xiao-Ming Wu 26 50 0 11 May 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review Li Shen Yan Sun Zhiyuan Yu Liang Ding Xinmei Tian Dacheng Tao VLM 24 39 0 07 Apr 2023
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications Abdelrahman M. Shaker Muhammad Maaz H. Rasheed Salman Khan Ming Yang F. Khan ViT 35 83 0 27 Mar 2023
OAMatcher: An Overlapping Areas-based Network for Accurate Local Feature Matching Kun Dai Tao Xie K. Wang Zhiqiang Jiang Ruifeng Li Lijun Zhao 22 14 0 12 Feb 2023
Efficient Joint Learning for Clinical Named Entity Recognition and Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events A. Yazdani D. Proios H. Rouhizadeh Douglas Teodoro 11 7 0 08 Feb 2023
Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence Sunghwan Hong Seokju Cho Seung Wook Kim Stephen Lin 3DV 42 4 0 19 Sep 2022
User recommendation system based on MIND dataset Niran A. Abdulhussein Ahmed J. Obaid 11 2 0 06 Sep 2022
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition A. Andrusenko R. Nasretdinov A. Romanenko 8 18 0 16 Aug 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding Yifan Peng Siddharth Dalmia Ian Lane Shinji Watanabe 19 141 0 06 Jul 2022
Fair Comparison between Efficient Attentions Jiuk Hong Chaehyeon Lee Soyoun Bang Heechul Jung 17 1 0 01 Jun 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers Seokju Cho Sunghwan Hong Seung Wook Kim ViT 19 34 0 14 Feb 2022
Boosting Robustness of Image Matting with Context Assembling and Strong Data Augmentation Yutong Dai Brian L. Price He Zhang Chunhua Shen 21 28 0 18 Jan 2022
Big Bird: Transformers for Longer Sequences Manzil Zaheer Guru Guruganesh Kumar Avinava Dubey Joshua Ainslie Chris Alberti ... Philip Pham Anirudh Ravula Qifan Wang Li Yang Amr Ahmed VLM 249 2,009 0 28 Jul 2020
A Decomposable Attention Model for Natural Language Inference Ankur P. Parikh Oscar Täckström Dipanjan Das Jakob Uszkoreit 196 1,358 0 06 Jun 2016