ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.09084
  4. Cited By
Fastformer: Additive Attention Can Be All You Need

Fastformer: Additive Attention Can Be All You Need

20 August 2021
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
Xing Xie
ArXivPDFHTML

Papers citing "Fastformer: Additive Attention Can Be All You Need"

16 / 16 papers shown
Title
Neural Attention: A Novel Mechanism for Enhanced Expressive Power in Transformer Models
Andrew DiGiugno
Ausif Mahmood
33
0
0
24 Feb 2025
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for
  Speech Recognition and Understanding
SummaryMixing: A Linear-Complexity Alternative to Self-Attention for Speech Recognition and Understanding
Titouan Parcollet
Rogier van Dalen
Shucong Zhang
S. Bhattacharya
16
6
0
12 Jul 2023
ONCE: Boosting Content-based Recommendation with Both Open- and
  Closed-source Large Language Models
ONCE: Boosting Content-based Recommendation with Both Open- and Closed-source Large Language Models
Qijiong Liu
Nuo Chen
Tetsuya Sakai
Xiao-Ming Wu
26
50
0
11 May 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature
  Review
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
24
39
0
07 Apr 2023
SwiftFormer: Efficient Additive Attention for Transformer-based
  Real-time Mobile Vision Applications
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming Yang
F. Khan
ViT
35
83
0
27 Mar 2023
OAMatcher: An Overlapping Areas-based Network for Accurate Local Feature
  Matching
OAMatcher: An Overlapping Areas-based Network for Accurate Local Feature Matching
Kun Dai
Tao Xie
K. Wang
Zhiqiang Jiang
Ruifeng Li
Lijun Zhao
22
14
0
12 Feb 2023
Efficient Joint Learning for Clinical Named Entity Recognition and
  Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
Efficient Joint Learning for Clinical Named Entity Recognition and Relation Extraction Using Fourier Networks: A Use Case in Adverse Drug Events
A. Yazdani
D. Proios
H. Rouhizadeh
Douglas Teodoro
11
7
0
08 Feb 2023
Integrative Feature and Cost Aggregation with Transformers for Dense
  Correspondence
Integrative Feature and Cost Aggregation with Transformers for Dense Correspondence
Sunghwan Hong
Seokju Cho
Seung Wook Kim
Stephen Lin
3DV
42
4
0
19 Sep 2022
User recommendation system based on MIND dataset
User recommendation system based on MIND dataset
Niran A. Abdulhussein
Ahmed J. Obaid
11
2
0
06 Sep 2022
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End
  Speech Recognition
Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
A. Andrusenko
R. Nasretdinov
A. Romanenko
8
18
0
16 Aug 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
19
141
0
06 Jul 2022
Fair Comparison between Efficient Attentions
Fair Comparison between Efficient Attentions
Jiuk Hong
Chaehyeon Lee
Soyoun Bang
Heechul Jung
17
1
0
01 Jun 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
19
34
0
14 Feb 2022
Boosting Robustness of Image Matting with Context Assembling and Strong
  Data Augmentation
Boosting Robustness of Image Matting with Context Assembling and Strong Data Augmentation
Yutong Dai
Brian L. Price
He Zhang
Chunhua Shen
21
28
0
18 Jan 2022
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
196
1,358
0
06 Jun 2016
1