ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.11859
  4. Cited By
Not all parameters are born equal: Attention is mostly what you need

Not all parameters are born equal: Attention is mostly what you need

22 October 2020
Nikolay Bogoychev
    MoE
ArXivPDFHTML

Papers citing "Not all parameters are born equal: Attention is mostly what you need"

3 / 3 papers shown
Title
A Fast Transformer-based General-Purpose Lossless Compressor
A Fast Transformer-based General-Purpose Lossless Compressor
Yushun Mao
Yufei Cui
Tei-Wei Kuo
C. Xue
ViT
AI4CE
18
28
0
30 Mar 2022
Cross-Attention is All You Need: Adapting Pretrained Transformers for
  Machine Translation
Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation
Mozhdeh Gheini
Xiang Ren
Jonathan May
LRM
20
105
0
18 Apr 2021
The Loss Surfaces of Multilayer Networks
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
175
1,184
0
30 Nov 2014
1