ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.06537
  4. Cited By
A Mixture of $h-1$ Heads is Better than $h$ Heads

A Mixture of h−1h-1h−1 Heads is Better than hhh Heads

13 May 2020
Hao Peng
Roy Schwartz
Dianqi Li
Noah A. Smith
    MoE
ArXivPDFHTML

Papers citing "A Mixture of $h-1$ Heads is Better than $h$ Heads"

7 / 7 papers shown
Title
RouterKT: Mixture-of-Experts for Knowledge Tracing
RouterKT: Mixture-of-Experts for Knowledge Tracing
Han Liao
Shuaishuai Zu
38
0
0
11 Apr 2025
A Call for Clarity in Beam Search: How It Works and When It Stops
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
Universal Simultaneous Machine Translation with Mixture-of-Experts
  Wait-k Policy
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k Policy
Shaolei Zhang
Yang Feng
MoE
20
39
0
11 Sep 2021
Mixed SIGNals: Sign Language Production via a Mixture of Motion
  Primitives
Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
25
50
0
23 Jul 2021
Multi-head or Single-head? An Empirical Comparison for Transformer
  Training
Multi-head or Single-head? An Empirical Comparison for Transformer Training
Liyuan Liu
Jialu Liu
Jiawei Han
21
32
0
17 Jun 2021
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning
  in NLP Using Fewer Parameters & Less Data
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
Jonathan Pilault
Amine Elhattami
C. Pal
CLL
MoE
19
89
0
19 Sep 2020
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classical Structured Prediction Losses for Sequence to Sequence Learning
Sergey Edunov
Myle Ott
Michael Auli
David Grangier
MarcÁurelio Ranzato
AIMat
48
185
0
14 Nov 2017
1