ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.10321
  4. Cited By
DropDim: A Regularization Method for Transformer Networks

DropDim: A Regularization Method for Transformer Networks

20 April 2023
Hao Zhang
Dan Qu
Kejia Shao
Xu Yang
ArXivPDFHTML

Papers citing "DropDim: A Regularization Method for Transformer Networks"

6 / 6 papers shown
Title
AttentionDrop: A Novel Regularization Method for Transformer Models
AttentionDrop: A Novel Regularization Method for Transformer Models
Mirza Samad Ahmed Baig
Syeda Anshrah Gillani
Abdul Akbar Khan
Shahid Munir Shah
28
0
0
16 Apr 2025
Reasoning Bias of Next Token Prediction Training
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
80
1
0
21 Feb 2025
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive
  Learning
Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Weiqiang Zhang
25
9
0
20 Apr 2023
Decouple Non-parametric Knowledge Distillation For End-to-end Speech
  Translation
Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation
Hao Zhang
Nianwen Si
Yaqi Chen
Wenlin Zhang
Xukui Yang
Dan Qu
Zhen Li
16
3
0
20 Apr 2023
M3ST: Mix at Three Levels for Speech Translation
M3ST: Mix at Three Levels for Speech Translation
Xuxin Cheng
Qianqian Dong
Fengpeng Yue
Tom Ko
Mingxuan Wang
Yuexian Zou
13
40
0
07 Dec 2022
Searchable Hidden Intermediates for End-to-End Models of Decomposable
  Sequence Tasks
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks
Siddharth Dalmia
Brian Yan
Vikas Raunak
Florian Metze
Shinji Watanabe
27
30
0
02 May 2021
1