ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.08178
  4. Cited By
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
v1v2v3v4v5 (latest)

Highway Transformer: Self-Gating Enhanced Self-Attentive Networks

Annual Meeting of the Association for Computational Linguistics (ACL), 2020
17 April 2020
Yekun Chai
Jin Shuo
Xinwen Hou
ArXiv (abs)PDFHTML

Papers citing "Highway Transformer: Self-Gating Enhanced Self-Attentive Networks"

10 / 10 papers shown
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
EG-MLA: Embedding-Gated Multi-head Latent Attention for Scalable and Efficient LLMs
Zhengge Cai
Haowen Hou
102
0
0
20 Sep 2025
IgCraft: A versatile sequence generation framework for antibody discovery and engineering
IgCraft: A versatile sequence generation framework for antibody discovery and engineering
Matthew Greenig
Haowen Zhao
Vladimir Radenkovic
Aubin Ramon
Pietro Sormanni
423
5
0
25 Mar 2025
On the Design Space Between Transformers and Recursive Neural Nets
On the Design Space Between Transformers and Recursive Neural Nets
Jishnu Ray Chowdhury
Cornelia Caragea
350
0
0
03 Sep 2024
Tokenization Falling Short: The Curse of Tokenization
Tokenization Falling Short: The Curse of Tokenization
Yekun Chai
Yewei Fang
Qiwei Peng
Xuhong Li
259
0
0
17 Jun 2024
Can Transformers Predict Vibrations?
Can Transformers Predict Vibrations?
Fusataka Kuniyoshi
Yoshihide Sawada
171
1
0
16 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
618
3
0
01 Feb 2024
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for
  Programming Languages
ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming LanguagesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Yekun Chai
Shuohuan Wang
Chao Pang
Yu Sun
Hao Tian
Hua Wu
308
43
0
13 Dec 2022
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple
  Tasks
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Zhenhailong Wang
Xiaoman Pan
Dian Yu
Dong Yu
Jianshu Chen
Heng Ji
VLM
323
11
0
01 Oct 2022
Leveraging Local Temporal Information for Multimodal Scene
  Classification
Leveraging Local Temporal Information for Multimodal Scene Classification
Saurabh Sahu
Palash Goyal
ViT
120
0
0
26 Oct 2021
Rewiring the Transformer with Depth-Wise LSTMs
Rewiring the Transformer with Depth-Wise LSTMsInternational Conference on Language Resources and Evaluation (LREC), 2020
Hongfei Xu
Yang Song
Qiuhui Liu
Josef van Genabith
Deyi Xiong
333
7
0
13 Jul 2020
1
Page 1 of 1