Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.06366
Cited By
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
10 June 2024
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Symmetric Dot-Product Attention for Efficient Training of BERT Language Models"
4 / 4 papers shown
Title
Does Self-Attention Need Separate Weights in Transformers?
Md. Kowsher
Nusrat Jahan Prottasha
Chun-Nam Yu
O. Garibay
Niloofar Yousefi
111
0
0
30 Nov 2024
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
167
3,799
0
14 Dec 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
214
7,687
0
17 Aug 2015
1