ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.02399
  4. Cited By
Guiding Attention for Self-Supervised Learning with Transformers

Guiding Attention for Self-Supervised Learning with Transformers

6 October 2020
Ameet Deshpande
Karthik Narasimhan
ArXivPDFHTML

Papers citing "Guiding Attention for Self-Supervised Learning with Transformers"

10 / 10 papers shown
Title
Enhancing Retrosynthesis with Conformer: A Template-Free Method
Enhancing Retrosynthesis with Conformer: A Template-Free Method
Jiaxi Zhuang
Qian Zhang
Ying Qian
160
0
0
21 Jan 2025
Sneaking Syntax into Transformer Language Models with Tree Regularization
Sneaking Syntax into Transformer Language Models with Tree Regularization
Ananjan Nandi
Christopher D. Manning
Shikhar Murty
74
0
0
28 Nov 2024
Beyond Self-learned Attention: Mitigating Attention Bias in
  Transformer-based Models Using Attention Guidance
Beyond Self-learned Attention: Mitigating Attention Bias in Transformer-based Models Using Attention Guidance
Jiri Gesi
Iftekhar Ahmed
57
0
0
26 Feb 2024
MUX-PLMs: Data Multiplexing for High-throughput Language Models
MUX-PLMs: Data Multiplexing for High-throughput Language Models
Vishvak Murahari
Ameet Deshpande
Carlos E. Jimenez
Izhak Shafran
Mingqiu Wang
Yuan Cao
Karthik Narasimhan
MoE
26
5
0
24 Feb 2023
Improving Speech Emotion Recognition Through Focus and Calibration
  Attention Mechanisms
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
20
13
0
21 Aug 2022
Paying More Attention to Self-attention: Improving Pre-trained Language
  Models via Attention Guiding
Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding
Shanshan Wang
Zhumin Chen
Z. Ren
Huasheng Liang
Qiang Yan
Pengjie Ren
33
9
0
06 Apr 2022
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1