How Does Selective Mechanism Improve Self-Attention Networks?

3 May 2020

Papers citing "How Does Selective Mechanism Improve Self-Attention Networks?"

7 / 7 papers shown

Title
Not All Parameters Matter: Masking Diffusion Models for Enhancing Generation Ability L. Wang Senmao Li Fei Yang Jianye Wang Ziheng Zhang Y. Liu Y. Wang Jian Yang DiffM 52 0 0 06 May 2025
Enhancing Job Salary Prediction with Disentangled Composition Effect Modeling: A Neural Prototyping Approach Yang Ji Ying Sun Hengshu Zhu 46 0 0 17 Mar 2025
MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts Xi Victoria Lin Akshat Shrivastava Liang Luo Srinivasan Iyer Mike Lewis Gargi Gosh Luke Zettlemoyer Armen Aghajanyan MoE 28 20 0 31 Jul 2024
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms Junghun Kim Yoojin An Jihie Kim 14 13 0 21 Aug 2022
On the Sub-Layer Functionalities of Transformer Decoder Yilin Yang Longyue Wang Shuming Shi Prasad Tadepalli Stefan Lee Zhaopeng Tu 17 27 0 06 Oct 2020
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 199 876 0 03 May 2018
Effective Approaches to Attention-based Neural Machine Translation Thang Luong Hieu H. Pham Christopher D. Manning 214 7,687 0 17 Aug 2015