Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.07644
Cited By
Attention-Guided Answer Distillation for Machine Reading Comprehension
23 August 2018
Minghao Hu
Yuxing Peng
Furu Wei
Zhen Huang
Dongsheng Li
Nan Yang
M. Zhou
FaML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention-Guided Answer Distillation for Machine Reading Comprehension"
10 / 10 papers shown
Title
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
48
24
0
19 Aug 2024
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
VLM
MoE
LRM
21
20
0
20 Oct 2022
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
Keshigeyan Chandrasegaran
Ngoc-Trung Tran
Yunqing Zhao
Ngai-man Cheung
83
41
0
29 Jun 2022
Knowledge Distillation as Semiparametric Inference
Tri Dao
G. Kamath
Vasilis Syrgkanis
Lester W. Mackey
22
31
0
20 Apr 2021
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
19
255
0
31 Dec 2020
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,835
0
09 Jun 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
45
1,198
0
25 Feb 2020
A Survey on Machine Reading Comprehension Systems
Razieh Baradaran
Razieh Ghiasi
Hossein Amirkhani
FaML
11
85
0
06 Jan 2020
Hint-Based Training for Non-Autoregressive Machine Translation
Zhuohan Li
Zi Lin
Di He
Fei Tian
Tao Qin
Liwei Wang
Tie-Yan Liu
23
72
0
15 Sep 2019
Efficient Video Classification Using Fewer Frames
S. Bhardwaj
Mukundhan Srinivasan
Mitesh M. Khapra
35
88
0
27 Feb 2019
1