ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.07644
  4. Cited By
Attention-Guided Answer Distillation for Machine Reading Comprehension

Attention-Guided Answer Distillation for Machine Reading Comprehension

23 August 2018
Minghao Hu
Yuxing Peng
Furu Wei
Zhen Huang
Dongsheng Li
Nan Yang
M. Zhou
    FaML
ArXivPDFHTML

Papers citing "Attention-Guided Answer Distillation for Machine Reading Comprehension"

10 / 10 papers shown
Title
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
48
24
0
19 Aug 2024
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model
  for Low-Resource Languages
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
VLM
MoE
LRM
21
20
0
20 Oct 2022
Revisiting Label Smoothing and Knowledge Distillation Compatibility:
  What was Missing?
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
Keshigeyan Chandrasegaran
Ngoc-Trung Tran
Yunqing Zhao
Ngai-man Cheung
83
41
0
29 Jun 2022
Knowledge Distillation as Semiparametric Inference
Knowledge Distillation as Semiparametric Inference
Tri Dao
G. Kamath
Vasilis Syrgkanis
Lester W. Mackey
22
31
0
20 Apr 2021
MiniLMv2: Multi-Head Self-Attention Relation Distillation for
  Compressing Pretrained Transformers
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
19
255
0
31 Dec 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,835
0
09 Jun 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
45
1,198
0
25 Feb 2020
A Survey on Machine Reading Comprehension Systems
A Survey on Machine Reading Comprehension Systems
Razieh Baradaran
Razieh Ghiasi
Hossein Amirkhani
FaML
11
85
0
06 Jan 2020
Hint-Based Training for Non-Autoregressive Machine Translation
Hint-Based Training for Non-Autoregressive Machine Translation
Zhuohan Li
Zi Lin
Di He
Fei Tian
Tao Qin
Liwei Wang
Tie-Yan Liu
23
72
0
15 Sep 2019
Efficient Video Classification Using Fewer Frames
Efficient Video Classification Using Fewer Frames
S. Bhardwaj
Mukundhan Srinivasan
Mitesh M. Khapra
35
88
0
27 Feb 2019
1