Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.06638
Cited By
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
13 December 2019
J. Tian
A. Kreuzer
Pai-Hung Chen
Hans-Martin Will
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WaLDORf: Wasteless Language-model Distillation On Reading-comprehension"
4 / 4 papers shown
Title
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,817
0
17 Sep 2019
Knowledge Enhanced Contextual Word Representations
Matthew E. Peters
Mark Neumann
IV RobertL.Logan
Roy Schwartz
Vidur Joshi
Sameer Singh
Noah A. Smith
226
656
0
09 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
214
7,923
0
17 Aug 2015
1