v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019

23 October 2019

Sharan Narang

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

50 / 12,032 papers shown

A Primer in BERTology: What we know about how BERT worksTransactions of the Association for Computational Linguistics (TACL), 2020

474

1,717

27 Feb 2020

Compressing Large-Scale Transformer-Based Models: A Case Study on BERTTransactions of the Association for Computational Linguistics (TACL), 2020

437

213

27 Feb 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

293

152

26 Feb 2020

On Feature Normalization and Data AugmentationComputer Vision and Pattern Recognition (CVPR), 2020

Boyi Li

Felix Wu

Ser-Nam Lim

Serge J. Belongie

Kilian Q. Weinberger

230

156

25 Feb 2020

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained TransformersNeural Information Processing Systems (NeurIPS), 2020

1.3K

1,757

25 Feb 2020

Improving BERT Fine-Tuning via Self-Ensemble and Self-DistillationJournal of Computational Science and Technology (JCST), 2020

Yige Xu

Xipeng Qiu

L. Zhou

Xuanjing Huang

150

24 Feb 2020

Training Question Answering Models From Synthetic DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

195

169

22 Feb 2020

Modelling Latent Skills for Multitask Language Generation

Kris Cao

Dani Yogatama

140

21 Feb 2020

Fast local linear regression with anchor regularization

Mathis Petrovich

M. Yamada

OffRL

157

21 Feb 2020

A Road Map to Strong Intelligence

Philip Paquette

AI4TS

20 Feb 2020

CodeBERT: A Pre-Trained Model for Programming and Natural LanguagesFindings (Findings), 2020

...

1.2K

3,386

19 Feb 2020

LAMBERT: Layout-Aware (Language) Modeling for information extractionIEEE International Conference on Document Analysis and Recognition (ICDAR), 2020

335

19 Feb 2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Xiaodong Liu

...

182

19 Feb 2020

Controlling Computation versus Quality for Neural Sequence Models

Ankur Bapna

N. Arivazhagan

Orhan Firat

219

17 Feb 2020

UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation

Tianrui Li

375

417

15 Feb 2020

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

282

676

15 Feb 2020

TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval

Wenhao Lu

Jian Jiao

Ruofei Zhang

188

14 Feb 2020

Transformer on a Diet

220

14 Feb 2020

CBAG: Conditional Biomedical Abstract GenerationPLoS ONE (PLOS ONE), 2020

Justin Sybrandt

Ilya Safro

MedIm AI4CE

146

13 Feb 2020

GLU Variants Improve Transformer

Noam M. Shazeer

585

1,477

12 Feb 2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

573

993

10 Feb 2020

REALM: Retrieval-Augmented Language Model Pre-TrainingInternational Conference on Machine Learning (ICML), 2020

1.2K

2,611

10 Feb 2020

Semi-Supervised Class Discovery

Jeremy Nixon

J. Liu

David Berthelot

266

10 Feb 2020

Momentum Improves Normalized SGDInternational Conference on Machine Learning (ICML), 2020

Ashok Cutkosky

Harsh Mehta

ODL

450

159

09 Feb 2020

Segmented Graph-Bert for Graph Instance Modeling

Jiawei Zhang

SSeg

129

09 Feb 2020

Description Based Text Classification with Reinforcement LearningInternational Conference on Machine Learning (ICML), 2020

Jiwei Li

389

08 Feb 2020

Aligning the Pretraining and Finetuning Objectives of Language Models

Nuo Wang Pierse

Jing Lu

AI4CE

05 Feb 2020

K-Adapter: Infusing Knowledge into Pre-Trained Models with AdaptersFindings (Findings), 2020

Xuanjing Huang

577

595

05 Feb 2020

DUMA: Reading Comprehension with Transposition ThinkingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

408

26 Jan 2020

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2020

215

133

26 Jan 2020

Scaling Laws for Neural Language Models

1.8K

6,691

23 Jan 2020

Multilingual Denoising Pre-training for Neural Machine TranslationTransactions of the Association for Computational Linguistics (TACL), 2020

Yinhan Liu

Jiatao Gu

Naman Goyal

Xian Li

Sergey Edunov

Marjan Ghazvininejad

M. Lewis

Luke Zettlemoyer

AI4CE AIMat

899

1,982

22 Jan 2020

Normalization of Input-output Shared Embeddings in Text Generation Models

Jinyang Liu

Yujia Zhai

Zizhong Chen

133

22 Jan 2020

FixMatch: Simplifying Semi-Supervised Learning with Consistency and ConfidenceNeural Information Processing Systems (NeurIPS), 2020

Chun-Liang Li

451

4,275

21 Jan 2020

Exploiting Cloze Questions for Few Shot Text Classification and Natural Language InferenceConference of the European Chapter of the Association for Computational Linguistics (EACL), 2020

Timo Schick

Hinrich Schütze

1.1K

1,758

21 Jan 2020

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

248

21 Jan 2020

A multimodal deep learning approach for named entity recognition from social media

272

19 Jan 2020

RobBERT: a Dutch RoBERTa-based Language ModelFindings (Findings), 2020

Pieter Delobelle

Thomas Winters

Bettina Berendt

198

262

17 Jan 2020

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-trainingFindings (Findings), 2020

Dayiheng Liu

367

472

13 Jan 2020

Learning Accurate Integer Transformer Machine-Translation ModelsSN Computer Science (SN Comput. Sci.), 2020

Ephrem Wu

03 Jan 2020

What Does My QA Model Know? Devising Controlled Probes using Expert KnowledgeTransactions of the Association for Computational Linguistics (TACL), 2019

Kyle Richardson

Ashish Sabharwal

243

31 Dec 2019

All-in-One Image-Grounded Conversational Agents

Jason Weston

147

28 Dec 2019

Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures TranslationInternational Conference on Language Resources and Evaluation (LREC), 2019

Israfel Salazar

Mary Dabre

Atsushi Fujita

Sadao Kurohashi

210

26 Dec 2019

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive SummarizationInternational Conference on Machine Learning (ICML), 2019

Mohammad Saleh

842

2,310

18 Dec 2019

Multilingual is not enough: BERT for Finnish

250

300

15 Dec 2019

WaLDORf: Wasteless Language-model Distillation On Reading-comprehension

167

13 Dec 2019

Extending Machine Language Models toward Human-Level Language Understanding

156

12 Dec 2019

FlauBERT: Unsupervised Language Model Pre-training for FrenchInternational Conference on Language Resources and Evaluation (LREC), 2019

340

431

11 Dec 2019

Zero-shot Text Classification With Generative Language Models

Raul Puri

Bryan Catanzaro

VLM

166

116

10 Dec 2019

Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art BaselineEuropean Conference on Computer Vision (ECCV), 2019

Devi Parikh

349

120

05 Dec 2019