ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.10683
  4. Cited By
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
v1v2v3v4 (latest)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Journal of machine learning research (JMLR), 2019
23 October 2019
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

38 / 12,038 papers shown
Multilingual is not enough: BERT for Finnish
Multilingual is not enough: BERT for Finnish
Antti Virtanen
Jenna Kanerva
Rami Ilo
Jouni Luoma
Juhani Luotolahti
T. Salakoski
Filip Ginter
S. Pyysalo
252
300
0
15 Dec 2019
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
WaLDORf: Wasteless Language-model Distillation On Reading-comprehension
J. Tian
A. Kreuzer
Pai-Hung Chen
Hans-Martin Will
VLM
169
3
0
13 Dec 2019
Extending Machine Language Models toward Human-Level Language
  Understanding
Extending Machine Language Models toward Human-Level Language Understanding
James L. McClelland
Felix Hill
Maja R. Rudolph
Jason Baldridge
Hinrich Schütze
LRM
159
36
0
12 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
FlauBERT: Unsupervised Language Model Pre-training for FrenchInternational Conference on Language Resources and Evaluation (LREC), 2019
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
350
431
0
11 Dec 2019
Zero-shot Text Classification With Generative Language Models
Zero-shot Text Classification With Generative Language Models
Raul Puri
Bryan Catanzaro
VLM
166
116
0
10 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art
  Baseline
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art BaselineEuropean Conference on Computer Vision (ECCV), 2019
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
359
120
0
05 Dec 2019
12-in-1: Multi-Task Vision and Language Representation Learning
12-in-1: Multi-Task Vision and Language Representation LearningComputer Vision and Pattern Recognition (CVPR), 2019
Jiasen Lu
Vedanuj Goswami
Marcus Rohrbach
Devi Parikh
Stefan Lee
VLMObjD
315
499
0
05 Dec 2019
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
BLiMP: The Benchmark of Linguistic Minimal Pairs for EnglishTransactions of the Association for Computational Linguistics (TACL), 2019
Alex Warstadt
Alicia Parrish
Haokun Liu
Anhad Mohananey
Wei Peng
Sheng-Fu Wang
Samuel R. Bowman
477
619
0
02 Dec 2019
What's Hidden in a Randomly Weighted Neural Network?
What's Hidden in a Randomly Weighted Neural Network?Computer Vision and Pattern Recognition (CVPR), 2019
Vivek Ramanujan
Mitchell Wortsman
Aniruddha Kembhavi
Ali Farhadi
Mohammad Rastegari
256
393
0
29 Nov 2019
Iterative Answer Prediction with Pointer-Augmented Multimodal
  Transformers for TextVQA
Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQAComputer Vision and Pattern Recognition (CVPR), 2019
Ronghang Hu
Amanpreet Singh
Trevor Darrell
Marcus Rohrbach
361
224
0
14 Nov 2019
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language
  Representation
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language RepresentationTransactions of the Association for Computational Linguistics (TACL), 2019
Xiaozhi Wang
Tianyu Gao
Zhaocheng Zhu
Zhengyan Zhang
Zhiyuan Liu
Juan-Zi Li
Jian Tang
386
771
0
13 Nov 2019
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language ModelAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
540
1,056
0
10 Nov 2019
INSET: Sentence Infilling with INter-SEntential Transformer
INSET: Sentence Infilling with INter-SEntential Transformer
Yichen Huang
Yizhe Zhang
Oussama Elachqar
Yu Cheng
248
1
0
10 Nov 2019
Learning to Few-Shot Learn Across Diverse Natural Language
  Classification Tasks
Learning to Few-Shot Learn Across Diverse Natural Language Classification TasksInternational Conference on Computational Linguistics (COLING), 2019
Trapit Bansal
Rishikesh Jha
Andrew McCallum
SSL
273
126
0
10 Nov 2019
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded
  Conversational Agents
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational AgentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Kurt Shuster
Da Ju
Stephen Roller
Emily Dinan
Y-Lan Boureau
Jason Weston
264
84
0
09 Nov 2019
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Sentence Meta-Embeddings for Unsupervised Semantic Textual SimilarityAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Nina Poerner
Ulli Waltinger
Hinrich Schütze
AI4TS
474
21
0
09 Nov 2019
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language
  Models through Principled Regularized Optimization
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
T. Zhao
643
590
0
08 Nov 2019
Contrastive Multi-document Question Generation
Contrastive Multi-document Question GenerationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2019
W. Cho
Yizhe Zhang
Sudha Rao
Asli Celikyilmaz
Chenyan Xiong
Jianfeng Gao
Mengdi Wang
Bill Dolan
SyDa
362
31
0
08 Nov 2019
BERTs of a feather do not generalize together: Large variability in
  generalization across models with similar test set performance
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performanceBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2019
R. Thomas McCoy
Junghyun Min
Tal Linzen
402
156
0
07 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at ScaleAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
492
7,725
0
05 Nov 2019
DialoGPT: Large-Scale Generative Pre-training for Conversational
  Response Generation
DialoGPT: Large-Scale Generative Pre-training for Conversational Response GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Yizhe Zhang
Siqi Sun
Michel Galley
Yen-Chun Chen
Chris Brockett
Xiang Gao
Jianfeng Gao
Jingjing Liu
W. Dolan
VLM
650
1,658
0
01 Nov 2019
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
CCNet: Extracting High Quality Monolingual Datasets from Web Crawl DataInternational Conference on Language Resources and Evaluation (LREC), 2019
Guillaume Wenzek
Marie-Anne Lachaux
Alexis Conneau
Vishrav Chaudhary
Francisco Guzmán
Armand Joulin
Edouard Grave
470
756
0
01 Nov 2019
Multi-Stage Document Ranking with BERT
Multi-Stage Document Ranking with BERT
Rodrigo Nogueira
Wei Yang
Dong Wang
Jimmy J. Lin
317
461
0
31 Oct 2019
Discourse-Aware Neural Extractive Text Summarization
Discourse-Aware Neural Extractive Text SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Jiacheng Xu
Zhe Gan
Yu Cheng
Jingjing Liu
BDL
334
292
0
30 Oct 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and ComprehensionAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMatVLM
834
12,121
0
29 Oct 2019
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
ZeRO: Memory Optimizations Toward Training Trillion Parameter ModelsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2019
Samyam Rajbhandari
Jeff Rasley
Olatunji Ruwase
Yuxiong He
ALMAI4CE
434
1,424
0
04 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsInternational Conference on Learning Representations (ICLR), 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSLAIMat
1.2K
7,141
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language UnderstandingInternational Conference on Learning Representations (ICLR), 2019
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
686
490
0
25 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
266
280
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language UnderstandingFindings (Findings), 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
604
2,161
0
23 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
1.3K
2,442
0
17 Sep 2019
I-MAD: Interpretable Malware Detector Using Galaxy Transformer
I-MAD: Interpretable Malware Detector Using Galaxy TransformerComputers & security (Comput. Secur.), 2019
Miles Q. Li
Benjamin C. M. Fung
P. Charland
Steven H. H. Ding
299
39
0
15 Sep 2019
Conditional Text Generation for Harmonious Human-Machine Interaction
Conditional Text Generation for Harmonious Human-Machine Interaction
Bin Guo
Hao Wang
Yasan Ding
Wei Wu
Shaoyang Hao
Yueqi Sun
Zhiwen Yu
185
4
0
08 Sep 2019
Taming Momentum in a Distributed Asynchronous Environment
Taming Momentum in a Distributed Asynchronous Environment
Ido Hakimi
Saar Barkai
Moshe Gabel
Assaf Schuster
303
24
0
26 Jul 2019
Contextual Word Representations: A Contextual Introduction
Contextual Word Representations: A Contextual Introduction
Noah A. Smith
239
35
0
15 Feb 2019
Are All Layers Created Equal?
Are All Layers Created Equal?
Chiyuan Zhang
Samy Bengio
Y. Singer
316
157
0
06 Feb 2019
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Tian Shi
Yaser Keneshloo
Naren Ramakrishnan
Chandan K. Reddy
420
252
0
05 Dec 2018
Deep Learning for Genomics: A Concise Overview
Deep Learning for Genomics: A Concise Overview
Tianwei Yue
Yuanxin Wang
Longxiang Zhang
Chunming Gu
Haohan Wang
Wenping Wang
Qi Lyu
Yujie Dun
AILawVLMBDL
289
97
0
02 Feb 2018
Previous
123...239240241
Page 241 of 241
Pageof 241