Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.07461
Cited By
v1
v2
v3 (latest)
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
20 April 2018
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding"
50 / 4,447 papers shown
Title
Exploring Multilingual Syntactic Sentence Representations
Chen Cecilia Liu
Anderson de Andrade
Muhammad Osama
27
4
0
25 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
141
16
0
25 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
593
20,418
0
23 Oct 2019
Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets
Ohad Rozen
Vered Shwartz
Roee Aharoni
Ido Dagan
AAML
90
38
0
21 Oct 2019
Discovering the Compositional Structure of Vector Representations with Role Learning Networks
Paul Soulos
R. Thomas McCoy
Tal Linzen
P. Smolensky
CoGe
129
44
0
21 Oct 2019
Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System
Ze Yang
Linjun Shou
Ming Gong
Wutao Lin
Daxin Jiang
67
94
0
18 Oct 2019
A Mutual Information Maximization Perspective of Language Representation Learning
Lingpeng Kong
Cyprien de Masson dÁutume
Wang Ling
Lei Yu
Zihang Dai
Dani Yogatama
SSL
284
167
0
18 Oct 2019
Question Classification with Deep Contextualized Transformer
Haozheng Luo
Ningwei Liu
Charles Feng
86
2
0
17 Oct 2019
Injecting Hierarchy with U-Net Transformers
David Donahue
Vladislav Lialin
Anna Rumshisky
AI4CE
26
1
0
16 Oct 2019
Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play
S. Majumdar
Serra Sinem Tekiroğlu
Marco Guerini
35
3
0
16 Oct 2019
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
107
507
0
14 Oct 2019
exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models
Benjamin Hoover
Hendrik Strobelt
Sebastian Gehrmann
40
86
0
11 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
106
15
0
11 Oct 2019
Structured Pruning of Large Language Models
Ziheng Wang
Jeremy Wohlwend
Tao Lei
94
293
0
10 Oct 2019
Multilingual Question Answering from Formatted Text applied to Conversational Agents
W. Siblini
Charlotte Pasqual
Axel Lavielle
Mohamed Challal
Cyril Cauchois
60
19
0
10 Oct 2019
Universal Adversarial Perturbation for Text Classification
Hang Gao
Tim Oates
AAML
108
15
0
10 Oct 2019
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Thomas Wolf
Lysandre Debut
Victor Sanh
Julien Chaumond
Clement Delangue
...
Teven Le Scao
Sylvain Gugger
Mariama Drame
Quentin Lhoest
Alexander M. Rush
AI4CE
107
1,960
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
106
181
0
08 Oct 2019
SesameBERT: Attention for Anywhere
Ta-Chun Su
Hsiang-Chih Cheng
53
7
0
08 Oct 2019
Classification As Decoder: Trading Flexibility For Control In Neural Dialogue
Sam Shleifer
Manish Chablani
Namit Katariya
Anitha Kannan
X. Amatriain
30
0
0
04 Oct 2019
Towards Understanding of Medical Randomized Controlled Trials by Conclusion Generation
Max Landauer
Yung-Sung Chuang
Florian Skopik
Yun-Nung Chen
FaML
LM&MA
MedIm
22
8
0
03 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
305
7,575
0
02 Oct 2019
A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
Xiaohua Zhai
J. Puigcerver
Alexander Kolesnikov
P. Ruyssen
C. Riquelme
...
Michael Tschannen
Marcin Michalski
Olivier Bousquet
Sylvain Gelly
N. Houlsby
SSL
97
448
0
01 Oct 2019
Dialogue Transformers
Vladimir Vlasov
Johannes E. M. Mosig
Alan Nichol
108
58
0
01 Oct 2019
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension
Di Jin
Shuyang Gao
Jiun-Yu Kao
Tagyoung Chung
Dilek Z. Hakkani-Tür
71
69
0
01 Oct 2019
On the use of BERT for Neural Machine Translation
Stéphane Clinchant
K. Jung
Vassilina Nikoulina
84
90
0
27 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
415
6,479
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
296
443
0
25 Sep 2019
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
VLM
66
53
0
25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
130
597
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
291
209
0
25 Sep 2019
Situating Sentence Embedders with Nearest Neighbor Overlap
Lucy H. Lin
Noah A. Smith
34
3
0
24 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
126
1,881
0
23 Sep 2019
Improving Natural Language Inference with a Pretrained Parser
D. Pang
Lucy H. Lin
Noah A. Smith
74
15
0
18 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
363
1,925
0
17 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
313
796
0
17 Sep 2019
Probing Natural Language Inference Models through Semantic Fragments
Kyle Richardson
Hai Hu
L. Moss
Ashish Sabharwal
88
149
0
16 Sep 2019
Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices
V. Chen
Sen Wu
Zhenzhen Weng
Alexander Ratner
Christopher Ré
96
56
0
13 Sep 2019
End-to-End Bias Mitigation by Modelling Biases in Corpora
Rabeeh Karimi Mahabadi
Yonatan Belinkov
James Henderson
154
181
0
13 Sep 2019
UER: An Open-Source Toolkit for Pre-training Models
Zhe Zhao
Hui Chen
Jinbin Zhang
Xin Zhao
Tao Liu
Wei Lu
Xi Chen
Haotang Deng
Qi Ju
Xiaoyong Du
77
115
0
12 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
Lav Varshney
Caiming Xiong
R. Socher
AI4CE
151
1,254
0
11 Sep 2019
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
94
73
0
09 Sep 2019
Pretrained AI Models: Performativity, Mobility, and Change
Lav Varshney
N. Keskar
R. Socher
68
20
0
07 Sep 2019
MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims
Isabelle Augenstein
Christina Lioma
Dongsheng Wang
Lucas Chaves Lima
Casper Hansen
Christian B. Hansen
J. Simonsen
HILM
146
252
0
07 Sep 2019
Overton: A Data System for Monitoring and Improving Machine-Learned Products
Christopher Ré
Feng Niu
Pallavi Gudipati
Charles Srisuwananukorn
AI4TS
72
47
0
07 Sep 2019
Abductive Reasoning as Self-Supervision for Common Sense Question Answering
Sathyanarayanan N. Aakur
Sudeep Sarkar
LRM
SSL
OOD
48
4
0
06 Sep 2019
Show Your Work: Improved Reporting of Experimental Results
Jesse Dodge
Suchin Gururangan
Dallas Card
Roy Schwartz
Noah A. Smith
78
255
0
06 Sep 2019
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Wei Zhao
Maxime Peyrard
Fei Liu
Yang Gao
Christian M. Meyer
Steffen Eger
228
602
0
05 Sep 2019
Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs
Alex Warstadt
Yuning Cao
Ioana Grosu
Wei Peng
Hagen Blix
...
Jason Phang
Anhad Mohananey
Phu Mon Htut
Paloma Jeretic
Samuel R. Bowman
71
123
0
05 Sep 2019
Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity
Anne Lauscher
Ivan Vulić
Edoardo Ponti
Anna Korhonen
Goran Glavaš
SSL
85
58
0
05 Sep 2019
Previous
1
2
3
...
85
86
87
88
89
Next