Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.07461
Cited By
v1
v2
v3 (latest)
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
20 April 2018
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding"
50 / 4,447 papers shown
Title
A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection
Daniele Bonadiman
Alessandro Moschitti
RALM
68
10
0
04 Mar 2020
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
Yada Pruksachatkun
Philip Yeres
Haokun Liu
Jason Phang
Phu Mon Htut
Alex Jinpeng Wang
Ian Tenney
Samuel R. Bowman
SSeg
41
94
0
04 Mar 2020
CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model
Liang Xu
Xuanwei Zhang
Qianqian Dong
SSL
66
71
0
03 Mar 2020
Long Short-Term Sample Distillation
Liang Jiang
Zujie Wen
Zhongping Liang
Yafang Wang
Gerard de Melo
Zhe Li
Liangzhuang Ma
Jiaxing Zhang
Xiaolong Li
Yuan Qi
27
7
0
02 Mar 2020
Style Example-Guided Text Generation using Generative Adversarial Transformers
Kuo-Hao Zeng
Mohammad Shoeybi
Ming-Yuan Liu
GAN
93
18
0
02 Mar 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
...
Yu Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
H. Hon
AI4CE
88
397
0
28 Feb 2020
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
Ziqing Yang
Yiming Cui
Zhipeng Chen
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
VLM
75
48
0
28 Feb 2020
On Biased Compression for Distributed Learning
Aleksandr Beznosikov
Samuel Horváth
Peter Richtárik
M. Safaryan
78
189
0
27 Feb 2020
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
143
1,511
0
27 Feb 2020
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Prakhar Ganesh
Yao Chen
Xin Lou
Mohammad Ali Khan
Yifan Yang
Hassan Sajjad
Preslav Nakov
Deming Chen
Marianne Winslett
AI4CE
136
201
0
27 Feb 2020
Using a thousand optimization tasks to learn hyperparameter search strategies
Luke Metz
Niru Maheswaranathan
Ruoxi Sun
C. Freeman
Ben Poole
Jascha Narain Sohl-Dickstein
117
46
0
27 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
138
151
0
26 Feb 2020
Towards Learning a Universal Non-Semantic Representation of Speech
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
151
160
0
25 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
236
1,285
0
25 Feb 2020
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
Yige Xu
Xipeng Qiu
L. Zhou
Xuanjing Huang
83
67
0
24 Feb 2020
Modelling Latent Skills for Multitask Language Generation
Kris Cao
Dani Yogatama
31
3
0
21 Feb 2020
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
Mitchell A. Gordon
Kevin Duh
Nicholas Andrews
VLM
81
343
0
19 Feb 2020
LAMBERT: Layout-Aware (Language) Modeling for information extraction
Lukasz Garncarek
Rafal Powalski
Tomasz Stanislawek
Bartosz Topolski
Piotr Halama
M. Turski
Filip Graliñski
84
88
0
19 Feb 2020
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu
Yu Wang
Jianshu Ji
Hao Cheng
Xueyun Zhu
...
Pengcheng He
Weizhu Chen
Hoifung Poon
Guihong Cao
Jianfeng Gao
AI4CE
77
61
0
19 Feb 2020
Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual Claims
Kevin Meng
Damian Jimenez
Fatma Arslan
J. Devasier
Daniel Obembe
Chengkai Li
63
16
0
18 Feb 2020
Controlling Computation versus Quality for Neural Sequence Models
Ankur Bapna
N. Arivazhagan
Orhan Firat
85
30
0
17 Feb 2020
SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models
Bin Wang
C.-C. Jay Kuo
50
156
0
16 Feb 2020
Text-based Question Answering from Information Retrieval and Deep Neural Network Perspectives: A Survey
Zahra Abbasiyantaeb
S. Momtazi
RALM
85
74
0
16 Feb 2020
PDDLGym: Gym Environments from PDDL Problems
Tom Silver
Rohan Chitnis
AI4CE
95
57
0
15 Feb 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
105
598
0
15 Feb 2020
Transformer on a Diet
Chenguang Wang
Zihao Ye
Aston Zhang
Zheng Zhang
Alex Smola
80
8
0
14 Feb 2020
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing
Xiyou Zhou
Zhiyu Zoey Chen
Xiaoyong Jin
Wenjie Wang
78
34
0
14 Feb 2020
Training Large Neural Networks with Constant Memory using a New Execution Algorithm
B. Pudipeddi
Maral Mesmakhosroshahi
Jinwen Xi
S. Bharadwaj
85
58
0
13 Feb 2020
CBAG: Conditional Biomedical Abstract Generation
Justin Sybrandt
Ilya Safro
MedIm
AI4CE
53
8
0
13 Feb 2020
GLU Variants Improve Transformer
Noam M. Shazeer
183
1,026
0
12 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
167
255
0
11 Feb 2020
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
149
223
0
10 Feb 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha Singh
Nirmalya Roy
A. Gangopadhyay
81
6
0
10 Feb 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
348
201
0
07 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
63
8
0
06 Feb 2020
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang Pierse
Jing Lu
AI4CE
35
2
0
05 Feb 2020
Adversarial Training for Aspect-Based Sentiment Analysis with BERT
Akbar Karimi
L. Rossi
Andrea Prati
312
103
0
30 Jan 2020
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sang-goo Lee
70
90
0
30 Jan 2020
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction
Weiran Wang
Qingming Tang
Karen Livescu
SSL
71
98
0
28 Jan 2020
BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT
Wei-Tsung Kao
Tsung-Han Wu
Po-Han Chi
Chun-Cheng Hsieh
Hung-yi Lee
SSL
44
5
0
25 Jan 2020
Generation-Distillation for Efficient Natural Language Understanding in Low-Data Settings
Luke Melas-Kyriazi
George Han
Celine Liang
47
12
0
25 Jan 2020
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
Saurabh Goyal
Anamitra R. Choudhury
Saurabh ManishRaje
Venkatesan T. Chakaravarthy
Yogish Sabharwal
Ashish Verma
96
18
0
24 Jan 2020
Length-controllable Abstractive Summarization by Guiding with Summary Prototype
Itsumi Saito
Kyosuke Nishida
Kosuke Nishida
Atsushi Otsuka
Hisako Asano
J. Tomita
Hiroyuki Shindo
Yuji Matsumoto
116
33
0
21 Jan 2020
Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Shuohang Wang
Yunshi Lan
Yi Tay
Jing Jiang
Jingjing Liu
ViT
67
7
0
20 Jan 2020
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search
Daoyuan Chen
Yaliang Li
Minghui Qiu
Zhen Wang
Bofang Li
Bolin Ding
Hongbo Deng
Jun Huang
Wei Lin
Jingren Zhou
MQ
97
104
0
13 Jan 2020
Stance Detection Benchmark: How Robust Is Your Stance Detection?
Benjamin Schiller
Johannes Daxenberger
Iryna Gurevych
94
98
0
06 Jan 2020
Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Gwenaelle Cunha Sergio
Minho Lee
43
30
0
01 Jan 2020
ORB: An Open Reading Benchmark for Comprehensive Evaluation of Machine Reading Comprehension
Dheeru Dua
Ananth Gottumukkala
Alon Talmor
Sameer Singh
Matt Gardner
62
10
0
29 Dec 2019
Siamese Networks for Large-Scale Author Identification
Chakaveh Saedi
Mark Dras
85
38
0
23 Dec 2019
A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks
Angelos Filos
Sebastian Farquhar
Aidan Gomez
Tim G. J. Rudner
Zachary Kenton
Lewis Smith
Milad Alizadeh
A. D. Kroon
Y. Gal
BDL
AAML
OOD
UQCV
89
109
0
22 Dec 2019
Previous
1
2
3
...
83
84
85
...
87
88
89
Next