Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1804.07461
Cited By
v1
v2
v3 (latest)
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
20 April 2018
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding"
50 / 4,447 papers shown
Title
HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Yi Tay
Zhe Zhao
Dara Bahri
Donald Metzler
Da-Cheng Juan
66
9
0
12 Jul 2020
Deep or Simple Models for Semantic Tagging? It Depends on your Data [Experiments]
Jinfeng Li
Yuliang Li
Xiaolan Wang
W. Tan
VLM
43
9
0
11 Jul 2020
Fast Transformers with Clustered Attention
Apoorv Vyas
Angelos Katharopoulos
Franccois Fleuret
71
155
0
09 Jul 2020
The curious case of developmental BERTology: On sparsity, transfer learning, generalization and the brain
Xin Wang
34
1
0
07 Jul 2020
Cross-lingual Inductive Transfer to Detect Offensive Language
Kartikey Pant
Tanvi Dadu
30
5
0
07 Jul 2020
Targeting the Benchmark: On Methodology in Current Natural Language Processing Research
David Schlangen
69
58
0
07 Jul 2020
Deep Contextual Embeddings for Address Classification in E-commerce
Shreyas Mangalgi
Lakshya Kumar
Ravindra Babu Tallamraju
27
8
0
06 Jul 2020
CORD19STS: COVID-19 Semantic Textual Similarity Dataset
Xiao Guo
H. Mirzaalian
Ekraam Sabir
Aysush Jaiswal
Wael AbdAlmageed
56
33
0
05 Jul 2020
El Departamento de Nosotros: How Machine Translated Corpora Affects Language Models in MRC Tasks
Maria Khvalchik
Mikhail Galkin
68
1
0
03 Jul 2020
On-The-Fly Information Retrieval Augmentation for Language Models
Hai Wang
David A. McAllester
KELM
RALM
63
6
0
03 Jul 2020
IIE-NLP-NUT at SemEval-2020 Task 4: Guiding PLM with Prompt Template Reconstruction Strategy for ComVE
Luxi Xing
Yuqiang Xie
Yue Hu
Wei Peng
76
7
0
02 Jul 2020
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong
Chenyan Xiong
Ye Li
Kwok-Fung Tang
Jialin Liu
Paul N. Bennett
Junaid Ahmed
Arnold Overwijk
147
1,238
0
01 Jul 2020
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
H. H. Mao
BDL
SSL
72
50
0
01 Jul 2020
Transferability of Natural Language Inference to Biomedical Question Answering
Minbyul Jeong
Mujeen Sung
Gangwoo Kim
Donghyeon Kim
Wonjin Yoon
J. Yoo
Jaewoo Kang
80
40
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
142
135
0
30 Jun 2020
Multi-Head Attention: Collaborate Instead of Concatenate
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
82
115
0
29 Jun 2020
Progressive Generation of Long Text with Pretrained Language Models
Bowen Tan
Zichao Yang
Maruan Al-Shedivat
Eric Xing
Zhiting Hu
67
23
0
28 Jun 2020
Rethinking Positional Encoding in Language Pre-training
Guolin Ke
Di He
Tie-Yan Liu
138
299
0
28 Jun 2020
Uncovering the Connections Between Adversarial Transferability and Knowledge Transferability
Kaizhao Liang
Jacky Y. Zhang
Wei Ping
Zhuolin Yang
Oluwasanmi Koyejo
Yangqiu Song
AAML
140
26
0
25 Jun 2020
Towards Differentially Private Text Representations
Lingjuan Lyu
Yitong Li
Xuanli He
Tong Xiao
72
39
0
25 Jun 2020
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention
Grégoire Mialon
Dexiong Chen
Alexandre d’Aspremont
Julien Mairal
OT
44
0
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
109
2
0
21 Jun 2020
We Should at Least Be Able to Design Molecules That Dock Well
Tobiasz Ciepliński
Tomasz Danel
Sabina Podlewska
Stanislaw Jastrzebski
89
31
0
20 Jun 2020
Memory Transformer
Andrey Kravchenko
Yuri Kuratov
Anton Peganov
Grigory V. Sapunov
RALM
78
72
0
20 Jun 2020
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
F. Iandola
Albert Eaton Shaw
Ravi Krishna
Kurt Keutzer
VLM
90
128
0
19 Jun 2020
Multi-branch Attentive Transformer
Yang Fan
Shufang Xie
Yingce Xia
Lijun Wu
Tao Qin
Xiang-Yang Li
Tie-Yan Liu
65
17
0
18 Jun 2020
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training
J. Qiu
Qibin Chen
Yuxiao Dong
Jing Zhang
Hongxia Yang
Ming Ding
Kuansan Wang
Jie Tang
SSL
251
960
0
17 Jun 2020
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
105
16
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
105
219
0
16 Jun 2020
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation
Steffen Eger
Johannes Daxenberger
Iryna Gurevych
67
11
0
16 Jun 2020
The SPPD System for Schema Guided Dialogue State Tracking Challenge
Miao Li
Haoqi Xiong
Yunbo Cao
44
10
0
16 Jun 2020
Hindsight Logging for Model Training
Rolando Garcia
Eric Liu
Vikram Sreekanti
Bobby Yan
Anusha Dandamudi
Joseph E. Gonzalez
J. M. Hellerstein
Koushik Sen
VLM
77
10
0
12 Jun 2020
Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks
Like Hui
M. Belkin
UQCV
AAML
VLM
62
172
0
12 Jun 2020
Ensemble Distillation for Robust Model Fusion in Federated Learning
Tao R. Lin
Lingjing Kong
Sebastian U. Stich
Martin Jaggi
FedML
141
1,060
0
12 Jun 2020
Video Understanding as Machine Translation
Bruno Korbar
Fabio Petroni
Rohit Girdhar
Lorenzo Torresani
SSL
88
29
0
12 Jun 2020
NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing
Nikita Klyuchnikov
I. Trofimov
Ekaterina Artemova
Mikhail Salnikov
M. Fedorov
Evgeny Burnaev
VLM
117
105
0
12 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
180
446
0
10 Jun 2020
MC-BERT: Efficient Language Pre-Training via a Meta Controller
Zhenhui Xu
Linyuan Gong
Guolin Ke
Di He
Shuxin Zheng
Liwei Wang
Jiang Bian
Tie-Yan Liu
BDL
65
18
0
10 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
187
363
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
244
1,720
0
08 Jun 2020
ColdGANs: Taming Language GANs with Cautious Sampling Strategies
Thomas Scialom
Paul-Alexis Dray
Sylvain Lamprier
Benjamin Piwowarski
Jacopo Staiano
GAN
SyDa
68
18
0
08 Jun 2020
Probing Neural Dialog Models for Conversational Understanding
Abdelrhman Saleh
Tovly Deutsch
Stephen Casper
Yonatan Belinkov
Stuart M. Shieber
65
13
0
07 Jun 2020
BERT Loses Patience: Fast and Robust Inference with Early Exit
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
79
343
0
07 Jun 2020
An Overview of Neural Network Compression
James OÑeill
AI4CE
160
99
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
151
499
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
189
2,770
0
05 Jun 2020
Sponge Examples: Energy-Latency Attacks on Neural Networks
Ilia Shumailov
Yiren Zhao
Daniel Bates
Nicolas Papernot
Robert D. Mullins
Ross J. Anderson
SILM
81
138
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
109
236
0
05 Jun 2020
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning
Alessandro Suglia
Ioannis Konstas
Andrea Vanzo
E. Bastianelli
Desmond Elliott
Stella Frank
Oliver Lemon
62
16
0
03 Jun 2020
Interpretable Meta-Measure for Model Performance
Alicja Gosiewska
Katarzyna Wo'znica
P. Biecek
44
5
0
02 Jun 2020
Previous
1
2
3
...
79
80
81
...
87
88
89
Next