Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.11692
Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach
26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RoBERTa: A Robustly Optimized BERT Pretraining Approach"
50 / 2,937 papers shown
Title
Residual Energy-Based Models for Text Generation
Yuntian Deng
A. Bakhtin
Myle Ott
Arthur Szlam
MarcÁurelio Ranzato
20
125
0
22 Apr 2020
The Ivory Tower Lost: How College Students Respond Differently than the General Public to the COVID-19 Pandemic
Viet-An Duong
Phu Pham
Tongyu Yang
Yu Wang
Jiebo Luo
AI4CE
11
89
0
21 Apr 2020
Train No Evil: Selective Masking for Task-Guided Pre-Training
Yuxian Gu
Zhengyan Zhang
Xiaozhi Wang
Zhiyuan Liu
Maosong Sun
19
59
0
21 Apr 2020
StereoSet: Measuring stereotypical bias in pretrained language models
Moin Nadeem
Anna Bethke
Siva Reddy
11
952
0
20 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
21
0
0
20 Apr 2020
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
23
226
0
18 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
31
377
0
13 Apr 2020
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
Shuyang Gao
Sanchit Agarwal
Tagyoung Chung
Di Jin
Dilek Z. Hakkani-Tür
18
71
0
13 Apr 2020
Unsupervised Commonsense Question Answering with Self-Talk
Vered Shwartz
Peter West
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
ReLM
SSL
AI4MH
LRM
14
257
0
11 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,904
0
10 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
19
114
0
09 Apr 2020
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
46
1,438
0
09 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
Zhaojiang Lin
Andrea Madotto
Pascale Fung
26
155
0
08 Apr 2020
Downstream Model Design of Pre-trained Language Model for Relation Extraction Task
Cheng-rong Li
Ye Tian
11
36
0
08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
Tianda Li
Jia-Chen Gu
Xiao-Dan Zhu
Quan Liu
Zhenhua Ling
Zhiming Su
Si Wei
21
27
0
08 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
14
199
0
07 Apr 2020
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Donghyun Choi
M. Shin
EungGyun Kim
Dong Ryeol Shin
23
123
0
07 Apr 2020
TAPAS: Weakly Supervised Table Parsing via Pre-training
Jonathan Herzig
Pawel Krzysztof Nowak
Thomas Müller
Francesco Piccinno
Julian Martin Eisenschlos
LMTD
RALM
19
630
0
05 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
29
353
0
05 Apr 2020
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni
Yoav Goldberg
13
243
0
05 Apr 2020
Enhancing Factual Consistency of Abstractive Summarization
Chenguang Zhu
William Fu-Hinthorn
Ruochen Xu
Qingkai Zeng
Michael Zeng
Xuedong Huang
Meng-Long Jiang
HILM
KELM
185
39
0
19 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
241
1,450
0
18 Mar 2020
Transformer Networks for Trajectory Forecasting
Francesco Giuliari
Irtiza Hasan
Marco Cristani
Fabio Galasso
111
371
0
18 Mar 2020
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Zhiyuan Fang
Tejas Gokhale
Pratyay Banerjee
Chitta Baral
Yezhou Yang
10
60
0
11 Mar 2020
Sensitive Data Detection and Classification in Spanish Clinical Text: Experiments with BERT
Aitor García-Pablos
Naiara Pérez
Montse Cuadros
21
34
0
06 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
32
23
0
05 Mar 2020
Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout
Filip Graliñski
Tomasz Stanislawek
Anna Wróblewska
Dawid Lipiñski
Agnieszka Kaliska
Paulina Rosalska
Bartosz Topolski
P. Biecek
23
40
0
04 Mar 2020
Learning Representations by Predicting Bags of Visual Words
Spyros Gidaris
Andrei Bursuc
N. Komodakis
P. Pérez
Matthieu Cord
SSL
6
117
0
27 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
43
1,198
0
25 Feb 2020
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
M. Shoeybi
Bryan Catanzaro
ELM
24
158
0
22 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
22
47
0
18 Feb 2020
Robustness Verification for Transformers
Zhouxing Shi
Huan Zhang
Kai-Wei Chang
Minlie Huang
Cho-Jui Hsieh
AAML
19
104
0
16 Feb 2020
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos Aspillaga
Andrés Carvallo
Vladimir Araujo
ELM
31
31
0
14 Feb 2020
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
19
98
0
14 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Blaž Škrlj
S. Džeroski
Nada Lavrac
Matej Petković
FAtt
MILM
26
51
0
11 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
8
239
0
11 Feb 2020
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
29
220
0
10 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
13
1,987
0
10 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng Chang
Felix X. Yu
Yin-Wen Chang
Yiming Yang
Sanjiv Kumar
RALM
11
301
0
10 Feb 2020
Segmented Graph-Bert for Graph Instance Modeling
Jiawei Zhang
SSeg
20
6
0
09 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
13
8
0
06 Feb 2020
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sang-goo Lee
12
90
0
30 Jan 2020
Retrospective Reader for Machine Reading Comprehension
Zhuosheng Zhang
Junjie Yang
Hai Zhao
RALM
23
226
0
27 Jan 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
17
1,768
0
22 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
29
258
0
22 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,586
0
21 Jan 2020
RobBERT: a Dutch RoBERTa-based Language Model
Pieter Delobelle
Thomas Winters
Bettina Berendt
10
232
0
17 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
25
48
0
13 Jan 2020
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
17
300
0
31 Dec 2019
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
Wenhan Xiong
Jingfei Du
William Yang Wang
Veselin Stoyanov
SSL
KELM
24
201
0
20 Dec 2019
Previous
1
2
3
...
57
58
59
Next