ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 2,766 papers shown
Title
SimAlign: High Quality Word Alignments without Parallel Training Data
  using Static and Contextualized Embeddings
SimAlign: High Quality Word Alignments without Parallel Training Data using Static and Contextualized Embeddings
Masoud Jalili Sabet
Philipp Dufter
François Yvon
Hinrich Schütze
6
224
0
18 Apr 2020
CLUE: A Chinese Language Understanding Evaluation Benchmark
CLUE: A Chinese Language Understanding Evaluation Benchmark
Liang Xu
Hai Hu
Xuanwei Zhang
Lu Li
Chenjie Cao
...
Cong Yue
Xinrui Zhang
Zhen-Yi Yang
Kyle Richardson
Zhenzhong Lan
ELM
26
377
0
13 Apr 2020
From Machine Reading Comprehension to Dialogue State Tracking: Bridging
  the Gap
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
Shuyang Gao
Sanchit Agarwal
Tagyoung Chung
Di Jin
Dilek Z. Hakkani-Tür
18
71
0
13 Apr 2020
Unsupervised Commonsense Question Answering with Self-Talk
Unsupervised Commonsense Question Answering with Self-Talk
Vered Shwartz
Peter West
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
ReLM
SSL
AI4MH
LRM
14
257
0
11 Apr 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,904
0
10 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
8
114
0
09 Apr 2020
BLEURT: Learning Robust Metrics for Text Generation
BLEURT: Learning Robust Metrics for Text Generation
Thibault Sellam
Dipanjan Das
Ankur P. Parikh
41
1,438
0
09 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient
  Transfer Learning
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
Zhaojiang Lin
Andrea Madotto
Pascale Fung
21
155
0
08 Apr 2020
Downstream Model Design of Pre-trained Language Model for Relation
  Extraction Task
Downstream Model Design of Pre-trained Language Model for Relation Extraction Task
Cheng-rong Li
Ye Tian
11
36
0
08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation
  Disentanglement
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement
Tianda Li
Jia-Chen Gu
Xiao-Dan Zhu
Quan Liu
Zhenhua Ling
Zhiming Su
Si Wei
21
26
0
08 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Byte Pair Encoding is Suboptimal for Language Model Pretraining
Kaj Bostrom
Greg Durrett
14
198
0
07 Apr 2020
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex
  Text-to-SQL in Cross-Domain Databases
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Donghyun Choi
M. Shin
EungGyun Kim
Dong Ryeol Shin
17
123
0
07 Apr 2020
TAPAS: Weakly Supervised Table Parsing via Pre-training
TAPAS: Weakly Supervised Table Parsing via Pre-training
Jonathan Herzig
Pawel Krzysztof Nowak
Thomas Müller
Francesco Piccinno
Julian Martin Eisenschlos
LMTD
RALM
19
629
0
05 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
29
353
0
05 Apr 2020
Unsupervised Domain Clusters in Pretrained Language Models
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni
Yoav Goldberg
13
243
0
05 Apr 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
241
1,444
0
18 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
  Natural Language Inference
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
32
22
0
05 Mar 2020
Kleister: A novel task for Information Extraction involving Long
  Documents with Complex Layout
Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout
Filip Graliñski
Tomasz Stanislawek
Anna Wróblewska
Dawid Lipiñski
Agnieszka Kaliska
Paulina Rosalska
Bartosz Topolski
P. Biecek
23
40
0
04 Mar 2020
Training Question Answering Models From Synthetic Data
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
M. Shoeybi
Bryan Catanzaro
ELM
24
157
0
22 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
19
47
0
18 Feb 2020
Robustness Verification for Transformers
Robustness Verification for Transformers
Zhouxing Shi
Huan Zhang
Kai-Wei Chang
Minlie Huang
Cho-Jui Hsieh
AAML
19
103
0
16 Feb 2020
FQuAD: French Question Answering Dataset
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
19
98
0
14 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Feature Importance Estimation with Self-Attention Networks
Blaž Škrlj
S. Džeroski
Nada Lavrac
Matej Petković
FAtt
MILM
26
50
0
11 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
8
238
0
11 Feb 2020
Adversarial Filters of Dataset Biases
Adversarial Filters of Dataset Biases
Ronan Le Bras
Swabha Swayamdipta
Chandra Bhagavatula
Rowan Zellers
Matthew E. Peters
Ashish Sabharwal
Yejin Choi
29
220
0
10 Feb 2020
REALM: Retrieval-Augmented Language Model Pre-Training
REALM: Retrieval-Augmented Language Model Pre-Training
Kelvin Guu
Kenton Lee
Zora Tung
Panupong Pasupat
Ming-Wei Chang
RALM
13
1,981
0
10 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng Chang
Felix X. Yu
Yin-Wen Chang
Yiming Yang
Sanjiv Kumar
RALM
6
301
0
10 Feb 2020
Segmented Graph-Bert for Graph Instance Modeling
Segmented Graph-Bert for Graph Instance Modeling
Jiawei Zhang
SSeg
20
5
0
09 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction
  Codes using Self-Attention
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
13
8
0
06 Feb 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
17
1,766
0
22 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,584
0
21 Jan 2020
RobBERT: a Dutch RoBERTa-based Language Model
RobBERT: a Dutch RoBERTa-based Language Model
Pieter Delobelle
Thomas Winters
Bettina Berendt
10
232
0
17 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark
  for Chinese
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
25
48
0
13 Jan 2020
oLMpics -- On what Language Model Pre-training Captures
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
12
300
0
31 Dec 2019
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language
  Model
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
Wenhan Xiong
Jingfei Du
William Yang Wang
Veselin Stoyanov
SSL
KELM
17
201
0
20 Dec 2019
BERTje: A Dutch BERT Model
BERTje: A Dutch BERT Model
Wietse de Vries
Andreas van Cranenburgh
Arianna Bisazza
Tommaso Caselli
Gertjan van Noord
Malvina Nissim
VLM
SSeg
11
291
0
19 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
FlauBERT: Unsupervised Language Model Pre-training for French
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
18
395
0
11 Dec 2019
Neural Duplicate Question Detection without Labeled Training Data
Neural Duplicate Question Detection without Labeled Training Data
Andreas Rucklé
N. Moosavi
Iryna Gurevych
OOD
AAML
6
11
0
13 Nov 2019
Attending to Entities for Better Text Understanding
Attending to Entities for Better Text Understanding
Pengxiang Cheng
K. Erk
LRM
11
36
0
11 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
21
50
0
09 Nov 2019
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning
Jaejun Lee
Raphael Tang
Jimmy J. Lin
18
121
0
08 Nov 2019
S2ORC: The Semantic Scholar Open Research Corpus
S2ORC: The Semantic Scholar Open Research Corpus
Kyle Lo
Lucy Lu Wang
Mark Neumann
Rodney Michael Kinney
Daniel S. Weld
OffRL
AI4CE
21
10
0
07 Nov 2019
Infusing Knowledge into the Textual Entailment Task Using Graph
  Convolutional Networks
Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks
Pavan Kapanipathi
Veronika Thost
S. Patel
Spencer Whitehead
Ibrahim Abdelaziz
...
R. Chulaka Gunasekara
B. Makni
Nicholas Mattei
Kartik Talamadupula
Achille Fokoue
34
45
0
05 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
13
806
0
01 Nov 2019
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language
  Generation, Translation, and Comprehension
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
M. Lewis
Yinhan Liu
Naman Goyal
Marjan Ghazvininejad
Abdel-rahman Mohamed
Omer Levy
Veselin Stoyanov
Luke Zettlemoyer
AIMat
VLM
26
10,567
0
29 Oct 2019
SpeechBERT: An Audio-and-text Jointly Learned Language Model for
  End-to-end Spoken Question Answering
SpeechBERT: An Audio-and-text Jointly Learned Language Model for End-to-end Spoken Question Answering
Yung-Sung Chuang
Chi-Liang Liu
Hung-yi Lee
Lin-shan Lee
AuLLM
19
39
0
25 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
21
16
0
25 Oct 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
47
19,391
0
23 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
44
70
0
09 Oct 2019
BERT for Evidence Retrieval and Claim Verification
BERT for Evidence Retrieval and Claim Verification
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
15
119
0
07 Oct 2019
Previous
123...545556
Next