ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

21 / 3,071 papers shown
Title
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
17
584
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
235
205
0
25 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
17
250
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with
  Contextualized Embeddings
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
11
174
0
23 Sep 2019
Language models and Automated Essay Scoring
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
22
82
0
18 Sep 2019
Span-based Joint Entity and Relation Extraction with Transformer
  Pre-training
Span-based Joint Entity and Relation Extraction with Transformer Pre-training
Markus Eberts
A. Ulges
LRM
ViT
164
380
0
17 Sep 2019
Span Selection Pre-training for Question Answering
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
27
70
0
09 Sep 2019
Pretrained Language Models for Sequential Sentence Classification
Pretrained Language Models for Sequential Sentence Classification
Arman Cohan
Iz Beltagy
Daniel King
Bhavana Dalvi
Daniel S. Weld
24
128
0
09 Sep 2019
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
Bill Yuchen Lin
Xinyue Chen
Jamin Chen
Xiang Ren
13
459
0
04 Sep 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
40
11,663
0
27 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
17
826
0
25 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
29
1,643
0
22 Aug 2019
Align, Mask and Select: A Simple Method for Incorporating Commonsense
  Knowledge into Language Representation Models
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
Zhiquan Ye
Qian Chen
Wen Wang
Zhenhua Ling
17
68
0
19 Aug 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal
  Pre-training
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
27
895
0
16 Aug 2019
On Identifiability in Transformers
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
10
186
0
12 Aug 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
57
433
0
29 Jul 2019
BERTphone: Phonetically-Aware Encoder Representations for
  Utterance-Level Speaker and Language Recognition
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition
Shaoshi Ling
Julian Salazar
Yuzong Liu
Katrin Kirchhoff
SSL
13
27
0
30 Jun 2019
Taming Pretrained Transformers for Extreme Multi-label Text
  Classification
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
10
20
0
07 May 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks,
  Resources, and Approaches
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
13
128
0
02 Apr 2019
Sentence transition matrix: An efficient approach that preserves
  sentence semantics
Sentence transition matrix: An efficient approach that preserves sentence semantics
Myeongjun Jang
Pilsung Kang
11
2
0
16 Jan 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
Previous
123...606162