ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 2,875 papers shown
Title
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
AraGPT2: Pre-Trained Transformer for Arabic Language Generation
Wissam Antoun
Fady Baly
Hazem M. Hajj
VLM
6
103
0
31 Dec 2020
AraELECTRA: Pre-Training Text Discriminators for Arabic Language
  Understanding
AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
9
102
0
31 Dec 2020
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Ahmad Rashid
Vasileios Lioutas
Abbas Ghaddar
Mehdi Rezagholizadeh
13
27
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
8
319
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng-Tao Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
10
68
0
30 Dec 2020
Improving BERT with Syntax-aware Local Attention
Improving BERT with Syntax-aware Local Attention
Zhongli Li
Qingyu Zhou
Chao Li
Ke Xu
Yunbo Cao
56
44
0
30 Dec 2020
Reservoir Transformers
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
20
17
0
30 Dec 2020
ERICA: Improving Entity and Relation Understanding for Pre-trained
  Language Models via Contrastive Learning
ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning
Yujia Qin
Yankai Lin
Ryuichi Takanobu
Zhiyuan Liu
Peng Li
Heng Ji
Minlie Huang
Maosong Sun
Jie Zhou
38
125
0
30 Dec 2020
Code Summarization with Structure-induced Transformer
Code Summarization with Structure-induced Transformer
Hongqiu Wu
Hai Zhao
Min Zhang
20
84
0
29 Dec 2020
CascadeBERT: Accelerating Inference of Pre-trained Language Models via
  Calibrated Complete Models Cascade
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
Lei Li
Yankai Lin
Deli Chen
Shuhuai Ren
Peng Li
Jie Zhou
Xu Sun
24
51
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked
  Language Model
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Matthew Cer
Jax Law
Eric F. Darve
SSL
11
57
0
28 Dec 2020
Multi-Head Self-Attention with Role-Guided Masks
Multi-Head Self-Attention with Role-Guided Masks
Dongsheng Wang
Casper Hansen
Lucas Chaves Lima
Christian B. Hansen
Maria Maistro
J. Simonsen
Christina Lioma
16
1
0
22 Dec 2020
Recognizing Emotion Cause in Conversations
Recognizing Emotion Cause in Conversations
Soujanya Poria
Navonil Majumder
Devamanyu Hazarika
Deepanway Ghosal
Rishabh Bhardwaj
...
Romila Ghosh
Abhinaba Roy
Niyati Chhaya
Alexander Gelbukh
Rada Mihalcea
32
122
0
22 Dec 2020
Mention Extraction and Linking for SQL Query Generation
Mention Extraction and Linking for SQL Query Generation
Jianqiang Ma
Zeyu Yan
Shuai Pang
Yang Zhang
Jianping Shen
24
29
0
18 Dec 2020
MELINDA: A Multimodal Dataset for Biomedical Experiment Method
  Classification
MELINDA: A Multimodal Dataset for Biomedical Experiment Method Classification
Te-Lin Wu
Shikhar Singh
S. Paul
Gully A. Burns
Nanyun Peng
17
18
0
16 Dec 2020
LIREx: Augmenting Language Inference with Relevant Explanation
LIREx: Augmenting Language Inference with Relevant Explanation
Xinyan Zhao
V. Vydiswaran
LRM
15
37
0
16 Dec 2020
Morphology Matters: A Multilingual Language Modeling Analysis
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
39
47
0
11 Dec 2020
Discourse Parsing of Contentious, Non-Convergent Online Discussions
Discourse Parsing of Contentious, Non-Convergent Online Discussions
S. Zakharov
Omri Hadar
Tovit Hakak
Dina Grossman
Y. Kolikant
Oren Tsur
8
8
0
08 Dec 2020
Benchmarking Commercial Intent Detection Services with Practice-Driven
  Evaluations
Benchmarking Commercial Intent Detection Services with Practice-Driven Evaluations
Haode Qi
Lin Pan
Atin Sood
Abhishek Shah
L. Kunc
Mo Yu
Saloni Potdar
VLM
15
16
0
07 Dec 2020
People Still Care About Facts: Twitter Users Engage More with Factual
  Discourse than Misinformation--A Comparison Between COVID and General
  Narratives on Twitter
People Still Care About Facts: Twitter Users Engage More with Factual Discourse than Misinformation--A Comparison Between COVID and General Narratives on Twitter
Mirela Silva
Fabrício Ceschin
P. Shrestha
Christopher Brant
Shlok Gilda
Juliana Fernandes
Catia S. Silva
André Grégio
Daniela Oliveira
Luiz H. F. Giovanini
13
1
0
03 Dec 2020
A Novel Sentiment Analysis Engine for Preliminary Depression Status
  Estimation on Social Media
A Novel Sentiment Analysis Engine for Preliminary Depression Status Estimation on Social Media
S. Suman
H. Shalu
Lakshya A Agrawal
Archit Agrawal
Juned Kadiwala
16
6
0
29 Nov 2020
Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for
  BERT Training Speedup
Progressively Stacking 2.0: A Multi-stage Layerwise Training Method for BERT Training Speedup
Cheng Yang
Shengnan Wang
Chao Yang
Yuechuan Li
Ru He
Jingqiao Zhang
22
25
0
27 Nov 2020
GLGE: A New General Language Generation Evaluation Benchmark
GLGE: A New General Language Generation Evaluation Benchmark
Dayiheng Liu
Yu Yan
Yeyun Gong
Weizhen Qi
Hang Zhang
...
Jiancheng Lv
Ruofei Zhang
Winnie Wu
Ming Zhou
Nan Duan
ELM
28
66
0
24 Nov 2020
Evaluating Semantic Accuracy of Data-to-Text Generation with Natural
  Language Inference
Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference
Ondrej Dusek
Zdeněk Kasner
14
64
0
21 Nov 2020
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
Hiroaki Hayashi
Prashant Budania
Peng Wang
Chris Ackerson
Raj Neervannan
Graham Neubig
18
64
0
16 Nov 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Mohit Bansal
OffRL
25
6
0
15 Nov 2020
When Do You Need Billions of Words of Pretraining Data?
When Do You Need Billions of Words of Pretraining Data?
Yian Zhang
Alex Warstadt
Haau-Sing Li
Samuel R. Bowman
8
136
0
10 Nov 2020
Multi-document Summarization via Deep Learning Techniques: A Survey
Multi-document Summarization via Deep Learning Techniques: A Survey
Congbo Ma
W. Zhang
Mingyu Guo
Hu Wang
Quan Z. Sheng
13
125
0
10 Nov 2020
Towards Domain-Agnostic Contrastive Learning
Towards Domain-Agnostic Contrastive Learning
Vikas Verma
Minh-Thang Luong
Kenji Kawaguchi
Hieu H. Pham
Quoc V. Le
SSL
13
115
0
09 Nov 2020
Positional Artefacts Propagate Through Masked Language Model Embeddings
Positional Artefacts Propagate Through Masked Language Model Embeddings
Ziyang Luo
Artur Kulmizev
Xiaoxi Mao
14
41
0
09 Nov 2020
Underspecification Presents Challenges for Credibility in Modern Machine
  Learning
Underspecification Presents Challenges for Credibility in Modern Machine Learning
Alexander DÁmour
Katherine A. Heller
D. Moldovan
Ben Adlam
B. Alipanahi
...
Kellie Webster
Steve Yadlowsky
T. Yun
Xiaohua Zhai
D. Sculley
OffRL
19
669
0
06 Nov 2020
Unleashing the Power of Neural Discourse Parsers -- A Context and
  Structure Aware Approach Using Large Scale Pretraining
Unleashing the Power of Neural Discourse Parsers -- A Context and Structure Aware Approach Using Large Scale Pretraining
Grigorii Guz
Patrick Huber
Giuseppe Carenini
6
11
0
06 Nov 2020
EXAMS: A Multi-Subject High School Examinations Dataset for
  Cross-Lingual and Multilingual Question Answering
EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering
Momchil Hardalov
Todor Mihaylov
Dimitrina Zlatkova
Yoan Dinkov
Ivan Koychev
Preslav Nakov
AI4Ed
ELM
23
50
0
05 Nov 2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation
Detecting Hallucinated Content in Conditional Neural Sequence Generation
Chunting Zhou
Graham Neubig
Jiatao Gu
Mona T. Diab
P. Guzmán
Luke Zettlemoyer
Marjan Ghazvininejad
HILM
31
194
0
05 Nov 2020
The Devil is in the Details: Evaluating Limitations of Transformer-based
  Methods for Granular Tasks
The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks
Brihi Joshi
Neil Shah
Francesco Barbieri
Leonardo Neves
29
5
0
02 Nov 2020
On the Sentence Embeddings from Pre-trained Language Models
On the Sentence Embeddings from Pre-trained Language Models
Bohan Li
Hao Zhou
Junxian He
Mingxuan Wang
Yiming Yang
Lei Li
6
213
0
02 Nov 2020
Emergent Communication Pretraining for Few-Shot Machine Translation
Emergent Communication Pretraining for Few-Shot Machine Translation
Yaoyiran Li
E. Ponti
Ivan Vulić
Anna Korhonen
18
19
0
02 Nov 2020
Probing Task-Oriented Dialogue Representation from Language Models
Probing Task-Oriented Dialogue Representation from Language Models
Chien-Sheng Wu
Caiming Xiong
8
20
0
26 Oct 2020
Discriminative Nearest Neighbor Few-Shot Intent Detection by
  Transferring Natural Language Inference
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
Jianguo Zhang
Kazuma Hashimoto
Wenhao Liu
Chien-Sheng Wu
Yao Wan
Philip S. Yu
R. Socher
Caiming Xiong
14
92
0
25 Oct 2020
Learning to Deceive Knowledge Graph Augmented Models via Targeted
  Perturbation
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation
Mrigank Raman
Aaron Chan
Siddhant Agarwal
Peifeng Wang
Hansen Wang
Sungchul Kim
Ryan Rossi
Handong Zhao
Nedim Lipka
Xiang Ren
28
19
0
24 Oct 2020
Rethinking embedding coupling in pre-trained language models
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
93
142
0
24 Oct 2020
Structure-Grounded Pretraining for Text-to-SQL
Structure-Grounded Pretraining for Text-to-SQL
Xiang Deng
Ahmed Hassan Awadallah
Christopher Meek
Oleksandr Polozov
Huan Sun
Matthew Richardson
LMTD
8
150
0
24 Oct 2020
Adding Chit-Chat to Enhance Task-Oriented Dialogues
Adding Chit-Chat to Enhance Task-Oriented Dialogues
Kai Sun
Seungwhan Moon
Paul A. Crook
Stephen Roller
Becka Silvert
Bing-Quan Liu
Zhiguang Wang
Honglei Liu
Eunjoon Cho
Claire Cardie
62
66
0
24 Oct 2020
Char2Subword: Extending the Subword Embedding Space Using Robust
  Character Compositionality
Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality
Gustavo Aguilar
Bryan McCann
Tong Niu
Nazneen Rajani
N. Keskar
Thamar Solorio
39
12
0
24 Oct 2020
ANLIzing the Adversarial Natural Language Inference Dataset
ANLIzing the Adversarial Natural Language Inference Dataset
Adina Williams
Tristan Thrush
Douwe Kiela
AAML
166
45
0
24 Oct 2020
A Differentiable Relaxation of Graph Segmentation and Alignment for AMR
  Parsing
A Differentiable Relaxation of Graph Segmentation and Alignment for AMR Parsing
Chunchuan Lyu
Shay B. Cohen
Ivan Titov
24
11
0
23 Oct 2020
Concealed Data Poisoning Attacks on NLP Models
Concealed Data Poisoning Attacks on NLP Models
Eric Wallace
Tony Zhao
Shi Feng
Sameer Singh
SILM
11
18
0
23 Oct 2020
On the Transformer Growth for Progressive BERT Training
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
C. L. P. Chen
Jiawei Han
VLM
61
51
0
23 Oct 2020
Generating Plausible Counterfactual Explanations for Deep Transformers
  in Financial Text Classification
Generating Plausible Counterfactual Explanations for Deep Transformers in Financial Text Classification
Linyi Yang
Eoin M. Kenny
T. L. J. Ng
Yi Yang
Barry Smyth
Ruihai Dong
13
70
0
23 Oct 2020
Improving Robustness by Augmenting Training Sentences with
  Predicate-Argument Structures
Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures
N. Moosavi
M. Boer
Prasetya Ajie Utama
Iryna Gurevych
6
13
0
23 Oct 2020
Previous
123...525354...565758
Next