ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 3,476 papers shown
Title
Vision Transformers for Dense Prediction
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
33
1,659
0
24 Mar 2021
FastMoE: A Fast Mixture-of-Expert Training System
FastMoE: A Fast Mixture-of-Expert Training System
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALM
MoE
22
94
0
24 Mar 2021
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning
  Performance of GPT-2
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor Betz
Kyle Richardson
Christian Voigt
ReLM
LRM
16
29
0
24 Mar 2021
Czert -- Czech BERT-like Model for Language Representation
Czert -- Czech BERT-like Model for Language Representation
Jakub Sido
O. Pražák
P. Pribán
Jan Pasek
Michal Seják
Miloslav Konopík
13
43
0
24 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
22
137
0
24 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
55
92
0
23 Mar 2021
Are Neural Language Models Good Plagiarists? A Benchmark for Neural
  Paraphrase Detection
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
25
34
0
23 Mar 2021
Instance-level Image Retrieval using Reranking Transformers
Instance-level Image Retrieval using Reranking Transformers
Fuwen Tan
Jiangbo Yuan
Vicente Ordonez
ViT
21
89
0
22 Mar 2021
BERT: A Review of Applications in Natural Language Processing and
  Understanding
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
17
194
0
22 Mar 2021
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
  Improved Cross-Modal Retrieval
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
Gregor Geigle
Jonas Pfeiffer
Nils Reimers
Ivan Vulić
Iryna Gurevych
27
59
0
22 Mar 2021
Identifying Machine-Paraphrased Plagiarism
Identifying Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Tomávs Foltýnek
Norman Meuschke
Bela Gipp
11
30
0
22 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
42
510
0
22 Mar 2021
Exploiting Method Names to Improve Code Summarization: A Deliberation
  Multi-Task Learning Approach
Exploiting Method Names to Improve Code Summarization: A Deliberation Multi-Task Learning Approach
Rui Xie
Wei Ye
Jinan Sun
Shikun Zhang
20
26
0
21 Mar 2021
API2Com: On the Improvement of Automatically Generated Code Comments
  Using API Documentations
API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations
Ramin Shahbazi
Rishab Sharma
Fatemeh H. Fard
19
25
0
19 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
43
1,145
0
18 Mar 2021
On the Role of Images for Analyzing Claims in Social Media
On the Role of Images for Analyzing Claims in Social Media
Gullal Singh Cheema
Sherzod Hakimov
Eric Müller-Budack
Ralph Ewerth
16
10
0
17 Mar 2021
Towards Few-Shot Fact-Checking via Perplexity
Towards Few-Shot Fact-Checking via Perplexity
Nayeon Lee
Yejin Bang
Andrea Madotto
Madian Khabsa
Pascale Fung
AAML
13
90
0
17 Mar 2021
Investigating Monolingual and Multilingual BERTModels for Vietnamese
  Aspect Category Detection
Investigating Monolingual and Multilingual BERTModels for Vietnamese Aspect Category Detection
D. Thin
Lac Si Le
V. Hoang
N. Nguyen
31
10
0
17 Mar 2021
Structural Adapters in Pretrained Language Models for AMR-to-text
  Generation
Structural Adapters in Pretrained Language Models for AMR-to-text Generation
Leonardo F. R. Ribeiro
Yue Zhang
Iryna Gurevych
33
69
0
16 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
49
296
0
15 Mar 2021
Deep Discourse Analysis for Generating Personalized Feedback in
  Intelligent Tutor Systems
Deep Discourse Analysis for Generating Personalized Feedback in Intelligent Tutor Systems
Matt Grenander
Robert Belfer
E. Kochmar
Iulian Serban
Franccois St-Hilaire
Jackie C.K. Cheung
AI4Ed
22
17
0
13 Mar 2021
Cooperative Self-training of Machine Reading Comprehension
Cooperative Self-training of Machine Reading Comprehension
Hongyin Luo
Shang-Wen Li
Ming Gao
Seunghak Yu
James R. Glass
SyDa
RALM
15
11
0
12 Mar 2021
Are NLP Models really able to Solve Simple Math Word Problems?
Are NLP Models really able to Solve Simple Math Word Problems?
Arkil Patel
S. Bhattamishra
Navin Goyal
ReLM
LRM
27
763
0
12 Mar 2021
Inductive Relation Prediction by BERT
Inductive Relation Prediction by BERT
H. Zha
Zhiyu Zoey Chen
Xifeng Yan
21
54
0
12 Mar 2021
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained
  Language Models
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models
Go Inoue
Bashar Alhafni
Nurpeiis Baimukan
Houda Bouamor
Nizar Habash
30
223
0
11 Mar 2021
ReportAGE: Automatically extracting the exact age of Twitter users based
  on self-reports in tweets
ReportAGE: Automatically extracting the exact age of Twitter users based on self-reports in tweets
A. Klein
A. Magge
G. Gonzalez-Hernandez
12
20
0
10 Mar 2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks
Collin Burns
Anya Chen
Spencer Ball
ELM
AILaw
18
179
0
10 Mar 2021
BERTese: Learning to Speak to BERT
BERTese: Learning to Speak to BERT
Adi Haviv
Jonathan Berant
Amir Globerson
19
122
0
09 Mar 2021
Text Simplification by Tagging
Text Simplification by Tagging
Kostiantyn Omelianchuk
Vipul Raheja
Oleksandr Skurzhanskyi
8
45
0
08 Mar 2021
Split Computing and Early Exiting for Deep Learning Applications: Survey
  and Research Challenges
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Yoshitomo Matsubara
Marco Levorato
Francesco Restuccia
22
199
0
08 Mar 2021
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees
Jiangang Bai
Yujing Wang
Yiren Chen
Yaming Yang
Jing Bai
J. Yu
Yunhai Tong
37
104
0
07 Mar 2021
MalBERT: Using Transformers for Cybersecurity and Malicious Software
  Detection
MalBERT: Using Transformers for Cybersecurity and Malicious Software Detection
Abir Rahali
M. Akhloufi
19
30
0
05 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
24
31
0
04 Mar 2021
Natural Language Understanding for Argumentative Dialogue Systems in the
  Opinion Building Domain
Natural Language Understanding for Argumentative Dialogue Systems in the Opinion Building Domain
W. A. Abro
Annalena Aicher
Niklas Rach
Stefan Ultes
Wolfgang Minker
Guilin Qi
23
32
0
03 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
29
8
0
02 Mar 2021
Disentangling Syntax and Semantics in the Brain with Deep Networks
Disentangling Syntax and Semantics in the Brain with Deep Networks
Charlotte Caucheteux
Alexandre Gramfort
J. King
31
69
0
02 Mar 2021
Contrastive Explanations for Model Interpretability
Contrastive Explanations for Model Interpretability
Alon Jacovi
Swabha Swayamdipta
Shauli Ravfogel
Yanai Elazar
Yejin Choi
Yoav Goldberg
33
95
0
02 Mar 2021
A Primer on Contrastive Pretraining in Language Processing: Methods,
  Lessons Learned and Perspectives
A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives
Nils Rethmeier
Isabelle Augenstein
SSL
VLM
85
90
0
25 Feb 2021
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language
  Model for Reading Comprehension of Abstract Meaning
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract Meaning
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
34
2
0
25 Feb 2021
Generating Human Readable Transcript for Automatic Speech Recognition
  with Pre-trained Language Model
Generating Human Readable Transcript for Automatic Speech Recognition with Pre-trained Language Model
Junwei Liao
Yu Shi
Ming Gong
Linjun Shou
Sefik Emre Eskimez
Liyang Lu
Hong Qu
Michael Zeng
17
9
0
22 Feb 2021
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
LogME: Practical Assessment of Pre-trained Models for Transfer Learning
Kaichao You
Yong Liu
Jianmin Wang
Mingsheng Long
16
178
0
22 Feb 2021
Better Call the Plumber: Orchestrating Dynamic Information Extraction
  Pipelines
Better Call the Plumber: Orchestrating Dynamic Information Extraction Pipelines
M. Y. Jaradeh
Kuldeep Singh
M. Stocker
A. Both
Sören Auer
9
7
0
22 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
14
5
0
20 Feb 2021
Entity Structure Within and Throughout: Modeling Mention Dependencies
  for Document-Level Relation Extraction
Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
Benfeng Xu
Quan Wang
Yajuan Lyu
Yong Zhu
Zhendong Mao
22
166
0
20 Feb 2021
Scaling up DNA digital data storage by efficiently predicting DNA
  hybridisation using deep learning
Scaling up DNA digital data storage by efficiently predicting DNA hybridisation using deep learning
David Buterez
14
7
0
19 Feb 2021
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A
  Transformer Based Approach
Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach
Anshul Wadhawan
Akshita Aggarwal
14
29
0
19 Feb 2021
MUDES: Multilingual Detection of Offensive Spans
MUDES: Multilingual Detection of Offensive Spans
Tharindu Ranasinghe
Marcos Zampieri
19
41
0
18 Feb 2021
Training Large-Scale News Recommenders with Pretrained Language Models
  in the Loop
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLM
AIFin
119
41
0
18 Feb 2021
Open-Retrieval Conversational Machine Reading
Open-Retrieval Conversational Machine Reading
Yifan Gao
Jingjing Li
Chien-Sheng Wu
M. Lyu
Irwin King
40
17
0
17 Feb 2021
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering
Abhilasha Ravichander
Siddharth Dalmia
Maria Ryskina
Florian Metze
Eduard H. Hovy
A. Black
ELM
21
32
0
16 Feb 2021
Previous
123...606162...686970
Next