ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.11692
  4. Cited By
RoBERTa: A Robustly Optimized BERT Pretraining Approach

RoBERTa: A Robustly Optimized BERT Pretraining Approach

26 July 2019
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
    AIMat
ArXivPDFHTML

Papers citing "RoBERTa: A Robustly Optimized BERT Pretraining Approach"

50 / 3,476 papers shown
Title
Does the Whole Exceed its Parts? The Effect of AI Explanations on
  Complementary Team Performance
Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance
Gagan Bansal
Tongshuang Wu
Joyce Zhou
Raymond Fok
Besmira Nushi
Ece Kamar
Marco Tulio Ribeiro
Daniel S. Weld
23
577
0
26 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
13
45
0
22 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
72
0
16 Jun 2020
On the Computational Power of Transformers and its Implications in
  Sequence Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
S. Bhattamishra
Arkil Patel
Navin Goyal
25
63
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
50
61
0
16 Jun 2020
Minimum Width for Universal Approximation
Minimum Width for Universal Approximation
Sejun Park
Chulhee Yun
Jaeho Lee
Jinwoo Shin
25
121
0
16 Jun 2020
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on
  Resource Rich Tasks
To Pretrain or Not to Pretrain: Examining the Benefits of Pretraining on Resource Rich Tasks
Sinong Wang
Madian Khabsa
Hao Ma
16
26
0
15 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
31
1,586
0
15 Jun 2020
SemEval-2020 Task 12: Multilingual Offensive Language Identification in
  Social Media (OffensEval 2020)
SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)
Marcos Zampieri
Preslav Nakov
Sara Rosenthal
Pepa Atanasova
Georgi Karadzhov
Hamdy Mubarak
Leon Derczynski
Zeses Pitenis
cCaugri cColtekin
19
481
0
12 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
19
432
0
11 Jun 2020
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason
  Over Implicit Knowledge
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge
Alon Talmor
Oyvind Tafjord
Peter Clark
Yoav Goldberg
Jonathan Berant
ReLM
LRM
17
39
0
11 Jun 2020
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
  Cross-Lingual NLP
CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
Libo Qin
Minheng Ni
Yue Zhang
Wanxiang Che
30
149
0
11 Jun 2020
A Monolingual Approach to Contextualized Word Embeddings for
  Mid-Resource Languages
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
Pedro Ortiz Suarez
Laurent Romary
Benoît Sagot
15
227
0
11 Jun 2020
Report from the NSF Future Directions Workshop, Toward User-Oriented
  Agents: Research Directions and Challenges
Report from the NSF Future Directions Workshop, Toward User-Oriented Agents: Research Directions and Challenges
M. Eskénazi
Tiancheng Zhao
LLMAG
AI4TS
AI4CE
36
9
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
30
441
0
10 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
58
1,645
0
08 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
40
98
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
62
2,614
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
26
229
0
05 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
22
72
0
31 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
15
39,958
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
31
33
0
27 May 2020
Rationalizing Text Matching: Learning Sparse Alignments via Optimal
  Transport
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Kyle Swanson
L. Yu
Tao Lei
OT
24
37
0
27 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
6
304
0
26 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other
  Affectual States from Text
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
17
312
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
  Injection into Pretrained Transformers
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavavs
KELM
29
79
0
24 May 2020
BERTweet: A pre-trained language model for English Tweets
BERTweet: A pre-trained language model for English Tweets
Dat Quoc Nguyen
Thanh Vu
A. Nguyen
VLM
9
900
0
20 May 2020
SciSight: Combining faceted navigation and research group detection for
  COVID-19 exploratory scientific search
SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search
Tom Hope
Jason Portenoy
Kishore Vasan
Jon Borchardt
Eric Horvitz
Daniel S. Weld
Marti A. Hearst
Jevin D. West
FedML
13
58
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
25
30
0
20 May 2020
Human Instruction-Following with Deep Reinforcement Learning via
  Transfer-Learning from Text
Human Instruction-Following with Deep Reinforcement Learning via Transfer-Learning from Text
Felix Hill
Soňa Mokrá
Nathaniel Wong
Tim Harley
LM&Ro
11
81
0
19 May 2020
Table Search Using a Deep Contextualized Language Model
Table Search Using a Deep Contextualized Language Model
Zhiyu Zoey Chen
M. Trabelsi
J. Heflin
Yinan Xu
Brian D. Davison
LMTD
18
56
0
19 May 2020
Quantifying the Uncertainty of Precision Estimates for Rule based Text
  Classifiers
Quantifying the Uncertainty of Precision Estimates for Rule based Text Classifiers
J. Nutaro
Özgür Özmen
16
0
0
19 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
8
316
0
18 May 2020
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained
  Conversational Representations
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Sam Coope
Tyler Farghly
D. Gerz
Ivan Vulić
Matthew Henderson
19
62
0
18 May 2020
T-VSE: Transformer-Based Visual Semantic Embedding
T-VSE: Transformer-Based Visual Semantic Embedding
M. Bastan
Arnau Ramisa
Mehmet Tek
ViT
16
7
0
17 May 2020
ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether
  Self-Training Helps Them
ApplicaAI at SemEval-2020 Task 11: On RoBERTa-CRF, Span CLS and Whether Self-Training Helps Them
Dawid Jurkiewicz
Łukasz Borchmann
Izabela Kosmala
Filip Graliñski
6
39
0
16 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
13
466
0
15 May 2020
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse
  COVID-19 Content on Twitter
COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter
Martin Müller
M. Salathé
P. Kummervold
VLM
MedIm
AI4MH
24
355
0
15 May 2020
That is a Known Lie: Detecting Previously Fact-Checked Claims
That is a Known Lie: Detecting Previously Fact-Checked Claims
Shaden Shaar
Giovanni Da San Martino
Nikolay Babulkov
Preslav Nakov
HILM
44
152
0
12 May 2020
A Report on the 2020 Sarcasm Detection Shared Task
A Report on the 2020 Sarcasm Detection Shared Task
Debanjan Ghosh
Avijit Vajpayee
Smaranda Muresan
16
59
0
12 May 2020
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for
  Answering Winograd Schema Challenge
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
Hongming Zhang
Xinran Zhao
Yangqiu Song
16
54
0
12 May 2020
On the Robustness of Language Encoders against Grammatical Errors
On the Robustness of Language Encoders against Grammatical Errors
Fan Yin
Quanyu Long
Tao Meng
Kai-Wei Chang
31
33
0
12 May 2020
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
  Teaching
SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching
Baolin Peng
Chunyuan Li
Jinchao Li
Shahin Shayandeh
Lars Liden
Jianfeng Gao
25
125
0
11 May 2020
Commonsense Evidence Generation and Injection in Reading Comprehension
Commonsense Evidence Generation and Injection in Reading Comprehension
Ye Liu
Tao Yang
Zeyu You
Wei Fan
Philip S. Yu
25
14
0
11 May 2020
schuBERT: Optimizing Elements of BERT
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar S. Karnin
23
30
0
09 May 2020
Temporal Common Sense Acquisition with Minimal Supervision
Temporal Common Sense Acquisition with Minimal Supervision
Ben Zhou
Qiang Ning
Daniel Khashabi
Dan Roth
19
92
0
08 May 2020
Evidence Inference 2.0: More Data, Better Models
Evidence Inference 2.0: More Data, Better Models
Jay DeYoung
Eric P. Lehman
Benjamin E. Nye
Iain J. Marshall
Byron C. Wallace
9
68
0
08 May 2020
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy
  Efficient Inference
GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference
Ali Hadi Zadeh
Isak Edo
Omar Mohamed Awad
Andreas Moshovos
MQ
19
183
0
08 May 2020
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for
  Multi-Document Summarization
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
Yang Gao
Wei-Ye Zhao
Steffen Eger
ELM
6
124
0
07 May 2020
To Test Machine Comprehension, Start by Defining Comprehension
To Test Machine Comprehension, Start by Defining Comprehension
Jesse Dunietz
Greg Burnham
Akash Bharadwaj
Owen Rambow
Jennifer Chu-Carroll
D. Ferrucci
FaML
52
64
0
04 May 2020
Previous
123...6667686970
Next