ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 33,017 papers shown
Title
How Much Does Attention Actually Attend? Questioning the Importance of
  Attention in Pretrained Transformers
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
201
31
0
07 Nov 2022
Exploiting Transformer-based Multitask Learning for the Detection of
  Media Bias in News Articles
Exploiting Transformer-based Multitask Learning for the Detection of Media Bias in News ArticlesiConference (iConference), 2022
Timo Spinde
Jan-David Krieger
Terry Ruas
Jelena Mitrović
Franz Götz-Hahn
Akiko Aizawa
Bela Gipp
177
29
0
07 Nov 2022
Predictive Coding beyond Gaussian Distributions
Predictive Coding beyond Gaussian DistributionsNeural Information Processing Systems (NeurIPS), 2022
Luca Pinchetti
Tommaso Salvatori
Yordan Yordanov
Beren Millidge
Yuhang Song
Thomas Lukasiewicz
UQCVBDL
210
17
0
07 Nov 2022
Generative Transformers for Design Concept Generation
Generative Transformers for Design Concept GenerationJournal of Computing and Information Science in Engineering (JCISE), 2022
Qihao Zhu
Jianxi Luo
AI4CE
165
64
0
07 Nov 2022
Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC
Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC
Ze Chen
Kangxu Wang
Zijian Cai
Jiewen Zheng
Jiarong He
Max Gao
Jason Zhang
MoE
131
3
0
07 Nov 2022
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
  Question Answering
NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual Question AnsweringNatural Language Processing and Chinese Computing (NLPCC), 2022
Tengxun Zhang
Hongfei Xu
Josef van Genabith
Deyi Xiong
Hongying Zan
AIMatLRM
231
8
0
07 Nov 2022
FIXED: Frustratingly Easy Domain Generalization with Mixup
FIXED: Frustratingly Easy Domain Generalization with Mixup
Wang Lu
Yongfeng Zhang
Han Yu
Lei Huang
Xiang Zhang
Yiqiang Chen
Xingxu Xie
251
12
0
07 Nov 2022
Contrastive Learning with Prompt-derived Virtual Semantic Prototypes for
  Unsupervised Sentence Embedding
Contrastive Learning with Prompt-derived Virtual Semantic Prototypes for Unsupervised Sentence EmbeddingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jiali Zeng
Yongjing Yin
Yu Jiang
Shuangzhi Wu
Yunbo Cao
SSL
120
13
0
07 Nov 2022
Contrastive Learning enhanced Author-Style Headline Generation
Contrastive Learning enhanced Author-Style Headline GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Hui Liu
Weidong Guo
Yige Chen
Xiangyang Li
132
5
0
07 Nov 2022
AD-BERT: Using Pre-trained contextualized embeddings to Predict the
  Progression from Mild Cognitive Impairment to Alzheimer's Disease
AD-BERT: Using Pre-trained contextualized embeddings to Predict the Progression from Mild Cognitive Impairment to Alzheimer's Disease
Chengsheng Mao
Jie Xu
Luke Rasmussen
Yikuan Li
P. Adekkanattu
...
R. Vassar
Guoqian Jiang
Fei Wang
Jyotishman Pathak
Yuan Luo
154
6
0
07 Nov 2022
Complex Reading Comprehension Through Question Decomposition
Complex Reading Comprehension Through Question DecompositionAustralasian Language Technology Association Workshop (ALTA), 2022
Xiao-Yu Guo
Yuan-Fang Li
Gholamreza Haffari
ReLM
123
11
0
07 Nov 2022
Reconciliation of Pre-trained Models and Prototypical Neural Networks in
  Few-shot Named Entity Recognition
Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity RecognitionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Youcheng Huang
Wenqiang Lei
Jie Fu
Jiancheng Lv
137
3
0
07 Nov 2022
Prompter: Utilizing Large Language Model Prompting for a Data Efficient
  Embodied Instruction Following
Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following
Y. Inoue
Hiroki Ohashi
LM&Ro
150
50
0
07 Nov 2022
AfroLM: A Self-Active Learning-based Multilingual Pretrained Language
  Model for 23 African Languages
AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages
Bonaventure F. P. Dossou
A. Tonja
Oreen Yousuf
Salomey Osei
Abigail Oppong
Iyanuoluwa Shode
Oluwabusayo Olufunke Awoyomi
Chris C. Emezue
165
62
0
07 Nov 2022
Zero-Shot Classification by Logical Reasoning on Natural Language
  Explanations
Zero-Shot Classification by Logical Reasoning on Natural Language ExplanationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Chi Han
Hengzhi Pei
Xinya Du
Heng Ji
NAI
186
4
0
07 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
MogaNet: Multi-order Gated Aggregation NetworkInternational Conference on Learning Representations (ICLR), 2022
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
245
115
0
07 Nov 2022
Computing and Exploiting Document Structure to Improve Unsupervised
  Extractive Summarization of Legal Case Decisions
Computing and Exploiting Document Structure to Improve Unsupervised Extractive Summarization of Legal Case Decisions
Yang Zhong
Diane Litman
210
10
0
06 Nov 2022
On the Domain Adaptation and Generalization of Pretrained Language
  Models: A Survey
On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey
Xu Guo
Han Yu
LM&MAVLM
291
33
0
06 Nov 2022
Noisy Channel for Automatic Text Simplification
Noisy Channel for Automatic Text Simplification
Oscar M. Cumbicus-Pineda
Iker Gutiérrez-Fandino
Itziar Gonzalez-Dios
Aitor Soroa Etxabe
148
0
0
06 Nov 2022
A Survey on Influence Maximization: From an ML-Based Combinatorial
  Optimization
A Survey on Influence Maximization: From an ML-Based Combinatorial OptimizationACM Transactions on Knowledge Discovery from Data (TKDD), 2022
Yandi Li
Haobo Gao
Yunxuan Gao
Jianxiong Guo
Weili Wu
264
62
0
06 Nov 2022
Improved Target-specific Stance Detection on Social Media Platforms by
  Delving into Conversation Threads
Improved Target-specific Stance Detection on Social Media Platforms by Delving into Conversation ThreadsIEEE Transactions on Computational Social Systems (IEEE TCSS), 2022
Yupeng Li
Haorui He
Shaonan Wang
F. Lau
Yunya Song
165
22
0
06 Nov 2022
Suffix Retrieval-Augmented Language Modeling
Suffix Retrieval-Augmented Language ModelingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zecheng Wang
Yik-Cheung Tam
RALM
177
2
0
06 Nov 2022
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement
  Learning
Wall Street Tree Search: Risk-Aware Planning for Offline Reinforcement Learning
D. Elbaz
Gal Novik
Oren Salzman
OffRL
273
0
0
06 Nov 2022
Knowledge is Power: Understanding Causality Makes Legal judgment
  Prediction Models More Generalizable and Robust
Knowledge is Power: Understanding Causality Makes Legal judgment Prediction Models More Generalizable and Robust
Haotian Chen
Lingwei Zhang
Yiran Liu
Fanchao Chen
Yang Yu
AILawELM
189
6
0
06 Nov 2022
Tuning Language Models as Training Data Generators for
  Augmentation-Enhanced Few-Shot Learning
Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot LearningInternational Conference on Machine Learning (ICML), 2022
Yu Meng
Martin Michalski
Jiaxin Huang
Yu Zhang
Tarek Abdelzaher
Jiawei Han
VLM
216
54
0
06 Nov 2022
Calibration Meets Explanation: A Simple and Effective Approach for Model
  Confidence Estimates
Calibration Meets Explanation: A Simple and Effective Approach for Model Confidence EstimatesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Dongfang Li
Baotian Hu
Qingcai Chen
119
10
0
06 Nov 2022
Prompt-based Text Entailment for Low-Resource Named Entity Recognition
Prompt-based Text Entailment for Low-Resource Named Entity RecognitionInternational Conference on Computational Linguistics (COLING), 2022
Dongfang Li
Baotian Hu
Qingcai Chen
165
7
0
06 Nov 2022
Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Bridging Speech and Textual Pre-trained Models with Unsupervised ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jiatong Shi
Chan-Jan Hsu
Ho-Lam Chung
Dongji Gao
Leibny Paola García-Perera
Shinji Watanabe
Ann Lee
Hung-yi Lee
153
13
0
06 Nov 2022
Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source
  Localization
Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source LocalizationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Dennis Fedorishin
D. Mohan
Bhavin Jawade
S. Setlur
V. Govindaraju
VGen
167
14
0
06 Nov 2022
Robust Lottery Tickets for Pre-trained Language Models
Robust Lottery Tickets for Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Rui Zheng
Rong Bao
Yuhao Zhou
Di Liang
Sirui Wang
Wei Wu
Tao Gui
Tao Gui
Xuanjing Huang
AAML
201
20
0
06 Nov 2022
Event and Entity Extraction from Generated Video Captions
Event and Entity Extraction from Generated Video CaptionsInternational Cross-Domain Conference on Machine Learning and Knowledge Extraction (CD-MAKE), 2022
Johannes Scherer
A. Scherp
Deepayan Bhowmik
186
0
0
05 Nov 2022
Learning to Infer from Unlabeled Data: A Semi-supervised Learning
  Approach for Robust Natural Language Inference
Learning to Infer from Unlabeled Data: A Semi-supervised Learning Approach for Robust Natural Language InferenceConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mobashir Sadat
Cornelia Caragea
125
3
0
05 Nov 2022
Privacy-Preserving Models for Legal Natural Language Processing
Privacy-Preserving Models for Legal Natural Language Processing
Ying Yin
Ivan Habernal
PILMAILaw
160
8
0
05 Nov 2022
The Legal Argument Reasoning Task in Civil Procedure
The Legal Argument Reasoning Task in Civil Procedure
Leonard Bongard
Lena Held
Ivan Habernal
AILawELM
130
22
0
05 Nov 2022
Tri-Attention: Explicit Context-Aware Attention Mechanism for Natural
  Language Processing
Tri-Attention: Explicit Context-Aware Attention Mechanism for Natural Language Processing
Rui Yu
Yifeng Li
Sibo Wei
LongBing Cao
124
1
0
05 Nov 2022
HERB: Measuring Hierarchical Regional Bias in Pre-trained Language
  Models
HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models
Yi Zhou
Ge Zhang
Bohao Yang
Chenghua Lin
Shi Wang
Anton Ragni
Jie Fu
128
10
0
05 Nov 2022
Textual Manifold-based Defense Against Natural Language Adversarial
  Examples
Textual Manifold-based Defense Against Natural Language Adversarial ExamplesConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
D. M. Nguyen
Anh Tuan Luu
AAML
249
24
0
05 Nov 2022
BEKG: A Built Environment Knowledge Graph
BEKG: A Built Environment Knowledge Graph
Xiaojun Yang
Haoyu Zhong
Penglin Du
Keyi Zhou
Xingjin Lai
Zhengdong Wang
Yik Lun Lau
Yangqiu Song
Liyaning Tang
133
3
0
05 Nov 2022
Inductive Graph Transformer for Delivery Time Estimation
Inductive Graph Transformer for Delivery Time EstimationWeb Search and Data Mining (WSDM), 2022
Xin Zhou
Jinglong Wang
Yong Liu
Xin Wu
Zhiqi Shen
Cyril Leung
87
20
0
05 Nov 2022
Coarse-to-fine Knowledge Graph Domain Adaptation based on
  Distantly-supervised Iterative Training
Coarse-to-fine Knowledge Graph Domain Adaptation based on Distantly-supervised Iterative TrainingIEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022
Homgmin Cai
Wenxiong Liao
Zheng Liu
Yiyang Zhang
Xiaoke Huang
...
Lingfei Wu
Ninghao Liu
Shijie Zhao
Tianming Liu
Xiang Li
146
22
0
05 Nov 2022
EventEA: Benchmarking Entity Alignment for Event-centric Knowledge
  Graphs
EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs
Xiaobin Tian
Zequn Sun
Guang-pu Li
Wei Hu
146
1
0
05 Nov 2022
PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze
  Pre-training
PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-trainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zihui Gu
Ju Fan
Nan Tang
Preslav Nakov
Xiaoman Zhao
Xiaoyong Du
LMTD
188
54
0
05 Nov 2022
Evaluation of Automated Speech Recognition Systems for Conversational
  Speech: A Linguistic Perspective
Evaluation of Automated Speech Recognition Systems for Conversational Speech: A Linguistic Perspective
H. Pasandi
Haniyeh B. Pasandi
174
1
0
05 Nov 2022
Hierarchical Multi-Label Classification of Scientific Documents
Hierarchical Multi-Label Classification of Scientific DocumentsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Mobashir Sadat
Cornelia Caragea
125
23
0
05 Nov 2022
Forecasting User Interests Through Topic Tag Predictions in Online
  Health Communities
Forecasting User Interests Through Topic Tag Predictions in Online Health CommunitiesIEEE journal of biomedical and health informatics (IEEE JBHI), 2022
A. Adishesha
Lily Jakielaszek
Fariha Azhar
Peixuan Zhang
Vasant Honavar
Fenglong Ma
C. Belani
P. Mitra
Sharon X. Huang
42
4
0
05 Nov 2022
KGLM: Integrating Knowledge Graph Structure in Language Models for Link
  Prediction
KGLM: Integrating Knowledge Graph Structure in Language Models for Link Prediction
Jason Youn
I. Tagkopoulos
KELM
245
31
0
04 Nov 2022
Intriguing Properties of Compression on Multilingual Models
Intriguing Properties of Compression on Multilingual ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kelechi Ogueji
Orevaoghene Ahia
Gbemileke Onilude
Sebastian Gehrmann
Sara Hooker
Julia Kreutzer
285
15
0
04 Nov 2022
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior
  Modeling Generalization
GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling GeneralizationNeural Information Processing Systems (NeurIPS), 2022
Xuhai Xu
Han Zhang
Yasaman S. Sefidgar
Yiyi Ren
Xin Liu
...
Tim Althoff
Margaret E. Morris
E. Riskin
Jennifer Mankoff
A. Dey
153
51
0
04 Nov 2022
1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality
  Classification of Socio-Political Event Data
1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event DataCASE (CASE), 2022
A. Nik
Ge Zhang
Xingran Chen
Mingyu Li
Jie Fu
128
4
0
04 Nov 2022
Resource-Efficient Transfer Learning From Speech Foundation Model Using
  Hierarchical Feature Fusion
Resource-Efficient Transfer Learning From Speech Foundation Model Using Hierarchical Feature FusionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Zhouyuan Huo
K. Sim
Yue Liu
DongSeon Hwang
Tara N. Sainath
Trevor Strohman
143
8
0
04 Nov 2022
Previous
123...298299300...659660661
Next