ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLM
    SSL
    SSeg
ArXivPDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 15,000 papers shown
Title
Plato Dialogue System: A Flexible Conversational AI Research Platform
Plato Dialogue System: A Flexible Conversational AI Research Platform
Alexandros Papangelis
Mahdi Namazifar
Chandra Khatri
Yi-Chia Wang
Piero Molino
Gökhan Tür
LLMAG
21
23
0
17 Jan 2020
A Common Semantic Space for Monolingual and Cross-Lingual
  Meta-Embeddings
A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings
G. R. Claramunt
Rodrigo Agerri
German Rigau
27
7
0
17 Jan 2020
RobBERT: a Dutch RoBERTa-based Language Model
RobBERT: a Dutch RoBERTa-based Language Model
Pieter Delobelle
Thomas Winters
Bettina Berendt
10
232
0
17 Jan 2020
Graph-Bert: Only Attention is Needed for Learning Graph Representations
Graph-Bert: Only Attention is Needed for Learning Graph Representations
Jiawei Zhang
Haopeng Zhang
Congying Xia
Li Sun
23
297
0
15 Jan 2020
"Why is 'Chicago' deceptive?" Towards Building Model-Driven Tutorials
  for Humans
"Why is 'Chicago' deceptive?" Towards Building Model-Driven Tutorials for Humans
Vivian Lai
Han Liu
Chenhao Tan
24
138
0
14 Jan 2020
Multi-Source Domain Adaptation for Text Classification via
  DistanceNet-Bandits
Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits
Han Guo
Ramakanth Pasunuru
Mohit Bansal
22
114
0
13 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark
  for Chinese
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
30
48
0
13 Jan 2020
Residual Attention Net for Superior Cross-Domain Time Sequence Modeling
Residual Attention Net for Superior Cross-Domain Time Sequence Modeling
Seth H. Huang
Lingjie Xu
Congwei Jiang
AI4TS
26
10
0
13 Jan 2020
Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and
  On-Device Inference
Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and On-Device Inference
Changhao Chen
Peijun Zhao
Chris Xiaoxuan Lu
Wei Wang
Andrew Markham
A. Trigoni
21
112
0
13 Jan 2020
A comprehensive deep learning-based approach to reduced order modeling
  of nonlinear time-dependent parametrized PDEs
A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs
S. Fresca
Luca Dede'
Andrea Manzoni
AI4CE
17
258
0
12 Jan 2020
Rethinking Generalization of Neural Models: A Named Entity Recognition
  Case Study
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study
Jinlan Fu
Pengfei Liu
Qi Zhang
Xuanjing Huang
AI4CE
25
73
0
12 Jan 2020
Linking Social Media Posts to News with Siamese Transformers
Linking Social Media Posts to News with Siamese Transformers
Jacob Danovitch
18
2
0
10 Jan 2020
Multiplex Word Embeddings for Selectional Preference Acquisition
Multiplex Word Embeddings for Selectional Preference Acquisition
Hongming Zhang
Jiaxin Bai
Yan Song
Kun Xu
Changlong Yu
Yangqiu Song
Wilfred Ng
Dong Yu
11
17
0
09 Jan 2020
On Interpretability of Artificial Neural Networks: A Survey
On Interpretability of Artificial Neural Networks: A Survey
Fenglei Fan
Jinjun Xiong
Mengzhou Li
Ge Wang
AAML
AI4CE
38
300
0
08 Jan 2020
An Exploration of Embodied Visual Exploration
An Exploration of Embodied Visual Exploration
Santhosh Kumar Ramakrishnan
Dinesh Jayaraman
Kristen Grauman
LM&Ro
27
98
0
07 Jan 2020
Attention over Parameters for Dialogue Systems
Attention over Parameters for Dialogue Systems
Andrea Madotto
Zhaojiang Lin
Chien-Sheng Wu
Jamin Shin
Pascale Fung
22
20
0
07 Jan 2020
Language Models Are An Effective Patient Representation Learning
  Technique For Electronic Health Record Data
Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data
E. Steinberg
Kenneth Jung
Jason Alan Fries
Conor K. Corbin
Stephen R. Pfohl
N. Shah
18
103
0
06 Jan 2020
Social Media Attributions in the Context of Water Crisis
Social Media Attributions in the Context of Water Crisis
Rupak Sarkar
Hirak Sarkar
Sayantan Mahinder
Ashiqur R. KhudaBukhsh
11
10
0
06 Jan 2020
A Survey on Machine Reading Comprehension Systems
A Survey on Machine Reading Comprehension Systems
Razieh Baradaran
Razieh Ghiasi
Hossein Amirkhani
FaML
11
85
0
06 Jan 2020
Stance Detection Benchmark: How Robust Is Your Stance Detection?
Stance Detection Benchmark: How Robust Is Your Stance Detection?
Benjamin Schiller
Johannes Daxenberger
Iryna Gurevych
11
95
0
06 Jan 2020
Computationally Efficient NER Taggers with Combined Embeddings and Constrained Decoding
Brian Lester
Daniel Pressel
Amy Hemmeter
Sagnik Ray Choudhury
14
3
0
05 Jan 2020
Empirical Studies on the Properties of Linear Regions in Deep Neural
  Networks
Empirical Studies on the Properties of Linear Regions in Deep Neural Networks
Xiao Zhang
Dongrui Wu
8
38
0
04 Jan 2020
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text
  Segmentation
Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation
Goran Glavas
Swapna Somasundaran
VLM
16
55
0
03 Jan 2020
TED: A Pretrained Unsupervised Summarization Model with Theme Modeling
  and Denoising
TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising
Ziyi Yang
Chenguang Zhu
R. Gmyr
Michael Zeng
Xuedong Huang
Eric Darve
15
61
0
03 Jan 2020
A Deep Learning Approach to Diagnosing Multiple Sclerosis from
  Smartphone Data
A Deep Learning Approach to Diagnosing Multiple Sclerosis from Smartphone Data
Patrick Schwab
W. Karlen
28
24
0
02 Jan 2020
Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Gwenaelle Cunha Sergio
Minho Lee
19
30
0
01 Jan 2020
Deep Attentive Ranking Networks for Learning to Order Sentences
Deep Attentive Ranking Networks for Learning to Order Sentences
Pawan Kumar
Dhanajit Brahma
H. Karnick
Piyush Rai
8
45
0
31 Dec 2019
oLMpics -- On what Language Model Pre-training Captures
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
17
300
0
31 Dec 2019
AraNet: A Deep Learning Toolkit for Arabic Social Media
AraNet: A Deep Learning Toolkit for Arabic Social Media
Muhammad Abdul-Mageed
Chiyu Zhang
A. Hashemi
El Moatez Billah Nagoudi
GNN
8
32
0
30 Dec 2019
Semi-Supervised Learning with Normalizing Flows
Semi-Supervised Learning with Normalizing Flows
Pavel Izmailov
Polina Kirichenko
Marc Finzi
A. Wilson
DRL
BDL
25
111
0
30 Dec 2019
AutoDiscern: Rating the Quality of Online Health Information with
  Hierarchical Encoder Attention-based Neural Networks
AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural Networks
Laura Kinkead
Ahmed Allam
Michael Krauthammer
17
19
0
30 Dec 2019
Machine Learning from a Continuous Viewpoint
Machine Learning from a Continuous Viewpoint
E. Weinan
Chao Ma
Lei Wu
23
102
0
30 Dec 2019
Towards Deep Federated Defenses Against Malware in Cloud Ecosystems
Towards Deep Federated Defenses Against Malware in Cloud Ecosystems
Josh Payne
A. Kundu
FedML
15
10
0
27 Dec 2019
Encoding word order in complex embeddings
Encoding word order in complex embeddings
Benyou Wang
Donghao Zhao
Christina Lioma
Qiuchi Li
Peng Zhang
J. Simonsen
11
111
0
27 Dec 2019
Multi-Graph Transformer for Free-Hand Sketch Recognition
Multi-Graph Transformer for Free-Hand Sketch Recognition
Peng-Tao Xu
Chaitanya K. Joshi
Xavier Bresson
ViT
17
85
0
24 Dec 2019
Probing the phonetic and phonological knowledge of tones in Mandarin TTS
  models
Probing the phonetic and phonological knowledge of tones in Mandarin TTS models
Jian Zhu
16
8
0
23 Dec 2019
A Multimodal Target-Source Classifier with Attention Branches to
  Understand Ambiguous Instructions for Fetching Daily Objects
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects
A. Magassouba
K. Sugiura
Hisashi Kawai
38
9
0
23 Dec 2019
Harnessing Evolution of Multi-Turn Conversations for Effective Answer
  Retrieval
Harnessing Evolution of Multi-Turn Conversations for Effective Answer Retrieval
Mohammad Aliannejadi
Manajit Chakraborty
E. A. Ríssola
Fabio Crestani
14
48
0
22 Dec 2019
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language
  Model
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
Wenhan Xiong
Jingfei Du
William Yang Wang
Veselin Stoyanov
SSL
KELM
36
201
0
20 Dec 2019
BERTje: A Dutch BERT Model
BERTje: A Dutch BERT Model
Wietse de Vries
Andreas van Cranenburgh
Arianna Bisazza
Tommaso Caselli
Gertjan van Noord
Malvina Nissim
VLM
SSeg
11
291
0
19 Dec 2019
Fashion Outfit Complementary Item Retrieval
Fashion Outfit Complementary Item Retrieval
Yen-Liang Lin
Son N. Tran
Larry S. Davis
8
84
0
19 Dec 2019
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive
  Summarization
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Jingqing Zhang
Yao-Min Zhao
Mohammad Saleh
Peter J. Liu
RALM
3DGS
43
2,011
0
18 Dec 2019
Transfer learning in hybrid classical-quantum neural networks
Transfer learning in hybrid classical-quantum neural networks
A. Mari
T. Bromley
J. Izaac
Maria Schuld
N. Killoran
19
282
0
17 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
868
0
17 Dec 2019
A Multi-task Learning Model for Chinese-oriented Aspect Polarity
  Classification and Aspect Term Extraction
A Multi-task Learning Model for Chinese-oriented Aspect Polarity Classification and Aspect Term Extraction
Heng Yang
Biqing Zeng
Jianhao Yang
Youwei Song
Ruyang Xu
32
132
0
17 Dec 2019
Cross-Lingual Ability of Multilingual BERT: An Empirical Study
Cross-Lingual Ability of Multilingual BERT: An Empirical Study
Karthikeyan K
Zihan Wang
Stephen D. Mayhew
Dan Roth
LRM
25
334
0
17 Dec 2019
Multilingual is not enough: BERT for Finnish
Multilingual is not enough: BERT for Finnish
Antti Virtanen
Jenna Kanerva
Rami Ilo
Jouni Luoma
Juhani Luotolahti
T. Salakoski
Filip Ginter
S. Pyysalo
25
277
0
15 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
FlauBERT: Unsupervised Language Model Pre-training for French
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
35
395
0
11 Dec 2019
CoSimLex: A Resource for Evaluating Graded Word Similarity in Context
CoSimLex: A Resource for Evaluating Graded Word Similarity in Context
C. S. Armendariz
Matthew Purver
Matej Ulčar
Senja Pollak
Nikola Ljubesic
Marko Robnik-Šikonja
Mark Granroth-Wilding
Kristiina Vaik
14
35
0
11 Dec 2019
Improving Neural Protein-Protein Interaction Extraction with Knowledge
  Selection
Improving Neural Protein-Protein Interaction Extraction with Knowledge Selection
Huiwei Zhou
Xuefei Li
Weihong Yao
Zhuang Liu
Shixian Ning
Chengkun Lang
Lei Du
17
7
0
11 Dec 2019
Previous
123...288289290...298299300
Next