ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.04805
  4. Cited By
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
v1v2 (latest)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

11 October 2018
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
    VLMSSLSSeg
ArXiv (abs)PDFHTML

Papers citing "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"

50 / 33,018 papers shown
Title
Understanding Chat Messages for Sticker Recommendation in Messaging Apps
Understanding Chat Messages for Sticker Recommendation in Messaging Apps
Abhishek Laddha
Mohamed Hanoosh
Debdoot Mukherjee
Parth Patwa
Ankur Narang
158
18
0
07 Feb 2019
Compression of Recurrent Neural Networks for Efficient Language Modeling
Compression of Recurrent Neural Networks for Efficient Language Modeling
Artem M. Grachev
D. Ignatov
Andrey V. Savchenko
151
42
0
06 Feb 2019
End-to-End Open-Domain Question Answering with BERTserini
End-to-End Open-Domain Question Answering with BERTserini
Wei Yang
Yuqing Xie
Aileen Lin
Xingyu Li
Luchen Tan
Kun Xiong
Ming Li
Jimmy J. Lin
RALM
258
509
0
05 Feb 2019
The Referential Reader: A Recurrent Entity Network for Anaphora
  Resolution
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution
Fei Liu
Luke Zettlemoyer
Jacob Eisenstein
203
18
0
05 Feb 2019
An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling
An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling
Juri Opitz
Anette Frank
143
7
0
04 Feb 2019
Attention in Natural Language Processing
Attention in Natural Language Processing
Andrea Galassi
Marco Lippi
Paolo Torroni
GNN
364
548
0
04 Feb 2019
Exploring Temporal Dependencies in Multimodal Referring Expressions with
  Mixed Reality
Exploring Temporal Dependencies in Multimodal Referring Expressions with Mixed Reality
E. Sibirtseva
Ali Ghadirzadeh
Iolanda Leite
Mårten Björkman
Danica Kragic
107
4
0
04 Feb 2019
A Comprehensive Exploration on WikiSQL with Table-Aware Word
  Contextualization
A Comprehensive Exploration on WikiSQL with Table-Aware Word Contextualization
Wonseok Hwang
Ji-Yoon Yim
Seunghyun Park
Minjoon Seo
271
259
0
04 Feb 2019
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Extracting Multiple-Relations in One-Pass with Pre-Trained Transformers
Haoyu Wang
Ming Tan
Mo Yu
Shiyu Chang
Dakuo Wang
Kun Xu
Xiaoxiao Guo
Saloni Potdar
ViT
187
102
0
04 Feb 2019
Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph
  Neural Networks in Molecular Graph Analysis
Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis
Katsuhiko Ishiguro
S. Maeda
Masanori Koyama
GNN
175
33
0
04 Feb 2019
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural
  Language Inference
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
R. Thomas McCoy
Ellie Pavlick
Tal Linzen
685
1,323
0
04 Feb 2019
Improving Question Answering with External Knowledge
Improving Question Answering with External Knowledge
Xiaoman Pan
Kai Sun
Dian Yu
Jianshu Chen
Heng Ji
Claire Cardie
Dong Yu
KELM
197
66
0
03 Feb 2019
Incremental Learning with Maximum Entropy Regularization: Rethinking
  Forgetting and Intransigence
Incremental Learning with Maximum Entropy Regularization: Rethinking Forgetting and Intransigence
Dahyun Kim
Jihwan Bae
Yeonsik Jo
Jonghyun Choi
OODCLL
111
22
0
03 Feb 2019
Review Conversational Reading Comprehension
Review Conversational Reading Comprehension
Hu Xu
Bing-Quan Liu
Lei Shu
Philip S. Yu
169
18
0
03 Feb 2019
Parameter-Efficient Transfer Learning for NLP
Parameter-Efficient Transfer Learning for NLPInternational Conference on Machine Learning (ICML), 2019
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
597
5,551
0
02 Feb 2019
A Multi-Resolution Word Embedding for Document Retrieval from Large
  Unstructured Knowledge Bases
A Multi-Resolution Word Embedding for Document Retrieval from Large Unstructured Knowledge Bases
Tolgahan Cakaloglu
Xiaowei Xu
RALM
260
5
0
02 Feb 2019
tax2vec: Constructing Interpretable Features from Taxonomies for Short
  Text Classification
tax2vec: Constructing Interpretable Features from Taxonomies for Short Text ClassificationComputer Speech and Language (CSL), 2019
Blaž Škrlj
Matej Martinc
Jan Kralj
Nada Lavrac
Senja Pollak
247
46
0
01 Feb 2019
Compressing Gradient Optimizers via Count-Sketches
Compressing Gradient Optimizers via Count-SketchesInternational Conference on Machine Learning (ICML), 2019
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
132
37
0
01 Feb 2019
The Second Conversational Intelligence Challenge (ConvAI2)
The Second Conversational Intelligence Challenge (ConvAI2)
Emily Dinan
V. Logacheva
Valentin Malykh
Alexander H. Miller
Kurt Shuster
...
Alexander I. Rudnicky
Jason Williams
Joelle Pineau
Andrey Kravchenko
Jason Weston
DRL
210
383
0
31 Jan 2019
Multi-Task Deep Neural Networks for Natural Language Understanding
Multi-Task Deep Neural Networks for Natural Language UnderstandingAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
AI4CE
272
1,323
0
31 Jan 2019
Learning Taxonomies of Concepts and not Words using Contextualized Word
  Representations: A Position Paper
Learning Taxonomies of Concepts and not Words using Contextualized Word Representations: A Position Paper
Lukas Schmelzeisen
Steffen Staab
103
3
0
31 Jan 2019
A large-scale crowdsourced analysis of abuse against women journalists
  and politicians on Twitter
A large-scale crowdsourced analysis of abuse against women journalists and politicians on Twitter
Laure Delisle
Freddie Kalaitzis
Krzysztof Majewski
A. D. Berker
M. Marin
Julien Cornebise
108
32
0
31 Jan 2019
Learning and Evaluating General Linguistic Intelligence
Learning and Evaluating General Linguistic Intelligence
Dani Yogatama
Cyprien de Masson dÁutume
Jerome T. Connor
Tomás Kociský
Mike Chrzanowski
...
Angeliki Lazaridou
Wang Ling
Lei Yu
Chris Dyer
Phil Blunsom
ELMAI4CE
324
217
0
31 Jan 2019
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text
  Classification Tasks
EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Jason W. Wei
Kai Zou
654
2,182
0
31 Jan 2019
Memory-Efficient Adaptive Optimization
Memory-Efficient Adaptive OptimizationNeural Information Processing Systems (NeurIPS), 2019
Rohan Anil
Vineet Gupta
Tomer Koren
Y. Singer
ODL
163
49
0
30 Jan 2019
The Evolved Transformer
The Evolved TransformerInternational Conference on Machine Learning (ICML), 2019
David R. So
Chen Liang
Quoc V. Le
ViT
477
487
0
30 Jan 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
202
74
0
30 Jan 2019
Glyce: Glyph-vectors for Chinese Character Representations
Glyce: Glyph-vectors for Chinese Character Representations
Yuxian Meng
Wei Wu
Fei Wang
Xiaoya Li
Ping Nie
J. Mei
Muyu Li
Qinghong Han
Xiaofei Sun
Jiwei Li
VLM
420
205
0
29 Jan 2019
Evaluating Word Embedding Models: Methods and Experimental Results
Evaluating Word Embedding Models: Methods and Experimental Results
Bin Wang
Angela Wang
Fenxiao Chen
Yun Cheng Wang
C.-C. Jay Kuo
ELM
194
293
0
28 Jan 2019
Language Independent Sequence Labelling for Opinion Target Extraction
Language Independent Sequence Labelling for Opinion Target Extraction
Rodrigo Agerri
German Rigau
93
23
0
28 Jan 2019
Stiffness: A New Perspective on Generalization in Neural Networks
Stiffness: A New Perspective on Generalization in Neural Networks
Stanislav Fort
Pawel Krzysztof Nowak
Stanislaw Jastrzebski
S. Narayanan
248
105
0
28 Jan 2019
Dual Co-Matching Network for Multi-choice Reading Comprehension
Dual Co-Matching Network for Multi-choice Reading Comprehension
Shuailiang Zhang
Zhao Hai
Yuwei Wu
Zhuosheng Zhang
Xi Zhou
Xiaoping Zhou
274
134
0
27 Jan 2019
Context in Neural Machine Translation: A Review of Models and
  Evaluations
Context in Neural Machine Translation: A Review of Models and Evaluations
Andrei Popescu-Belis
MedIm
146
32
0
25 Jan 2019
Deep Learning on Small Datasets without Pre-Training using Cosine Loss
Deep Learning on Small Datasets without Pre-Training using Cosine Loss
Björn Barz
Joachim Denzler
170
144
0
25 Jan 2019
BioBERT: a pre-trained biomedical language representation model for
  biomedical text mining
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Jinhyuk Lee
Wonjin Yoon
Sungdong Kim
Donghyeon Kim
Sunkyu Kim
Chan Ho So
Jaewoo Kang
OOD
1.4K
6,562
0
25 Jan 2019
A BERT Baseline for the Natural Questions
A BERT Baseline for the Natural Questions
Chris Alberti
Kenton Lee
Michael Collins
ELMAI4MH
196
129
0
24 Jan 2019
Large-Batch Training for LSTM and Beyond
Large-Batch Training for LSTM and Beyond
Yang You
Jonathan Hseu
Chris Ying
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
192
95
0
24 Jan 2019
TransferTransfo: A Transfer Learning Approach for Neural Network Based
  Conversational Agents
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
Thomas Wolf
Victor Sanh
Julien Chaumond
Clement Delangue
269
510
0
23 Jan 2019
A Question-Entailment Approach to Question Answering
A Question-Entailment Approach to Question Answering
Asma Ben Abacha
Dina Demner-Fushman
174
241
0
23 Jan 2019
Programmable Neural Network Trojan for Pre-Trained Feature Extractor
Programmable Neural Network Trojan for Pre-Trained Feature Extractor
Yu Ji
Zixin Liu
Xing Hu
Peiqi Wang
Youhui Zhang
AAML
94
20
0
23 Jan 2019
Automated Essay Scoring based on Two-Stage Learning
Automated Essay Scoring based on Two-Stage Learning
Jiawei Liu
Yang Xu
Yaguang Zhu
94
68
0
23 Jan 2019
Deep learning and sub-word-unit approach in written art generation
Deep learning and sub-word-unit approach in written art generation
K. Wołk
Emilia Zawadzka-Gosk
Wojciech Czarnowski
103
1
0
22 Jan 2019
Cross-lingual Language Model Pretraining
Cross-lingual Language Model Pretraining
Guillaume Lample
Alexis Conneau
1.1K
2,884
0
22 Jan 2019
Spatial Broadcast Decoder: A Simple Architecture for Learning
  Disentangled Representations in VAEs
Spatial Broadcast Decoder: A Simple Architecture for Learning Disentangled Representations in VAEs
Nicholas Watters
Loic Matthey
Christopher P. Burgess
Alexander Lerchner
CoGe
371
181
0
21 Jan 2019
Mixed Formal Learning: A Path to Transparent Machine Learning
Mixed Formal Learning: A Path to Transparent Machine Learning
Sandra Carrico
AI4CE
42
1
0
20 Jan 2019
Physics-Constrained Deep Learning for High-dimensional Surrogate
  Modeling and Uncertainty Quantification without Labeled Data
Physics-Constrained Deep Learning for High-dimensional Surrogate Modeling and Uncertainty Quantification without Labeled Data
Yinhao Zhu
N. Zabaras
P. Koutsourelakis
P. Perdikaris
PINNAI4CE
324
954
0
18 Jan 2019
Learning from Dialogue after Deployment: Feed Yourself, Chatbot!
Learning from Dialogue after Deployment: Feed Yourself, Chatbot!
Braden Hancock
Antoine Bordes
Pierre-Emmanuel Mazaré
Jason Weston
478
210
0
16 Jan 2019
Assessing BERT's Syntactic Abilities
Assessing BERT's Syntactic Abilities
Yoav Goldberg
255
512
0
16 Jan 2019
Sentence transition matrix: An efficient approach that preserves
  sentence semantics
Sentence transition matrix: An efficient approach that preserves sentence semantics
Myeongjun Jang
Pilsung Kang
74
3
0
16 Jan 2019
Investigating Antigram Behaviour using Distributional Semantics
Investigating Antigram Behaviour using Distributional Semantics
Saptarshi Sengupta
69
0
0
15 Jan 2019
Previous
123...657658659660661
Next