ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Neural Information Processing Systems (NeurIPS), 2019
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,732 papers shown
Multilingual Question Answering from Formatted Text applied to
  Conversational Agents
Multilingual Question Answering from Formatted Text applied to Conversational Agents
W. Siblini
Charlotte Pasqual
Axel Lavielle
Mohamed Challal
Cyril Cauchois
202
21
0
10 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
PipeMare: Asynchronous Pipeline Parallel DNN TrainingConference on Machine Learning and Systems (MLSys), 2019
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
301
124
0
09 Oct 2019
Domain-Relevant Embeddings for Medical Question Similarity
Domain-Relevant Embeddings for Medical Question Similarity
Clara H. McCreery
Namit Katariya
A. Kannan
Manish Chablani
X. Amatriain
174
9
0
09 Oct 2019
HuggingFace's Transformers: State-of-the-art Natural Language Processing
HuggingFace's Transformers: State-of-the-art Natural Language Processing
Thomas Wolf
Lysandre Debut
Victor Sanh
Julien Chaumond
Clement Delangue
...
Teven Le Scao
Sylvain Gugger
Mariama Drame
Quentin Lhoest
Alexander M. Rush
AI4CE
443
3,286
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Knowledge Distillation from Internal RepresentationsAAAI Conference on Artificial Intelligence (AAAI), 2019
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
320
196
0
08 Oct 2019
Read, Highlight and Summarize: A Hierarchical Neural Semantic
  Encoder-based Approach
Read, Highlight and Summarize: A Hierarchical Neural Semantic Encoder-based Approach
Rajeev Bhatt Ambati
Saptarashmi Bandyopadhyay
P. Mitra
61
0
0
08 Oct 2019
BERT for Evidence Retrieval and Claim Verification
BERT for Evidence Retrieval and Claim VerificationEuropean Conference on Information Retrieval (ECIR), 2019
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
156
141
0
07 Oct 2019
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
MASTER: Multi-Aspect Non-local Network for Scene Text RecognitionPattern Recognition (Pattern Recognit.), 2019
Ning Lu
Wenwen Yu
Xianbiao Qi
Yihao Chen
Ping Gong
Rong Xiao
Xiang Bai
230
175
0
07 Oct 2019
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Subhabrata Mukherjee
Ahmed Hassan Awadallah
189
25
0
04 Oct 2019
Cracking the Contextual Commonsense Code: Understanding Commonsense
  Reasoning Aptitude of Deep Contextual Representations
Cracking the Contextual Commonsense Code: Understanding Commonsense Reasoning Aptitude of Deep Contextual RepresentationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Jeff Da
Jungo Kasai
LRM
172
41
0
02 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
2.5K
9,006
0
02 Oct 2019
SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic
  Auto-Encoders
SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders
Peter J. Liu
Yu-An Chung
Jie Jessie Ren
222
20
0
02 Oct 2019
Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
Exploiting BERT for End-to-End Aspect-based Sentiment AnalysisConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Xin Li
Lidong Bing
Wenxuan Zhang
W. Lam
226
311
0
02 Oct 2019
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention
  With Dilated 1D Convolutions
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D ConvolutionsAutomatic Speech Recognition & Understanding (ASRU), 2019
Kyu Jeong Han
R. Prieto
Kaixing(Kai) Wu
T. Ma
290
77
0
01 Oct 2019
Better Document-Level Machine Translation with Bayes' Rule
Better Document-Level Machine Translation with Bayes' Rule
Lei Yu
Laurent Sartran
Wojciech Stokowiec
Wang Ling
Lingpeng Kong
Phil Blunsom
Chris Dyer
188
7
0
01 Oct 2019
MMM: Multi-stage Multi-task Learning for Multi-choice Reading
  Comprehension
MMM: Multi-stage Multi-task Learning for Multi-choice Reading ComprehensionAAAI Conference on Artificial Intelligence (AAAI), 2019
Di Jin
Shuyang Gao
Jiun-Yu Kao
Tagyoung Chung
Dilek Z. Hakkani-Tür
235
72
0
01 Oct 2019
TMLab: Generative Enhanced Model (GEM) for adversarial attacks
TMLab: Generative Enhanced Model (GEM) for adversarial attacksConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
P. Niewinski
M. Pszona
M. Janicka
VLMGAN
146
19
0
01 Oct 2019
Biomedical relation extraction with pre-trained language representations
  and minimal task-specific architecture
Biomedical relation extraction with pre-trained language representations and minimal task-specific architectureConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Ashok Thillaisundaram
Theodosia Togia
115
17
0
26 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
ALBERT: A Lite BERT for Self-supervised Learning of Language RepresentationsInternational Conference on Learning Representations (ICLR), 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSLAIMat
1.2K
7,166
0
26 Sep 2019
Aspect and Opinion Term Extraction for Hotel Reviews using Transfer
  Learning and Auxiliary Labels
Aspect and Opinion Term Extraction for Hotel Reviews using Transfer Learning and Auxiliary Labels
Yosef Ardhito Winatmoko
Ali Akbar Septiandri
Arie Pratama Sutiono
203
4
0
26 Sep 2019
Pre-train, Interact, Fine-tune: A Novel Interaction Representation for
  Text Classification
Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text ClassificationInformation Processing & Management (IPM), 2019
Jianming Zheng
Fei Cai
Honghui Chen
Maarten de Rijke
99
23
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
FreeLB: Enhanced Adversarial Training for Natural Language UnderstandingInternational Conference on Learning Representations (ICLR), 2019
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
686
492
0
25 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
UNITER: UNiversal Image-TExt Representation LearningEuropean Conference on Computer Vision (ECCV), 2019
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLMOT
374
465
0
25 Sep 2019
Extremely Small BERT Models from Mixed-Vocabulary Training
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
VLM
239
54
0
25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured DropoutInternational Conference on Learning Representations (ICLR), 2019
Angela Fan
Edouard Grave
Armand Joulin
633
662
0
25 Sep 2019
Multi-task Batch Reinforcement Learning with Metric Learning
Multi-task Batch Reinforcement Learning with Metric Learning
Jiachen Li
Q. Vuong
Shuang Liu
Minghua Liu
K. Ciosek
George Andriopoulos
Henrik I. Christensen
H. Su
OffRL
315
2
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language ModelsInternational Conference on Learning Representations (ICLR), 2019
Cheolhyoung Lee
Dong Wang
Wanmo Kang
MoE
506
228
0
25 Sep 2019
Understanding Semantics from Speech Through Pre-training
Understanding Semantics from Speech Through Pre-training
P. Wang
Liangchen Wei
Yong Cao
Jinghui Xie
Yuji Cao
Zaiqing Nie
SSLVLM
106
6
0
24 Sep 2019
Technical report on Conversational Question Answering
Technical report on Conversational Question Answering
Yingnan Ju
Fubang Zhao
Shijie Chen
Bowen Zheng
Xuefeng Yang
Yunfeng Liu
115
50
0
24 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
275
280
0
23 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training
Cross-Lingual Natural Language Generation via Pre-TrainingAAAI Conference on Artificial Intelligence (AAAI), 2019
Zewen Chi
Li Dong
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang
238
141
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with
  Contextualized Embeddings
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized EmbeddingsConference on Natural Language Processing (NLP), 2019
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
269
191
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language UnderstandingFindings (Findings), 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
635
2,164
0
23 Sep 2019
Teaching Pretrained Models with Commonsense Reasoning: A Preliminary
  KB-Based Approach
Teaching Pretrained Models with Commonsense Reasoning: A Preliminary KB-Based Approach
Shiyang Li
Jianshu Chen
Dian Yu
ReLMLRM
172
21
0
20 Sep 2019
Representation Learning for Electronic Health Records
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
170
21
0
19 Sep 2019
ASU at TextGraphs 2019 Shared Task: Explanation ReGeneration using
  Language Models and Iterative Re-Ranking
ASU at TextGraphs 2019 Shared Task: Explanation ReGeneration using Language Models and Iterative Re-RankingConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Pratyay Banerjee
LRM
112
21
0
19 Sep 2019
Summary Level Training of Sentence Rewriting for Abstractive
  Summarization
Summary Level Training of Sentence Rewriting for Abstractive SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Sanghwan Bae
Taeuk Kim
Jihoon Kim
Sang-goo Lee
152
72
0
19 Sep 2019
Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words
  In Mind
Cross-Lingual Contextual Word Embeddings Mapping With Multi-Sense Words In Mind
Zheng Zhang
Ruiqing Yin
Jun Zhu
Pierre Zweigenbaum
110
4
0
18 Sep 2019
Language models and Automated Essay Scoring
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
185
108
0
18 Sep 2019
Extractive Summarization of Long Documents by Combining Global and Local
  Context
Extractive Summarization of Long Documents by Combining Global and Local ContextConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Wen Xiao
Giuseppe Carenini
218
160
0
17 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
1.3K
2,442
0
17 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
K-BERT: Enabling Language Representation with Knowledge GraphAAAI Conference on Artificial Intelligence (AAAI), 2019
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
620
868
0
17 Sep 2019
I-MAD: Interpretable Malware Detector Using Galaxy Transformer
I-MAD: Interpretable Malware Detector Using Galaxy TransformerComputers & security (Comput. Secur.), 2019
Miles Q. Li
Benjamin C. M. Fung
P. Charland
Steven H. H. Ding
311
39
0
15 Sep 2019
Temporal FiLM: Capturing Long-Range Sequence Dependencies with
  Feature-Wise Modulations
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise ModulationsNeural Information Processing Systems (NeurIPS), 2019
Sawyer Birnbaum
Volodymyr Kuleshov
S. Enam
Pang Wei Koh
Stefano Ermon
AI4TS
247
84
0
14 Sep 2019
SANVis: Visual Analytics for Understanding Self-Attention Networks
SANVis: Visual Analytics for Understanding Self-Attention NetworksVisual .. (VISUAL), 2019
Cheonbok Park
Inyoup Na
Yongjang Jo
Sungbok Shin
J. Yoo
Bum Chul Kwon
Jian Zhao
Hyungjong Noh
Yeonsoo Lee
Jaegul Choo
HAI
182
41
0
13 Sep 2019
Frustratingly Easy Natural Question Answering
Frustratingly Easy Natural Question Answering
Lin Pan
Rishav Chakravarti
Anthony Ferritto
Michael R. Glass
A. Gliozzo
Salim Roukos
Radu Florian
Avirup Sil
162
14
0
11 Sep 2019
Comprehensive Analysis of Aspect Term Extraction Methods using Various
  Text Embeddings
Comprehensive Analysis of Aspect Term Extraction Methods using Various Text EmbeddingsComputer Speech and Language (CSL), 2019
Lukasz Augustyniak
Tomasz Kajdanowicz
Przemyslaw Kazienko
113
45
0
11 Sep 2019
Question Generation by Transformers
Question Generation by Transformers
Kettip Kriangchaivech
A. Wangperawong
175
30
0
09 Sep 2019
Span Selection Pre-training for Question Answering
Span Selection Pre-training for Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
294
74
0
09 Sep 2019
Forecaster: A Graph Transformer for Forecasting Spatial and
  Time-Dependent Data
Forecaster: A Graph Transformer for Forecasting Spatial and Time-Dependent DataEuropean Conference on Artificial Intelligence (ECAI), 2019
Yongqian Li
J. M. F. Moura
AI4TS
236
39
0
09 Sep 2019
Previous
123...72737475
Next
Page 73 of 75
Pageof 75