ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXivPDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 1,069 papers shown
Title
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
72
0
16 Jun 2020
Results of the seventh edition of the BioASQ Challenge
Results of the seventh edition of the BioASQ Challenge
A. Nentidis
K. Bougiatiotis
Anastasia Krithara
G. Paliouras
21
62
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
50
61
0
16 Jun 2020
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Haoxing Lin
Rufan Bai
Weijia Jia
Xinyu Yang
Yongjian You
HAI
AI4TS
16
63
0
16 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
31
1,586
0
15 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
19
432
0
11 Jun 2020
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
  Regulated GAN with Data Augmentation
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of Regulated GAN with Data Augmentation
Saeedreza Shehnepoor
R. Togneri
Wei Liu
Bennamoun
19
4
0
11 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
30
441
0
10 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
  Generation
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li
Fuyu Wang
Xiaojun Chang
Xiaodan Liang
MedIm
21
101
0
06 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
62
2,614
0
05 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
25
7
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
28
229
0
05 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
22
72
0
31 May 2020
Stance Prediction for Contemporary Issues: Data and Experiments
Stance Prediction for Contemporary Issues: Data and Experiments
Marjan Hosseinia
Eduard Constantin Dragut
Arjun Mukherjee
14
28
0
29 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
15
39,958
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
31
33
0
27 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
8
304
0
26 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other
  Affectual States from Text
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
22
312
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
  Injection into Pretrained Transformers
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavavs
KELM
29
79
0
24 May 2020
What Makes for Good Views for Contrastive Learning?
What Makes for Good Views for Contrastive Learning?
Yonglong Tian
Chen Sun
Ben Poole
Dilip Krishnan
Cordelia Schmid
Phillip Isola
SSL
13
1,302
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
25
30
0
20 May 2020
Contextual Embeddings: When Are They Worth It?
Contextual Embeddings: When Are They Worth It?
Simran Arora
Avner May
Jian Zhang
Christopher Ré
8
58
0
18 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
8
316
0
18 May 2020
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory
  Prediction
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun Yu
Xiao Ma
Jiawei Ren
Haiyu Zhao
Shuai Yi
26
459
0
18 May 2020
T-VSE: Transformer-Based Visual Semantic Embedding
T-VSE: Transformer-Based Visual Semantic Embedding
M. Bastan
Arnau Ramisa
Mehmet Tek
ViT
16
7
0
17 May 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
32
36
0
12 May 2020
A Report on the 2020 Sarcasm Detection Shared Task
A Report on the 2020 Sarcasm Detection Shared Task
Debanjan Ghosh
Avijit Vajpayee
Smaranda Muresan
16
59
0
12 May 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video
  Paragraph Captioning
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Jie Lei
Liwei Wang
Yelong Shen
Dong Yu
Tamara L. Berg
Mohit Bansal
16
186
0
11 May 2020
A Deep Learning Approach for Automatic Detection of Fake News
A Deep Learning Approach for Automatic Detection of Fake News
Tanik Saikh
Arkadipta De
Asif Ekbal
P. Bhattacharyya
6
33
0
11 May 2020
schuBERT: Optimizing Elements of BERT
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar S. Karnin
23
30
0
09 May 2020
CAiRE-COVID: A Question Answering and Query-focused Multi-Document
  Summarization System for COVID-19 Scholarly Information Management
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
Dan Su
Yan Xu
Tiezheng Yu
Farhad Bin Siddique
Elham J. Barezi
Pascale Fung
RALM
11
31
0
04 May 2020
To Test Machine Comprehension, Start by Defining Comprehension
To Test Machine Comprehension, Start by Defining Comprehension
Jesse Dunietz
Greg Burnham
Akash Bharadwaj
Owen Rambow
Jennifer Chu-Carroll
D. Ferrucci
FaML
52
64
0
04 May 2020
The Sensitivity of Language Models and Humans to Winograd Schema
  Perturbations
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
Mostafa Abdou
Vinit Ravishankar
Maria Barrett
Yonatan Belinkov
Desmond Elliott
Anders Søgaard
ReLM
LRM
54
34
0
04 May 2020
A Simple Language Model for Task-Oriented Dialogue
A Simple Language Model for Task-Oriented Dialogue
Ehsan Hosseini-Asl
Bryan McCann
Chien-Sheng Wu
Semih Yavuz
R. Socher
31
523
0
02 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
41
492
0
01 May 2020
Zero-Shot Learning and its Applications from Autonomous Vehicles to
  COVID-19 Diagnosis: A Review
Zero-Shot Learning and its Applications from Autonomous Vehicles to COVID-19 Diagnosis: A Review
Mahdi Rezaei
Mahsa Shahidi
19
53
0
29 Apr 2020
Span-based Localizing Network for Natural Language Video Localization
Span-based Localizing Network for Natural Language Video Localization
Hao Zhang
Aixin Sun
Wei Jing
Joey Tianyi Zhou
15
311
0
29 Apr 2020
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Revisiting Pre-Trained Models for Chinese Natural Language Processing
Yiming Cui
Wanxiang Che
Ting Liu
Bing Qin
Shijin Wang
Guoping Hu
26
681
0
29 Apr 2020
Exploring Self-attention for Image Recognition
Exploring Self-attention for Image Recognition
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
11
772
0
28 Apr 2020
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Ji Xin
Raphael Tang
Jaejun Lee
Yaoliang Yu
Jimmy J. Lin
6
363
0
27 Apr 2020
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical
  Encoder for Long-Form Document Matching
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching
Liu Yang
Mingyang Zhang
Cheng Li
Michael Bendersky
Marc Najork
27
86
0
26 Apr 2020
Template-Based Question Generation from Retrieved Sentences for Improved
  Unsupervised Question Answering
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Alexander R. Fabbri
Patrick K. L. Ng
Zhiguo Wang
Ramesh Nallapati
Bing Xiang
24
77
0
24 Apr 2020
Learning the grammar of drug prescription: recurrent neural network
  grammars for medication information extraction in clinical texts
Learning the grammar of drug prescription: recurrent neural network grammars for medication information extraction in clinical texts
Ivan Lerner
Jordan Jouffroy
Anita Burgun
A. Neuraz
19
9
0
24 Apr 2020
QURIOUS: Question Generation Pretraining for Text Generation
QURIOUS: Question Generation Pretraining for Text Generation
Shashi Narayan
Gonçalo Simães
Ji Ma
Hannah Craighead
Ryan T. McDonald
26
15
0
23 Apr 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
19
350
0
21 Apr 2020
Train No Evil: Selective Masking for Task-Guided Pre-Training
Train No Evil: Selective Masking for Task-Guided Pre-Training
Yuxian Gu
Zhengyan Zhang
Xiaozhi Wang
Zhiyuan Liu
Maosong Sun
24
59
0
21 Apr 2020
StereoSet: Measuring stereotypical bias in pretrained language models
StereoSet: Measuring stereotypical bias in pretrained language models
Moin Nadeem
Anna Bethke
Siva Reddy
20
954
0
20 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network
Guanming Xiong
21
0
0
20 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Previous
123...1819202122
Next