Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 1,069 papers shown
Title
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Yang Yang
...
Jiahao Liu
Jingang Wang
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
ALM
MoE
17
11
0
11 Jun 2023
Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction
Simone Scaboro
Beatrice Portelli
Emmanuele Chersoni
Enrico Santus
Giuseppe Serra
16
8
0
08 Jun 2023
Evaluating Emotion Arcs Across Languages: Bridging the Global Divide in Sentiment Analysis
Daniela Teodorescu
Saif M. Mohammad
30
8
0
03 Jun 2023
The Utility of Large Language Models and Generative AI for Education Research
Andrew Katz
Umair Shakir
B. Chambers
AI4CE
20
6
0
29 May 2023
Modeling Adversarial Attack on Pre-trained Language Models as Sequential Decision Making
Xuanjie Fang
Sijie Cheng
Yang Liu
Wen Wang
AAML
28
9
0
27 May 2023
Entailment as Robust Self-Learner
Jiaxin Ge
Hongyin Luo
Yoon Kim
James R. Glass
39
3
0
26 May 2023
Neural Summarization of Electronic Health Records
Koyena Pal
Seyed Ali Bahrainian
Laura Y. Mercurio
Carsten Eickhoff
25
3
0
24 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
24
4
0
23 May 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path
Zilong Wang
Jingbo Shang
36
0
0
23 May 2023
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
37
10
0
22 May 2023
Bidirectional Transformer Reranker for Grammatical Error Correction
Ying Zhang
Hidetaka Kamigaito
Manabu Okumura
11
2
0
22 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
23
12
0
21 May 2023
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
Eli Chien
Jiong Zhang
Cho-Jui Hsieh
Jyun-Yu Jiang
Wei-Cheng Chang
O. Milenkovic
Hsiang-Fu Yu
17
9
0
21 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
29
81
0
19 May 2023
Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models
Raj Sanjay Shah
Vijay Marupudi
Reba Koenen
Khushi Bhardwaj
Sashank Varma
19
6
0
18 May 2023
NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing
Tingting Wu
Xiao Ding
Minji Tang
Haotian Zhang
Bing Qin
Ting Liu
NoLa
26
9
0
18 May 2023
Statistical Knowledge Assessment for Large Language Models
Qingxiu Dong
Jingjing Xu
Lingpeng Kong
Zhifang Sui
Lei Li
HILM
33
6
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
10
0
0
17 May 2023
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Hans W. A. Hanley
Zakir Durumeric
DeLMO
21
29
0
16 May 2023
DLUE: Benchmarking Document Language Understanding
Ruoxi Xu
Hongyu Lin
Xinyan Guan
Xianpei Han
Yingfei Sun
Le Sun
ELM
18
0
0
16 May 2023
Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility
Wen-song Ye
Mingfeng Ou
Tianyi Li
Yipeng Chen
Xuetao Ma
...
Sai Wu
Jie Fu
Gang Chen
Haobo Wang
J. Zhao
42
36
0
15 May 2023
ParaLS: Lexical Substitution via Pretrained Paraphraser
Jipeng Qiang
Kang Liu
Yun Li
Yunhao Yuan
Yi Zhu
KELM
24
11
0
14 May 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
8
6
0
11 May 2023
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification
Anastasiia Grishina
Max Hort
Leon Moonen
22
6
0
08 May 2023
Differentially Private Attention Computation
Yeqi Gao
Zhao-quan Song
Xin Yang
42
19
0
08 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
19
6
0
07 May 2023
Pre-training Language Model as a Multi-perspective Course Learner
Beiduo Chen
Shaohan Huang
Zi-qiang Zhang
Wu Guo
Zhen-Hua Ling
Haizhen Huang
Furu Wei
Weiwei Deng
Qi Zhang
21
0
0
06 May 2023
DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Chunkit Chan
Xin Liu
Cheng Jiayang
Zihan Li
Yangqiu Song
Ginny Y. Wong
Simon See
21
30
0
06 May 2023
VendorLink: An NLP approach for Identifying & Linking Vendor Migrants & Potential Aliases on Darknet Markets
V. Saxena
Nils Rethmeier
Gijs Van Dijck
Gerasimos Spanakis
24
6
0
04 May 2023
Using Language Models on Low-end Hardware
Silin Gao
Beatriz Borges
Saya Kanno
Antoine Bosselut
17
0
0
03 May 2023
Calibration Error Estimation Using Fuzzy Binning
Geetanjali Bihani
Julia Taylor Rayz
90
2
0
30 Apr 2023
MMViT: Multiscale Multiview Vision Transformers
Yuchen Liu
Natasha Ong
Kaiyan Peng
Bo Xiong
Qifan Wang
...
Madian Khabsa
Kaiyue Yang
David C. Liu
Donald Williamson
Hanchao Yu
ViT
22
4
0
28 Apr 2023
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]
Alexandros Zeakis
G. Papadakis
Dimitrios Skoutas
Manolis Koubarakis
24
37
0
24 Apr 2023
MasakhaNEWS: News Topic Classification for African languages
David Ifeoluwa Adelani
Marek Masiak
Israel Abebe Azime
Jesujoba Oluwadara Alabi
A. Tonja
...
Moges Ahmed Mehamed
Evrard Ngabire
Jules Jules
Ivan Ssenkungu
Pontus Stenetorp
20
23
0
19 Apr 2023
MER 2023: Multi-label Learning, Modality Robustness, and Semi-Supervised Learning
Zheng Lian
Haiyang Sun
Licai Sun
Kang Chen
Mingyu Xu
...
Meng Wang
Erik Cambria
Guoying Zhao
Björn W. Schuller
Jianhua Tao
28
47
0
18 Apr 2023
MisRoBÆRTa: Transformers versus Misinformation
Ciprian-Octavian Truică
Elena Simona Apostol
19
37
0
16 Apr 2023
Fairness in Visual Clustering: A Novel Transformer Clustering Approach
Xuan-Bac Nguyen
C. Duong
Marios Savvides
Kaushik Roy
Hugh Churchill
Khoa Luu
26
9
0
14 Apr 2023
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Shiyin Kang
H. Meng
25
6
0
13 Apr 2023
A Cheaper and Better Diffusion Language Model with Soft-Masked Noise
Jiaao Chen
Aston Zhang
Mu Li
Alexander J. Smola
Diyi Yang
DiffM
24
17
0
10 Apr 2023
Continual Graph Convolutional Network for Text Classification
Tiandeng Wu
Qijiong Liu
Yinhao Cao
yao. huang
Xiao-Ming Wu
Jiandong Ding
GNN
16
9
0
09 Apr 2023
Multi-class Categorization of Reasons behind Mental Disturbance in Long Texts
Muskan Garg
AI4MH
21
2
0
08 Apr 2023
MEGClass: Extremely Weakly Supervised Text Classification via Mutually-Enhancing Text Granularities
Priyanka Kargupta
Tanay Komarlu
Susik Yoon
Xuan Wang
Jiawei Han
23
8
0
04 Apr 2023
Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT
Yi Qi
Xingyu Zhao
Siddartha Khastgir
Xiaowei Huang
24
14
0
03 Apr 2023
MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model
Xin Yao
Ziqing Yang
Yiming Cui
Shijin Wang
20
3
0
03 Apr 2023
BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection
Sihao Hu
Zhen Zhang
B. Luo
Shengliang Lu
Bingsheng He
Ling Liu
14
39
0
29 Mar 2023
Informed Machine Learning, Centrality, CNN, Relevant Document Detection, Repatriation of Indigenous Human Remains
M. A. Bashar
R. Nayak
G. Knapman
Paul Turnbull
C. Fforde
31
1
0
25 Mar 2023
SPEC: Summary Preference Decomposition for Low-Resource Abstractive Summarization
Yi-Syuan Chen
Yun-Zhu Song
Hong-Han Shuai
33
6
0
24 Mar 2023
Paraphrase Detection: Human vs. Machine Content
Jonas Becker
Jan Philip Wahle
Terry Ruas
Bela Gipp
DeLMO
22
12
0
24 Mar 2023
Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function
A. B. Siddique
M. H. Maqbool
Kshitija Taywade
H. Foroosh
24
12
0
24 Mar 2023
Human Behavior in the Time of COVID-19: Learning from Big Data
Hanjia Lyu
Arsal Imtiaz
Yufei Zhao
Jiebo Luo
20
6
0
23 Mar 2023
Previous
1
2
3
4
5
...
20
21
22
Next