ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Neural Information Processing Systems (NeurIPS), 2019
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,732 papers shown
Autoregressive Image Generation with Randomized Parallel Decoding
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
277
7
0
13 Mar 2025
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
Vivek Chari
Guanghui Qin
Benjamin Van Durme
VLM
271
8
0
13 Mar 2025
Sentiment Analysis in SemEval: A Review of Sentiment Identification ApproachesInternational Journal of Electrical and Computer Engineering (IJECE) (IJECE), 2023
Bousselham EL HADDAOUI
R. Chiheb
R. Faizi
A. E. Afia
264
1
0
13 Mar 2025
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents
Samira Zangooei
Amirhossein Darmani
Hossein Farahmand Nezhad
Laya Mahmoudi
232
1
0
13 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
339
8
0
13 Mar 2025
Towards Graph Foundation Models: A Transferability Perspective
Longji Xu
Wenqi Fan
Suhang Wang
Yao Ma
260
6
0
13 Mar 2025
CALLM: Understanding Cancer Survivors' Emotions and Intervention Opportunities via Mobile Diaries and Context-Aware Language Models
CALLM: Understanding Cancer Survivors' Emotions and Intervention Opportunities via Mobile Diaries and Context-Aware Language Models
Zhiyuan Wang
Katharine E. Daniel
Laura E. Barnes
Philip I. Chow
186
0
0
12 Mar 2025
LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence RerankingJournal of Artificial Intelligence Research (JAIR), 2025
Yan Yan
Junyuan Liu
Bo Zhang
181
1
0
11 Mar 2025
LiSu: A Dataset and Method for LiDAR Surface Normal EstimationComputer Vision and Pattern Recognition (CVPR), 2025
Dušan Malić
Christian Fruhwirth-Reisinger
Samuel Schulter
Horst Possegger
3DV
292
1
0
11 Mar 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
864
4
0
10 Mar 2025
Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs
Gonzalo Mancera
Daniel DeAlcala
Julian Fierrez
Ruben Tolosana
Aythami Morales
348
6
0
10 Mar 2025
Learning-Order Autoregressive Models with Application to Molecular Graph Generation
Learning-Order Autoregressive Models with Application to Molecular Graph Generation
Zhe Wang
Jiaxin Shi
N. Heess
Arthur Gretton
Michalis K. Titsias
361
11
0
07 Mar 2025
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network Security
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network SecurityIEEE Transactions on Cognitive Communications and Networking (TCCN), 2025
Binghui Wu
D. Divakaran
M. Gurusamy
342
2
0
06 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
Xinyu Wang
Lan Yu
S. Li
149
0
0
05 Mar 2025
Zero-Shot Complex Question-Answering on Long Scientific Documents
Wanting Wang
RALM
134
1
0
04 Mar 2025
Wavelet-Driven Masked Image Modeling: A Path to Efficient Visual RepresentationAAAI Conference on Artificial Intelligence (AAAI), 2025
Wenzhao Xiang
Chang Liu
Hongyang Yu
Xilin Chen
250
1
0
02 Mar 2025
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Yifei Duan
Raphael Shang
Deng Liang
Yongqiang Cai
347
0
0
28 Feb 2025
Revisiting Kernel Attention with Correlated Gaussian Process Representation
Revisiting Kernel Attention with Correlated Gaussian Process RepresentationConference on Uncertainty in Artificial Intelligence (UAI), 2025
Long Minh Bui
Tho Tran Huu
Duy-Tung Dinh
T. Nguyen
Trong Nghia Hoang
366
5
0
27 Feb 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of ExpertsInternational Conference on Learning Representations (ICLR), 2025
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
356
6
0
26 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
478
8
0
26 Feb 2025
Exploring Graph Learning Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Exploring Graph Learning Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Longji Xu
Xinnan Dai
Wenqi Fan
Yao Ma
326
9
0
26 Feb 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
368
5
0
25 Feb 2025
How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching
How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching
Nuo Xu
Peijie Wang
Zi Liang
Junzhou Zhao
X. Guan
AILaw
293
0
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Predicting Through Generation: Why Generation Is Better for PredictionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
553
1
0
25 Feb 2025
Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT
Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT
Hediyeh Baban
Sai A Pidapar
Aashutosh Nema
Sichen Lu
LLMAG
308
1
0
25 Feb 2025
Detecting Code Vulnerabilities with Heterogeneous GNN Training
Detecting Code Vulnerabilities with Heterogeneous GNN Training
Yu Luo
Weifeng Xu
Dianxiang Xu
324
2
0
24 Feb 2025
Streaming Looking Ahead with Token-level Self-reward
Han Zhang
Ruixin Hong
Dong Yu
230
2
0
24 Feb 2025
Predictive Modeling: BIM Command Recommendation Based on Large-scale Usage Logs
Predictive Modeling: BIM Command Recommendation Based on Large-scale Usage LogsAdvanced Engineering Informatics (AEI), 2025
Changyu Du
Zihan Deng
Stavros Nousias
André Borrmann
AI4CE
221
1
0
23 Feb 2025
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
IPAD: Inverse Prompt for AI Detection - A Robust and Interpretable LLM-Generated Text Detector
Zheng Chen
Yushi Feng
Changyang He
Yue Deng
Hongxi Pu
Yue Liu
Haoxuan Li
Bo Li
DeLMO
268
1
0
21 Feb 2025
Model Privacy: A Unified Framework to Understand Model Stealing Attacks and Defenses
Model Privacy: A Unified Framework to Understand Model Stealing Attacks and Defenses
G. Wang
Yuhong Yang
Jie Ding
177
2
0
21 Feb 2025
CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization
CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report SummarizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Mst. Fahmida Sultana Naznin
Adnan Ibney Faruq
Mostafa Rifat Tazwar
Md Jobayer
Md. Mehedi Hasan Shawon
Md Rakibul Hasan
MedIm
259
0
0
21 Feb 2025
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Ranjan Sapkota
Shaina Raza
Manoj Karkee
277
15
0
21 Feb 2025
What Are They Filtering Out? An Experimental Benchmark of Filtering Strategies for Harm Reduction in Pretraining Datasets
What Are They Filtering Out? An Experimental Benchmark of Filtering Strategies for Harm Reduction in Pretraining Datasets
Marco Antonio Stranisci
Christian Hardmeier
411
2
0
17 Feb 2025
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
Matteo Saponati
Pascal Sager
Pau Vilimelis Aceituno
Thilo Stadelmann
Benjamin Grewe
211
4
0
15 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
318
8
0
12 Feb 2025
Context information can be more important than reasoning for time series forecasting with a large language model
Context information can be more important than reasoning for time series forecasting with a large language modelInternational Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), 2025
Janghoon Yang
AI4TSLRM
202
2
0
08 Feb 2025
Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes
Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word SubstitutesInternational Conference on Agents and Artificial Intelligence (ICAART), 2025
Juraj Vladika
Stephen Meisenbacher
Florian Matthes
555
0
0
06 Feb 2025
Lowering the Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMs
Lowering the Barrier of Machine Learning: Achieving Zero Manual Labeling in Review Classification Using LLMsInternational Conference on Computing and Artificial Intelligence (ICCAI), 2025
Yejian Zhang
Shingo Takada
211
1
0
05 Feb 2025
A Framework for Double-Blind Federated Adaptation of Foundation Models
A Framework for Double-Blind Federated Adaptation of Foundation Models
Nurbek Tastan
Karthik Nandakumar
FedML
322
0
0
03 Feb 2025
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active LearningTransactions of the Association for Computational Linguistics (TACL), 2024
Jiaxin Guo
Cheng Chen
Shuzhen Li
Tianze Zhang
421
1
0
01 Feb 2025
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
Ahmed K. Kadhim
Lei Jiao
Rishad Shafik
Ole-Christoffer Granmo
DeLMO
418
5
0
31 Jan 2025
Detecting harassment and defamation in cyberbullying with emotion-adaptive training
Detecting harassment and defamation in cyberbullying with emotion-adaptive trainingInternational Conference on Web and Social Media (ICWSM), 2025
Peiling Yi
A. Zubiaga
Yunfei Long
395
2
0
28 Jan 2025
Optimizing Sentence Embedding with Pseudo-Labeling and Model Ensembles: A Hierarchical Framework for Enhanced NLP Tasks
Ziwei Liu
Qi Zhang
Lifu Gao
165
1
0
28 Jan 2025
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data
P. Tiwald
Ivona Krchova
Andrey Sidorenko
Mariana Vargas-Vieyra
Mario Scriminaci
Michael Platzer
461
10
0
21 Jan 2025
Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities
Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities
M. Ferrag
Fatima Alwahedi
A. Battah
Bilel Cherif
Abdechakour Mechri
Norbert Tihanyi
Tamás Bisztray
Merouane Debbah
131
11
0
20 Jan 2025
Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters
Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU ClustersIEEE Conference on Computer Communications (IEEE INFOCOM), 2025
Ziyue Luo
Jia-Wei Liu
Myungjin Lee
Ness B. Shroff
183
1
0
09 Jan 2025
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference Framework
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference FrameworkIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2023
Run Shao
Cheng Yang
Qiujun Li
Qing Zhu
Yongjun Zhang
...
Yu Liu
Yong Tang
Dapeng Liu
Shizhong Yang
Haifeng Li
487
0
0
08 Jan 2025
Trust Modeling in Counseling Conversations: A Benchmark Study
Aseem Srivastava
Zuhair Hasan Shaik
Tanmoy Chakraborty
Md. Shad Akhtar
187
2
0
06 Jan 2025
GORAG: Graph-based Online Retrieval Augmented Generation for Dynamic Few-shot Social Media Text Classification
GORAG: Graph-based Online Retrieval Augmented Generation for Dynamic Few-shot Social Media Text Classification
Yubo Wang
Haoyang Li
Fei Teng
Lei Chen
497
3
0
06 Jan 2025
Swift Cross-Dataset Pruning: Enhancing Fine-Tuning Efficiency in Natural Language Understanding
Swift Cross-Dataset Pruning: Enhancing Fine-Tuning Efficiency in Natural Language UnderstandingInternational Conference on Computational Linguistics (COLING), 2025
Binh-Nguyen Nguyen
Yang He
260
2
0
05 Jan 2025
Previous
12345...737475
Next
Page 4 of 75
Pageof 75