ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

International Conference on Learning Representations (ICLR), 2019
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 3,044 papers shown
Title
On the Origin of Algorithmic Progress in AI
On the Origin of Algorithmic Progress in AI
Hans Gundlach
Alex Fogelson
Jayson Lynch
Ana Trisovic
Jonathan Rosenfeld
Anmol Sandhu
Neil Thompson
44
0
0
26 Nov 2025
Large Language Models and 3D Vision for Intelligent Robotic Perception and Autonomy
Large Language Models and 3D Vision for Intelligent Robotic Perception and AutonomyItalian National Conference on Sensors (INS), 2025
Vinit Mehta
Charu Sharma
Karthick Thiyagarajan
LM&Ro
324
0
0
14 Nov 2025
Stratified Knowledge-Density Super-Network for Scalable Vision Transformers
Stratified Knowledge-Density Super-Network for Scalable Vision Transformers
Longhua Li
Lei Qi
Xin Geng
ViT
112
0
0
12 Nov 2025
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Sean McLeish
Ang Li
John Kirchenbauer
Dayal Singh Kalra
Brian Bartoldson
B. Kailkhura
Avi Schwarzschild
Jonas Geiping
Tom Goldstein
Micah Goldblum
208
0
0
10 Nov 2025
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
Qian Ma
Ruoxiang Xu
Yongqiang Cai
68
0
0
09 Nov 2025
Comparing Reconstruction Attacks on Pretrained Versus Full Fine-tuned Large Language Model Embeddings on Homo Sapiens Splice Sites Genomic Data
Comparing Reconstruction Attacks on Pretrained Versus Full Fine-tuned Large Language Model Embeddings on Homo Sapiens Splice Sites Genomic Data
Reem Al-Saidi
Erman Ayday
Ziad Kobti
AAML
60
0
0
09 Nov 2025
Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering
Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering
Xinying Qian
Ying Zhang
Yu Zhao
Baohang Zhou
Xuhui Sui
Xiaojie Yuan
RALM
223
0
0
06 Nov 2025
The Curved Spacetime of Transformer Architectures
The Curved Spacetime of Transformer Architectures
Riccardo Di Sipio
Jairo Diaz-Rodriguez
Luis Serrano
72
0
0
04 Nov 2025
ProM3E: Probabilistic Masked MultiModal Embedding Model for Ecology
ProM3E: Probabilistic Masked MultiModal Embedding Model for Ecology
Srikumar Sastry
Subash Khanal
Aayush Dhakal
Jiayu Lin
Dan Cher
Phoenix Jarosz
Nathan Jacobs
104
0
0
04 Nov 2025
TriCon-Fair: Triplet Contrastive Learning for Mitigating Social Bias in Pre-trained Language Models
TriCon-Fair: Triplet Contrastive Learning for Mitigating Social Bias in Pre-trained Language Models
Chong Lyu
Lin Li
Shiqing Wu
Jingling Yuan
68
0
0
02 Nov 2025
Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems
Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems
Hongbo Li
Qinhang Wu
Sen-Fon Lin
Yingbin Liang
Ness B. Shroff
MoE
88
0
0
30 Oct 2025
Beyond One-Size-Fits-All: Personalized Harmful Content Detection with In-Context Learning
Beyond One-Size-Fits-All: Personalized Harmful Content Detection with In-Context Learning
Rufan Zhang
Lin Zhang
Xianghang Mi
40
0
0
29 Oct 2025
MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference
MERGE: Minimal Expression-Replacement GEneralization Test for Natural Language Inference
Mădălina Zgreabăn
Tejaswini Deoskar
Lasha Abzianidze
82
0
0
28 Oct 2025
Parallel Loop Transformer for Efficient Test-Time Computation Scaling
Parallel Loop Transformer for Efficient Test-Time Computation Scaling
Bohong Wu
Mengzhao Chen
Xiang Luo
Shen Yan
Qifan Yu
...
Hongrui Zhan
Zheng Zhong
Xun Zhou
Siyuan Qiao
Xingyan Bin
96
2
0
28 Oct 2025
Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Decoder-Only Transformers
Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Decoder-Only Transformers
Marko Karbevski
Antonij Mijoski
123
0
0
27 Oct 2025
Knocking-Heads Attention
Knocking-Heads Attention
Zhanchao Zhou
Xiaodong Chen
Haoxing Chen
Zhenzhong Lan
Jianguo Li
72
0
0
27 Oct 2025
Manifold Approximation leads to Robust Kernel Alignment
Manifold Approximation leads to Robust Kernel Alignment
Mohammad Tariqul Islam
Du Liu
Deblina Sarkar
108
1
0
27 Oct 2025
SALSA: Single-pass Autoregressive LLM Structured Classification
SALSA: Single-pass Autoregressive LLM Structured Classification
Ruslan Berdichevsky
Shai Nahum-Gefen
Elad Ben Zaken
100
0
0
26 Oct 2025
NeoDictaBERT: Pushing the Frontier of BERT models for Hebrew
NeoDictaBERT: Pushing the Frontier of BERT models for Hebrew
Shaltiel Shmidman
Avi Shmidman
Moshe Koppel
VLM
55
0
0
23 Oct 2025
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Tibetan Language and AI: A Comprehensive Survey of Resources, Methods and Challenges
Cheng Huang
Nyima Tashi
Fan Gao
Yutong Liu
J. Li
...
Guojie Tang
Xiangxiang Wang
Jia Zhang
Tsengdar J. Lee
Yongbin Yu
96
0
0
22 Oct 2025
ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
Zhiwei Hao
Jianyuan Guo
Li Shen
Kai Han
Yehui Tang
Han Hu
Yunhe Wang
175
0
0
21 Oct 2025
Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models
Engagement Undermines Safety: How Stereotypes and Toxicity Shape Humor in Language Models
Atharvan Dogra
Soumya Suvra Ghosal
Ameet Deshpande
Ashwin Kalyan
Dinesh Manocha
106
0
0
21 Oct 2025
Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL
Decoding Dynamic Visual Experience from Calcium Imaging via Cell-Pattern-Aware SSL
Sangyoon Bae
Mehdi Azabou
Jiook Cha
Blake Richards
100
0
0
21 Oct 2025
Explainability of Large Language Models: Opportunities and Challenges toward Generating Trustworthy Explanations
Explainability of Large Language Models: Opportunities and Challenges toward Generating Trustworthy Explanations
Shahin Atakishiyev
H. Babiker
Jiayi Dai
Nawshad Farruque
Teruaki Hayashi
...
Md Abed Rahman
Iain Smith
Mi-Young Kim
Osmar R. Zaïane
Randy Goebel
LRM
117
0
0
20 Oct 2025
Extending Audio Context for Long-Form Understanding in Large Audio-Language Models
Extending Audio Context for Long-Form Understanding in Large Audio-Language Models
Yuatyong Chaichana
Pittawat Taveekitworachai
Warit Sirichotedumrong
Potsawee Manakul
Kunat Pipatanakul
AuLLM
116
0
0
17 Oct 2025
MemoTime: Memory-Augmented Temporal Knowledge Graph Enhanced Large Language Model Reasoning
MemoTime: Memory-Augmented Temporal Knowledge Graph Enhanced Large Language Model Reasoning
Xingyu Tan
Xiaoyang Wang
Xiwei Xu
Xin Yuan
Liming Zhu
Wenjie Zhang
KELMLRM
105
0
0
15 Oct 2025
DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation
DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation
Yakun Song
Xiaobin Zhuang
Jiawei Chen
Zhikang Niu
Guanrou Yang
...
Zhuo Chen
Yuping Wang
Yuping Wang
Xie Chen
Xie Chen
DiffM
128
0
0
14 Oct 2025
Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
Traveling Salesman-Based Token Ordering Improves Stability in Homomorphically Encrypted Language Models
Donghwan Rho
Sieun Seo
Hyewon Sung
Chohong Min
Ernest K. Ryu
100
0
0
14 Oct 2025
ProtoSiTex: Learning Semi-Interpretable Prototypes for Multi-label Text Classification
ProtoSiTex: Learning Semi-Interpretable Prototypes for Multi-label Text Classification
Utsav Nareti
Suraj Kumar
Soumya Pandey
S. Chattopadhyay
Chandranath Adak
VLM
110
0
0
14 Oct 2025
FedHybrid: Breaking the Memory Wall of Federated Learning via Hybrid Tensor Management
FedHybrid: Breaking the Memory Wall of Federated Learning via Hybrid Tensor ManagementACM International Conference on Embedded Networked Sensor Systems (SenSys), 2024
Kahou Tam
Chunlin Tian
Li Li
Haikai Zhao
Chengzhong Xu
FedML
149
6
0
13 Oct 2025
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
Yeskendir Koishekenov
Aldo Lipani
Nicola Cancedda
LRM
86
2
0
08 Oct 2025
AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents
AgentDR Dynamic Recommendation with Implicit Item-Item Relations via LLM-based Agents
Mingdai Yang
Nurendra Choudhary
Jiangshu Du
Edward W.Huang
Philip S.Yu
Karthik Subbian
Danai Kourta
116
0
0
07 Oct 2025
MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation
MASA: Rethinking the Representational Bottleneck in LoRA with Multi-A Shared Adaptation
Qin Dong
Yuntian Tang
Heming Jia
Yunhang Shen
Bohan Jia
Wenxuan Huang
Lianyue Zhang
Jiao Xie
Shaohui Lin
MoE
68
0
0
07 Oct 2025
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Ryan Solgi
Parsa Madinei
Jiayi Tian
Rupak Vignesh Swaminathan
Jing Liu
Nathan Susanj
Zheng Zhang
58
1
0
07 Oct 2025
Downsized and Compromised?: Assessing the Faithfulness of Model Compression
Downsized and Compromised?: Assessing the Faithfulness of Model Compression
Moumita Kamal
Douglas A. Talbert
96
0
0
07 Oct 2025
Neural Correlates of Language Models Are Specific to Human Language
Neural Correlates of Language Models Are Specific to Human Language
Iñigo Parra
77
0
0
03 Oct 2025
Dissecting Transformers: A CLEAR Perspective towards Green AI
Dissecting Transformers: A CLEAR Perspective towards Green AI
Hemang Jain
Shailender Goyal
Divyansh Pandey
Karthik Vaidhyanathan
68
0
0
03 Oct 2025
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
PrunedLoRA: Robust Gradient-Based structured pruning for Low-rank Adaptation in Fine-tuning
Xin Yu
Cong Xie
Ziyu Zhao
Tiantian Fan
Lingzhou Xue
Zhi-Li Zhang
192
0
0
30 Sep 2025
CustomIR: Unsupervised Fine-Tuning of Dense Embeddings for Known Document Corpora
CustomIR: Unsupervised Fine-Tuning of Dense Embeddings for Known Document Corpora
Nathan Paull
78
0
0
30 Sep 2025
Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests
Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests
Yanbin Fu
Hong Jiao
Tianyi Zhou
Robert Lissitz
Nan Zhang
Ming Li
Qingshu Xu
Sydney Peters
199
0
0
30 Sep 2025
Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction
Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction
Zhexiong Liu
Diane Litman
KELM
96
0
0
30 Sep 2025
Federated Learning Meets LLMs: Feature Extraction From Heterogeneous Clients
Federated Learning Meets LLMs: Feature Extraction From Heterogeneous Clients
Abdelrhman Gaber
Hassan Abd-Eltawab
Youssif Abuzied
Muhammad ElMahdy
Tamer ElBatt
76
0
0
29 Sep 2025
RedNote-Vibe: A Dataset for Capturing Temporal Dynamics of AI-Generated Text in Social Media
RedNote-Vibe: A Dataset for Capturing Temporal Dynamics of AI-Generated Text in Social Media
Yudong Li
Yufei Sun
Yuhan Yao
Peiru Yang
Wanyue Li
Jiajun Zou
Yongfeng Huang
LinLin Shen
113
0
0
26 Sep 2025
Detecting (Un)answerability in Large Language Models with Linear Directions
Detecting (Un)answerability in Large Language Models with Linear Directions
Maor Juliet Lavi
Tova Milo
Mor Geva
120
0
0
26 Sep 2025
A Formal Comparison Between Chain-of-Thought and Latent Thought
A Formal Comparison Between Chain-of-Thought and Latent Thought
Kevin Xu
Issei Sato
ReLMLRM
61
0
0
25 Sep 2025
RedHerring Attack: Testing the Reliability of Attack Detection
RedHerring Attack: Testing the Reliability of Attack Detection
Jonathan Rusert
AAML
68
0
0
25 Sep 2025
Every Character Counts: From Vulnerability to Defense in Phishing Detection
Every Character Counts: From Vulnerability to Defense in Phishing Detection
Maria Chiper
Radu Tudor Ionescu
169
0
0
24 Sep 2025
An overview of neural architectures for self-supervised audio representation learning from masked spectrograms
An overview of neural architectures for self-supervised audio representation learning from masked spectrograms
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
Mamba
151
0
0
23 Sep 2025
Uncertainty in Semantic Language Modeling with PIXELS
Uncertainty in Semantic Language Modeling with PIXELS
Stefania Radu
Marco Zullich
Matias Valdenegro-Toro
88
0
0
23 Sep 2025
Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
Lekkala Sai Teja
Annepaka Yadagiri
Sangam Sai Anish
Siva Gopala Krishna Nuthakki
Partha Pakray
AAMLDeLMO
186
1
0
22 Sep 2025
1234...596061
Next