ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSL
    AIMat
ArXivPDFHTML

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 2,911 papers shown
Title
Turing Representational Similarity Analysis (RSA): A Flexible Method for Measuring Alignment Between Human and Artificial Intelligence
Turing Representational Similarity Analysis (RSA): A Flexible Method for Measuring Alignment Between Human and Artificial Intelligence
Mattson Ogg
Ritwik Bose
Jamie Scharf
Christopher Ratto
Michael Wolmetz
83
2
0
30 Nov 2024
Can bidirectional encoder become the ultimate winner for downstream
  applications of foundation models?
Can bidirectional encoder become the ultimate winner for downstream applications of foundation models?
Lewen Yang
Xuanyu Zhou
Juao Fan
Xinyi Xie
Shengxin Zhu
AI4CE
64
0
0
27 Nov 2024
What Differentiates Educational Literature? A Multimodal Fusion Approach
  of Transformers and Computational Linguistics
What Differentiates Educational Literature? A Multimodal Fusion Approach of Transformers and Computational Linguistics
Jordan J. Bird
63
0
0
26 Nov 2024
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning
  Small Language Models
Dynamic Self-Distillation via Previous Mini-batches for Fine-tuning Small Language Models
Y. Fu
Yin Yu
Xiaotian Han
Runchao Li
Xianxuan Long
Haotian Yu
Pan Li
SyDa
57
0
0
25 Nov 2024
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings
Carolin M. Schuster
Maria-Alexandra Dinisor
Shashwat Ghatiwala
Georg Groh
75
1
0
25 Nov 2024
A Comparative Analysis of Transformer and LSTM Models for Detecting
  Suicidal Ideation on Reddit
A Comparative Analysis of Transformer and LSTM Models for Detecting Suicidal Ideation on Reddit
Khalid Hasan
Jamil Saquer
AI4MH
65
0
0
23 Nov 2024
FLARE: FP-Less PTQ and Low-ENOB ADC Based AMS-PiM for Error-Resilient,
  Fast, and Efficient Transformer Acceleration
FLARE: FP-Less PTQ and Low-ENOB ADC Based AMS-PiM for Error-Resilient, Fast, and Efficient Transformer Acceleration
Donghyeon Yi
Seoyoung Lee
Jongho Kim
Junyoung Kim
Sohmyung Ha
Ik Joon Chang
Minkyu Je
70
0
0
22 Nov 2024
BERT-Based Approach for Automating Course Articulation Matrix
  Construction with Explainable AI
BERT-Based Approach for Automating Course Articulation Matrix Construction with Explainable AI
Natenaile Asmamaw Shiferaw
Simpenzwe Honore Leandre
Aman Sinha
Dillip Rout
54
0
0
21 Nov 2024
Forecasting Future International Events: A Reliable Dataset for
  Text-Based Event Modeling
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling
Daehoon Gwak
Junwoo Park
Minho Park
C. Park
Hyunchan Lee
E. Choi
Jaegul Choo
64
0
0
21 Nov 2024
Mitigating Gender Bias in Contextual Word Embeddings
Mitigating Gender Bias in Contextual Word Embeddings
Navya Yarrabelly
Vinay Damodaran
Feng-Guang Su
67
0
0
18 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and
  Outlook
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
56
1
0
12 Nov 2024
Clustering in Causal Attention Masking
Clustering in Causal Attention Masking
Nikita Karagodin
Yury Polyanskiy
Philippe Rigollet
60
5
0
07 Nov 2024
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
PSformer: Parameter-efficient Transformer with Segment Attention for Time Series Forecasting
Yanlong Wang
J. Xu
Fei Ma
Shao-Lun Huang
Danny Dongning Sun
Xiao-Ping Zhang
AI4TS
40
1
0
03 Nov 2024
Human-inspired Perspectives: A Survey on AI Long-term Memory
Human-inspired Perspectives: A Survey on AI Long-term Memory
Zihong He
Weizhe Lin
Hao Zheng
Fan Zhang
Matt Jones
Laurence Aitchison
X. Xu
Miao Liu
Per Ola Kristensson
Junxiao Shen
77
2
0
01 Nov 2024
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
ProTransformer: Robustify Transformers via Plug-and-Play Paradigm
Zhichao Hou
Weizhi Gao
Yuchen Shen
Feiyi Wang
Xiaorui Liu
VLM
28
2
0
30 Oct 2024
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA
Sangmin Bae
Adam Fisch
Hrayr Harutyunyan
Ziwei Ji
Seungyeon Kim
Tal Schuster
KELM
68
5
0
28 Oct 2024
Ensembling Finetuned Language Models for Text Classification
Ensembling Finetuned Language Models for Text Classification
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
33
0
0
25 Oct 2024
Deep Insights into Cognitive Decline: A Survey of Leveraging
  Non-Intrusive Modalities with Deep Learning Techniques
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques
David Ortiz-Perez
Manuel Benavent-Lledo
José García Rodríguez
David Tomás
M. Flores Vizcaya-Moreno
28
0
0
24 Oct 2024
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
Zebin Yang
Renze Chen
Taiqiang Wu
Ngai Wong
Yun Liang
Runsheng Wang
R. Huang
Meng Li
MQ
17
1
0
23 Oct 2024
Quantifying the Risks of Tool-assisted Rephrasing to Linguistic
  Diversity
Quantifying the Risks of Tool-assisted Rephrasing to Linguistic Diversity
Mengying Wang
Andreas Spitz
16
0
0
23 Oct 2024
Acoustic Model Optimization over Multiple Data Sources: Merging and
  Valuation
Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation
Victor Junqiu Wei
Weicheng Wang
Di Jiang
Conghui Tan
Rongzhong Lian
MoMe
30
0
0
21 Oct 2024
Causality for Large Language Models
Causality for Large Language Models
Anpeng Wu
Kun Kuang
Minqin Zhu
Yingrong Wang
Yujia Zheng
Kairong Han
B. Li
Guangyi Chen
Fei Wu
Kun Zhang
LRM
46
7
0
20 Oct 2024
Fine-Tuning Pre-trained Language Models for Robust Causal Representation
  Learning
Fine-Tuning Pre-trained Language Models for Robust Causal Representation Learning
Jialin Yu
Yuxiang Zhou
Yulan He
Nevin L. Zhang
Ricardo Silva
31
0
0
18 Oct 2024
Pseudo-label Refinement for Improving Self-Supervised Learning Systems
Pseudo-label Refinement for Improving Self-Supervised Learning Systems
Zia-ur-Rehman
Arif Mahmood
Wenxiong Kang
18
0
0
18 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in
  Terms of Human Language Acquisition
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
21
2
0
17 Oct 2024
Unitary Multi-Margin BERT for Robust Natural Language Processing
Unitary Multi-Margin BERT for Robust Natural Language Processing
Hao-Yuan Chang
Kang L. Wang
AAML
16
0
0
16 Oct 2024
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs
  with Adaptive Compression
FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression
Zhenheng Tang
Xueze Kang
Yiming Yin
Xinglin Pan
Yuxin Wang
...
Shaohuai Shi
Amelie Chi Zhou
Bo Li
Bingsheng He
Xiaowen Chu
AI4CE
67
8
0
16 Oct 2024
Layer-wise Importance Matters: Less Memory for Better Performance in
  Parameter-efficient Fine-tuning of Large Language Models
Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models
Kai Yao
P. Gao
Lichun Li
Yuan Zhao
Xiaofeng Wang
W. Wang
Jianke Zhu
24
1
0
15 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
TSDS: Data Selection for Task-Specific Model Finetuning
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
29
3
0
15 Oct 2024
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix
Arrhythmia Classification Using Graph Neural Networks Based on Correlation Matrix
Seungwoo Han
18
0
0
14 Oct 2024
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Lambda-Skip Connections: the architectural component that prevents Rank Collapse
Federico Arangath Joseph
Jerome Sieber
M. Zeilinger
Carmen Amo Alonso
33
0
0
14 Oct 2024
Mental Disorders Detection in the Era of Large Language Models
Mental Disorders Detection in the Era of Large Language Models
Gleb Kuzmin
Petr Strepetov
Maksim Stankevich
Artem Shelmanov
Ivan Smirnov
24
1
0
09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Tong Wu
Shujian Zhang
Kaiqiang Song
Silei Xu
Sanqiang Zhao
Ravi Agrawal
Sathish Indurthi
Chong Xiang
Prateek Mittal
Wenxuan Zhou
37
7
0
09 Oct 2024
Towards the generation of hierarchical attack models from cybersecurity
  vulnerabilities using language models
Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language models
Kacper Sowka
Vasile Palade
Xiaorui Jiang
Hesam Jadidbonab
15
1
0
07 Oct 2024
Computational design of target-specific linear peptide binders with
  TransformerBeta
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao
Francesco A. Aprile
Barbara Bravi
29
0
0
07 Oct 2024
Dynamic Post-Hoc Neural Ensemblers
Dynamic Post-Hoc Neural Ensemblers
Sebastian Pineda Arango
Maciej Janowski
Lennart Purucker
Arber Zela
Frank Hutter
Josif Grabocka
UQCV
31
0
0
06 Oct 2024
Variational Language Concepts for Interpreting Foundation Language
  Models
Variational Language Concepts for Interpreting Foundation Language Models
Hengyi Wang
Shiwei Tan
Zhiqing Hong
Desheng Zhang
Hao Wang
32
3
0
04 Oct 2024
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose
  Protein Understanding
Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding
Wei Yu Wu
Chao Wang
Liyi Chen
Mingze Yin
Yiheng Zhu
Kun Fu
Jieping Ye
Hui Xiong
Zheng Wang
28
1
0
04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models
Demystifying the Token Dynamics of Deep Selective State Space Models
Thieu N. Vo
Tung D. Pham
Xin T. Tong
Tan Minh Nguyen
Mamba
44
0
0
04 Oct 2024
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor
  Factorization for Compression of Generative Language Models
Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models
Mingxue Xu
Sadia Sharmin
Danilo P. Mandic
22
2
0
03 Oct 2024
Morphological evaluation of subwords vocabulary used by BETO language
  model
Morphological evaluation of subwords vocabulary used by BETO language model
Óscar García-Sierra
Ana Fernández-Pampillón Cesteros
Miguel Ortega-Martín
36
0
0
03 Oct 2024
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical
  Free-text Data
DeIDClinic: A Multi-Layered Framework for De-identification of Clinical Free-text Data
Angel Paul
Dhivin Shaji
Lifeng Han
Warren Del-Pinto
Goran Nenadic
OOD
34
0
0
02 Oct 2024
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models
Yuxuan Zhang
Ruizhe Li
MoMe
53
0
0
02 Oct 2024
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
Kevin Xu
Issei Sato
37
3
0
02 Oct 2024
Depression detection in social media posts using transformer-based
  models and auxiliary features
Depression detection in social media posts using transformer-based models and auxiliary features
Marios Kerasiotis
Loukas Ilias
D. Askounis
16
4
0
30 Sep 2024
FINE: Factorizing Knowledge for Initialization of Variable-sized
  Diffusion Models
FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models
Yucheng Xie
Fu Feng
Ruixiao Shi
Jing Wang
Xin Geng
AI4CE
34
2
0
28 Sep 2024
On the Inductive Bias of Stacking Towards Improving Reasoning
On the Inductive Bias of Stacking Towards Improving Reasoning
Nikunj Saunshi
Stefani Karp
Shankar Krishnan
Sobhan Miryoosefi
Sashank J. Reddi
Sanjiv Kumar
LRM
AI4CE
34
4
0
27 Sep 2024
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning
Yu Fu
Jie He
Yifan Yang
Qun Liu
Deyi Xiong
OffRL
LRM
36
0
0
27 Sep 2024
DisGeM: Distractor Generation for Multiple Choice Questions with Span
  Masking
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking
Devrim Cavusoglu
Secil Sen
Ulas Sert
29
0
0
26 Sep 2024
Integrating Hierarchical Semantic into Iterative Generation Model for
  Entailment Tree Explanation
Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation
Qin Wang
Jianzhou Feng
Yiming Xu
23
0
0
26 Sep 2024
Previous
123456...575859
Next