ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.11942
  4. Cited By
ALBERT: A Lite BERT for Self-supervised Learning of Language
  Representations
v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

International Conference on Learning Representations (ICLR), 2019
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
    SSLAIMat
ArXiv (abs)PDFHTMLGithub (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 3,049 papers shown
Milestones in Bengali Sentiment Analysis leveraging Transformer-models:
  Fundamentals, Challenges and Future Directions
Milestones in Bengali Sentiment Analysis leveraging Transformer-models: Fundamentals, Challenges and Future Directions
Saptarshi Sengupta
Shreya Ghosh
Prasenjit Mitra
Tarikul Islam Tamiti
360
3
0
15 Jan 2024
Developing ChatGPT for Biology and Medicine: A Complete Review of
  Biomedical Question Answering
Developing ChatGPT for Biology and Medicine: A Complete Review of Biomedical Question AnsweringBiophysics Reports (BR), 2024
Qing Li
Lei Li
Yu Li
LM&MAAI4MH
450
19
0
15 Jan 2024
Harnessing Large Language Models Over Transformer Models for Detecting
  Bengali Depressive Social Media Text: A Comprehensive Study
Harnessing Large Language Models Over Transformer Models for Detecting Bengali Depressive Social Media Text: A Comprehensive StudyNatural Language Processing Journal (JNLP), 2024
Ahmadul Karim Chowdhury
Saidur Rahman Sujon
Md. Shirajus Salekin Shafi
Tasin Ahmmad
Sifat Ahmed
Khan Md. Hasib
Faisal Muhammad Shah
AI4MH
203
31
0
14 Jan 2024
Stylometry Analysis of Multi-authored Documents for Authorship and
  Author Style Change Detection
Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection
Muhammad Tayyab Zamir
Muhammad Asif Ayub
Asma Gul
Nasir Ahmad
Kashif Ahmad
107
5
0
12 Jan 2024
Reliability Analysis of Psychological Concept Extraction and
  Classification in User-penned Text
Reliability Analysis of Psychological Concept Extraction and Classification in User-penned TextInternational Conference on Web and Social Media (ICWSM), 2024
Muskan Garg
Msvpj Sathvik
Amrit Chadha
Shaina Raza
Sunghwan Sohn
AI4MH
164
4
0
12 Jan 2024
LLMRS: Unlocking Potentials of LLM-Based Recommender Systems for
  Software Purchase
LLMRS: Unlocking Potentials of LLM-Based Recommender Systems for Software Purchase
Angela John
Theophilus Aidoo
Hamayoon Behmanush
Irem B. Gunduz
Hewan Shrestha
M. R. Rahman
Wolfgang Maass
301
2
0
12 Jan 2024
Multi-Task Learning for Front-End Text Processing in TTS
Multi-Task Learning for Front-End Text Processing in TTSIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Wonjune Kang
Yun Wang
Shun Zhang
Arthur Hinsvark
Qing He
178
3
0
12 Jan 2024
Autocompletion of Chief Complaints in the Electronic Health Records
  using Large Language Models
Autocompletion of Chief Complaints in the Electronic Health Records using Large Language ModelsBigData Congress [Services Society] (BSS), 2023
K M Sajjadul Islam
Ayesha Siddika Nipu
Praveen Madiraju
Priya Deshpande
LM&MA
173
7
0
11 Jan 2024
Phishing Website Detection through Multi-Model Analysis of HTML Content
Phishing Website Detection through Multi-Model Analysis of HTML Content
Furkan Çolhak
Mert İlhan Ecevit
Bilal Emir Uçar
Reiner Creutzburg
Hasan Dag
202
15
0
09 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
406
12
0
09 Jan 2024
MoE-Mamba: Efficient Selective State Space Models with Mixture of
  Experts
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Maciej Pióro
Kamil Ciebiera
Krystian Król
Jan Ludziejewski
Michał Krutul
Jakub Krajewski
Szymon Antoniak
Piotr Miłoś
Marek Cygan
Sebastian Jaszczur
MoEMamba
338
81
0
08 Jan 2024
TIER: Text-Image Encoder-based Regression for AIGC Image Quality
  Assessment
TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment
Jiquan Yuan
Xinyan Cao
Jinming Che
Qinyuan Wang
Sen Liang
Wei Ren
Jinlong Lin
Xixin Cao
EGVM
239
10
0
08 Jan 2024
An Exploratory Study on Automatic Identification of Assumptions in the
  Development of Deep Learning Frameworks
An Exploratory Study on Automatic Identification of Assumptions in the Development of Deep Learning FrameworksScience of Computer Programming (SCP), 2024
Chen Yang
Peng Liang
Zinan Ma
229
0
0
08 Jan 2024
Building Efficient and Effective OpenQA Systems for Low-Resource
  Languages
Building Efficient and Effective OpenQA Systems for Low-Resource Languages
Emrah Budur
Riza Ozccelik
Dilara Soylu
Omar Khattab
Tunga Güngör
Christopher Potts
269
8
0
07 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion
  Recognition
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Guoying Zhao
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Yinan Han
Jianhua Tao
429
26
0
07 Jan 2024
Enhancing Context Through Contrast
Enhancing Context Through Contrast
Kshitij Ambilduke
Aneesh Shetye
Diksha Bagade
Rishika Bhagwatkar
Khurshed Fitter
P. Vagdargi
Shital S. Chiddarwar
232
0
0
06 Jan 2024
SecureReg: Combining NLP and MLP for Enhanced Detection of Malicious
  Domain Name Registrations
SecureReg: Combining NLP and MLP for Enhanced Detection of Malicious Domain Name Registrations
Furkan cColhak
Mert İlhan Ecevit
Hasan Daug
Reiner Creutzburg
168
1
0
06 Jan 2024
Lotto: Secure Participant Selection against Adversarial Servers in
  Federated Learning
Lotto: Secure Participant Selection against Adversarial Servers in Federated Learning
Zhifeng Jiang
Peng Ye
Shiqi He
Wei Wang
Ruichuan Chen
Bo Li
329
6
0
05 Jan 2024
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts
  for Instruction Tuning on General Tasks
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
Haoyuan Wu
Haisheng Zheng
Zhuolun He
Bei Yu
MoEALM
273
24
0
05 Jan 2024
Understanding LLMs: A Comprehensive Overview from Training to Inference
Understanding LLMs: A Comprehensive Overview from Training to Inference
Yi-Hsueh Liu
Haoyang He
Tianle Han
Xu-Yao Zhang
Mengyuan Liu
...
Xiaoyan Cai
Tuo Zhang
Ning Qiang
Tianming Liu
Bao Ge
SyDa
464
124
0
04 Jan 2024
Towards a Foundation Purchasing Model: Pretrained Generative
  Autoregression on Transaction Sequences
Towards a Foundation Purchasing Model: Pretrained Generative Autoregression on Transaction SequencesInternational Conference on AI in Finance (ICAF), 2023
Piotr Skalski
David Sutton
Stuart Burrell
Iker Perez
Jason Wong
AI4TS
223
8
0
03 Jan 2024
Evaluating Fairness in Self-supervised and Supervised Models for
  Sequential Data
Evaluating Fairness in Self-supervised and Supervised Models for Sequential Data
Sofia Yfantidou
Dimitris Spathis
Marios Constantinides
Athena Vakali
Daniele Quercia
F. Kawsar
325
3
0
03 Jan 2024
A Comprehensive Survey of Hallucination Mitigation Techniques in Large
  Language Models
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Anku Rani
Vipula Rawte
Vasu Sharma
Amitava Das
HILM
461
356
0
02 Jan 2024
Unifying Structured Data as Graph for Data-to-Text Pre-Training
Unifying Structured Data as Graph for Data-to-Text Pre-TrainingTransactions of the Association for Computational Linguistics (TACL), 2024
Shujie Li
Liang Li
Ruiying Geng
Min Yang
Binhua Li
...
Wanwei He
Shao Yuan
Can Ma
Fei Huang
Yongbin Li
LMTD
293
25
0
02 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
301
28
0
31 Dec 2023
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via
  Expressive Masked Audio Gesture Modeling
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture ModelingComputer Vision and Pattern Recognition (CVPR), 2023
Haiyang Liu
Zihao Zhu
Giorgio Becherini
Yichen Peng
Mingyang Su
You Zhou
Xuefei Zhe
Naoya Iwamoto
Bo Zheng
Michael J. Black
SLR
891
78
0
31 Dec 2023
Research on the Laws of Multimodal Perception and Cognition from a
  Cross-cultural Perspective -- Taking Overseas Chinese Gardens as an Example
Research on the Laws of Multimodal Perception and Cognition from a Cross-cultural Perspective -- Taking Overseas Chinese Gardens as an Example
Ran Chen
Xueqi Yao
Jing Zhao
Shuhan Xu
Sirui Zhang
Yijun Mao
110
0
0
29 Dec 2023
Multi-Task Multi-Agent Shared Layers are Universal Cognition of
  Multi-Agent Coordination
Multi-Task Multi-Agent Shared Layers are Universal Cognition of Multi-Agent Coordination
Jiawei Wang
Jian Zhao
Zhengtao Cao
Ruili Feng
Rongjun Qin
Yang Yu
207
1
0
25 Dec 2023
Multi-level biomedical NER through multi-granularity embeddings and
  enhanced labeling
Multi-level biomedical NER through multi-granularity embeddings and enhanced labeling
Fahime Shahrokh
Nasser Ghadiri
Rasoul Samani
M. Moradi
211
0
0
24 Dec 2023
Understanding the Potential of FPGA-Based Spatial Acceleration for Large
  Language Model Inference
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference
Hongzheng Chen
Jiahao Zhang
Yixiao Du
Shaojie Xiang
Zichao Yue
Niansong Zhang
Yaohui Cai
Zhiru Zhang
235
72
0
23 Dec 2023
Characterizing and Classifying Developer Forum Posts with their
  Intentions
Characterizing and Classifying Developer Forum Posts with their Intentions
Xingfang Wu
Eric Thibodeau-Laufer
Heng Li
Foutse Khomh
Santhosh Srinivasan
Jayden Luo
140
3
0
21 Dec 2023
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse
  Weight Factorization
DSFormer: Effective Compression of Text-Transformers by Dense-Sparse Weight Factorization
Rahul Chand
Yashoteja Prabhu
Pratyush Kumar
190
5
0
20 Dec 2023
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models:
  A Critical Review and Assessment
Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment
Lingling Xu
Haoran Xie
S. J. Qin
Xiaohui Tao
F. Wang
300
272
0
19 Dec 2023
Assessing Logical Reasoning Capabilities of Encoder-Only Transformer
  Models
Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models
Paulo Pirozelli
M. M. José
Paulo de Tarso P. Filho
A. Brandão
Fabio Gagliardi Cozman
LRMELM
330
4
0
18 Dec 2023
A mathematical perspective on Transformers
A mathematical perspective on TransformersBulletin of the American Mathematical Society (BAMS), 2023
Borjan Geshkovski
Cyril Letrouit
Yury Polyanskiy
Philippe Rigollet
EDLAI4CE
671
104
0
17 Dec 2023
RDR: the Recap, Deliberate, and Respond Method for Enhanced Language
  Understanding
RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding
Yuxin Zi
Hariram Veeramani
Kaushik Roy
Amit P. Sheth
AI4TS
243
2
0
15 Dec 2023
BinGo: Identifying Security Patches in Binary Code with Graph
  Representation Learning
BinGo: Identifying Security Patches in Binary Code with Graph Representation LearningACM Asia Conference on Computer and Communications Security (AsiaCCS), 2023
Xu He
Shu Wang
Pengbin Feng
Xinda Wang
Shiyu Sun
Qi Li
Kun Sun
147
2
0
13 Dec 2023
One-Step Diffusion Distillation via Deep Equilibrium Models
One-Step Diffusion Distillation via Deep Equilibrium ModelsNeural Information Processing Systems (NeurIPS), 2023
Zhengyang Geng
Ashwini Pokle
Trevor Killeen
294
49
0
12 Dec 2023
Evaluating ChatGPT as a Question Answering System: A Comprehensive
  Analysis and Comparison with Existing Models
Evaluating ChatGPT as a Question Answering System: A Comprehensive Analysis and Comparison with Existing Models
Hossein Bahak
Farzaneh Taheri
Zahra Zojaji
Arefeh Kazemi
ELMAI4MH
196
27
0
11 Dec 2023
Why "classic" Transformers are shallow and how to make them go deep
Why "classic" Transformers are shallow and how to make them go deep
Yueyao Yu
Yin Zhang
ViT
277
0
0
11 Dec 2023
Transformer as Linear Expansion of Learngene
Transformer as Linear Expansion of LearngeneAAAI Conference on Artificial Intelligence (AAAI), 2023
Shiyu Xia
Miaosen Zhang
Xu Yang
Ruiming Chen
Haokun Chen
Xin Geng
198
12
0
09 Dec 2023
Sim-GPT: Text Similarity via GPT Annotated Data
Sim-GPT: Text Similarity via GPT Annotated Data
Shuhe Wang
Beiming Cao
Shengyu Zhang
Xiaoya Li
Jiwei Li
Leilei Gan
Guoyin Wang
Eduard Hovy
218
4
0
09 Dec 2023
Enhanced E-Commerce Attribute Extraction: Innovating with Decorative
  Relation Correction and LLAMA 2.0-Based Annotation
Enhanced E-Commerce Attribute Extraction: Innovating with Decorative Relation Correction and LLAMA 2.0-Based Annotation
Jianghong Zhou
Weizhi Du
Md Omar Faruk Rokon
Zhaodong Wang
Jiaxuan Xu
Isha Shah
Kuang-chih Lee
Musen Wen
104
1
0
09 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
411
12
0
07 Dec 2023
RoAST: Robustifying Language Models via Adversarial Perturbation with
  Selective Training
RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training
Jaehyung Kim
Yuning Mao
Rui Hou
Hanchao Yu
Davis Liang
Pascale Fung
Qifan Wang
Fuli Feng
Lifu Huang
Madian Khabsa
AAML
241
4
0
07 Dec 2023
Series2Vec: Similarity-based Self-supervised Representation Learning for
  Time Series Classification
Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification
Navid Mohammadi Foumani
Chang Wei Tan
Geoffrey I. Webb
Hamid Rezatofighi
Mahsa Salehi
SSLAI4TS
297
14
0
07 Dec 2023
Detecting Rumor Veracity with Only Textual Information by Double-Channel
  Structure
Detecting Rumor Veracity with Only Textual Information by Double-Channel StructureInternational Workshop on Natural Language Processing for Social Media (SocialNLP), 2023
Alex G. Kim
Sangwon Yoon
173
4
0
06 Dec 2023
Large Language Models on Graphs: A Comprehensive Survey
Large Language Models on Graphs: A Comprehensive SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Sara Szymkuć
Gang Liu
Chi Han
Meng Jiang
Heng Ji
Jiawei Han
AI4CE
342
248
0
05 Dec 2023
Expand BERT Representation with Visual Information via Grounded Language
  Learning with Multimodal Partial Alignment
Expand BERT Representation with Visual Information via Grounded Language Learning with Multimodal Partial AlignmentACM Multimedia (ACM MM), 2023
Cong-Duy Nguyen
The-Anh Vu-Le
Thong Nguyen
Tho Quan
Anh Tuan Luu
338
7
0
04 Dec 2023
Unsupervised Approach to Evaluate Sentence-Level Fluency: Do We Really
  Need Reference?
Unsupervised Approach to Evaluate Sentence-Level Fluency: Do We Really Need Reference?
Gopichand Kanumolu
Lokesh Madasu
Pavan Baswani
Ananya Mukherjee
Manish Shrivastava
171
2
0
03 Dec 2023
Previous
123...121314...596061
Next
Page 13 of 61
Pageof 61