v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

International Conference on Learning Representations (ICLR), 2019

26 September 2019

ArXiv (abs)PDF HTML Github (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 3,048 papers shown

Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning TechniquesApplied Soft Computing (Appl. Soft Comput.), 2024

David Ortiz-Perez

Manuel Benavent-Lledo

José García Rodríguez

David Tomás

M. Flores Vizcaya-Moreno

242

24 Oct 2024

MCUBERT: Memory-Efficient BERT Inference on Commodity MicrocontrollersInternational Conference on Computer Aided Design (ICCAD), 2024

Meng Li

258

23 Oct 2024

Quantifying the Risks of Tool-assisted Rephrasing to Linguistic Diversity

Mengying Wang

Andreas Spitz

113

23 Oct 2024

Acoustic Model Optimization over Multiple Data Sources: Merging and Valuation

297

21 Oct 2024

Causality for Large Language Models

Yingrong Wang

325

20 Oct 2024

Pseudo-label Refinement for Improving Self-Supervised Learning Systems

Zia-ur-Rehman

Arif Mahmood

Wenxiong Kang

239

18 Oct 2024

Attuned to Change: Causal Fine-Tuning under Latent-Confounded Shifts

386

18 Oct 2024

From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition

233

17 Oct 2024

Unitary Multi-Margin BERT for Robust Natural Language Processing

Hao-Yuan Chang

Kang L. Wang

AAML

173

16 Oct 2024

FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression

...

Bo Li

231

16 Oct 2024

Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Kai Yao

152

15 Oct 2024

TSDS: Data Selection for Task-Specific Model FinetuningNeural Information Processing Systems (NeurIPS), 2024

Zifan Liu

Amin Karbasi

Theodoros Rekatsinas

309

15 Oct 2024

Arrhythmia Classification Using Graph Neural Networks Based on Correlation MatrixIEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024

Seungwoo Han

376

14 Oct 2024

Lambda-Skip Connections: the architectural component that prevents Rank CollapseInternational Conference on Learning Representations (ICLR), 2024

Federico Arangath Joseph

Jerome Sieber

Melanie Zeilinger

Carmen Amo Alonso

465

14 Oct 2024

Instructional Segment Embedding: Improving LLM Safety with Instruction HierarchyInternational Conference on Learning Representations (ICLR), 2024

406

09 Oct 2024

Exploring Large Language Models for Detecting Mental Disorders

230

09 Oct 2024

Towards the generation of hierarchical attack models from cybersecurity vulnerabilities using language modelsApplied Soft Computing (Appl. Soft Comput.), 2024

212

07 Oct 2024

Computational design of target-specific linear peptide binders with TransformerBeta

Haowen Zhao

Francesco A. Aprile

Barbara Bravi

262

07 Oct 2024

Regularized Neural Ensemblers

Sebastian Pineda Arango

301

06 Oct 2024

Variational Language Concepts for Interpreting Foundation Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Hao Wang

397

04 Oct 2024

Demystifying the Token Dynamics of Deep Selective State Space ModelsInternational Conference on Learning Representations (ICLR), 2024

Tan Minh Nguyen

318

04 Oct 2024

Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding with LLMs

389

04 Oct 2024

Geometry is All You Need: A Unified Taxonomy of Matrix and Tensor Factorization for Compression of Generative Language Models

Mingxue Xu

Sadia Sharmin

Danilo Mandic

262

03 Oct 2024

Morphological evaluation of subwords vocabulary used by BETO language model

Óscar García-Sierra

Ana Fernández-Pampillón Cesteros

Miguel Ortega-Martín

216

03 Oct 2024

DeIDClinic: A Multi-Layered Framework for De-identification of Clinical Free-text Data

225

02 Oct 2024

DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models

Yuxuan Zhang

Ruizhe Li

MoMe

487

02 Oct 2024

On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding

Kevin Xu

Issei Sato

761

02 Oct 2024

Depression detection in social media posts using transformer-based models and auxiliary featuresSocial Network Analysis and Mining (SNAM), 2024

Marios Kerasiotis

Loukas Ilias

D. Askounis

184

30 Sep 2024

FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models

Xin Geng

200

28 Sep 2024

On the Inductive Bias of Stacking Towards Improving ReasoningNeural Information Processing Systems (NeurIPS), 2024

280

27 Sep 2024

Meta-RTL: Reinforcement-Based Meta-Transfer Learning for Low-Resource Commonsense Reasoning

418

27 Sep 2024

DisGeM: Distractor Generation for Multiple Choice Questions with Span MaskingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Devrim Cavusoglu

Secil Sen

Ulas Sert

153

26 Sep 2024

Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation

Qin Wang

Jianzhou Feng

Yiming Xu

192

26 Sep 2024

SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal FusionNeural Information Processing Systems (NeurIPS), 2024

Wankou Yang

452

26 Sep 2024

Pre-trained Language Models Return Distinguishable Probability Distributions to Unfaithfully Hallucinated TextsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Taehun Cha

Donghun Lee

HILM

219

25 Sep 2024

dnaGrinder: a lightweight and high-capacity genomic foundation model

Qihang Zhao

Chi Zhang

Weixiong Zhang

183

24 Sep 2024

ToxiCraft: A Novel Framework for Synthetic Generation of Harmful InformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

383

23 Sep 2024

Data-centric NLP Backdoor Defense from the Lens of MemorizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Zhenting Wang

Zhizhi Wang

Haoyang Ling

Mengnan Du

Juan Zhai

Shiqing Ma

266

21 Sep 2024

Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer PredictionConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Amrit Diggavi Seshadri

160

21 Sep 2024

FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAsInternational Conference on Field-Programmable Technology (ICFPT), 2024

276

21 Sep 2024

Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease DetectionAnnual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2024

Chin-Po Chen

Jeng-Lin Li

LM&MA

19 Sep 2024

Evaluation of pretrained language models on music understanding

Yannis Vasilakis

Rachel M. Bittner

Johan Pauwels

261

17 Sep 2024

OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities

Hanane Azzag

M. Lebbah

ObjD

350

17 Sep 2024

Towards Data-Centric RLHF: Simple Metrics for Preference Dataset Comparison

Judy Hanwen Shen

Archit Sharma

Jun Qin

182

15 Sep 2024

Deep Fast Machine Learning Utils: A Python Library for Streamlined Machine Learning Prototyping

Fabi Prezja

AI4CE

132

14 Sep 2024

Multi-intent Aware Contrastive Learning for Sequential RecommendationInternational Conference on Artificial Neural Networks (ICANN), 2024

Xianghua Fu

147

13 Sep 2024

A BERT-Based Summarization approach for depression detection

Hossein Salahshoor Gavalan

Mohmmad Naim Rastgoo

Bahareh Nakisa

136

13 Sep 2024

TheraGen: Therapy for Every Generation

175

12 Sep 2024

Enhancing adversarial robustness in Natural Language Inference using explanationsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024

Alexandros Koulakos

Maria Lymperaiou

Giorgos Filandrianos

Giorgos Stamou

SILM AAML

393

11 Sep 2024

DA-MoE: Towards Dynamic Expert Allocation for Mixture-of-Experts Models

Maryam Akhavan Aghdam

Hongpeng Jin

Yanzhao Wu

MoE

225

10 Sep 2024