v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

International Conference on Learning Representations (ICLR), 2019

26 September 2019

ArXiv (abs)PDF HTML Github (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 3,050 papers shown

A Framework for Evaluation of Machine Reading Comprehension Gold StandardsInternational Conference on Language Resources and Evaluation (LREC), 2020

Viktor Schlegel

155

10 Mar 2020

What the [MASK]? Making Sense of Language-Specific BERT Models

Debora Nozza

Federico Bianchi

Dirk Hovy

317

121

05 Mar 2020

Talking-Heads Attention

271

05 Mar 2020

jiant: A Software Toolkit for Research on General-Purpose Text Understanding ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

246

04 Mar 2020

AraBERT: Transformer-based Model for Arabic Language Understanding

Wissam Antoun

Fady Baly

Hazem M. Hajj

660

1,200

28 Feb 2020

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-TrainingInternational Conference on Machine Learning (ICML), 2020

...

225

419

28 Feb 2020

TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language ProcessingAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

186

28 Feb 2020

On Biased Compression for Distributed LearningJournal of machine learning research (JMLR), 2020

326

221

27 Feb 2020

A Primer in BERTology: What we know about how BERT worksTransactions of the Association for Computational Linguistics (TACL), 2020

483

1,744

27 Feb 2020

Compressing Large-Scale Transformer-Based Models: A Case Study on BERTTransactions of the Association for Computational Linguistics (TACL), 2020

480

213

27 Feb 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

310

153

26 Feb 2020

Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension

H. Wan

193

26 Feb 2020

KEML: A Knowledge-Enriched Meta-Learning Framework for Lexical Relation ClassificationAAAI Conference on Artificial Intelligence (AAAI), 2020

Chengyu Wang

240

25 Feb 2020

Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0

Eric Hulburd

129

25 Feb 2020

Do Multi-Hop Question Answering Systems Know How to Answer the Single-Hop Sub-Questions?Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2020

Yixuan Tang

Hwee Tou Ng

A. Tung

121

23 Feb 2020

Investigating Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network

Xuefeng Bai

Pengbo Liu

Yue Zhang

GNN

176

22 Feb 2020

Training Question Answering Models From Synthetic DataConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

200

170

22 Feb 2020

CoLES: Contrastive Learning for Event Sequences with Self-Supervision

Ivan Kireev

203

19 Feb 2020

Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning

Zixin Wen

SSL

185

17 Feb 2020

SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

Sijin Yu

C.-C. Jay Kuo

219

184

16 Feb 2020

Towards Detection of Subjective Bias using Contextualized Word EmbeddingsThe Web Conference (WWW), 2020

Tanvi Dadu

Kartikey Pant

R. Mamidi

16 Feb 2020

Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping

294

678

15 Feb 2020

TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval

Wenhao Lu

Jian Jiao

Ruofei Zhang

191

14 Feb 2020

Transformer on a Diet

223

14 Feb 2020

HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language ProcessingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2020

199

14 Feb 2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

576

995

10 Feb 2020

BERT-of-Theseus: Compressing BERT by Progressive Module ReplacingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

700

219

07 Feb 2020

Aligning the Pretraining and Finetuning Objectives of Language Models

Nuo Wang Pierse

Jing Lu

AI4CE

109

05 Feb 2020

Pseudo-Bidirectional Decoding for Local Sequence TransductionFindings (Findings), 2020

Wangchunshu Zhou

Tao Ge

Ke Xu

213

31 Jan 2020

Bringing Stories Alive: Generating Interactive Fiction WorldsArtificial Intelligence and Interactive Digital Entertainment Conference (AIIDE), 2020

Prithviraj Ammanabrolu

227

28 Jan 2020

Retrospective Reader for Machine Reading ComprehensionAAAI Conference on Artificial Intelligence (AAAI), 2020

373

237

27 Jan 2020

DUMA: Reading Comprehension with Transposition ThinkingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020

411

26 Jan 2020

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language GenerationInternational Joint Conference on Artificial Intelligence (IJCAI), 2020

217

133

26 Jan 2020

BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT

146

25 Jan 2020

Multi-task self-supervised learning for Robust Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020

Mirco Ravanelli

477

303

25 Jan 2020

PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination

Saurabh Goyal

Anamitra R. Choudhury

Saurabh ManishRaje

Venkatesan T. Chakaravarthy

Yogish Sabharwal

Ashish Verma

349

24 Jan 2020

Scaling Laws for Neural Language Models

1.8K

6,759

23 Jan 2020

Normalization of Input-output Shared Embeddings in Text Generation Models

Jinyang Liu

Yujia Zhai

Zizhong Chen

133

22 Jan 2020

A multimodal deep learning approach for named entity recognition from social media

282

19 Jan 2020

RobBERT: a Dutch RoBERTa-based Language ModelFindings (Findings), 2020

Pieter Delobelle

Thomas Winters

Bettina Berendt

199

263

17 Jan 2020

Graph-Bert: Only Attention is Needed for Learning Graph Representations

344

360

15 Jan 2020

A BERT based Sentiment Analysis and Key Entity Detection Approach for Online Financial TextsInternational Conference on Computer Supported Cooperative Work in Design (CSCWD), 2020

Lin Zhao

Lin Li

Xinhao Zheng

197

14 Jan 2020

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese

308

13 Jan 2020

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture SearchInternational Joint Conference on Artificial Intelligence (IJCAI), 2020

Jingren Zhou

219

106

13 Jan 2020

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-trainingFindings (Findings), 2020

Dayiheng Liu

367

473

13 Jan 2020

Assessment Modeling: Fundamental Pre-training Tasks for Interactive Educational Systems

...

279

01 Jan 2020

Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical VentilationClinical Natural Language Processing Workshop (ClinicalNLP), 2019

198

27 Dec 2019

Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

Thomas D. Dowdell

Hongyu Zhang

154

27 Dec 2019

BERTje: A Dutch BERT Model

Wietse de Vries

Andreas van Cranenburgh

226

316

19 Dec 2019

WaLDORf: Wasteless Language-model Distillation On Reading-comprehension

169

13 Dec 2019