v1v2v3v4v5v6 (latest)

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

International Conference on Learning Representations (ICLR), 2019

26 September 2019

ArXiv (abs)PDF HTML Github (3271★)

Papers citing "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"

50 / 3,047 papers shown

Title
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms Goro Kobayashi Tatsuki Kuribayashi Sho Yokoi Kentaro Inui 183 15 0 21 Apr 2020
A Generic Network Compression Framework for Sequential Recommender Systems Yang Sun Fajie Yuan Ming Yang Guoao Wei Zhou Zhao Duo Liu 203 58 0 21 Apr 2020
Investigating the Effectiveness of Representations Based on Pretrained Transformer-based Language Models in Active Learning for Labelling Text Datasets Jinghui Lu B. MacNamee 108 19 0 21 Apr 2020
Fine-tuning Multi-hop Question Answering with Hierarchical Graph Network Guanming Xiong 364 0 0 20 Apr 2020
The Cost of Training NLP Models: A Concise Overview Or Sharir Barak Peleg Y. Shoham 219 230 0 19 Apr 2020
ETC: Encoding Long and Structured Inputs in Transformers Joshua Ainslie Santiago Ontanon Chris Alberti Vaclav Cvicek Zachary Kenneth Fisher Philip Pham Anirudh Ravula Sumit Sanghai Qifan Wang Li Yang 291 56 0 17 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive NetworksAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Yekun Chai Jin Shuo Xinwen Hou 252 22 0 17 Apr 2020
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Joongbo Shin Yoonhyung Lee Seunghyun Yoon Kyomin Jung OOD 141 12 0 17 Apr 2020
Transform and Tell: Entity-Aware News Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2020 Alasdair Tran A. Mathews Lexing Xie VLM 173 108 0 17 Apr 2020
Training with Quantization Noise for Extreme Model CompressionInternational Conference on Learning Representations (ICLR), 2020 Angela Fan Pierre Stock Benjamin Graham Edouard Grave Remi Gribonval Edouard Grave Armand Joulin MQ 255 256 0 15 Apr 2020
lamBERT: Language and Action Learning Using Multimodal BERT Kazuki Miyazawa Tatsuya Aoki Takato Horii Takayuki Nagai SSL LM&Ro 164 12 0 15 Apr 2020
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented DialogueConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Chien-Sheng Wu Guosheng Lin R. Socher Caiming Xiong 327 337 0 15 Apr 2020
Cascade Neural Ensemble for Identifying Scientifically Sound Articles Ashwin Karthik Ambalavanan M. Devarakonda 91 1 0 13 Apr 2020
Robustly Pre-trained Neural Model for Direct Temporal Relation ExtractionIEEE International Conference on Healthcare Informatics (ICHI), 2020 Hong Guan Jianfu Li Hua Xu M. Devarakonda 101 13 0 13 Apr 2020
Pretrained Transformers Improve Out-of-Distribution RobustnessAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Dan Hendrycks Xiaoyuan Liu Eric Wallace Adam Dziedzic R. Krishnan Basel Alomair OOD 453 459 0 13 Apr 2020
CLUE: A Chinese Language Understanding Evaluation BenchmarkInternational Conference on Computational Linguistics (COLING), 2020 Liang Xu Hai Hu Xuanwei Zhang Lu Li Chenjie Cao ... Cong Yue Xinrui Zhang Zhen-Yi Yang Kyle Richardson Zhenzhong Lan ELM 308 429 0 13 Apr 2020
Explaining Question Answering Models through Text Generation Veronica Latcinnik Jonathan Berant LRM 220 53 0 12 Apr 2020
Multimodal Categorization of Crisis Events in Social MediaComputer Vision and Pattern Recognition (CVPR), 2020 Mahdi Abavisani Liwei Wu Shengli Hu Joel R. Tetreault A. Jaimes 252 110 0 10 Apr 2020
Designing Precise and Robust Dialogue Response EvaluatorsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Tianyu Zhao Divesh Lala Tatsuya Kawahara 145 55 0 10 Apr 2020
Telling BERT's full story: from Local Attention to Global AggregationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2020 Damian Pascual Gino Brunner Roger Wattenhofer 183 20 0 10 Apr 2020
Injecting Numerical Reasoning Skills into Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Mor Geva Ankit Gupta Jonathan Berant AIMat LRM 238 237 0 09 Apr 2020
Generating Counter Narratives against Online Hate Speech: Data and StrategiesAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Serra Sinem Tekiroğlu Yi-Ling Chung Marco Guerini 117 123 0 08 Apr 2020
DynaBERT: Dynamic BERT with Adaptive Width and DepthNeural Information Processing Systems (NeurIPS), 2020 Lu Hou Zhiqi Huang Lifeng Shang Xin Jiang Xiao Chen Qun Liu MQ 242 352 0 08 Apr 2020
Analyzing Redundancy in Pretrained Transformer Models Fahim Dalvi Hassan Sajjad Nadir Durrani Yonatan Belinkov 156 3 0 08 Apr 2020
On the Effect of Dropping Layers of Pre-trained Transformer ModelsComputer Speech and Language (CSL), 2020 Hassan Sajjad Fahim Dalvi Nadir Durrani Preslav Nakov 263 172 0 08 Apr 2020
DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement Tianda Li Jia-Chen Gu Xiao-Dan Zhu Quan Liu Zhenhua Ling Zhiming Su Si Wei 164 30 0 08 Apr 2020
Towards Evaluating the Robustness of Chinese BERT Classifiers Wei Ping Boyuan Pan Xin Li Yue Liu AAML 142 9 0 07 Apr 2020
Byte Pair Encoding is Suboptimal for Language Model PretrainingFindings (Findings), 2020 Kaj Bostrom Greg Durrett 215 255 0 07 Apr 2020
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Changmao Li Jinho Choi 164 26 0 07 Apr 2020
A Few Topical Tweets are Enough for Effective User-Level Stance Detection Younes Samih Kareem Darwish 123 7 0 07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive ReviewACM Computing Surveys (ACM CSUR), 2020 Shervin Minaee Nal Kalchbrenner Xiaoshi Zhong Narjes Nikzad M. Asgari-Chenaghlu Jianfeng Gao AILaw VLM AI4TS 265 1,214 0 06 Apr 2020
Continual Domain-Tuning for Pretrained Language Models Subendhu Rongali Abhyuday N. Jagannatha Bhanu Pratap Singh Rawat Hong-ye Yu CLL KELM 143 7 0 05 Apr 2020
FastBERT: a Self-distilling BERT with Adaptive Inference TimeAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Weijie Liu Peng Zhou Zhe Zhao Zhiruo Wang Haotang Deng Qi Ju 228 392 0 05 Apr 2020
Finding Black Cat in a Coal Cellar -- Keyphrase Extraction & Keyphrase-Rubric Relationship Classification from Complex Assignments Manikandan Ravikiran 183 0 0 03 Apr 2020
Gestalt: a Stacking Ensemble for SQuAD2.0 Mohamed El-Geish 93 5 0 02 Apr 2020
Deep Entity Matching with Pre-Trained Language ModelsProceedings of the VLDB Endowment (PVLDB), 2020 Yuliang Li Jinfeng Li Yoshihiko Suhara A. Doan W. Tan VLM 286 441 0 01 Apr 2020
Information Leakage in Embedding ModelsConference on Computer and Communications Security (CCS), 2020 Congzheng Song A. Raghunathan MIACV 385 320 0 31 Mar 2020
Meta Fine-Tuning Neural Language Models for Multi-Domain Text MiningConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Chengyu Wang Minghui Qiu Yanjie Liang Xiaofeng He AI4CE 214 24 0 29 Mar 2020
Felix: Flexible Text Editing Through Tagging and InsertionFindings (Findings), 2020 Jonathan Mallinson Aliaksei Severyn Eric Malmi Guillermo Garrido 165 81 0 24 Mar 2020
Data-driven models and computational tools for neurolinguistics: a language technology perspective Ekaterina Artemova Amir Bakarov A. Artemov Evgeny Burnaev M. Sharaev 116 4 0 23 Mar 2020
Pre-trained Models for Natural Language Processing: A SurveyScience China Technological Sciences (Sci China Technol Sci), 2020 Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai Xuanjing Huang LM&MA VLM 965 1,609 0 18 Mar 2020
Calibration of Pre-trained TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Shrey Desai Greg Durrett UQLM 577 353 0 17 Mar 2020
A Survey on Contextual Embeddings Qi Liu Matt J. Kusner Phil Blunsom 433 169 0 16 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding Zhiheng Huang Peng Xu Davis Liang Ajay K. Mishra Bing Xiang 149 33 0 16 Mar 2020
A Survey of End-to-End Driving: Architectures and Training MethodsIEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), 2020 Ardi Tampuu Maksym Semikin Naveed Muhammad D. Fishman Tambet Matiisen 3DV 324 276 0 13 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical ModelInternational Conference on Machine Learning (ICML), 2020 Xuanqing Liu Hsiang-Fu Yu Inderjit Dhillon Cho-Jui Hsieh 169 131 0 13 Mar 2020
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic SimilarityComputational Linguistics (CL), 2020 Ivan Vulić Simon Baker Edoardo Ponti Ulla Petti Ira Leviant ... Eden Bar Matt Malone Thierry Poibeau Roi Reichart Anna Korhonen 196 87 0 10 Mar 2020
A Framework for Evaluation of Machine Reading Comprehension Gold StandardsInternational Conference on Language Resources and Evaluation (LREC), 2020 Viktor Schlegel Marco Valentino André Freitas Goran Nenadic Riza Batista-Navarro 134 34 0 10 Mar 2020
What the [MASK]? Making Sense of Language-Specific BERT Models Debora Nozza Federico Bianchi Dirk Hovy 282 119 0 05 Mar 2020
Talking-Heads Attention Noam M. Shazeer Zhenzhong Lan Youlong Cheng Nan Ding L. Hou 247 91 0 05 Mar 2020