v1v2v3 (latest)

Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling

4 November 2016

Papers citing "Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling"

50 / 237 papers shown

Title
Modeling Local Dependence in Natural Language with Multi-channel Recurrent Neural Networks Chang Xu Weiran Huang Hongwei Wang G. Wang Tie-Yan Liu 122 13 0 13 Nov 2018
Federated Learning for Mobile Keyboard Prediction Andrew Straiton Hard Kanishka Rao Zhifeng Lin Swaroop Indra Ramaswamy Youjie Li S. Augenstein Alex Schwing M. Annavaram A. Avestimehr FedML 532 1,691 0 08 Nov 2018
Analysing Dropout and Compounding Errors in Neural Language Models James OÑeill Danushka Bollegala 76 1 0 02 Nov 2018
Progress and Tradeoffs in Neural Language Models Raphael Tang Jimmy J. Lin 84 5 0 02 Nov 2018
You May Not Need Attention Ofir Press Noah A. Smith 120 28 0 31 Oct 2018
A Simple Recurrent Unit with Reduced Tensor Product Representations Shuai Tang P. Smolensky V. D. Sa 212 2 0 29 Oct 2018
Language Modeling with Sparse Product of Sememe Experts Yihong Gu Jun Yan Hao Zhu Zhiyuan Liu Ruobing Xie Maosong Sun Fen Lin Leyu Lin MoE 123 31 0 29 Oct 2018
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks Songlin Yang Shawn Tan Alessandro Sordoni Aaron Courville 351 341 0 22 Oct 2018
Adaptive Input Representations for Neural Language ModelingInternational Conference on Learning Representations (ICLR), 2018 Alexei Baevski Michael Auli 544 421 0 28 Sep 2018
Adaptive Pruning of Neural Language Models for Mobile Devices Raphael Tang Jimmy J. Lin 101 7 0 27 Sep 2018
Distilled Wasserstein Learning for Word Embedding and Topic Modeling Hongteng Xu Wenlin Wang Wen Liu Lawrence Carin MedIm FedML 188 86 0 12 Sep 2018
Towards one-shot learning for rare-word translation with external experts Ngoc-Quan Pham Jan Niehues A. Waibel AAML 117 25 0 10 Sep 2018
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction Kazuma Hashimoto Yoshimasa Tsuruoka 160 7 0 05 Sep 2018
Parameter Sharing Methods for Multilingual Self-Attentional Translation Models Devendra Singh Sachan Graham Neubig MoE 175 115 0 01 Sep 2018
Beyond Weight Tying: Learning Joint Input-Output Embeddings for Neural Machine Translation Nikolaos Pappas Lesly Miculicich James Henderson 135 17 0 31 Aug 2018
Direct Output Connection for a High-Rank Language Model Sho Takase Jun Suzuki Masaaki Nagata 212 37 0 30 Aug 2018
Pyramidal Recurrent Unit for Language Modeling Sachin Mehta Rik Koncel-Kedziorski Mohammad Rastegari Hannaneh Hajishirzi 129 11 0 27 Aug 2018
Improving Abstraction in Text Summarization Wojciech Kry'sciñski Romain Paulus Caiming Xiong R. Socher 164 150 0 23 Aug 2018
Neural Architecture Optimization Renqian Luo Fei Tian Tao Qin Enhong Chen Tie-Yan Liu 3DV 402 690 0 22 Aug 2018
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference Rowan Zellers Yonatan Bisk Roy Schwartz Yejin Choi 386 757 0 16 Aug 2018
Improved Language Modeling by Decoding the Past Siddhartha Brahma BDL AI4TS 267 6 0 14 Aug 2018
Recurrent Neural Networks for Long and Short-Term Sequential Recommendation Kiewan Villatel E. Smirnova Jérémie Mary Philippe Preux HAI 84 27 0 23 Jul 2018
Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme Jey Han Lau Trevor Cohn Timothy Baldwin Julian Brooke Adam Hammond 158 78 0 10 Jul 2018
Neural Document Summarization by Jointly Learning to Score and Select SentencesAnnual Meeting of the Association for Computational Linguistics (ACL), 2018 Qingyu Zhou Nan Yang Furu Wei Shaohan Huang M. Zhou Tiejun Zhao 197 337 0 06 Jul 2018
How To Backdoor Federated LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2018 Eugene Bagdasaryan Andreas Veit Yiqing Hua D. Estrin Vitaly Shmatikov SILM FedML 489 2,231 0 02 Jul 2018
Learning Visually-Grounded Semantics from Contrastive Adversarial SamplesInternational Conference on Computational Linguistics (COLING), 2018 Freda Shi Jiayuan Mao Tete Xiao Yuning Jiang Jian Sun ObjD 168 52 0 27 Jun 2018
DARTS: Differentiable Architecture Search Hanxiao Liu Karen Simonyan Yiming Yang 648 4,707 0 24 Jun 2018
GILE: A Generalized Input-Label Embedding for Text Classification Nikolaos Pappas James Henderson AI4TS AILaw VLM 343 80 0 16 Jun 2018
Towards Binary-Valued Gates for Robust LSTM Training Zhuohan Li Di He Fei Tian Wei-neng Chen Tao Qin Liwei Wang Tie-Yan Liu MQ 135 49 0 08 Jun 2018
Learn from Your Neighbor: Learning Multi-modal Mappings from Sparse Annotations Ashwin Kalyan Stefan Lee A. Kannan Dhruv Batra 122 6 0 08 Jun 2018
Like a Baby: Visually Situated Neural Language Acquisition Alexander Ororbia A. Mali Mary Alexandria Kelly David Reitter 125 4 0 29 May 2018
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting Yen-Chun Chen Joey Tianyi Zhou BDL 295 604 0 28 May 2018
Inducing Grammars with and for Neural Machine Translation Ke M. Tran Yonatan Bisk 120 22 0 28 May 2018
Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication Felix Sattler Simon Wiedemann K. Müller Wojciech Samek MQ 147 230 0 22 May 2018
Breaking the Activation Function Bottleneck through Adaptive Parameterization Sebastian Flennerhag Hujun Yin J. Keane Mark Elliot 165 12 0 22 May 2018
Learning to Write with Cooperative Discriminators Ari Holtzman Jan Buys Maxwell Forbes Antoine Bosselut David Golub Yejin Choi 184 245 0 16 May 2018
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context Urvashi Khandelwal He He Peng Qi Dan Jurafsky RALM 176 309 0 12 May 2018
Noisin: Unbiased Regularization for Recurrent Neural Networks Adji Bousso Dieng Rajesh Ranganath Jaan Altosaar David M. Blei 131 24 0 03 May 2018
Efficient Contextualized Representation: Language Model Pruning for Sequence LabelingConference on Empirical Methods in Natural Language Processing (EMNLP), 2018 Liyuan Liu Xiang Ren Jingbo Shang Jian-wei Peng Jiawei Han 221 46 0 20 Apr 2018
Value-aware Quantization for Training and Inference of Neural NetworksEuropean Conference on Computer Vision (ECCV), 2018 Eunhyeok Park S. Yoo Peter Vajda MQ 134 169 0 20 Apr 2018
An Analysis of Neural Language Modeling at Multiple Scales Stephen Merity N. Keskar R. Socher 169 172 0 22 Mar 2018
Neural Lattice Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2018 Jacob Buckman Graham Neubig 140 30 0 13 Mar 2018
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN Shuai Li W. Li Chris Cook Ce Zhu Yanbo Gao 278 795 0 13 Mar 2018
The Importance of Being Recurrent for Modeling Hierarchical Structure Ke M. Tran Arianna Bisazza Christof Monz 214 153 0 09 Mar 2018
Learning Sparse Structured Ensembles with SG-MCMC and Network Pruning Yichi Zhang Zhijian Ou 180 0 0 01 Mar 2018
Reusing Weights in Subword-aware Neural Language Models Z. Assylbekov Rustem Takhanov 115 4 0 23 Feb 2018
Efficient Neural Architecture Search via Parameter Sharing Hieu H. Pham M. Guan Barret Zoph Quoc V. Le J. Dean 384 2,909 0 09 Feb 2018
MaskGAN: Better Text Generation via Filling in the______ W. Fedus Ian Goodfellow Andrew M. Dai 339 485 0 23 Jan 2018
Fix your classifier: the marginal value of training the last weight layer Elad Hoffer Itay Hubara Daniel Soudry 244 104 0 14 Jan 2018
Character-level Recurrent Neural Networks in Practice: Comparing Training and Sampling Schemes Cedric De Boom Thomas Demeester Bart Dhoedt 149 8 0 02 Jan 2018