v1v2v3 (latest)

Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling

4 November 2016

Papers citing "Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling"

50 / 237 papers shown

Title
Data Augmentation Using Many-To-Many RNNs for Session-Aware Recommender Systems Martín Baigorria Alonso 60 2 0 22 Aug 2021
Transformers with multi-modal features and post-fusion context for e-commerce session-based recommendation Gabriel de Souza P. Moreira Sara Rabhi Ronay Ak Md Yasin Kabir Even Oldridge 125 32 0 11 Jul 2021
Training Graph Neural Networks with 1000 LayersInternational Conference on Machine Learning (ICML), 2021 Guohao Li Matthias Muller Guohao Li V. Koltun GNN AI4CE 341 274 0 14 Jun 2021
Exploring Unsupervised Pretraining Objectives for Machine TranslationFindings (Findings), 2021 Christos Baziotis Ivan Titov Alexandra Birch Barry Haddow AAML AI4CE 106 9 0 10 Jun 2021
Spectral Pruning for Recurrent Neural NetworksInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Takashi Furuya Kazuma Suetake K. Taniguchi Hiroyuki Kusumoto Ryuji Saiin Tomohiro Daimon 151 4 0 23 May 2021
Attention vs non-attention for a Shapley-based explanation methodWorkshop on Knowledge Extraction and Integration for Deep Learning Architectures; Deep Learning Inside Out (DEELIO), 2021 T. Kersten Hugh Mee Wong Jaap Jumelet Dieuwke Hupkes 236 4 0 26 Apr 2021
When FastText Pays Attention: Efficient Estimation of Word Representations using Constrained Positional Weighting Vít Novotný Michal Štefánik E. F. Ayetiran Petr Sojka Radim Řehůřek 260 6 0 19 Apr 2021
Neural Architecture Search for Image Super-Resolution Using Densely Constructed Search Space: DeCoNASInternational Conference on Pattern Recognition (ICPR), 2021 Joonyoung Ahn N. Cho 129 12 0 19 Apr 2021
Convex Aggregation for Opinion SummarizationConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Hayate Iso Xiaolan Wang Yoshihiko Suhara Stefanos Angelidis W. Tan 304 37 0 03 Apr 2021
Finetuning Pretrained Transformers into RNNsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021 Jungo Kasai Hao Peng Yizhe Zhang Dani Yogatama Gabriel Ilharco Nikolaos Pappas Yi Mao Weizhu Chen Noah A. Smith 270 81 0 24 Mar 2021
$Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks$ Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks Liping Yuan Jiangtao Feng Xiaoqing Zheng Xuanjing Huang 156 1 0 22 Mar 2021
Learning a Word-Level Language Model with Sentence-Level Noise Contrastive Estimation for Contextual Sentence Probability Estimation Heewoong Park Sukhyun Cho Jonghun Park 99 0 0 14 Mar 2021
The Rediscovery Hypothesis: Language Models Need to Meet LinguisticsJournal of Artificial Intelligence Research (JAIR), 2021 Vassilina Nikoulina Maxat Tezekbayev Nuradil Kozhakhmet Madina Babazhanova Matthias Gallé Z. Assylbekov 178 8 0 02 Mar 2021
Doping: A technique for efficient compression of LSTM models using sparse structured additive matricesConference on Machine Learning and Systems (MLSys), 2021 Urmish Thakker P. Whatmough Zhi-Gang Liu Matthew Mattina Jesse G. Beu 105 7 0 14 Feb 2021
Adaptive Semiparametric Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2021 Dani Yogatama Cyprien de Masson dÁutume Lingpeng Kong KELM RALM 162 105 0 04 Feb 2021
Shortformer: Better Language Modeling using Shorter InputsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021 Ofir Press Noah A. Smith M. Lewis 628 95 0 31 Dec 2020
MSVD-Turkish: A Comprehensive Multimodal Dataset for Integrated Vision and Language Research in TurkishMachine Translation (MT), 2020 Begum Citamak Ozan Caglayan Menekse Kuyu Erkut Erdem Aykut Erdem Pranava Madhyastha Lucia Specia 186 9 0 13 Dec 2020
Learning Contextualised Cross-lingual Word Embeddings and Alignments for Extremely Low-Resource Languages Using Parallel Corpora Takashi Wada Tomoharu Iwata Yuji Matsumoto Timothy Baldwin Jey Han Lau 270 8 0 27 Oct 2020
Rethinking embedding coupling in pre-trained language modelsInternational Conference on Learning Representations (ICLR), 2020 Hyung Won Chung Thibault Févry Henry Tsai Melvin Johnson Sebastian Ruder 295 163 0 24 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation Charles F Welch Amélie Reymond Jonathan K. Kummerfeld AI4CE 230 5 0 29 Sep 2020
Grounded Compositional Outputs for Adaptive Language ModelingConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Nikolaos Pappas Phoebe Mulcaire Noah A. Smith KELM 206 8 0 24 Sep 2020
Pruning Convolutional Filters using Batch BridgeoutIEEE Access (IEEE Access), 2020 Najeeb Khan Ian Stavness 87 3 0 23 Sep 2020
Local and Central Differential Privacy for Robustness and Privacy in Federated LearningNetwork and Distributed System Security Symposium (NDSS), 2020 Mohammad Naseri Jamie Hayes Emiliano De Cristofaro FedML 274 190 0 08 Sep 2020
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data HidingIEEE Symposium on Security and Privacy (IEEE S&P), 2020 Sahar Abdelnabi Mario Fritz WaLM 284 183 0 07 Sep 2020
DeLighT: Deep and Light-weight Transformer Sachin Mehta Marjan Ghazvininejad Srini Iyer Luke Zettlemoyer Hannaneh Hajishirzi VLM 207 33 0 03 Aug 2020
A High-Quality Multilingual Dataset for Structured Documentation TranslationConference on Machine Translation (WMT), 2020 Kazuma Hashimoto Raffaella Buschiazzo James Bradbury Teresa Marshall R. Socher Caiming Xiong 131 23 0 24 Jun 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation Jungo Kasai Nikolaos Pappas Hao Peng James Cross Noah A. Smith 337 147 0 18 Jun 2020
VirTex: Learning Visual Representations from Textual AnnotationsComputer Vision and Pattern Recognition (CVPR), 2020 Karan Desai Justin Johnson SSL VLM 432 465 0 11 Jun 2020
An Overview of Neural Network Compression James OÑeill AI4CE 300 112 0 05 Jun 2020
rTop-k: A Statistical Estimation Approach to Distributed SGD L. P. Barnes Huseyin A. Inan Berivan Isik Ayfer Özgür 204 68 0 21 May 2020
Staying True to Your Word: (How) Can Attention Become Explanation? Martin Tutek Jan Snajder 137 28 0 19 May 2020
Language Model Prior for Low-Resource Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2020 Christos Baziotis Barry Haddow Alexandra Birch 396 59 0 30 Apr 2020
Preventing Posterior Collapse with Levenshtein Variational Autoencoder Serhii Havrylov Ivan Titov DRL 172 20 0 30 Apr 2020
Fast and Memory-Efficient Neural Code CompletionIEEE Working Conference on Mining Software Repositories (MSR), 2020 Alexey Svyatkovskiy Sebastian Lee A. Hadjitofi M. Riechert Juliana Franco Miltiadis Allamanis 252 98 0 28 Apr 2020
An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020 Hiroshi Noji Hiroya Takamura 190 15 0 06 Apr 2020
Dynamic Sampling and Selective Masking for Communication-Efficient Federated LearningIEEE Intelligent Systems (IEEE Intell. Syst.), 2020 Shaoxiong Ji Wenqi Jiang A. Walid Xue Li FedML 281 70 0 21 Mar 2020
PowerNorm: Rethinking Batch Normalization in TransformersInternational Conference on Machine Learning (ICML), 2020 Sheng Shen Z. Yao A. Gholami Michael W. Mahoney Kurt Keutzer BDL 234 17 0 17 Mar 2020
ProGen: Language Modeling for Protein GenerationbioRxiv (bioRxiv), 2020 Ali Madani Bryan McCann Nikhil Naik N. Keskar N. Anand Raphael R. Eguchi Po-Ssu Huang R. Socher 191 315 0 08 Mar 2020
Modelling Latent Skills for Multitask Language Generation Kris Cao Dani Yogatama 128 3 0 21 Feb 2020
Assessing the Memory Ability of Recurrent Neural NetworksEuropean Conference on Artificial Intelligence (ECAI), 2020 Cheng Zhang Qiuchi Li L. Hua D. Song 113 6 0 18 Feb 2020
Transformer on a Diet Chenguang Wang Zihao Ye Aston Zhang Zheng Zhang Alex Smola 220 9 0 14 Feb 2020
Deep Learning for Source Code Modeling and Generation: Models, Applications and ChallengesACM Computing Surveys (ACM CSUR), 2020 T. H. Le Hao Chen Muhammad Ali Babar VLM 233 170 0 13 Feb 2020
Regularizing activations in neural networks via distribution matching with the Wasserstein metricInternational Conference on Learning Representations (ICLR), 2020 Taejong Joo Donggu Kang Byunghoon Kim 169 8 0 13 Feb 2020
A deep-learning view of chemical space designed to facilitate drug discoveryJournal of Chemical Information and Modeling (JCIM), 2020 P. Maragakis Hunter M. Nisonoff B. Cole D. Shaw 187 33 0 07 Feb 2020
Applying Recent Innovations from NLP to MOOC Student Course Trajectory ModelingEducational Data Mining (EDM), 2020 Clare Chen Z. Pardos 48 4 0 23 Jan 2020
CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity Konpat Preechakul B. Kijsirikul ODL 68 3 0 24 Dec 2019
Pythia: AI-assisted Code Completion SystemKnowledge Discovery and Data Mining (KDD), 2019 Alexey Svyatkovskiy Ying Zhao Shengyu Fu Neel Sundaresan 196 174 0 29 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence ModelingInternational Conference on Learning Representations (ICLR), 2019 Sachin Mehta Rik Koncel-Kedziorski Mohammad Rastegari Hannaneh Hajishirzi AI4TS 312 27 0 27 Nov 2019
Single Headed Attention RNN: Stop Thinking With Your Head Stephen Merity 232 70 0 26 Nov 2019
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question AnsweringInternational Conference on Learning Representations (ICLR), 2019 Akari Asai Kazuma Hashimoto Hannaneh Hajishirzi R. Socher Caiming Xiong RALM KELM LRM 433 310 0 24 Nov 2019