v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

22 / 2,022 papers shown

Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models

Oren Melamud

Chaitanya P. Shivade

SyDa MedIm

207

16 May 2019

Style Transformer: Unpaired Text Style Transfer without Disentangled Latent RepresentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Ning Dai

Jianze Liang

Xipeng Qiu

Xuanjing Huang

DRL

403

216

14 May 2019

Language Modeling with Deep TransformersInterspeech (Interspeech), 2019

374

187

10 May 2019

Towards Efficient Model Compression via Learned Global RankingComputer Vision and Pattern Recognition (CVPR), 2019

224

184

28 Apr 2019

Think Again Networks and the Delta Loss

Alexandre Salle

Marcelo O. R. Prates

155

26 Apr 2019

Language Models with Transformers

Chenguang Wang

Mu Li

Alex Smola

265

132

20 Apr 2019

Dynamic Evaluation of Transformer Language Models

224

17 Apr 2019

Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis

218

17 Apr 2019

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

188

495

11 Apr 2019

An Attentive Survey of Attention Models

450

723

05 Apr 2019

Visualizing Attention in Transformer-Based Language Representation Models

Jesse Vig

MILM

134

04 Apr 2019

Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches

Shane Storks

Qiaozi Gao

J. Chai

476

142

02 Apr 2019

Star-Transformer

Qipeng Guo

Xipeng Qiu

343

286

25 Feb 2019

Re-examination of the Role of Latent Variables in Sequence Modeling

217

04 Feb 2019

Compressing Gradient Optimizers via Count-SketchesInternational Conference on Machine Learning (ICML), 2019

Ryan Spring

Anastasios Kyrillidis

Vijai Mohan

Anshumali Shrivastava

172

01 Feb 2019

Tensorized Embedding Layers for Efficient Model Compression

248

30 Jan 2019

Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction

201

23 Jan 2019

Extractive Summary as Discrete Latent Variables

Aran Komatsuzaki

160

14 Nov 2018

Trellis Networks for Sequence Modeling

Shaojie Bai

J. Zico Kolter

V. Koltun

213

160

15 Oct 2018

A Survey of the Usages of Deep Learning in Natural Language Processing

375

27 Jul 2018

Deep Learning for Genomics: A Concise Overview

292

02 Feb 2018

Natural Language Processing: State of The Art, Current Trends and Challenges

170

1,378

17 Aug 2017