ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.02860
  4. Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
    VLM
ArXiv (abs)PDFHTML

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

22 / 2,022 papers shown
Towards Automatic Generation of Shareable Synthetic Clinical Notes Using
  Neural Language Models
Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models
Oren Melamud
Chaitanya P. Shivade
SyDaMedIm
207
41
0
16 May 2019
Style Transformer: Unpaired Text Style Transfer without Disentangled
  Latent Representation
Style Transformer: Unpaired Text Style Transfer without Disentangled Latent RepresentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Ning Dai
Jianze Liang
Xipeng Qiu
Xuanjing Huang
DRL
403
216
0
14 May 2019
Language Modeling with Deep Transformers
Language Modeling with Deep TransformersInterspeech (Interspeech), 2019
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
374
187
0
10 May 2019
Towards Efficient Model Compression via Learned Global Ranking
Towards Efficient Model Compression via Learned Global RankingComputer Vision and Pattern Recognition (CVPR), 2019
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
224
184
0
28 Apr 2019
Think Again Networks and the Delta Loss
Think Again Networks and the Delta Loss
Alexandre Salle
Marcelo O. R. Prates
155
2
0
26 Apr 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
265
132
0
20 Apr 2019
Dynamic Evaluation of Transformer Language Models
Dynamic Evaluation of Transformer Language Models
Ben Krause
Emmanuel Kahembwe
Iain Murray
Steve Renals
224
45
0
17 Apr 2019
Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment
  Analysis
Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis
Feiyang Chen
Ziqian Luo
Yanyan Xu
Dengfeng Ke
218
86
0
17 Apr 2019
An Empirical Study of Spatial Attention Mechanisms in Deep Networks
An Empirical Study of Spatial Attention Mechanisms in Deep Networks
Xizhou Zhu
Dazhi Cheng
Zheng Zhang
Stephen Lin
Jifeng Dai
188
495
0
11 Apr 2019
An Attentive Survey of Attention Models
An Attentive Survey of Attention Models
S. Chaudhari
Varun Mithal
Gungor Polatkan
R. Ramanath
450
723
0
05 Apr 2019
Visualizing Attention in Transformer-Based Language Representation
  Models
Visualizing Attention in Transformer-Based Language Representation Models
Jesse Vig
MILM
134
23
0
04 Apr 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks,
  Resources, and Approaches
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
476
142
0
02 Apr 2019
Star-Transformer
Star-Transformer
Qipeng Guo
Xipeng Qiu
Pengfei Liu
Yunfan Shao
Xiangyang Xue
Zheng Zhang
343
286
0
25 Feb 2019
Re-examination of the Role of Latent Variables in Sequence Modeling
Re-examination of the Role of Latent Variables in Sequence Modeling
Zihang Dai
Guokun Lai
Yiming Yang
Shinjae Yoo
BDLDRL
217
4
0
04 Feb 2019
Compressing Gradient Optimizers via Count-Sketches
Compressing Gradient Optimizers via Count-SketchesInternational Conference on Machine Learning (ICML), 2019
Ryan Spring
Anastasios Kyrillidis
Vijai Mohan
Anshumali Shrivastava
172
38
0
01 Feb 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
248
75
0
30 Jan 2019
Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps
  for Time Series Prediction
Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction
Bryan Lim
S. Zohren
Stephen J. Roberts
BDLAI4TS
201
47
0
23 Jan 2019
Extractive Summary as Discrete Latent Variables
Extractive Summary as Discrete Latent Variables
Aran Komatsuzaki
160
3
0
14 Nov 2018
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
213
160
0
15 Oct 2018
A Survey of the Usages of Deep Learning in Natural Language Processing
A Survey of the Usages of Deep Learning in Natural Language Processing
Dan Otter
Julian R. Medina
Jugal Kalita
VLM
375
12
0
27 Jul 2018
Deep Learning for Genomics: A Concise Overview
Deep Learning for Genomics: A Concise Overview
Tianwei Yue
Yuanxin Wang
Longxiang Zhang
Chunming Gu
Haohan Wang
Wenping Wang
Qi Lyu
Yujie Dun
AILawVLMBDL
292
98
0
02 Feb 2018
Natural Language Processing: State of The Art, Current Trends and
  Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
Diksha Khurana
Aditya Koli
Kiran Khatter
Sukhdev Singh
170
1,378
0
17 Aug 2017
Previous
123...394041
Page 41 of 41
Pageof 41