ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.02860
  4. Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
    VLM
ArXivPDFHTML

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

21 / 621 papers shown
Title
Span-based Joint Entity and Relation Extraction with Transformer
  Pre-training
Span-based Joint Entity and Relation Extraction with Transformer Pre-training
Markus Eberts
A. Ulges
LRM
ViT
164
380
0
17 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable
  Generation
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
L. Varshney
Caiming Xiong
R. Socher
AI4CE
55
1,233
0
11 Sep 2019
PaLM: A Hybrid Parser and Language Model
PaLM: A Hybrid Parser and Language Model
Hao Peng
Roy Schwartz
Noah A. Smith
AIMat
23
15
0
04 Sep 2019
AutoML: A Survey of the State-of-the-Art
AutoML: A Survey of the State-of-the-Art
Xin He
Kaiyong Zhao
X. Chu
20
1,419
0
02 Aug 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
69
433
0
29 Jul 2019
EmotionX-HSU: Adopting Pre-trained BERT for Emotion Classification
EmotionX-HSU: Adopting Pre-trained BERT for Emotion Classification
Li Luo
Yue Wang
25
26
0
23 Jul 2019
R-Transformer: Recurrent Neural Network Enhanced Transformer
R-Transformer: Recurrent Neural Network Enhanced Transformer
Z. Wang
Yao Ma
Zitao Liu
Jiliang Tang
ViT
11
105
0
12 Jul 2019
LakhNES: Improving multi-instrumental music generation with cross-domain
  pre-training
LakhNES: Improving multi-instrumental music generation with cross-domain pre-training
Chris Donahue
H. H. Mao
Yiting Li
G. Cottrell
Julian McAuley
30
116
0
10 Jul 2019
Transfer Learning in Biomedical Natural Language Processing: An
  Evaluation of BERT and ELMo on Ten Benchmarking Datasets
Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets
Yifan Peng
Shankai Yan
Zhiyong Lu
LM&MA
AI4MH
13
830
0
13 Jun 2019
Analyzing the Structure of Attention in a Transformer Language Model
Analyzing the Structure of Attention in a Transformer Language Model
Jesse Vig
Yonatan Belinkov
21
357
0
07 Jun 2019
Large-Scale Multi-Label Text Classification on EU Legislation
Large-Scale Multi-Label Text Classification on EU Legislation
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Ion Androutsopoulos
AILaw
11
212
0
05 Jun 2019
Adversarial Generation and Encoding of Nested Texts
Adversarial Generation and Encoding of Nested Texts
A. Rozental
GAN
11
0
0
01 Jun 2019
Why gradient clipping accelerates training: A theoretical justification
  for adaptivity
Why gradient clipping accelerates training: A theoretical justification for adaptivity
Junzhe Zhang
Tianxing He
S. Sra
Ali Jadbabaie
22
441
0
28 May 2019
Interpreting and improving natural-language processing (in machines)
  with natural language-processing (in the brain)
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)
Mariya Toneva
Leila Wehbe
MILM
AI4CE
36
219
0
28 May 2019
Style Transformer: Unpaired Text Style Transfer without Disentangled
  Latent Representation
Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation
Ning Dai
Jianze Liang
Xipeng Qiu
Xuanjing Huang
DRL
8
202
0
14 May 2019
Language Modeling with Deep Transformers
Language Modeling with Deep Transformers
Kazuki Irie
Albert Zeyer
Ralf Schluter
Hermann Ney
KELM
27
172
0
10 May 2019
Towards Efficient Model Compression via Learned Global Ranking
Towards Efficient Model Compression via Learned Global Ranking
Ting-Wu Chin
Ruizhou Ding
Cha Zhang
Diana Marculescu
16
170
0
28 Apr 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
12
120
0
20 Apr 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks,
  Resources, and Approaches
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
18
128
0
02 Apr 2019
Trellis Networks for Sequence Modeling
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
17
145
0
15 Oct 2018
Neural Architecture Search with Reinforcement Learning
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
271
5,329
0
05 Nov 2016
Previous
123...111213