Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.02860
Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 604 papers shown
Title
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
26
12
0
22 May 2023
EE-TTS: Emphatic Expressive TTS with Linguistic Information
Yifan Zhong
Chen Zhang
Xule Liu
Chenxi Sun
Weishan Deng
Haifeng Hu
Zhongqian Sun
13
3
0
20 May 2023
Overcoming Topology Agnosticism: Enhancing Skeleton-Based Action Recognition through Redefined Skeletal Topology Awareness
Yuxuan Zhou
Zhi-Qi Cheng
Ju He
Bin Luo
Yifeng Geng
Xuansong Xie
29
11
0
19 May 2023
A Comparative Study on E-Branchformer vs Conformer in Speech Recognition, Translation, and Understanding Tasks
Yifan Peng
Kwangyoun Kim
Felix Wu
Brian Yan
Siddhant Arora
William Chen
Jiyang Tang
Suwon Shon
Prashant Sridhar
Shinji Watanabe
21
17
0
18 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
Mimetic Initialization of Self-Attention Layers
Asher Trockman
J. Zico Kolter
30
30
0
16 May 2023
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Hans W. A. Hanley
Zakir Durumeric
DeLMO
23
29
0
16 May 2023
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation
Yuxin Ren
Zi-Qi Zhong
Xingjian Shi
Yi Zhu
Chun Yuan
Mu Li
21
7
0
16 May 2023
EvoluNet: Advancing Dynamic Non-IID Transfer Learning on Graphs
Haohui Wang
Yuzhen Mao
Yujun Yan
Yaoqing Yang
Jianhui Sun
...
Si Zhang
Alison Hu
Edward Bowen
Tyler Cody
Dawei Zhou
44
2
0
01 May 2023
TransFlow: Transformer as Flow Learner
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
32
45
0
23 Apr 2023
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov
Yuri Kuratov
Yermek Kapushev
Mikhail Burtsev
LRM
16
87
0
19 Apr 2023
From Words to Music: A Study of Subword Tokenization Techniques in Symbolic Music Generation
Adarsh Kumar
Pedro Sarmento
28
4
0
18 Apr 2023
Learning to Compress Prompts with Gist Tokens
Jesse Mu
Xiang Lisa Li
Noah D. Goodman
VLM
44
204
0
17 Apr 2023
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Victor Agostinelli
Lizhong Chen
22
1
0
17 Apr 2023
MisRoBÆRTa: Transformers versus Misinformation
Ciprian-Octavian Truică
Elena Simona Apostol
21
37
0
16 Apr 2023
Fairness in Visual Clustering: A Novel Transformer Clustering Approach
Xuan-Bac Nguyen
C. Duong
Marios Savvides
Kaushik Roy
Hugh Churchill
Khoa Luu
37
9
0
14 Apr 2023
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang
Jiaming Han
Chris Liu
Peng Gao
Aojun Zhou
Xiangfei Hu
Shilin Yan
Pan Lu
Hongsheng Li
Yu Qiao
MLLM
38
741
0
28 Mar 2023
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu
G. Chen
Yufei Wang
Libo Zhang
Tiejian Luo
Longyin Wen
27
47
0
22 Mar 2023
Transformers in Speech Processing: A Survey
S. Latif
Aun Zaidi
Heriberto Cuayáhuitl
Fahad Shamshad
Moazzam Shoukat
Junaid Qadir
42
47
0
21 Mar 2023
HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals
Ella Lan
MedIm
20
1
0
17 Mar 2023
AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+
Tianlin Li
Ying Wang
Ziwei Xuan
Guo-Jun Qi
ViT
42
3
0
14 Mar 2023
Transformer Models for Acute Brain Dysfunction Prediction
B. Silva
Miguel Contreras
T. Ozrazgat-Baslanti
Yuanfang Ren
Ziyuan Guan
Kia Khezeli
A. Bihorac
Parisa Rashidi
19
0
0
13 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
29
506
0
07 Mar 2023
LooperGP: A Loopable Sequence Model for Live Coding Performance using GuitarPro Tablature
Sara Adkins
Pedro Sarmento
M. Barthet
19
9
0
03 Mar 2023
FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation
Xiaoyu Shi
Zhaoyang Huang
Dasong Li
Manyuan Zhang
Ka Chun Cheung
Simon See
Hongwei Qin
Jifeng Dai
Hongsheng Li
27
82
0
02 Mar 2023
Applying Plain Transformers to Real-World Point Clouds
Lanxiao Li
M. Heizmann
3DPC
ViT
26
3
0
28 Feb 2023
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang
Arsha Nagrani
Paul Hongsuck Seo
Antoine Miech
Jordi Pont-Tuset
Ivan Laptev
Josef Sivic
Cordelia Schmid
AI4TS
VLM
36
220
0
27 Feb 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
36
101
0
27 Feb 2023
Simple Hardware-Efficient Long Convolutions for Sequence Modeling
Daniel Y. Fu
Elliot L. Epstein
Eric N. D. Nguyen
A. Thomas
Michael Zhang
Tri Dao
Atri Rudra
Christopher Ré
16
52
0
13 Feb 2023
A Study on ReLU and Softmax in Transformer
Kai Shen
Junliang Guo
Xuejiao Tan
Siliang Tang
Rui Wang
Jiang Bian
19
53
0
13 Feb 2023
GTR-CTRL: Instrument and Genre Conditioning for Guitar-Focused Music Generation with Transformers
Pedro Sarmento
Adarsh Kumar
Yu-Hua Chen
CJ Carr
Zack Zukowski
M. Barthet
51
20
0
10 Feb 2023
Cut your Losses with Squentropy
Like Hui
M. Belkin
S. Wright
UQCV
15
8
0
08 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
37
8
0
03 Feb 2023
An Comparative Analysis of Different Pitch and Metrical Grid Encoding Methods in the Task of Sequential Music Generation
Yuqiang Li
Shengchen Li
Georgy Fazekas
39
2
0
31 Jan 2023
A Comparative Study of Pretrained Language Models for Long Clinical Text
Yikuan Li
R. M. Wehbe
F. Ahmad
Hanyin Wang
Yuan Luo
LM&MA
ELM
VLM
MedIm
24
79
0
27 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
42
2
0
26 Jan 2023
Out of Distribution Performance of State of Art Vision Model
Salman Rahman
W. Lee
37
2
0
25 Jan 2023
Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team
Jakob Bauer
Kate Baumli
Satinder Baveja
Feryal M. P. Behbahani
...
Jakub Sygnowski
K. Tuyls
Sarah York
Alexander Zacherl
Lei Zhang
LM&Ro
OffRL
AI4CE
LRM
35
108
0
18 Jan 2023
Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling
Ahmed Elnaggar
Hazem Essam
Wafaa Salah-Eldin
Walid Moustafa
Mohamed Elkerdawy
Charlotte Rochereau
B. Rost
167
86
0
16 Jan 2023
WuYun: Exploring hierarchical skeleton-guided melody generation using knowledge-enhanced deep learning
Kejun Zhang
Xinda Wu
Tieyao Zhang
Zhijie Huang
Xu Tan
Qihao Liang
Songruoyao Wu
Lingyun Sun
40
10
0
11 Jan 2023
A Survey on Transformers in Reinforcement Learning
Wenzhe Li
Hao Luo
Zichuan Lin
Chongjie Zhang
Zongqing Lu
Deheng Ye
OffRL
MU
AI4CE
37
55
0
08 Jan 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
David M. Chan
Shalini Ghosh
Ariya Rastrow
Björn Hoffmeister
OffRL
18
6
0
06 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
50
11
0
30 Dec 2022
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
70
370
0
28 Dec 2022
Part-guided Relational Transformers for Fine-grained Visual Recognition
Yifan Zhao
Jia Li
Xiaowu Chen
Yonghong Tian
ViT
36
34
0
28 Dec 2022
On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective
Ying Wen
Bo Liu
M. Zhou
Shufang Hou
Zhe Cao
Chenyang Le
Jingxiao Chen
Zheng Tian
Weinan Zhang
Jun Wang
AI4CE
20
10
0
24 Dec 2022
A Length-Extrapolatable Transformer
Yutao Sun
Li Dong
Barun Patra
Shuming Ma
Shaohan Huang
Alon Benhaim
Vishrav Chaudhary
Xia Song
Furu Wei
30
115
0
20 Dec 2022
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model
Yeskendir Koishekenov
Alexandre Berard
Vassilina Nikoulina
MoE
35
29
0
19 Dec 2022
Inductive Attention for Video Action Anticipation
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Simon See
Oswald Lanz
39
1
0
17 Dec 2022
Previous
1
2
3
4
5
6
...
11
12
13
Next