Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.02860
Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 604 papers shown
Title
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
125
36
0
15 Dec 2022
Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar Data
Matthias Zeller
Jens Behley
Michael Heidingsfeld
C. Stachniss
29
23
0
07 Dec 2022
Transformers for End-to-End InfoSec Tasks: A Feasibility Study
Ethan M. Rudd
Mohammad Saidur Rahman
Philip Tully
22
5
0
05 Dec 2022
Meta-Learning Fast Weight Language Models
Kevin Clark
Kelvin Guu
Ming-Wei Chang
Panupong Pasupat
Geoffrey E. Hinton
Mohammad Norouzi
KELM
32
13
0
05 Dec 2022
LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Yuguang Yang
Y. Pan
Jingjing Yin
Heng Lu
24
3
0
05 Dec 2022
A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling
Z. Guo
J. Kang
Dorien Herremans
14
18
0
02 Dec 2022
Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images
Meng Wang
Kai-An Yu
Chun-Mei Feng
K. Zou
Yanyu Xu
Qingquan Meng
Rick Siow Mong Goh
Yong Liu
Huazhu Fu
MedIm
27
3
0
01 Dec 2022
VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning
Kashu Yamazaki
Khoa T. Vo
Sang Truong
Bhiksha Raj
Ngan Le
26
35
0
28 Nov 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Bin Cui
GNN
MoE
37
60
0
25 Nov 2022
Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence Modeling
Zhijun Wang
Xuebo Liu
Min Zhang
25
11
0
23 Nov 2022
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
E. Ponti
20
42
0
17 Nov 2022
Hypergraph Transformer for Skeleton-based Action Recognition
Yuxuan Zhou
Zhi-Qi Cheng
Chuan Li
Yanwen Fang
Yifeng Geng
Xuansong Xie
M. Keuper
ViT
29
52
0
17 Nov 2022
Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical Image Segmentation
Yiyue Hu
Lei Zhang
Nan Mu
Leijun Liu
ViT
MedIm
22
1
0
17 Nov 2022
ComMU: Dataset for Combinatorial Music Generation
Lee Hyun
Taehyun Kim
Hyolim Kang
Minjoo Ki
H. Hwang
Kwanho Park
Sharang Han
Seon Joo Kim
35
14
0
17 Nov 2022
Deep Emotion Recognition in Textual Conversations: A Survey
Patrícia Pereira
Helena Moniz
Joao Paulo Carvalho
34
15
0
16 Nov 2022
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
27
21
0
16 Nov 2022
An Overview on Controllable Text Generation via Variational Auto-Encoders
Haoqin Tu
Yitong Li
BDL
24
2
0
15 Nov 2022
BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 Tweets
Javad Hassannataj Joloudari
Sadiq Hussain
M. Nematollahi
Rouhollah Bagheri
Fatemeh Fazl
R. Alizadehsani
Reza Lashgari
Ashis Talukder
18
38
0
04 Nov 2022
Circling Back to Recurrent Models of Language
Gábor Melis
32
0
0
03 Nov 2022
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
P. Swietojanski
Stefan Braun
Dogan Can
Thiago Fraga da Silva
Arnab Ghoshal
...
Henry Mason
Erik McDermott
Honza Silovsky
R. Travadi
Xiaodan Zhuang
32
13
0
02 Nov 2022
Structured State Space Decoder for Speech Recognition and Synthesis
Koichi Miyazaki
Masato Murata
Tomoki Koriyama
34
12
0
31 Oct 2022
Interpretable CNN-Multilevel Attention Transformer for Rapid Recognition of Pneumonia from Chest X-Ray Images
Shengchao Chen
Sufen Ren
Guanjun Wang
Mengxing Huang
Chenyang Xue
ViT
MedIm
47
16
0
29 Oct 2022
OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks
Benoit Steiner
Mostafa Elhoushi
Jacob Kahn
James Hegarty
29
8
0
24 Oct 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
87
15
0
23 Oct 2022
Is Encoder-Decoder Redundant for Neural Machine Translation?
Yingbo Gao
Christian Herold
Zijian Yang
Hermann Ney
19
4
0
21 Oct 2022
Graphically Structured Diffusion Models
Christian Weilbach
William Harvey
Frank D. Wood
DiffM
35
7
0
20 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
29
40
0
19 Oct 2022
Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation
Botao Yu
Peiling Lu
Rui Wang
Wei Hu
Xu Tan
Wei Ye
Shikun Zhang
Tao Qin
Tie-Yan Liu
MGen
25
54
0
19 Oct 2022
Tiny-Attention Adapter: Contexts Are More Important Than the Number of Parameters
Hongyu Zhao
Hao Tan
Hongyuan Mei
MoE
31
16
0
18 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
43
9
0
14 Oct 2022
CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
NAI
19
11
0
12 Oct 2022
Bird-Eye Transformers for Text Generation Models
Lei Sha
Yuhang Song
Yordan Yordanov
Tommaso Salvatori
Thomas Lukasiewicz
30
0
0
08 Oct 2022
Melody Infilling with User-Provided Structural Context
Chih-Pin Tan
A. Su
Yi-Hsuan Yang
31
3
0
06 Oct 2022
Memory in humans and deep language models: Linking hypotheses for model augmentation
Omri Raccah
Pheobe Chen
Ted Willke
David Poeppel
Vy A. Vo
RALM
18
1
0
04 Oct 2022
Enhancing Fine-Grained 3D Object Recognition using Hybrid Multi-Modal Vision Transformer-CNN Models
Songsong Xiong
Georgios Tziafas
H. Kasaei
ViT
23
3
0
03 Oct 2022
Grouped self-attention mechanism for a memory-efficient Transformer
Bumjun Jung
Yusuke Mukuta
Tatsuya Harada
AI4TS
12
3
0
02 Oct 2022
Effective General-Domain Data Inclusion for the Machine Translation Task by Vanilla Transformers
H. Soliman
32
0
0
28 Sep 2022
Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models
Xiuying Wei
Yunchen Zhang
Xiangguo Zhang
Ruihao Gong
Shanghang Zhang
Qi Zhang
F. Yu
Xianglong Liu
MQ
22
145
0
27 Sep 2022
Searching a High-Performance Feature Extractor for Text Recognition Network
Hui Zhang
Quanming Yao
James T. Kwok
X. Bai
28
7
0
27 Sep 2022
Optimizing DNN Compilation for Distributed Training with Joint OP and Tensor Fusion
Xiaodong Yi
Shiwei Zhang
Lansong Diao
Chuan Wu
Zhen Zheng
Shiqing Fan
Siyu Wang
Jun Yang
W. Lin
36
4
0
26 Sep 2022
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
14
183
0
21 Sep 2022
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Ye Bai
Jie Li
W. Han
Hao Ni
Kaituo Xu
Zhuo Zhang
Cheng Yi
Xiaorui Wang
MoE
21
1
0
17 Sep 2022
Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach
Shih-Lun Wu
Yi-Hsuan Yang
40
14
0
17 Sep 2022
Out of One, Many: Using Language Models to Simulate Human Samples
Lisa P. Argyle
Ethan C. Busby
Nancy Fulda
Joshua R Gubler
Christopher Rytting
David Wingate
SyDa
39
549
0
14 Sep 2022
Activity report analysis with automatic single or multispan answer extraction
R. Choudhary
A. Sridhar
Erik M. Visser
16
1
0
09 Sep 2022
Features Fusion Framework for Multimodal Irregular Time-series Events
Peiwang Tang
Xianchao Zhang
AI4TS
26
2
0
05 Sep 2022
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Surbhi Goel
Sham Kakade
Adam Tauman Kalai
Cyril Zhang
32
1
0
01 Sep 2022
Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation
Nadine Behrmann
S. Golestaneh
Zico Kolter
Juergen Gall
M. Noroozi
22
72
0
01 Sep 2022
Deep Sparse Conformer for Speech Recognition
Xianchao Wu
20
2
0
01 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
Previous
1
2
3
4
5
...
11
12
13
Next