Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1901.02860
Cited By
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
9 January 2019
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"
50 / 621 papers shown
Title
Out of One, Many: Using Language Models to Simulate Human Samples
Lisa P. Argyle
Ethan C. Busby
Nancy Fulda
Joshua R Gubler
Christopher Rytting
David Wingate
SyDa
45
549
0
14 Sep 2022
Activity report analysis with automatic single or multispan answer extraction
R. Choudhary
A. Sridhar
Erik M. Visser
16
1
0
09 Sep 2022
Features Fusion Framework for Multimodal Irregular Time-series Events
Peiwang Tang
Xianchao Zhang
AI4TS
26
2
0
05 Sep 2022
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Surbhi Goel
Sham Kakade
Adam Tauman Kalai
Cyril Zhang
32
1
0
01 Sep 2022
Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation
Nadine Behrmann
S. Golestaneh
Zico Kolter
Juergen Gall
M. Noroozi
22
72
0
01 Sep 2022
Deep Sparse Conformer for Speech Recognition
Xianchao Wu
20
2
0
01 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
Efficient Sparsely Activated Transformers
Salar Latifi
Saurav Muralidharan
M. Garland
MoE
19
2
0
31 Aug 2022
K-Order Graph-oriented Transformer with GraAttention for 3D Pose and Shape Estimation
Weixi Zhao
Weiqiang Wang
ViT
3DPC
21
2
0
24 Aug 2022
Lost in Context? On the Sense-wise Variance of Contextualized Word Embeddings
Yile Wang
Yue Zhang
19
4
0
20 Aug 2022
Adam Can Converge Without Any Modification On Update Rules
Yushun Zhang
Congliang Chen
Naichen Shi
Ruoyu Sun
Zhimin Luo
18
62
0
20 Aug 2022
Parallel Hierarchical Transformer with Attention Alignment for Abstractive Multi-Document Summarization
Ye Ma
Lu Zong
24
0
0
16 Aug 2022
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
32
15
0
08 Aug 2022
Enhancing the Robustness via Adversarial Learning and Joint Spatial-Temporal Embeddings in Traffic Forecasting
Juyong Jiang
Binqing Wu
Ling-Hao Chen
Kai Zhang
Sunghun Kim
AI4TS
30
18
0
05 Aug 2022
PointConvFormer: Revenge of the Point-based Convolution
Wenxuan Wu
Li Fuxin
Qi Shan
3DPC
25
30
0
04 Aug 2022
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets
Gabriel Vogel
Lukas Schulze Balhorn
Artur M. Schweidtmann
AI4CE
35
33
0
01 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
26
9
0
01 Aug 2022
Is Attention All That NeRF Needs?
T. MukundVarma
Peihao Wang
Xuxi Chen
Tianlong Chen
Subhashini Venugopalan
Zhangyang Wang
ViT
30
107
0
27 Jul 2022
3D Siamese Transformer Network for Single Object Tracking on Point Clouds
Le Hui
Lingpeng Wang
Ling-Yu Tang
Kaihao Lan
Jin Xie
Jian Yang
ViT
3DPC
31
59
0
25 Jul 2022
Improving Mandarin Speech Recogntion with Block-augmented Transformer
Xiaoming Ren
Huifeng Zhu
Liuwei Wei
Minghui Wu
Jie Hao
33
9
0
24 Jul 2022
Learning Object Placement via Dual-path Graph Completion
Siyuan Zhou
Liu Liu
Li Niu
Liqing Zhang
31
24
0
23 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
27
72
0
20 Jul 2022
Learning Sequence Representations by Non-local Recurrent Neural Memory
Wenjie Pei
Xin Feng
Canmiao Fu
Qi Cao
Guangming Lu
Yu-Wing Tai
AI4TS
24
1
0
20 Jul 2022
Vision Transformers: From Semantic Segmentation to Dense Prediction
Li Zhang
Jiachen Lu
Sixiao Zheng
Xinxuan Zhao
Xiatian Zhu
Yanwei Fu
Tao Xiang
Jianfeng Feng
Philip H. S. Torr
ViT
27
7
0
19 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
27
33
0
18 Jul 2022
Recurrent Memory Transformer
Aydar Bulatov
Yuri Kuratov
Mikhail Burtsev
CLL
13
102
0
14 Jul 2022
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Zekun Li
Zhengyang Geng
Zhao Kang
Wenyu Chen
Yibo Yang
21
35
0
13 Jul 2022
ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships
Chen Zhang
Luchin Chang
Songruoyao Wu
Xu Tan
Tao Qin
Tie-Yan Liu
Kecheng Zhang
19
15
0
12 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
21
143
0
06 Jul 2022
Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Kun Wei
Pengcheng Guo
Ning Jiang
48
11
0
02 Jul 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
34
231
0
27 Jun 2022
VLCap: Vision-Language with Contrastive Learning for Coherent Video Paragraph Captioning
Kashu Yamazaki
Sang Truong
Khoa T. Vo
Michael Kidd
Chase Rainwater
Khoa Luu
Ngan Le
VLM
CoGe
11
25
0
26 Jun 2022
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation
Qihang Yu
Huiyu Wang
Dahun Kim
Siyuan Qiao
Maxwell D. Collins
Yukun Zhu
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
ViT
MedIm
32
90
0
17 Jun 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Maribeth Rauh
John F. J. Mellor
J. Uesato
Po-Sen Huang
Johannes Welbl
...
Amelia Glaese
G. Irving
Iason Gabriel
William S. Isaac
Lisa Anne Hendricks
25
49
0
16 Jun 2022
PInKS: Preconditioned Commonsense Inference with Minimal Supervision
Ehsan Qasemi
Piyush Khanna
Qiang Ning
Muhao Chen
ReLM
LRM
27
8
0
16 Jun 2022
Identifying Electrocardiogram Abnormalities Using a Handcrafted-Rule-Enhanced Neural Network
Yu Bian
Jintai Chen
Xiaojun Chen
Xiaoxian Yang
Da Chen
Jian Wu
33
9
0
16 Jun 2022
Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis
Rania Briq
Chuhang Zou
L. Pishchulin
Christopher Broaddus
Juergen Gall
21
1
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
54
527
0
13 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
41
33
0
09 Jun 2022
Meet You Halfway: Explaining Deep Learning Mysteries
Oriel BenShmuel
AAML
FedML
FAtt
OOD
22
0
0
09 Jun 2022
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors
Shota Horiguchi
Shinji Watanabe
Leibny Paola García-Perera
Yuki Takashima
Y. Kawaguchi
39
23
0
06 Jun 2022
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
16
7
0
04 Jun 2022
Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Kevin Esslinger
Robert W. Platt
Chris Amato
OffRL
29
35
0
02 Jun 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
115
17
0
30 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
63
2,024
0
27 May 2022
Contrastive Siamese Network for Semi-supervised Speech Recognition
S. Khorram
Jaeyoung Kim
Anshuman Tripathi
Han Lu
Qian Zhang
Hasim Sak
SSL
19
11
0
27 May 2022
Do we really need temporal convolutions in action segmentation?
Dazhao Du
Bing-Huang Su
Yu Li
Zhongang Qi
Hui Xiong
Ying Shan
ViT
21
16
0
26 May 2022
DT-SV: A Transformer-based Time-domain Approach for Speaker Verification
Nan Zhang
Jianzong Wang
Zhenhou Hong
Chendong Zhao
Xiaoyang Qu
Jing Xiao
29
5
0
26 May 2022
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
234
128
0
25 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDa
VLM
47
48
0
24 May 2022
Previous
1
2
3
4
5
6
...
11
12
13
Next