v1v2v3 (latest)

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

9 January 2019

Papers citing "Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context"

50 / 2,022 papers shown

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Hongyi Yuan

Zheng Yuan

Chuanqi Tan

Fei Huang

Songfang Huang

DiffM

252

20 Dec 2022

Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation ModelAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Yeskendir Koishekenov

Alexandre Berard

Vassilina Nikoulina

MoE

264

19 Dec 2022

Inductive Attention for Video Action Anticipation

209

17 Dec 2022

Speech Aware Dialog System Technology Challenge (DSTC11)

194

16 Dec 2022

Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiersAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

J. Michaelov

Benjamin Bergen

198

16 Dec 2022

GeneFormer: Learned Gene Compression using Transformer-based Context ModelingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

125

16 Dec 2022

Efficient Long Sequence Modeling via State Space Augmented Transformer

Xiaodong Liu

332

15 Dec 2022

Jointly Learning Visual and Auditory Speech Representations from Raw DataInternational Conference on Learning Representations (ICLR), 2022

309

12 Dec 2022

Contextual Explainable Video Representation: Human Perception-based UnderstandingAsilomar Conference on Signals, Systems and Computers (ACSSC), 2022

Ngan Le

229

12 Dec 2022

P-Transformer: Towards Better Document-to-Document Neural Machine TranslationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

157

12 Dec 2022

CLIP-TSA: CLIP-Assisted Temporal Self-Attention for Weakly-Supervised Video Anomaly DetectionInternational Conference on Information Photonics (ICIP), 2022

Kevin Hyekang Joo

Khoa T. Vo

Kashu Yamazaki

Ngan Le

234

09 Dec 2022

Gaussian Radar Transformer for Semantic Segmentation in Noisy Radar DataIEEE Robotics and Automation Letters (RA-L), 2022

Matthias Zeller

Jens Behley

Michael Heidingsfeld

C. Stachniss

249

07 Dec 2022

Hierarchical multimodal transformers for Multi-Page DocVQAPattern Recognition (Pattern Recogn.), 2022

Rubèn Pérez Tito

Dimosthenis Karatzas

Ernest Valveny

266

07 Dec 2022

Transformers for End-to-End InfoSec Tasks: A Feasibility Study

Ethan M. Rudd

Mohammad Saidur Rahman

Philip Tully

212

05 Dec 2022

Meta-Learning Fast Weight Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

208

05 Dec 2022

LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition

Yuguang Yang

Yu Pan

Jingjing Yin

Heng Lu

252

05 Dec 2022

NBC2: Multichannel Speech Separation with Revised Narrow-band Conformer

Changsheng Quan

Xiaofei Li

161

05 Dec 2022

Language Models as Agent ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Jacob Andreas

LLMAG

272

169

03 Dec 2022

A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music ModelingAAAI Conference on Artificial Intelligence (AAAI), 2022

Z. Guo

J. Kang

Dorien Herremans

133

02 Dec 2022

ResFormer: Scaling ViTs with Multi-Resolution TrainingComputer Vision and Pattern Recognition (CVPR), 2022

Zuxuan Wu

Yu Qiao

257

01 Dec 2022

Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images

Meng Wang

Yong Liu

232

01 Dec 2022

Protein Language Models and Structure Prediction: Connection and Progression

Cheng Tan

Stan Z. Li

220

30 Nov 2022

Survey on Self-Supervised Multimodal Representation Learning and Foundation Models

Sushil Thapa

AI4TS SSL

103

29 Nov 2022

VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2022

Kashu Yamazaki

Khoa T. Vo

Sang Truong

Bhiksha Raj

Ngan Le

277

28 Nov 2022

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Jiuxiang Gu

165

27 Nov 2022

Deep representation learning: Fundamentals, Perspectives, Applications, and Open Challenges

Somayeh Bakhtiari Ramezani

FaML AI4TS

226

27 Nov 2022

A Survey of Text Representation Methods and Their GenealogyIEEE Access (IEEE Access), 2022

119

26 Nov 2022

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic ParallelismProceedings of the VLDB Endowment (PVLDB), 2022

Xupeng Miao

Yujie Wang

Youhe Jiang

Xiaonan Nie

238

25 Nov 2022

DBA: Efficient Transformer with Dynamic Bilinear Low-Rank AttentionIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

178

24 Nov 2022

Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation with Stroke Sequence ModelingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Zhijun Wang

Xuebo Liu

Min Zhang

343

23 Nov 2022

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural VocodersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

217

20 Nov 2022

Efficient Transformers with Dynamic Token PoolingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

251

17 Nov 2022

Hypergraph Transformer for Skeleton-based Action Recognition

328

17 Nov 2022

ReLER@ZJU Submission to the Ego4D Moment Queries Challenge 2022

Jiayi Shao

Xiaohan Wang

Yi Yang

149

17 Nov 2022

Parameter-Efficient Transformer with Hybrid Axial-Attention for Medical Image Segmentation

111

17 Nov 2022

ComMU: Dataset for Combinatorial Music GenerationNeural Information Processing Systems (NeurIPS), 2022

182

17 Nov 2022

Deep Emotion Recognition in Textual Conversations: A SurveyArtificial Intelligence Review (Artif Intell Rev), 2022

Patrícia Pereira

Helena Moniz

Joao Paulo Carvalho

460

16 Nov 2022

Token Turing MachinesComputer Vision and Pattern Recognition (CVPR), 2022

258

16 Nov 2022

An Overview on Controllable Text Generation via Variational Auto-Encoders

Haoqin Tu

Yitong Li

BDL

182

15 Nov 2022

YM2413-MDB: A Multi-Instrumental FM Video Game Music Dataset with Emotion AnnotationsInternational Society for Music Information Retrieval Conference (ISMIR), 2022

160

14 Nov 2022

Creative Writing with an AI-Powered Writing Assistant: Perspectives from Professional Writers

231

124

09 Nov 2022

Cross-Attention is all you need: Real-Time Streaming Transformers for Personalised Speech Enhancement

Shucong Zhang

Malcolm Chadwick

Alberto Gil C. P. Ramos

S. Bhattacharya

134

08 Nov 2022

Self-conditioned Embedding Diffusion for Text Generation

...

239

107

08 Nov 2022

Linear Self-Attention Approximation via Trainable Feedforward KernelInternational Conference on Artificial Neural Networks (ICANN), 2022

Uladzislau Yorsh

Alexander Kovalenko

271

08 Nov 2022

BERT-Deep CNN: State-of-the-Art for Sentiment Analysis of COVID-19 TweetsSocial Network Analysis and Mining (SNAM), 2022

Javad Hassannataj Joloudari

209

04 Nov 2022

Circling Back to Recurrent Models of Language

Gábor Melis

238

03 Nov 2022

Variable Attention Masking for Configurable Transformer Transducer Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

P. Swietojanski

Stefan Braun

Dogan Can

Thiago Fraga da Silva

...

246

02 Nov 2022

Processing Long Legal Documents with Pre-trained Transformers: Modding LegalBERT and Longformer

264

02 Nov 2022

SDMuse: Stochastic Differential Music Editing and Generation via Hybrid RepresentationIEEE transactions on multimedia (IEEE TMM), 2022

240

01 Nov 2022

Accelerating Distributed MoE Training and Inference with LinaUSENIX Annual Technical Conference (USENIX ATC), 2022

224

110

31 Oct 2022