Generating Long Sequences with Sparse Transformers

23 April 2019

Papers citing "Generating Long Sequences with Sparse Transformers"

50 / 1,283 papers shown

Stepwise Extractive Summarization and Planning with Structured Transformers

179

06 Oct 2020

Scene Graph Modification Based on Natural Language CommandsFindings (Findings), 2020

203

06 Oct 2020

Guiding Attention for Self-Supervised Learning with TransformersFindings (Findings), 2020

Ameet Deshpande

Karthik Narasimhan

157

06 Oct 2020

Which *BERT? A Survey Organizing Contextualized EncodersConference on Empirical Methods in Natural Language Processing (EMNLP), 2020

Patrick Xia

Shijie Wu

Benjamin Van Durme

223

02 Oct 2020

Rethinking Attention with Performers

K. Choromanski

Valerii Likhosherstov

...

781

1,979

30 Sep 2020

Learning Hard Retrieval Decoder Attention for Transformers

118

30 Sep 2020

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems

Andrea Madotto

296

28 Sep 2020

Current Limitations of Language Models: What You Need is Retrieval

Aran Komatsuzaki

LRM

124

15 Sep 2020

Efficient Transformers: A SurveyACM Computing Surveys (ACM CSUR), 2020

866

1,362

14 Sep 2020

Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency EncodingFindings (Findings), 2020

249

13 Sep 2020

Sparsifying Transformer Models with Trainable Representation PoolingAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Michal Pietruszka

Łukasz Borchmann

Lukasz Garncarek

259

10 Sep 2020

Multi-Attention-Network for Semantic Segmentation of Fine Resolution Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2020

422

496

03 Sep 2020

Accelerating Sparse DNN Models without Hardware-Support via Tile-Wise SparsityInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2020

Cong Guo

Jingwen Leng

Yuhao Zhu

167

29 Aug 2020

Deep Spatial Transformation for Pose-Guided Person Image Generation and AnimationIEEE Transactions on Image Processing (TIP), 2020

270

27 Aug 2020

AMBERT: A Pre-trained Language Model with Multi-Grained TokenizationFindings (Findings), 2020

Xinsong Zhang

Pengshuai Li

Hang Li

391

27 Aug 2020

Generating Music with a Self-Correcting Non-Chronological Autoregressive Model

216

18 Aug 2020

PopMAG: Pop Music Accompaniment Generation

Xu Tan

Zhou Zhao

225

132

18 Aug 2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections

407

813

17 Aug 2020

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

200

16 Aug 2020

Compression of Deep Learning Models for Text: A SurveyACM Transactions on Knowledge Discovery from Data (TKDD), 2020

Manish Gupta

Puneet Agrawal

VLM MedIm AI4CE

511

134

12 Aug 2020

DeLighT: Deep and Light-weight Transformer

Luke Zettlemoyer

249

03 Aug 2020

The Chess Transformer: Mastering Play using Generative Language Models

David Noever

Matt Ciolino

Josh Kalin

574

02 Aug 2020

Neural Language Generation: Formulation, Methods, and Evaluation

Cristina Garbacea

Qiaozhu Mei

362

31 Jul 2020

Linear Attention Mechanism: An Efficient Attention for Semantic Segmentation

172

29 Jul 2020

TensorCoder: Dimension-Wise Attention via Tensor Representation for Natural Language Modeling

Qun Liu

122

28 Jul 2020

Big Bird: Transformers for Longer SequencesNeural Information Processing Systems (NeurIPS), 2020

Joshua Ainslie

...

1.3K

2,532

28 Jul 2020

Cloud Transformers: A Universal Approach To Point Cloud Processing Tasks

Kirill Mazur

Victor Lempitsky

3DPC

400

22 Jul 2020

DeepSVG: A Hierarchical Generative Network for Vector Graphics Animation

Alexandre Carlier

Martin Danelljan

Alexandre Alahi

Radu Timofte

517

178

22 Jul 2020

Conformer-Kernel with Query Term Independence for Document Retrieval

172

20 Jul 2020

Autoregressive Unsupervised Image SegmentationEuropean Conference on Computer Vision (ECCV), 2020

Yassine Ouali

C´eline Hudelot

Myriam Tami

SSL

236

16 Jul 2020

ProtTrans: Towards Cracking the Language of Life's Code Through Self-Supervised Deep Learning and High Performance ComputingbioRxiv (bioRxiv), 2020

...

472

1,156

13 Jul 2020

Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation

725

12 Jul 2020

Variable Skipping for Autoregressive Range Density EstimationInternational Conference on Machine Learning (ICML), 2020

Pieter Abbeel

193

10 Jul 2020

Fast Transformers with Clustered AttentionNeural Information Processing Systems (NeurIPS), 2020

Apoorv Vyas

Angelos Katharopoulos

Franccois Fleuret

281

171

09 Jul 2020

Data Movement Is All You Need: A Case Study on Optimizing Transformers

418

168

30 Jun 2020

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Angelos Katharopoulos

Apoorv Vyas

Nikolaos Pappas

Franccois Fleuret

724

2,328

29 Jun 2020

Matrix Shuffle-Exchange Networks for Hard 2D Tasks

Emīls Ozoliņš

Kārlis Freivalds

A. Sostaks

29 Jun 2020

Streaming Transformer ASR with Blockwise Synchronous Beam Search

E. Tsunoo

Yosuke Kashiwagi

Shinji Watanabe

313

25 Jun 2020

Locally Masked Convolution for Autoregressive Models

Ajay Jain

Pieter Abbeel

Deepak Pathak

DiffM OffRL

203

22 Jun 2020

248

20 Jun 2020

Denoising Diffusion Probabilistic Models

Jonathan Ho

Ajay Jain

Pieter Abbeel

DiffM

5.1K

26,105

19 Jun 2020

Sparse GPU Kernels for Deep Learning

270

265

18 Jun 2020

A Tutorial on VAEs: From Bayes' Rule to Lossless Compression

Ronald Yu

BDL

160

18 Jun 2020

SEAL: Segment-wise Extractive-Abstractive Long-form Text Summarization

Yao-Min Zhao

Mohammad Saleh

Peter J. Liu

RALM

167

18 Jun 2020

Untangling tradeoffs between recurrence and self-attention in neural networks

170

16 Jun 2020

410

12 Jun 2020

Dance Revolution: Long-Term Dance Generation with Music via Curriculum LearningInternational Conference on Learning Representations (ICLR), 2020

515

133

11 Jun 2020

Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers

223

09 Jun 2020

O(n)

Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Srinadh Bhojanapalli

Sanjiv Kumar

230

08 Jun 2020

Linformer: Self-Attention with Linear Complexity

Sinong Wang

Belinda Z. Li

Madian Khabsa

Han Fang

Hao Ma

450

2,075

08 Jun 2020