ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.04451
  4. Cited By
Reformer: The Efficient Transformer

Reformer: The Efficient Transformer

13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
    VLM
ArXivPDFHTML

Papers citing "Reformer: The Efficient Transformer"

25 / 375 papers shown
Title
An Empirical Study on Large-Scale Multi-Label Text Classification
  Including Few and Zero-Shot Labels
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels
Ilias Chalkidis
Manos Fergadiotis
Sotiris Kotitsas
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
VLM
AI4TS
12
84
0
04 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
8
1,517
0
30 Sep 2020
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue
  Systems
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems
Andrea Madotto
Samuel Cahyawijaya
Genta Indra Winata
Yan Xu
Zihan Liu
Zhaojiang Lin
Pascale Fung
34
59
0
28 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
22
953
0
15 Sep 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
74
1,101
0
14 Sep 2020
Sparsifying Transformer Models with Trainable Representation Pooling
Sparsifying Transformer Models with Trainable Representation Pooling
Michal Pietruszka
Łukasz Borchmann
Lukasz Garncarek
13
10
0
10 Sep 2020
Aligning AI With Shared Human Values
Aligning AI With Shared Human Values
Dan Hendrycks
Collin Burns
Steven Basart
Andrew Critch
J. Li
D. Song
Jacob Steinhardt
32
515
0
05 Aug 2020
S2RMs: Spatially Structured Recurrent Modules
S2RMs: Spatially Structured Recurrent Modules
Nasim Rahaman
Anirudh Goyal
Muhammad Waleed Gondal
M. Wuthrich
Stefan Bauer
Yash Sharma
Yoshua Bengio
Bernhard Schölkopf
21
14
0
13 Jul 2020
Recurrent Quantum Neural Networks
Recurrent Quantum Neural Networks
Johannes Bausch
21
151
0
25 Jun 2020
Sparse GPU Kernels for Deep Learning
Sparse GPU Kernels for Deep Learning
Trevor Gale
Matei A. Zaharia
C. Young
Erich Elsen
10
227
0
18 Jun 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine
  Translation
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Jungo Kasai
Nikolaos Pappas
Hao Peng
James Cross
Noah A. Smith
30
134
0
18 Jun 2020
Dynamic Tensor Rematerialization
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
16
93
0
17 Jun 2020
Input-independent Attention Weights Are Expressive Enough: A Study of
  Attention in Self-supervised Audio Transformers
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers
Tsung-Han Wu
Chun-Chen Hsieh
Yen-Hao Chen
Po-Han Chi
Hung-yi Lee
18
1
0
09 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
58
1,645
0
08 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
62
2,614
0
05 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
25
7
0
05 Jun 2020
General-Purpose User Embeddings based on Mobile App Usage
General-Purpose User Embeddings based on Mobile App Usage
Junqi Zhang
Bing Bai
Ye Lin
Jian Liang
Kun Bai
Fei-Yue Wang
27
35
0
27 May 2020
Multiresolution and Multimodal Speech Recognition with Transformers
Multiresolution and Multimodal Speech Recognition with Transformers
Georgios Paraskevopoulos
Srinivas Parthasarathy
Aparna Khare
Shiva Sundaram
18
29
0
29 Apr 2020
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical
  Encoder for Long-Form Document Matching
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching
Liu Yang
Mingyang Zhang
Cheng Li
Michael Bendersky
Marc Najork
27
86
0
26 Apr 2020
Vector Quantized Contrastive Predictive Coding for Template-based Music
  Generation
Vector Quantized Contrastive Predictive Coding for Template-based Music Generation
Gaëtan Hadjeres
Léopold Crestel
26
18
0
21 Apr 2020
Residual Attention U-Net for Automated Multi-Class Segmentation of
  COVID-19 Chest CT Images
Residual Attention U-Net for Automated Multi-Class Segmentation of COVID-19 Chest CT Images
Xiaocong Chen
Lina Yao
Yu Zhang
28
197
0
12 Apr 2020
Longformer: The Long-Document Transformer
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
28
3,913
0
10 Apr 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
22
92
0
24 Feb 2020
Multivariate Probabilistic Time Series Forecasting via Conditioned
  Normalizing Flows
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows
Kashif Rasul
Abdul-Saboor Sheikh
Ingmar Schuster
Urs M. Bergmann
Roland Vollgraf
BDL
AI4TS
AI4CE
22
179
0
14 Feb 2020
Faster Neural Network Training with Approximate Tensor Operations
Faster Neural Network Training with Approximate Tensor Operations
Menachem Adelman
Kfir Y. Levy
Ido Hakimi
M. Silberstein
21
26
0
21 May 2018
Previous
12345678