Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.03236
Cited By
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
5 June 2020
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing"
50 / 141 papers shown
Title
How Good Are SOTA Fake News Detectors
Matthew Iceland
16
6
0
04 Aug 2023
CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition
Tian-Hao Zhang
Dinghao Zhou
Guiping Zhong
Jiaming Zhou
Baoxiang Li
18
3
0
26 Jul 2023
X-CapsNet For Fake News Detection
Mohammad Hadi Goldani
Reza Safabakhsh
S. Momtazi
GNN
MedIm
20
1
0
23 Jul 2023
Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks
Ryosuke Korekata
Motonari Kambara
Yusuke Yoshida
Shintaro Ishikawa
Yosuke Kawasaki
Masaki Takahashi
K. Sugiura
LM&Ro
33
5
0
14 Jul 2023
Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation
Dahyun Kang
Piotr Koniusz
Minsu Cho
Naila Murray
VLM
ViT
23
24
0
07 Jul 2023
Enhancing the Protein Tertiary Structure Prediction by Multiple Sequence Alignment Generation
Le Zhang
Jiayang Chen
Tao Shen
Yu-Hu Li
S. Sun
13
5
0
02 Jun 2023
Hierarchical Attention Encoder Decoder
Asier Mujika
BDL
22
3
0
01 Jun 2023
A Quantitative Review on Language Model Efficiency Research
Meng-Long Jiang
Hy Dang
Lingbo Tong
25
0
0
28 May 2023
Bridging the Granularity Gap for Acoustic Modeling
Chen Xu
Yuhao Zhang
Chengbo Jiao
Xiaoqian Liu
Chi Hu
Xin Zeng
Tong Xiao
Anxiang Ma
Huizhen Wang
JingBo Zhu
21
6
0
27 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
34
53
0
25 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Ziwei He
Meng-Da Yang
Minwei Feng
Jingcheng Yin
X. Wang
Jingwen Leng
Zhouhan Lin
ViT
32
11
0
24 May 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
21
12
0
22 May 2023
Fast-StrucTexT: An Efficient Hourglass Transformer with Modality-guided Dynamic Token Merge for Document Understanding
Mingliang Zhai
Yulin Li
Xiameng Qin
Chen Yi
Qunyi Xie
Chengquan Zhang
Kun Yao
Yuwei Wu
Yunde Jia
13
8
0
19 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
19
6
0
07 May 2023
Multimodal Graph Transformer for Multimodal Question Answering
Xuehai He
Xin Eric Wang
34
7
0
30 Apr 2023
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification
Wenjie Zhu
M. Omar
37
22
0
19 Mar 2023
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks
Xuanting Chen
Junjie Ye
Can Zu
Nuo Xu
Rui Zheng
Minlong Peng
Jie Zhou
Tao Gui
Qi Zhang
Xuanjing Huang
AI4MH
ELM
23
79
0
01 Mar 2023
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment
Jared Fernandez
Jacob Kahn
Clara Na
Yonatan Bisk
Emma Strubell
FedML
28
10
0
13 Feb 2023
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
28
18
0
09 Feb 2023
Alternating Updates for Efficient Transformers
Cenk Baykal
D. Cutler
Nishanth Dikkala
Nikhil Ghosh
Rina Panigrahy
Xin Wang
MoE
40
5
0
30 Jan 2023
NarrowBERT: Accelerating Masked Language Model Pretraining and Inference
Haoxin Li
Phillip Keung
Daniel Cheng
Jungo Kasai
Noah A. Smith
12
3
0
11 Jan 2023
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
28
84
0
28 Dec 2022
A Survey of Text Representation Methods and Their Genealogy
Philipp Siebers
Christian Janiesch
Patrick Zschech
AI4TS
14
9
0
26 Nov 2022
Efficient Transformers with Dynamic Token Pooling
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
E. Ponti
17
42
0
17 Nov 2022
Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan Pisula
Katarzyna Bozek
VLM
MedIm
30
3
0
14 Nov 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
34
49
0
25 Oct 2022
DiscoSense: Commonsense Reasoning with Discourse Connectives
Prajjwal Bhargava
Vincent Ng
LRM
143
4
0
22 Oct 2022
RedApt: An Adaptor for wav2vec 2 Encoding \\ Faster and Smaller Speech Translation without Quality Compromise
Jinming Zhao
Haomiao Yang
Gholamreza Haffari
Ehsan Shareghi
VLM
11
2
0
16 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Lingpeng Kong
3DV
41
9
0
14 Oct 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning
Jou-An Chen
Wei Niu
Bin Ren
Yanzhi Wang
Xipeng Shen
23
24
0
29 Aug 2022
Generating Coherent Narratives by Learning Dynamic and Discrete Entity States with a Contrastive Framework
Jian-Yu Guan
Zhenyu Yang
Rongsheng Zhang
Zhipeng Hu
Minlie Huang
18
9
0
08 Aug 2022
Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?
Yi Tay
Mostafa Dehghani
Samira Abnar
Hyung Won Chung
W. Fedus
J. Rao
Sharan Narang
Vinh Q. Tran
Dani Yogatama
Donald Metzler
AI4CE
32
100
0
21 Jul 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
26
231
0
27 Jun 2022
Megapixel Image Generation with Step-Unrolled Denoising Autoencoders
Alex F. McKinney
Chris G. Willcocks
DiffM
28
0
0
24 Jun 2022
A Survey of Deep Learning Models for Structural Code Understanding
Ruoting Wu
Yu-xin Zhang
Qibiao Peng
Liang Chen
Zibin Zheng
14
6
0
03 May 2022
Triformer: Triangular, Variable-Specific Attentions for Long Sequence Multivariate Time Series Forecasting--Full Version
Razvan-Gabriel Cirstea
Chenjuan Guo
B. Yang
Tung Kieu
Xuanyi Dong
Shirui Pan
AI4TS
29
106
0
28 Apr 2022
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Kai Hui
Honglei Zhuang
Tao Chen
Zhen Qin
Jing Lu
...
Ji Ma
Jai Gupta
Cicero Nogueira dos Santos
Yi Tay
Donald Metzler
34
16
0
25 Apr 2022
A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Shaojin Ding
Weiran Wang
Ding Zhao
Tara N. Sainath
Yanzhang He
...
Qiao Liang
Dongseong Hwang
Ian McGraw
Rohit Prabhavalkar
Trevor Strohman
30
17
0
13 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
20
31
0
10 Apr 2022
Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He
Chunyuan Li
Pengchuan Zhang
Jianwei Yang
X. Wang
28
84
0
29 Mar 2022
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection
Xin Huang
A. Khetan
Rene Bidart
Zohar S. Karnin
17
14
0
27 Mar 2022
Token Dropping for Efficient BERT Pretraining
Le Hou
Richard Yuanzhe Pang
Tianyi Zhou
Yuexin Wu
Xinying Song
Xiaodan Song
Denny Zhou
22
42
0
24 Mar 2022
A Context-Aware Feature Fusion Framework for Punctuation Restoration
Yangjun Wu
Kebin Fang
Yao Zhao
11
4
0
23 Mar 2022
Clickbait Spoiling via Question Answering and Passage Retrieval
Matthias Hagen
Maik Frobe
Artur Jurk
Martin Potthast
24
35
0
19 Mar 2022
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video
Bin Li
Yixuan Weng
Bin Sun
Shutao Li
27
24
0
13 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
18
94
0
11 Mar 2022
Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents
Yicheng Zou
Hongwei Liu
Tao Gui
Junzhe Wang
Qi Zhang
M. Tang
Haixiang Li
Dan Wang
DRL
35
29
0
06 Mar 2022
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
31
14
0
22 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
35
65
0
15 Feb 2022
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Yaoqing Yang
Ryan Theisen
Liam Hodgkinson
Joseph E. Gonzalez
Kannan Ramchandran
Charles H. Martin
Michael W. Mahoney
86
17
0
06 Feb 2022
Previous
1
2
3
Next