ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03206
  4. Cited By
Perceiver: General Perception with Iterative Attention
v1v2 (latest)

Perceiver: General Perception with Iterative Attention

International Conference on Machine Learning (ICML), 2021
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
    VLMViTMDE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Perceiver: General Perception with Iterative Attention"

50 / 790 papers shown
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Wayformer: Motion Forecasting via Simple & Efficient Attention NetworksIEEE International Conference on Robotics and Automation (ICRA), 2022
Nigamaa Nayakanti
Rami Al-Rfou
Aurick Zhou
Kratarth Goel
Khaled S. Refaat
Benjamin Sapp
AI4TS
303
350
0
12 Jul 2022
MaiT: Leverage Attention Masks for More Efficient Image Transformers
MaiT: Leverage Attention Masks for More Efficient Image Transformers
Ling Li
Ali Shafiee Ardestani
Joseph Hassoun
123
1
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph LearnersNeural Information Processing Systems (NeurIPS), 2022
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
384
248
0
06 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear TransformersInternational Journal of Computer Vision (IJCV), 2022
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
209
14
0
05 Jul 2022
Conditioned Human Trajectory Prediction using Iterative Attention Blocks
Conditioned Human Trajectory Prediction using Iterative Attention BlocksIEEE International Conference on Robotics and Automation (ICRA), 2022
A. Postnikov
A. Gamayunov
Gonzalo Ferrer
168
4
0
29 Jun 2022
Deformable Graph Transformer
Deformable Graph TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jinyoung Park
Seongjun Yun
Hyeon-ju Park
Jaewoo Kang
Jisu Jeong
KyungHyun Kim
Jung-Woo Ha
Hyunwoo J. Kim
244
11
0
29 Jun 2022
A Unified Sequence Interface for Vision Tasks
A Unified Sequence Interface for Vision TasksNeural Information Processing Systems (NeurIPS), 2022
Ting-Li Chen
Saurabh Saxena
Lala Li
Nayeon Lee
David J. Fleet
Geoffrey E. Hinton
VLMMLLM
208
171
0
15 Jun 2022
Human Eyes Inspired Recurrent Neural Networks are More Robust Against
  Adversarial Noises
Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial NoisesNeural Computation (Neural Comput.), 2022
Minkyu Choi
Yizhen Zhang
Kuan Han
Xiaokai Wang
Zhongming Liu
AAMLGAN
139
6
0
15 Jun 2022
It's Time for Artistic Correspondence in Music and Video
It's Time for Artistic Correspondence in Music and VideoComputer Vision and Pattern Recognition (CVPR), 2022
Dídac Surís
Carl Vondrick
Bryan C. Russell
Justin Salamon
147
42
0
14 Jun 2022
Peripheral Vision Transformer
Peripheral Vision TransformerNeural Information Processing Systems (NeurIPS), 2022
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViTMDE
238
35
0
14 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Peng Xu
Xiatian Zhu
David Clifton
ViT
538
836
0
13 Jun 2022
Bringing Image Scene Structure to Video via Frame-Clip Consistency of
  Object Tokens
Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens
Elad Ben-Avraham
Roei Herzig
K. Mangalam
Amir Bar
Anna Rohrbach
Leonid Karlinsky
Trevor Darrell
Amir Globerson
300
0
0
13 Jun 2022
ChordMixer: A Scalable Neural Attention Model for Sequences with
  Different Lengths
ChordMixer: A Scalable Neural Attention Model for Sequences with Different LengthsInternational Conference on Learning Representations (ICLR), 2022
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
201
15
0
12 Jun 2022
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional
  MoEs
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsNeural Information Processing Systems (NeurIPS), 2022
Jinguo Zhu
Xizhou Zhu
Wenhai Wang
Xiaohua Wang
Jiaming Song
Xiaogang Wang
Jifeng Dai
MoMeMoE
301
84
0
09 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online
  Action Detection
GateHUB: Gated History Unit with Background Suppression for Online Action DetectionComputer Vision and Pattern Recognition (CVPR), 2022
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
237
52
0
09 Jun 2022
Revealing Single Frame Bias for Video-and-Language Learning
Revealing Single Frame Bias for Video-and-Language LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
236
141
0
07 Jun 2022
Fair Classification via Transformer Neural Networks: Case Study of an
  Educational Domain
Fair Classification via Transformer Neural Networks: Case Study of an Educational Domain
Modar Sulaiman
Kallol Roy
226
0
0
03 Jun 2022
SymFormer: End-to-end symbolic regression using transformer-based
  architecture
SymFormer: End-to-end symbolic regression using transformer-based architectureIEEE Access (IEEE Access), 2022
Martin Vastl
Jonáš Kulhánek
Jiří Kubalík
Erik Derner
Robert Babuška
379
77
0
31 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing
  Mechanisms in Sequence Learning
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence LearningNeural Information Processing Systems (NeurIPS), 2022
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
454
21
0
30 May 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
346
132
0
27 May 2022
Transformer for Partial Differential Equations' Operator Learning
Transformer for Partial Differential Equations' Operator Learning
Zijie Li
Kazem Meidani
A. Farimani
358
252
0
26 May 2022
Semi-Parametric Inducing Point Networks and Neural Processes
Semi-Parametric Inducing Point Networks and Neural ProcessesInternational Conference on Learning Representations (ICLR), 2022
R. Rastogi
Yair Schiff
Alon Hacohen
Zhaozhi Li
I-Hsiang Lee
Yuntian Deng
M. Sabuncu
Volodymyr Kuleshov
3DPC
288
8
0
24 May 2022
Dynamic Query Selection for Fast Visual Perceiver
Dynamic Query Selection for Fast Visual Perceiver
Corentin Dancette
Matthieu Cord
134
1
0
22 May 2022
Equivariant Mesh Attention Networks
Equivariant Mesh Attention Networks
Sourya Basu
Jose Gallego-Posada
Francesco Vigano
J. Rowbottom
Taco S. Cohen
3DPCMDEAI4CE
220
11
0
21 May 2022
Visual Concepts Tokenization
Visual Concepts TokenizationNeural Information Processing Systems (NeurIPS), 2022
Tao Yang
Yuwang Wang
Yan Lu
Nanning Zheng
OCLViT
234
16
0
20 May 2022
Towards Unified Keyframe Propagation Models
Towards Unified Keyframe Propagation Models
Patrick Esser
Peter Michael
Soumyadip Sengupta
VGen
122
0
0
19 May 2022
Meta-Learning Sparse Compression Networks
Meta-Learning Sparse Compression Networks
Jonathan Richard Schwarz
Yee Whye Teh
255
29
0
18 May 2022
Vision Transformer Adapter for Dense Predictions
Vision Transformer Adapter for Dense PredictionsInternational Conference on Learning Representations (ICLR), 2022
Zhe Chen
Yuchen Duan
Wenhai Wang
Junjun He
Tong Lu
Jifeng Dai
Yu Qiao
878
755
0
17 May 2022
CONSENT: Context Sensitive Transformer for Bold Words Classification
CONSENT: Context Sensitive Transformer for Bold Words Classification
Ionut Sandu
Daniel Voinea
A. Popa
137
4
0
16 May 2022
ImageSig: A signature transform for ultra-lightweight image recognition
ImageSig: A signature transform for ultra-lightweight image recognition
Mohamed Ramzy Ibrahim
Terry Lyons
VLM
205
7
0
13 May 2022
Cross Domain Object Detection by Target-Perceived Dual Branch
  Distillation
Cross Domain Object Detection by Target-Perceived Dual Branch DistillationComputer Vision and Pattern Recognition (CVPR), 2022
Meng He
Yali Wang
Jiaxi Wu
Yiru Wang
Hanqing Li
Yue Liu
Weihao Gan
Wei Wu
Yu Qiao
217
94
0
03 May 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot LearningNeural Information Processing Systems (NeurIPS), 2022
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLMVLM
695
4,826
0
29 Apr 2022
Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker
  and Gain
Autonomous In-Situ Soundscape Augmentation via Joint Selection of Masker and GainIEEE Signal Processing Letters (SPL), 2022
Karn N. Watcharasupat
Kenneth Ooi
Bhan Lam
Trevor Wong
Zhen-Ting Ong
W. Gan
146
8
0
29 Apr 2022
Pseudo strong labels for large scale weakly supervised audio tagging
Pseudo strong labels for large scale weakly supervised audio taggingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
109
8
0
28 Apr 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action
  Prediction
The Wisdom of Crowds: Temporal Progressive Attention for Early Action PredictionComputer Vision and Pattern Recognition (CVPR), 2022
Alexandros Stergiou
Dima Damen
AI4TSEgoVEDL
170
14
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
279
290
0
27 Apr 2022
Revealing Occlusions with 4D Neural Fields
Revealing Occlusions with 4D Neural FieldsComputer Vision and Pattern Recognition (CVPR), 2022
Basile Van Hoorick
Purva Tendulkar
Dídac Surís
Dennis Park
Simon Stent
Carl Vondrick
145
23
0
22 Apr 2022
Future Object Detection with Spatiotemporal Transformers
Future Object Detection with Spatiotemporal Transformers
Adam Tonderski
Joakim Johnander
Christoffer Petersson
Kalle AAstrom
ViT
186
1
0
21 Apr 2022
Visio-Linguistic Brain Encoding
Visio-Linguistic Brain EncodingInternational Conference on Computational Linguistics (COLING), 2022
R. Mamidi
Jashn Arora
Vijay Rowtula
Subba Reddy Oota
R. Bapi
AI4CE
99
20
0
18 Apr 2022
Visual Attention Methods in Deep Learning: An In-Depth Survey
Visual Attention Methods in Deep Learning: An In-Depth SurveyInformation Fusion (Inf. Fusion), 2022
Mohammed Hassanin
Saeed Anwar
Ibrahim Radwan
Fahad Shahbaz Khan
Lin Wang
338
245
0
16 Apr 2022
Malceiver: Perceiver with Hierarchical and Multi-modal Features for
  Android Malware Detection
Malceiver: Perceiver with Hierarchical and Multi-modal Features for Android Malware Detection
Niall McLaughlin
143
2
0
12 Apr 2022
Probabilistic Compositional Embeddings for Multimodal Image Retrieval
Probabilistic Compositional Embeddings for Multimodal Image Retrieval
Andrei Neculai
Yanbei Chen
Zeynep Akata
CoGe
260
42
0
12 Apr 2022
Linear Complexity Randomized Self-attention Mechanism
Linear Complexity Randomized Self-attention MechanismInternational Conference on Machine Learning (ICML), 2022
Lin Zheng
Chong-Jun Wang
Lingpeng Kong
183
34
0
10 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
MAESTRO: Matched Speech Text Representations through Modality MatchingInterspeech (Interspeech), 2022
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
241
119
0
07 Apr 2022
Event Transformer. A sparse-aware solution for efficient event data
  processing
Event Transformer. A sparse-aware solution for efficient event data processing
Alberto Sabater
Luis Montesano
Ana C. Murillo
222
67
0
07 Apr 2022
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
ReSTR: Convolution-free Referring Image Segmentation Using TransformersComputer Vision and Pattern Recognition (CVPR), 2022
N. Kim
Dongwon Kim
Cuiling Lan
Wenjun Zeng
Suha Kwak
342
178
0
31 Mar 2022
RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D
  Point Clouds with Cross-Attention Spatio-Temporal Features
RFNet-4D++: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds with Cross-Attention Spatio-Temporal Features
Tuan-Anh Vu
D. Nguyen
Binh-Son Hua
Quang Pham
Sai-Kit Yeung
3DPC
231
5
0
30 Mar 2022
Unsupervised Learning of Temporal Abstractions with Slot-based
  Transformers
Unsupervised Learning of Temporal Abstractions with Slot-based TransformersNeural Computation (Neural Comput.), 2022
Anand Gopalakrishnan
Kazuki Irie
Jürgen Schmidhuber
Sjoerd van Steenkiste
OffRL
387
18
0
25 Mar 2022
Transform your Smartphone into a DSLR Camera: Learning the ISP in the
  Wild
Transform your Smartphone into a DSLR Camera: Learning the ISP in the WildEuropean Conference on Computer Vision (ECCV), 2022
A. S. Tripathi
Martin Danelljan
Samarth Shukla
Radu Timofte
Luc Van Gool
287
12
0
20 Mar 2022
Integrating Language Guidance into Vision-based Deep Metric Learning
Integrating Language Guidance into Vision-based Deep Metric LearningComputer Vision and Pattern Recognition (CVPR), 2022
Karsten Roth
Oriol Vinyals
Zeynep Akata
VLM
207
31
0
16 Mar 2022
Previous
123...13141516
Next