ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03206
  4. Cited By
Perceiver: General Perception with Iterative Attention
v1v2 (latest)

Perceiver: General Perception with Iterative Attention

International Conference on Machine Learning (ICML), 2021
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
    VLMViTMDE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Perceiver: General Perception with Iterative Attention"

50 / 792 papers shown
Transform your Smartphone into a DSLR Camera: Learning the ISP in the
  Wild
Transform your Smartphone into a DSLR Camera: Learning the ISP in the WildEuropean Conference on Computer Vision (ECCV), 2022
A. S. Tripathi
Martin Danelljan
Samarth Shukla
Radu Timofte
Luc Van Gool
301
12
0
20 Mar 2022
Integrating Language Guidance into Vision-based Deep Metric Learning
Integrating Language Guidance into Vision-based Deep Metric LearningComputer Vision and Pattern Recognition (CVPR), 2022
Karsten Roth
Oriol Vinyals
Zeynep Akata
VLM
214
31
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks
  with Unified Vision-and-Language BERTs
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
203
6
0
15 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Masked Autoencoders for Point Cloud Self-supervised LearningEuropean Conference on Computer Vision (ECCV), 2022
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPCViT
290
626
0
13 Mar 2022
Block-Recurrent Transformers
Block-Recurrent TransformersNeural Information Processing Systems (NeurIPS), 2022
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
450
132
0
11 Mar 2022
Geodesic Multi-Modal Mixup for Robust Fine-Tuning
Geodesic Multi-Modal Mixup for Robust Fine-TuningNeural Information Processing Systems (NeurIPS), 2022
Changdae Oh
Junhyuk So
Hoyoon Byun
Yongtaek Lim
Minchul Shin
Jong-June Jeon
Kyungwoo Song
458
39
0
08 Mar 2022
High-Modality Multimodal Transformer: Quantifying Modality & Interaction
  Heterogeneity for High-Modality Representation Learning
High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Jeffrey Tsaw
Yudong Liu
Shentong Mo
Dani Yogatama
Louis-Philippe Morency
Ruslan Salakhutdinov
230
43
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary
  Detection
Temporal Perceiver: A General Architecture for Arbitrary Boundary DetectionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
239
20
0
01 Mar 2022
Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luís Vilacca
Yi Yu
Paula Viana
242
11
0
28 Feb 2022
Retriever: Learning Content-Style Representation as a Token-Level
  Bipartite Graph
Retriever: Learning Content-Style Representation as a Token-Level Bipartite GraphInternational Conference on Learning Representations (ICLR), 2022
Dacheng Yin
Xuanchi Ren
Chong Luo
Yuwang Wang
Zhiwei Xiong
Wenjun Zeng
272
13
0
24 Feb 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
Measuring CLEVRness: Blackbox testing of Visual Reasoning ModelsInternational Conference on Learning Representations (ICLR), 2022
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
272
4
0
24 Feb 2022
Learning to Merge Tokens in Vision Transformers
Learning to Merge Tokens in Vision Transformers
Cédric Renggli
André Susano Pinto
N. Houlsby
Basil Mustafa
J. Puigcerver
C. Riquelme
MoMe
227
79
0
24 Feb 2022
Better Modelling Out-of-Distribution Regression on Distributed Acoustic
  Sensor Data Using Anchored Hidden State Mixup
Better Modelling Out-of-Distribution Regression on Distributed Acoustic Sensor Data Using Anchored Hidden State MixupIEEE Transactions on Industrial Informatics (IEEE TII), 2022
Hasan Asy’ari Arief
P. J. Thomas
T. Wiktorski
OOD
99
6
0
23 Feb 2022
HiP: Hierarchical Perceiver
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
362
14
0
22 Feb 2022
Transformer Quality in Linear Time
Transformer Quality in Linear TimeInternational Conference on Machine Learning (ICML), 2022
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
493
302
0
21 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver ARInternational Conference on Machine Learning (ICML), 2022
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
242
76
0
15 Feb 2022
SpeechPainter: Text-conditioned Speech Inpainting
SpeechPainter: Text-conditioned Speech InpaintingInterspeech (Interspeech), 2022
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
214
35
0
15 Feb 2022
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting
  Recognition from IMU-Enhanced Pens
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced PensInternational Journal on Document Analysis and Recognition (IJDAR), 2022
Felix Ott
David Rügamer
Lucas Heublein
Tim Hamann
Jens Barth
B. Bischl
Christopher Mutschler
418
20
0
14 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech,
  Vision and Language
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and LanguageInternational Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSLVLMViT
584
1,037
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning FrameworkInternational Conference on Machine Learning (ICML), 2022
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLMObjD
525
1,017
0
07 Feb 2022
Webly Supervised Concept Expansion for General Purpose Vision Models
Webly Supervised Concept Expansion for General Purpose Vision ModelsEuropean Conference on Computer Vision (ECCV), 2022
Amita Kamath
Christopher Clark
Tanmay Gupta
Eric Kolve
Derek Hoiem
Aniruddha Kembhavi
VLM
301
68
0
04 Feb 2022
Exploring Transformer Backbones for Heterogeneous Treatment Effect
  Estimation
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation
Yi-Fan Zhang
Hanlin Zhang
Zachary Chase Lipton
Li Erran Li
Eric P. Xing
OODD
384
35
0
02 Feb 2022
Learning Super-Features for Image Retrieval
Learning Super-Features for Image RetrievalInternational Conference on Learning Representations (ICLR), 2022
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
SupRVLM
224
56
0
31 Jan 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
  Progressive Matrices
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive MatricesACM Computing Surveys (ACM CSUR), 2022
Mikolaj Malkiñski
Jacek Mańdziuk
492
54
0
28 Jan 2022
From data to functa: Your data point is a function and you can treat it
  like one
From data to functa: Your data point is a function and you can treat it like oneInternational Conference on Machine Learning (ICML), 2022
Emilien Dupont
Hyunjik Kim
S. M. Ali Eslami
Danilo Jimenez Rezende
Dan Rosenbaum
TDI3DPC
587
186
0
28 Jan 2022
Density-Aware Hyper-Graph Neural Networks for Graph-based
  Semi-supervised Node Classification
Density-Aware Hyper-Graph Neural Networks for Graph-based Semi-supervised Node Classification
Jianpeng Liao
Qian Tao
Jun Yan
GNN
176
3
0
27 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual ModalitiesComputer Vision and Pattern Recognition (CVPR), 2022
Rohit Girdhar
Mannat Singh
Nikhil Ravi
Laurens van der Maaten
Armand Joulin
Ishan Misra
611
287
0
20 Jan 2022
Video Transformers: A Survey
Video Transformers: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
460
139
0
16 Jan 2022
Latency Adjustable Transformer Encoder for Language Understanding
Latency Adjustable Transformer Encoder for Language UnderstandingIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Sajjad Kachuee
M. Sharifkhani
590
1
0
10 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable AttentionComputer Vision and Pattern Recognition (CVPR), 2022
Zhuofan Xia
Xuran Pan
Qing Xiao
Li Erran Li
Gao Huang
ViT
451
704
0
03 Jan 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
283
121
0
23 Dec 2021
Learned Queries for Efficient Local Attention
Learned Queries for Efficient Local AttentionComputer Vision and Pattern Recognition (CVPR), 2021
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
271
36
0
21 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
3.1K
21,434
0
20 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in
  Images and Point Clouds
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
492
148
0
16 Dec 2021
Audio-Visual Synchronisation in the wild
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
225
49
0
08 Dec 2021
Input-level Inductive Biases for 3D Reconstruction
Input-level Inductive Biases for 3D ReconstructionComputer Vision and Pattern Recognition (CVPR), 2021
Yifan Wang
Carl Doersch
Relja Arandjelović
João Carreira
Andrew Zisserman
3DV
370
32
0
06 Dec 2021
Hybrid Instance-aware Temporal Fusion for Online Video Instance
  Segmentation
Hybrid Instance-aware Temporal Fusion for Online Video Instance Segmentation
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
201
20
0
03 Dec 2021
Efficient Self-Ensemble for Semantic Segmentation
Efficient Self-Ensemble for Semantic SegmentationBritish Machine Vision Conference (BMVC), 2021
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
292
33
0
26 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
196
83
0
25 Nov 2021
Conditional Object-Centric Learning from Video
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
357
264
0
24 Nov 2021
Sparse Fusion for Multimodal Transformers
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
169
9
0
23 Nov 2021
Many Heads but One Brain: Fusion Brain -- a Competition and a Single
  Multimodal Multitask Architecture
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture
Daria Bakshandaeva
Denis Dimitrov
V.Ya. Arkhipkin
Alex Shonenkov
M. Potanin
...
Mikhail Martynov
Anton Voronov
Vera Davydova
E. Tutubalina
Aleksandr Petiushko
381
0
0
22 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under
  Tiny Model Constraints
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
241
4
0
19 Nov 2021
Edge-Native Intelligence for 6G Communications Driven by Federated
  Learning: A Survey of Trends and Challenges
Edge-Native Intelligence for 6G Communications Driven by Federated Learning: A Survey of Trends and ChallengesIEEE Transactions on Emerging Topics in Computational Intelligence (IEEE TETCI), 2021
Mohammad M. Al-Quraan
Lina S. Mohjazi
Lina Bariah
A. Centeno
A. Zoha
Sami Muhaidat
Mérouane Debbah
Muhammad Ali Imran
194
94
0
14 Nov 2021
Multi-Glimpse Network: A Robust and Efficient Classification
  Architecture based on Recurrent Downsampled Attention
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled AttentionBritish Machine Vision Conference (BMVC), 2021
S. Tan
Runpei Dong
Kaisheng Ma
344
2
0
03 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric
  Action Recognition
With a Little Help from my Temporal Context: Multimodal Egocentric Action RecognitionBritish Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
297
55
0
01 Nov 2021
Hyper-Representations: Self-Supervised Representation Learning on Neural
  Network Weights for Model Characteristic Prediction
Hyper-Representations: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction
Konstantin Schurholt
Dimche Kostadinov
Damian Borth
SSL
416
15
0
28 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear ComplexityNeural Information Processing Systems (NeurIPS), 2021
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
241
193
0
22 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
359
150
0
19 Oct 2021
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
173
4
0
18 Oct 2021
Previous
123...141516
Next
Page 15 of 16
Pageof 16