ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03206
  4. Cited By
Perceiver: General Perception with Iterative Attention

Perceiver: General Perception with Iterative Attention

4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
    VLM
    ViT
    MDE
ArXivPDFHTML

Papers citing "Perceiver: General Perception with Iterative Attention"

50 / 680 papers shown
Title
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
26
848
0
07 Feb 2022
Webly Supervised Concept Expansion for General Purpose Vision Models
Webly Supervised Concept Expansion for General Purpose Vision Models
Amita Kamath
Christopher Clark
Tanmay Gupta
Eric Kolve
Derek Hoiem
Aniruddha Kembhavi
VLM
17
54
0
04 Feb 2022
Exploring Transformer Backbones for Heterogeneous Treatment Effect
  Estimation
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation
Yi-Fan Zhang
Hanlin Zhang
Zachary Chase Lipton
Li Erran Li
Eric P. Xing
OODD
8
29
0
02 Feb 2022
Learning Super-Features for Image Retrieval
Learning Super-Features for Image Retrieval
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
SupR
VLM
14
45
0
31 Jan 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
  Progressive Matrices
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices
Mikolaj Malkiñski
Jacek Mañdziuk
107
41
0
28 Jan 2022
From data to functa: Your data point is a function and you can treat it
  like one
From data to functa: Your data point is a function and you can treat it like one
Emilien Dupont
Hyunjik Kim
S. M. Ali Eslami
Danilo Jimenez Rezende
Dan Rosenbaum
TDI
3DPC
160
136
0
28 Jan 2022
Density-Aware Hyper-Graph Neural Networks for Graph-based
  Semi-supervised Node Classification
Density-Aware Hyper-Graph Neural Networks for Graph-based Semi-supervised Node Classification
Jianpeng Liao
Qian Tao
Jun Yan
GNN
12
3
0
27 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
209
222
0
20 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
101
0
16 Jan 2022
Latency Adjustable Transformer Encoder for Language Understanding
Latency Adjustable Transformer Encoder for Language Understanding
Sajjad Kachuee
M. Sharifkhani
24
0
0
10 Jan 2022
Vision Transformer with Deformable Attention
Vision Transformer with Deformable Attention
Zhuofan Xia
Xuran Pan
S. Song
Li Erran Li
Gao Huang
ViT
11
450
0
03 Jan 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
24
92
0
23 Dec 2021
Learned Queries for Efficient Local Attention
Learned Queries for Efficient Local Attention
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
28
29
0
21 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
50
14,533
0
20 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in
  Images and Point Clouds
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
23
98
0
16 Dec 2021
Audio-Visual Synchronisation in the wild
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
10
37
0
08 Dec 2021
Input-level Inductive Biases for 3D Reconstruction
Input-level Inductive Biases for 3D Reconstruction
Yifan Wang
Carl Doersch
Relja Arandjelović
João Carreira
Andrew Zisserman
3DV
21
24
0
06 Dec 2021
Hybrid Instance-aware Temporal Fusion for Online Video Instance
  Segmentation
Hybrid Instance-aware Temporal Fusion for Online Video Instance Segmentation
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
22
19
0
03 Dec 2021
Efficient Self-Ensemble for Semantic Segmentation
Efficient Self-Ensemble for Semantic Segmentation
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
14
24
0
26 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
33
73
0
25 Nov 2021
Conditional Object-Centric Learning from Video
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
39
213
0
24 Nov 2021
Sparse Fusion for Multimodal Transformers
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
17
7
0
23 Nov 2021
Many Heads but One Brain: Fusion Brain -- a Competition and a Single
  Multimodal Multitask Architecture
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture
Daria Bakshandaeva
Denis Dimitrov
V.Ya. Arkhipkin
Alex Shonenkov
M. Potanin
...
Mikhail Martynov
Anton Voronov
Vera Davydova
E. Tutubalina
Aleksandr Petiushko
33
0
0
22 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under
  Tiny Model Constraints
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
11
3
0
19 Nov 2021
Edge-Native Intelligence for 6G Communications Driven by Federated
  Learning: A Survey of Trends and Challenges
Edge-Native Intelligence for 6G Communications Driven by Federated Learning: A Survey of Trends and Challenges
Mohammad M. Al-Quraan
Lina S. Mohjazi
Lina Bariah
A. Centeno
A. Zoha
Sami Muhaidat
Mérouane Debbah
Muhammad Ali Imran
8
62
0
14 Nov 2021
Multi-Glimpse Network: A Robust and Efficient Classification
  Architecture based on Recurrent Downsampled Attention
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention
S. Tan
Runpei Dong
Kaisheng Ma
12
2
0
03 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric
  Action Recognition
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
34
45
0
01 Nov 2021
Hyper-Representations: Self-Supervised Representation Learning on Neural
  Network Weights for Model Characteristic Prediction
Hyper-Representations: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction
Konstantin Schurholt
Dimche Kostadinov
Damian Borth
SSL
14
14
0
28 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
11
159
0
22 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
11
114
0
19 Oct 2021
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
26
3
0
18 Oct 2021
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
EncT5: A Framework for Fine-tuning T5 as Non-autoregressive Models
Frederick Liu
T. Huang
Shihang Lyu
Siamak Shakeri
Hongkun Yu
Jing Li
29
7
0
16 Oct 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
221
1,017
0
13 Oct 2021
Two-argument activation functions learn soft XOR operations like
  cortical neurons
Two-argument activation functions learn soft XOR operations like cortical neurons
Kijung Yoon
Emin Orhan
Juhyeon Kim
Xaq Pitkow
MLT
27
0
0
13 Oct 2021
Dynamic Inference with Neural Interpreters
Dynamic Inference with Neural Interpreters
Nasim Rahaman
Muhammad Waleed Gondal
S. Joshi
Peter V. Gehler
Yoshua Bengio
Francesco Locatello
Bernhard Schölkopf
21
31
0
12 Oct 2021
Efficient Training of Audio Transformers with Patchout
Efficient Training of Audio Transformers with Patchout
Khaled Koutini
Jan Schluter
Hamid Eghbalzadeh
Gerhard Widmer
ViT
30
251
0
11 Oct 2021
Recurrent Attention Models with Object-centric Capsule Representation
  for Multi-object Recognition
Recurrent Attention Models with Object-centric Capsule Representation for Multi-object Recognition
Hossein Adeli
Seoyoung Ahn
G. Zelinsky
OCL
23
3
0
11 Oct 2021
Cross-lingual Transfer of Monolingual Models
Cross-lingual Transfer of Monolingual Models
Evangelia Gogoulou
Ariel Ekgren
T. Isbister
Magnus Sahlgren
27
16
0
15 Sep 2021
Patch-based Medical Image Segmentation using Matrix Product State Tensor
  Networks
Patch-based Medical Image Segmentation using Matrix Product State Tensor Networks
Raghavendra Selvan
Erik Dam
Soren Alexander Flensborg
Jens Petersen
MedIm
16
2
0
15 Sep 2021
The Sensory Neuron as a Transformer: Permutation-Invariant Neural
  Networks for Reinforcement Learning
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning
Yujin Tang
David R Ha
20
75
0
07 Sep 2021
$\infty$-former: Infinite Memory Transformer
∞\infty∞-former: Infinite Memory Transformer
Pedro Henrique Martins
Zita Marinho
André F. T. Martins
23
11
0
01 Sep 2021
Transformers predicting the future. Applying attention in next-frame and
  time series forecasting
Transformers predicting the future. Applying attention in next-frame and time series forecasting
Radostin Cholakov
T. Kolev
AI4TS
9
15
0
18 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
17
561
0
30 Jul 2021
Proposal-based Few-shot Sound Event Detection for Speech and
  Environmental Sounds with Perceivers
Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers
Piper Wolters
Logan Sizemore
Chris Daw
Brian Hutchinson
Lauren A. Phillips
12
11
0
28 Jul 2021
A3GC-IP: Attention-Oriented Adjacency Adaptive Recurrent Graph
  Convolutions for Human Pose Estimation from Sparse Inertial Measurements
A3GC-IP: Attention-Oriented Adjacency Adaptive Recurrent Graph Convolutions for Human Pose Estimation from Sparse Inertial Measurements
Patrik Puchert
Timo Ropinski
3DH
17
3
0
23 Jul 2021
Sequence-to-Sequence Piano Transcription with Transformers
Sequence-to-Sequence Piano Transcription with Transformers
Curtis Hawthorne
Ian Simon
Rigel Swavely
Ethan Manilow
Jesse Engel
22
80
0
19 Jul 2021
Long Short-Term Transformer for Online Action Detection
Long Short-Term Transformer for Online Action Detection
Mingze Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Xia
Z. Tu
Stefano Soatto
ViT
24
128
0
07 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Wei Ping
Chaowei Xiao
M. Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViT
VLM
13
130
0
05 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
23
536
0
30 Jun 2021
Multi-Exit Vision Transformer for Dynamic Inference
Multi-Exit Vision Transformer for Dynamic Inference
Arian Bakhtiarnia
Qi Zhang
Alexandros Iosifidis
28
26
0
29 Jun 2021
Previous
123...121314
Next