Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2103.03206
Cited By
v1
v2 (latest)
Perceiver: General Perception with Iterative Attention
International Conference on Machine Learning (ICML), 2021
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (2 upvotes)
Papers citing
"Perceiver: General Perception with Iterative Attention"
50 / 792 papers shown
Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild
European Conference on Computer Vision (ECCV), 2022
A. S. Tripathi
Martin Danelljan
Samarth Shukla
Radu Timofte
Luc Van Gool
301
12
0
20 Mar 2022
Integrating Language Guidance into Vision-based Deep Metric Learning
Computer Vision and Pattern Recognition (CVPR), 2022
Karsten Roth
Oriol Vinyals
Zeynep Akata
VLM
214
31
0
16 Mar 2022
Do BERTs Learn to Use Browser User Interface? Exploring Multi-Step Tasks with Unified Vision-and-Language BERTs
Taichi Iki
Akiko Aizawa
LLMAG
203
6
0
15 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
European Conference on Computer Vision (ECCV), 2022
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
290
626
0
13 Mar 2022
Block-Recurrent Transformers
Neural Information Processing Systems (NeurIPS), 2022
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
450
132
0
11 Mar 2022
Geodesic Multi-Modal Mixup for Robust Fine-Tuning
Neural Information Processing Systems (NeurIPS), 2022
Changdae Oh
Junhyuk So
Hoyoon Byun
Yongtaek Lim
Minchul Shin
Jong-June Jeon
Kyungwoo Song
458
39
0
08 Mar 2022
High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Jeffrey Tsaw
Yudong Liu
Shentong Mo
Dani Yogatama
Louis-Philippe Morency
Ruslan Salakhutdinov
230
43
0
02 Mar 2022
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
239
20
0
01 Mar 2022
Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luís Vilacca
Yi Yu
Paula Viana
242
11
0
28 Feb 2022
Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph
International Conference on Learning Representations (ICLR), 2022
Dacheng Yin
Xuanchi Ren
Chong Luo
Yuwang Wang
Zhiwei Xiong
Wenjun Zeng
272
13
0
24 Feb 2022
Measuring CLEVRness: Blackbox testing of Visual Reasoning Models
International Conference on Learning Representations (ICLR), 2022
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
272
4
0
24 Feb 2022
Learning to Merge Tokens in Vision Transformers
Cédric Renggli
André Susano Pinto
N. Houlsby
Basil Mustafa
J. Puigcerver
C. Riquelme
MoMe
227
79
0
24 Feb 2022
Better Modelling Out-of-Distribution Regression on Distributed Acoustic Sensor Data Using Anchored Hidden State Mixup
IEEE Transactions on Industrial Informatics (IEEE TII), 2022
Hasan Asy’ari Arief
P. J. Thomas
T. Wiktorski
OOD
99
6
0
23 Feb 2022
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
362
14
0
22 Feb 2022
Transformer Quality in Linear Time
International Conference on Machine Learning (ICML), 2022
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
493
302
0
21 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
International Conference on Machine Learning (ICML), 2022
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
242
76
0
15 Feb 2022
SpeechPainter: Text-conditioned Speech Inpainting
Interspeech (Interspeech), 2022
Zalan Borsos
Matthew Sharifi
Marco Tagliasacchi
214
35
0
15 Feb 2022
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens
International Journal on Document Analysis and Recognition (IJDAR), 2022
Felix Ott
David Rügamer
Lucas Heublein
Tim Hamann
Jens Barth
B. Bischl
Christopher Mutschler
418
20
0
14 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
International Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
584
1,037
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
International Conference on Machine Learning (ICML), 2022
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
525
1,017
0
07 Feb 2022
Webly Supervised Concept Expansion for General Purpose Vision Models
European Conference on Computer Vision (ECCV), 2022
Amita Kamath
Christopher Clark
Tanmay Gupta
Eric Kolve
Derek Hoiem
Aniruddha Kembhavi
VLM
301
68
0
04 Feb 2022
Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation
Yi-Fan Zhang
Hanlin Zhang
Zachary Chase Lipton
Li Erran Li
Eric P. Xing
OODD
384
35
0
02 Feb 2022
Learning Super-Features for Image Retrieval
International Conference on Learning Representations (ICLR), 2022
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
SupR
VLM
224
56
0
31 Jan 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive Matrices
ACM Computing Surveys (ACM CSUR), 2022
Mikolaj Malkiñski
Jacek Mańdziuk
492
54
0
28 Jan 2022
From data to functa: Your data point is a function and you can treat it like one
International Conference on Machine Learning (ICML), 2022
Emilien Dupont
Hyunjik Kim
S. M. Ali Eslami
Danilo Jimenez Rezende
Dan Rosenbaum
TDI
3DPC
587
186
0
28 Jan 2022
Density-Aware Hyper-Graph Neural Networks for Graph-based Semi-supervised Node Classification
Jianpeng Liao
Qian Tao
Jun Yan
GNN
176
3
0
27 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Computer Vision and Pattern Recognition (CVPR), 2022
Rohit Girdhar
Mannat Singh
Nikhil Ravi
Laurens van der Maaten
Armand Joulin
Ishan Misra
611
287
0
20 Jan 2022
Video Transformers: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
460
139
0
16 Jan 2022
Latency Adjustable Transformer Encoder for Language Understanding
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Sajjad Kachuee
M. Sharifkhani
590
1
0
10 Jan 2022
Vision Transformer with Deformable Attention
Computer Vision and Pattern Recognition (CVPR), 2022
Zhuofan Xia
Xuran Pan
Qing Xiao
Li Erran Li
Gao Huang
ViT
451
704
0
03 Jan 2022
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
283
121
0
23 Dec 2021
Learned Queries for Efficient Local Attention
Computer Vision and Pattern Recognition (CVPR), 2021
Moab Arar
Ariel Shamir
Amit H. Bermano
ViT
271
36
0
21 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2021
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
DiffM
3.1K
21,434
0
20 Dec 2021
Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds
Ayush Jain
N. Gkanatsios
Ishita Mediratta
Katerina Fragkiadaki
ObjD
492
148
0
16 Dec 2021
Audio-Visual Synchronisation in the wild
Honglie Chen
Weidi Xie
Triantafyllos Afouras
Arsha Nagrani
Andrea Vedaldi
Andrew Zisserman
225
49
0
08 Dec 2021
Input-level Inductive Biases for 3D Reconstruction
Computer Vision and Pattern Recognition (CVPR), 2021
Yifan Wang
Carl Doersch
Relja Arandjelović
João Carreira
Andrew Zisserman
3DV
370
32
0
06 Dec 2021
Hybrid Instance-aware Temporal Fusion for Online Video Instance Segmentation
Xiang Li
Jinglu Wang
Xiao Li
Yan Lu
201
20
0
03 Dec 2021
Efficient Self-Ensemble for Semantic Segmentation
British Machine Vision Conference (BMVC), 2021
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
292
33
0
26 Nov 2021
PolyViT: Co-training Vision Transformers on Images, Videos and Audio
Valerii Likhosherstov
Anurag Arnab
K. Choromanski
Mario Lucic
Yi Tay
Adrian Weller
Mostafa Dehghani
ViT
196
83
0
25 Nov 2021
Conditional Object-Centric Learning from Video
Thomas Kipf
Gamaleldin F. Elsayed
Aravindh Mahendran
Austin Stone
S. Sabour
G. Heigold
Rico Jonschkowski
Alexey Dosovitskiy
Klaus Greff
OCL
357
264
0
24 Nov 2021
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
169
9
0
23 Nov 2021
Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture
Daria Bakshandaeva
Denis Dimitrov
V.Ya. Arkhipkin
Alex Shonenkov
M. Potanin
...
Mikhail Martynov
Anton Voronov
Vera Davydova
E. Tutubalina
Aleksandr Petiushko
381
0
0
22 Nov 2021
Rethinking Query, Key, and Value Embedding in Vision Transformer under Tiny Model Constraints
Jaesin Ahn
Jiuk Hong
Jeongwoo Ju
Heechul Jung
ViT
241
4
0
19 Nov 2021
Edge-Native Intelligence for 6G Communications Driven by Federated Learning: A Survey of Trends and Challenges
IEEE Transactions on Emerging Topics in Computational Intelligence (IEEE TETCI), 2021
Mohammad M. Al-Quraan
Lina S. Mohjazi
Lina Bariah
A. Centeno
A. Zoha
Sami Muhaidat
Mérouane Debbah
Muhammad Ali Imran
194
94
0
14 Nov 2021
Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention
British Machine Vision Conference (BMVC), 2021
S. Tan
Runpei Dong
Kaisheng Ma
344
2
0
03 Nov 2021
With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition
British Machine Vision Conference (BMVC), 2021
Evangelos Kazakos
Jaesung Huh
Arsha Nagrani
Andrew Zisserman
Dima Damen
EgoV
297
55
0
01 Nov 2021
Hyper-Representations: Self-Supervised Representation Learning on Neural Network Weights for Model Characteristic Prediction
Konstantin Schurholt
Dimche Kostadinov
Damian Borth
SSL
416
15
0
28 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
Neural Information Processing Systems (NeurIPS), 2021
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
241
193
0
22 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
359
150
0
19 Oct 2021
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
173
4
0
18 Oct 2021
Previous
1
2
3
...
14
15
16
Next
Page 15 of 16
Page
of 16
Go