ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.03206
  4. Cited By
Perceiver: General Perception with Iterative Attention
v1v2 (latest)

Perceiver: General Perception with Iterative Attention

International Conference on Machine Learning (ICML), 2021
4 March 2021
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
    VLMViTMDE
ArXiv (abs)PDFHTMLHuggingFace (2 upvotes)

Papers citing "Perceiver: General Perception with Iterative Attention"

50 / 790 papers shown
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
Kai-Po Chang
Wei-Yuan Cheng
Chi-Pin Huang
Fu-En Yang
Yu-Jie Wang
259
1
0
04 Dec 2025
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention
Zipeng Wang
Dan Xu
ViT
103
0
0
01 Dec 2025
OmniFD: A Unified Model for Versatile Face Forgery Detection
OmniFD: A Unified Model for Versatile Face Forgery Detection
Haotian Liu
Haoyu Chen
Chenhui Pan
You Hu
Guoying Zhao
Xiaobai Li
CVBM
285
0
0
30 Nov 2025
From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
Jingxi Chen
Yixiao Zhang
Xiaoye Qian
Zongxia Li
Cornelia Fermuller
Caren Chen
Yiannis Aloimonos
DiffM
372
0
0
26 Nov 2025
AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens
AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens
Purvish Jajal
Nick Eliopoulos
Benjamin Shiue-Hal Chou
George K. Thiruvathukal
Yung-Hsiang Lu
James C. Davis
109
0
0
22 Nov 2025
V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction
V2X-RECT: An Efficient V2X Trajectory Prediction Framework via Redundant Interaction Filtering and Tracking Error Correction
Xiangyan Kong
Xuecheng Wu
Xiongwei Zhao
X. Li
Yunyun Shi
Gang Wang
Dingkang Yang
Y. Liu
H. Chen
Y. Gao
79
0
0
22 Nov 2025
CAMS: Towards Compositional Zero-Shot Learning via Gated Cross-Attention and Multi-Space Disentanglement
Pan Yang
Cheng Deng
J. Yang
Han Zhao
Yun-Hai Liu
Yuling Chen
Xiaoli Ruan
Yanping Chen
CoGe
312
0
0
20 Nov 2025
N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator
N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator
Zheyu Lin
Jirui Yang
Hengqi Guo
Yubing Bao
Yao Guan
Yao Guan
138
0
0
18 Nov 2025
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
MergeDNA: Context-aware Genome Modeling with Dynamic Tokenization through Token Merging
Siyuan Li
Kai Yu
Anna Wang
Zicheng Liu
Chang Yu
Jingbo Zhou
Qirong Yang
Yucheng Guo
Xiaoming Zhang
Stan Z. Li
96
0
0
17 Nov 2025
Functional Mean Flow in Hilbert Space
Functional Mean Flow in Hilbert Space
Zhiqi Li
Yuchen Sun
Greg Turk
Bo Zhu
185
0
0
17 Nov 2025
CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
CLAReSNet: When Convolution Meets Latent Attention for Hyperspectral Image Classification
Asmit Bandyopadhyay
Anindita Das Bhattacharjee
Rakesh Das
114
0
0
15 Nov 2025
Spatial Reasoning in Multimodal Large Language Models: A Survey of Tasks, Benchmarks and Methods
Weichen Liu
Qiyao Xue
Haoming Wang
Xiangyu Yin
Boyuan Yang
Wei Gao
114
1
0
14 Nov 2025
Batch Transformer Architecture: Case of Synthetic Image Generation for Emotion Expression Facial Recognition
Batch Transformer Architecture: Case of Synthetic Image Generation for Emotion Expression Facial RecognitionAthens Journal of Sciences (JAS), 2025
Stanislav Selitskiy
ViT
150
0
0
13 Nov 2025
Integrating Temporal and Structural Context in Graph Transformers for Relational Deep Learning
Integrating Temporal and Structural Context in Graph Transformers for Relational Deep Learning
Divyansha Lachi
Mahmoud Mohammadi
Joe Meyer
Vinam Arora
Tom Palczewski
Eva L. Dyer
157
0
0
06 Nov 2025
Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models
Scalable Single-Cell Gene Expression Generation with Latent Diffusion Models
Giovanni Palla
Sudarshan Babu
Payam Dibaeinia
James D. Pearce
Donghui Li
Aly A. Khan
Theofanis Karaletsos
Jakub M. Tomczak
177
0
0
04 Nov 2025
Context Engineering 2.0: The Context of Context Engineering
Context Engineering 2.0: The Context of Context Engineering
Qishuo Hua
Lyumanshan Ye
Dayuan Fu
Yang Xiao
Xiaojie Cai
Yunze Wu
Jifan Lin
Junfei Wang
Pengfei Liu
390
2
0
30 Oct 2025
BLM$_1$: A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning
BLM1_11​: A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning
Wentao Tan
Bowen Wang
Heng Zhi
Chenyu Liu
Z. Li
...
Chen Xu
Zhibin Wang
Tianshi Wang
Lei Zhu
Heng Tao Shen
LM&Ro
168
0
0
28 Oct 2025
Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning
Perception Learning: A Formal Separation of Sensory Representation Learning from Decision Learning
Suman Sanyal
SSL
318
0
0
28 Oct 2025
Energy-Efficient Domain-Specific Artificial Intelligence Models and Agents: Pathways and Paradigms
Energy-Efficient Domain-Specific Artificial Intelligence Models and Agents: Pathways and Paradigms
Abhijit Chatterjee
N. Jha
Jonathan D. Cohen
Thomas Griffiths
Hongjing Lu
Diana Marculescu
Ashiqur Rasul
Keshab K. Parhi
LLMAGAI4CE
406
1
0
24 Oct 2025
Diffusion Autoencoders with Perceivers for Long, Irregular and Multimodal Astronomical Sequences
Diffusion Autoencoders with Perceivers for Long, Irregular and Multimodal Astronomical Sequences
Yunyi Shen
Alexander T. Gagliano
DiffM
125
1
0
23 Oct 2025
A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks
A Scalable, Causal, and Energy Efficient Framework for Neural Decoding with Spiking Neural Networks
Georgios Mentzelopoulos
Ioannis Asmanis
Konrad Paul Kording
Eva L. Dyer
Kostas Daniilidis
Flavia Vitale
141
0
0
23 Oct 2025
LLavaCode: Compressed Code Representations for Retrieval-Augmented Code Generation
LLavaCode: Compressed Code Representations for Retrieval-Augmented Code Generation
Daria Cherniuk
Nikita Sukhorukov
Nikita Sushko
Daniil Gusak
Danil Sivtsov
Elena Tutubalina
Evgeny Frolov
120
0
0
22 Oct 2025
AUGUSTUS: An LLM-Driven Multimodal Agent System with Contextualized User Memory
AUGUSTUS: An LLM-Driven Multimodal Agent System with Contextualized User Memory
Jitesh Jain
Shubham Maheshwari
Ning Yu
Wen-mei W. Hwu
Humphrey Shi
RALM
145
1
0
17 Oct 2025
AB-UPT for Automotive and Aerospace Applications
AB-UPT for Automotive and Aerospace Applications
Benedikt Alkin
Richard Kurle
Louis Serrano
Dennis Just
Johannes Brandstetter
AI4CE
103
2
0
17 Oct 2025
GOPLA: Generalizable Object Placement Learning via Synthetic Augmentation of Human Arrangement
GOPLA: Generalizable Object Placement Learning via Synthetic Augmentation of Human Arrangement
Yao Zhong
Hanzhi Chen
Simon Schaefer
Anran Zhang
Stefan Leutenegger
252
0
0
16 Oct 2025
PAINT: Parallel-in-time Neural Twins for Dynamical System Reconstruction
PAINT: Parallel-in-time Neural Twins for Dynamical System Reconstruction
Andreas Radler
Vincent Seyfried
Stefan Pirker
Johannes Brandstetter
Thomas Lichtenegger
132
1
0
14 Oct 2025
A Review of Longitudinal Radiology Report Generation: Dataset Composition, Methods, and Performance Evaluation
A Review of Longitudinal Radiology Report Generation: Dataset Composition, Methods, and Performance Evaluation
Shaoyang Zhou
Y. Li
Y. Liu
Lingqiao Liu
Lei Wang
Luping Zhou
123
0
0
14 Oct 2025
Inpainting the Neural Picture: Inferring Unrecorded Brain Area Dynamics from Multi-Animal Datasets
Inpainting the Neural Picture: Inferring Unrecorded Brain Area Dynamics from Multi-Animal Datasets
Ji Xia
Yizi Zhang
Shuqi Wang
Genevera I. Allen
Liam Paninski
Cole Hurwitz
Kenneth D. Miller
AI4CE
96
0
0
13 Oct 2025
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational Videos
Class Prototypes based Contrastive Learning for Classifying Multi-Label and Fine-Grained Educational VideosComputer Vision and Pattern Recognition (CVPR), 2023
Rohit Gupta
Anirban Roy
Claire Christensen
Sujeong Kim
Sarah Gerard
Madeline Cincebeaux
Ajay Divakaran
Todd Grindal
M. Shah
155
21
0
13 Oct 2025
Placeit! A Framework for Learning Robot Object Placement Skills
Placeit! A Framework for Learning Robot Object Placement Skills
Amina Ferrad
J. Huber
Francois Helenon
Julien Gleyze
Mahdi Khoramshahi
Stéphane Doncieux
119
1
0
10 Oct 2025
DM1: MeanFlow with Dispersive Regularization for 1-Step Robotic Manipulation
DM1: MeanFlow with Dispersive Regularization for 1-Step Robotic Manipulation
Guowei Zou
Haitao Wang
Hejun Wu
Yukun Qian
Yuhang Wang
Weibing Li
105
2
0
09 Oct 2025
Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT
Single layer tiny Co4^44 outpaces GPT-2 and GPT-BERT
Noor Ul Zain
Mohsin Raza
Ahsan Adeel
MoEALMELMVLM
183
0
0
09 Oct 2025
Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling
Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling
Bianca-Mihaela Ganescu
Suchir Salhan
Andrew Caines
P. Buttery
VLM
136
1
0
09 Oct 2025
Lung Infection Severity Prediction Using Transformers with Conditional TransMix Augmentation and Cross-Attention
Lung Infection Severity Prediction Using Transformers with Conditional TransMix Augmentation and Cross-Attention
Bouthaina Slika
Fadi Dornaika
F. Bougourzi
K. Hammoudi
ViTMedImLM&MA
199
0
0
08 Oct 2025
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations
Fabian Paischer
Gianluca Galletti
William Hornsby
Paul Setinek
L. Zanisi
Naomi Carey
Stanislas Pamela
Johannes Brandstetter
205
3
0
08 Oct 2025
PhaseFormer: From Patches to Phases for Efficient and Effective Time Series Forecasting
PhaseFormer: From Patches to Phases for Efficient and Effective Time Series Forecasting
Yiming Niu
Jinliang Deng
Yongxin Tong
AI4TS
121
0
0
05 Oct 2025
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
Chenhui Zhu
Yilu Wu
Shuai Wang
Gangshan Wu
Limin Wang
DiffMVGen
125
1
0
30 Sep 2025
EntroPE: Entropy-Guided Dynamic Patch Encoder for Time Series Forecasting
EntroPE: Entropy-Guided Dynamic Patch Encoder for Time Series Forecasting
Sachith Abeywickrama
Emadeldeen Eldele
Ruibing Jin
Xiaoli Li
Chau Yuen
AI4TS
137
0
0
30 Sep 2025
Indirect Attention: Turning Context Misalignment into a Feature
Indirect Attention: Turning Context Misalignment into a Feature
Bissmella Bahaduri
Hicham Talaoubrid
Fangchen Feng
Zuheng Ming
Anissa Mokraoui
111
0
0
30 Sep 2025
NeMo: Needle in a Montage for Video-Language Understanding
NeMo: Needle in a Montage for Video-Language Understanding
Zi-Yuan Hu
Shuo Liang
Duo Zheng
Yanyang Li
Yeyao Tao
...
Jianguang Yu
Jing-ling Huang
Meng Fang
Yin Li
Liwei Wang
170
2
0
29 Sep 2025
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections
Zeyu Cai
Z. Li
Xiaoben Li
Boqian Li
Zeyu Wang
Zhenyu Zhang
Yuliang Xiu
3DH
180
2
0
29 Sep 2025
GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries
GeoFunFlow: Geometric Function Flow Matching for Inverse Operator Learning over Complex Geometries
Sifan Wang
Zhikai Wu
David van Dijk
Lu Lu
AI4CE
148
1
0
28 Sep 2025
Task-Adaptive Parameter-Efficient Fine-Tuning for Weather Foundation Models
Task-Adaptive Parameter-Efficient Fine-Tuning for Weather Foundation Models
S. Cao
Hehai Lin
Jiashun Cheng
Yang Liu
Guowen Li
...
Mengxuan Chen
Meng Jin
C. Qin
Hong Cheng
Haohuan Fu
129
1
0
26 Sep 2025
Contrastive Mutual Information Learning: Toward Robust Representations without Positive-Pair Augmentations
Contrastive Mutual Information Learning: Toward Robust Representations without Positive-Pair Augmentations
Micha Livne
SSL
135
0
0
25 Sep 2025
Diffusion-Based Impedance Learning for Contact-Rich Manipulation Tasks
Diffusion-Based Impedance Learning for Contact-Rich Manipulation Tasks
Noah Geiger
Tamim Asfour
Neville Hogan
Johannes Lachner
182
0
0
24 Sep 2025
CompLLM: Compression for Long Context Q&A
CompLLM: Compression for Long Context Q&A
Gabriele Berton
Jayakrishnan Unnikrishnan
Son Tran
Mubarak Shah
89
1
0
23 Sep 2025
Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization
Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization
Liang Luo
Yong Li
Lin Zhao
Xiu-Shen Wei
163
0
0
21 Sep 2025
MAST: Multi-Agent Spatial Transformer for Learning to Collaborate
MAST: Multi-Agent Spatial Transformer for Learning to Collaborate
Damian Owerko
Frederic Vatnsdal
S. Agarwal
Vijay Kumar
Alejandro Ribeiro
180
0
0
21 Sep 2025
DAFTED: Decoupled Asymmetric Fusion of Tabular and Echocardiographic Data for Cardiac Hypertension Diagnosis
DAFTED: Decoupled Asymmetric Fusion of Tabular and Echocardiographic Data for Cardiac Hypertension Diagnosis
Jérémie Stym-Popper
Nathan Painchaud
Clément Rambour
P. Courand
Nicolas Thome
Olivier Bernard
147
1
0
19 Sep 2025
Hunyuan3D Studio: End-to-End AI Pipeline for Game-Ready 3D Asset Generation
Hunyuan3D Studio: End-to-End AI Pipeline for Game-Ready 3D Asset Generation
Biwen Lei
Yang Li
Xinhai Liu
Shuhui Yang
Lixin Xu
...
Y. Liu
Linus
Jie Jiang
Zhuo Chen
Chunchao Guo
VGen
204
4
0
16 Sep 2025
1234...141516
Next