ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12556
  4. Cited By
A Survey on Visual Transformer
v1v2v3v4v5v6 (latest)

A Survey on Visual Transformer

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
23 December 2020
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
Zhenhua Liu
Yehui Tang
An Xiao
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
    ViT
ArXiv (abs)PDFHTML

Papers citing "A Survey on Visual Transformer"

50 / 564 papers shown
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Yuan Yao
Lixu Wang
Jiaqi Wu
Jin Song
Simin Chen
Zehua Wang
Zijian Tian
Wei Chen
Huixia Li
Xiaoxiao Li
FedML
199
0
0
30 Mar 2026
Temp-SCONE: A Novel Out-of-Distribution Detection and Domain Generalization Framework for Wild Data with Temporal Shift
Temp-SCONE: A Novel Out-of-Distribution Detection and Domain Generalization Framework for Wild Data with Temporal Shift
Aditi Naiknaware
Sanchit Singh
Hajar Homayouni
Salimeh Sekeh
OOD
161
1
0
04 Dec 2025
Unrolled Networks are Conditional Probability Flows in MRI Reconstruction
Unrolled Networks are Conditional Probability Flows in MRI Reconstruction
Kehan Qi
Saumya Gupta
Qingqiao Hu
Weimin Lyu
Chao Chen
Chao Chen
MedIm
360
0
0
02 Dec 2025
Benchmarking machine learning models for multi-class state recognition in double quantum dot data
Benchmarking machine learning models for multi-class state recognition in double quantum dot data
Valeria Díaz Moreno
Ryan P Khalili
Daniel Schug
Patrick J. Walsh
Justyna P. Zwolak
176
1
0
27 Nov 2025
Collaborative Learning with Multiple Foundation Models for Source-Free Domain Adaptation
Collaborative Learning with Multiple Foundation Models for Source-Free Domain Adaptation
Huisoo Lee
Jisu Han
Hyunsouk Cho
Wonjun Hwang
TTAVLMAI4CE
314
0
0
24 Nov 2025
From Low-Rank Features to Encoding Mismatch: Rethinking Feature Distillation in Vision Transformers
From Low-Rank Features to Encoding Mismatch: Rethinking Feature Distillation in Vision Transformers
Huiyuan Tian
Bonan Xu
Shijian Li
Xin Jin
165
1
0
19 Nov 2025
Naga: Vedic Encoding for Deep State Space Models
Naga: Vedic Encoding for Deep State Space Models
Melanie Schaller
Nick Janssen
Bodo Rosenhahn
AI4TS
241
0
0
17 Nov 2025
Intelligent Collaborative Optimization for Rubber Tyre Film Production Based on Multi-path Differentiated Clipping Proximal Policy Optimization
Intelligent Collaborative Optimization for Rubber Tyre Film Production Based on Multi-path Differentiated Clipping Proximal Policy Optimization
Yinghao Ruan
Wei Pang
Shuaihao Liu
Huili Yang
Leyi Han
Xinghui Dong
246
0
0
15 Nov 2025
TEDxTN: A Three-way Speech Translation Corpus for Code-Switched Tunisian Arabic - English
TEDxTN: A Three-way Speech Translation Corpus for Code-Switched Tunisian Arabic - English
Fethi Bougares
Salima Mdhaffar
Haroun Elleuch
Yannick Esteve
66
0
0
13 Nov 2025
Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks
Densemarks: Learning Canonical Embeddings for Human Heads Images via Point Tracks
Dmitrii Pozdeev
Alexey Artemov
A. Bhattarai
Artem Sevastopolsky
3DH
297
0
0
04 Nov 2025
REASON: Probability map-guided dual-branch fusion framework for gastric content assessment
REASON: Probability map-guided dual-branch fusion framework for gastric content assessment
Nu-Fnag Xiao
De-Xing Huang
Le-Tian Wang
Mei-Jiang Gui
Qi Fu
...
Shi-Qi Liu
Shuangyi Wang
Zeng-Guang Hou
Ying-Wei Wang
Xiao-Hu Zhou
194
0
0
03 Nov 2025
CoMViT: An Efficient Vision Backbone for Supervised Classification in Medical Imaging
CoMViT: An Efficient Vision Backbone for Supervised Classification in Medical Imaging
Aon Safdar
Mohamed Saadeldin
ViTMedIm
121
0
0
31 Oct 2025
Integrating ConvNeXt and Vision Transformers for Enhancing Facial Age Estimation
Integrating ConvNeXt and Vision Transformers for Enhancing Facial Age EstimationComputer Vision and Image Understanding (CVIU), 2025
Gaby Maroun
Salah Eddine Bekhouche
Fadi Dornaika
ViT
154
1
0
31 Oct 2025
Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems
Mixture-of-Transformers Learn Faster: A Theoretical Study on Classification Problems
Hongbo Li
Qinhang Wu
Sen-Fon Lin
Yingbin Liang
Ness B. Shroff
MoE
213
0
0
30 Oct 2025
CLFSeg: A Fuzzy-Logic based Solution for Boundary Clarity and Uncertainty Reduction in Medical Image Segmentation
CLFSeg: A Fuzzy-Logic based Solution for Boundary Clarity and Uncertainty Reduction in Medical Image Segmentation
Anshul Kaushal
Kunal Jangid
Vinod K. Kurmi
118
0
0
28 Oct 2025
VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
Jesimon Barreto
C. Caetano
A. Araújo
William Robson Schwartz
VLM
165
0
0
23 Oct 2025
BrainPuzzle: Hybrid Physics and Data-Driven Reconstruction for Transcranial Ultrasound Tomography
BrainPuzzle: Hybrid Physics and Data-Driven Reconstruction for Transcranial Ultrasound Tomography
Shengyu Chen
Shihang Feng
Yi Luo
Xiaowei Jia
Youzuo Lin
165
0
0
22 Oct 2025
ArmFormer: Lightweight Transformer Architecture for Real-Time Multi-Class Weapon Segmentation and Classification
ArmFormer: Lightweight Transformer Architecture for Real-Time Multi-Class Weapon Segmentation and Classification
Akhila Kambhatla
Taminul Islam
Khaled R Ahmed
ViT
212
0
0
19 Oct 2025
Cross-Layer Feature Self-Attention Module for Multi-Scale Object Detection
Cross-Layer Feature Self-Attention Module for Multi-Scale Object Detection
Dingzhou Xie
Rushi Lan
Cheng Pang
Enhao Ning
Jiahao Zeng
Wei Zheng
194
0
0
16 Oct 2025
Minkowski-MambaNet: A Point Cloud Framework with Selective State Space Models for Forest Biomass Quantification
Minkowski-MambaNet: A Point Cloud Framework with Selective State Space Models for Forest Biomass Quantification
Jinxiang Tu
Dayong Ren
Fei Shi
Zhenhong Jia
Yahong Ren
Jiwei Qin
Fang He
Mamba
154
0
0
10 Oct 2025
Data driven approaches in nanophotonics: A review of AI-enabled metadevices
Data driven approaches in nanophotonics: A review of AI-enabled metadevices
Huanshu Zhang
Lei Kang
Sawyer D. Campbell
Jacob T. Young
Douglas H. Werner
AI4CE
252
4
0
30 Sep 2025
Causally Guided Gaussian Perturbations for Out-Of-Distribution Generalization in Medical Imaging
Causally Guided Gaussian Perturbations for Out-Of-Distribution Generalization in Medical Imaging
Haoran Pei
Yuguang Yang
Kexin Liu
Baochang Zhang
OODOODDCMLMedIm
251
0
0
30 Sep 2025
When MLLMs Meet Compression Distortion: A Coding Paradigm Tailored to MLLMs
When MLLMs Meet Compression Distortion: A Coding Paradigm Tailored to MLLMs
Jinming Liu
Zhaoyang Jia
J. Li
Bin Li
Xin Jin
Wenjun Zeng
Yan Lu
149
2
0
29 Sep 2025
OmniScene: Attention-Augmented Multimodal 4D Scene Understanding for Autonomous Driving
OmniScene: Attention-Augmented Multimodal 4D Scene Understanding for Autonomous Driving
Pei Liu
Hongliang Lu
Haichao Liu
Haipeng Liu
Xin Liu
Ruoyu Yao
S. Li
Jun Ma
241
3
0
24 Sep 2025
Lightweight Vision Transformer with Window and Spatial Attention for Food Image Classification
Lightweight Vision Transformer with Window and Spatial Attention for Food Image Classification
Xinle Gao
Linghui Ye
Zhiyong Xiao
ViT
93
2
0
23 Sep 2025
Towards a Transparent and Interpretable AI Model for Medical Image Classifications
Towards a Transparent and Interpretable AI Model for Medical Image ClassificationsCognitive Neurodynamics (Cogn Neurodyn), 2025
Binbin Wen
Yihang Wu
Tareef Daqqaq
Ahmad Chaddad
164
0
0
20 Sep 2025
Learning Hyperspectral Images with Curated Text Prompts for Efficient Multimodal Alignment
Learning Hyperspectral Images with Curated Text Prompts for Efficient Multimodal Alignment
Abhiroop Chatterjee
Susmita K. Ghosh
VLM
128
1
0
20 Sep 2025
Sequential Token Merging: Revisiting Hidden States
Sequential Token Merging: Revisiting Hidden States
Yan Wen
Peng Ye
Lin Zhang
Baopu Li
Jiakang Yuan
Yaoxin Yang
Tao Chen
Mamba
178
0
0
19 Sep 2025
Layout Stroke Imitation: A Layout Guided Handwriting Stroke Generation for Style Imitation with Diffusion Model
Layout Stroke Imitation: A Layout Guided Handwriting Stroke Generation for Style Imitation with Diffusion Model
Sidra Hanif
Longin Jan Latecki
DiffM
263
0
0
19 Sep 2025
Which Direction to Choose? An Analysis on the Representation Power of Self-Supervised ViTs in Downstream Tasks
Which Direction to Choose? An Analysis on the Representation Power of Self-Supervised ViTs in Downstream Tasks
Yannis Kaltampanidis
Alexandros Doumanoglou
D. Zarpalas
226
1
0
18 Sep 2025
A Framework for Generating Artificial Datasets to Validate Absolute and Relative Position Concepts
A Framework for Generating Artificial Datasets to Validate Absolute and Relative Position Concepts
George Correa de Araujo
H. Maia
Hélio Pedrini
196
0
0
17 Sep 2025
FusionMAE: large-scale pretrained model to optimize and simplify diagnostic and control of fusion plasma
FusionMAE: large-scale pretrained model to optimize and simplify diagnostic and control of fusion plasma
Zongyu Yang
Zhenghao Yang
Wenjing Tian
Jiyuan Li
Xiang Sun
...
Zhe Gao
Wei Chen
Xiaoquan Ji
Min Xu
Wulyu Zhong
AI4CE
247
0
0
16 Sep 2025
Hierarchical MLANet: Multi-level Attention for 3D Face Reconstruction From Single Images
Hierarchical MLANet: Multi-level Attention for 3D Face Reconstruction From Single Images
Danling Cao
CVBM3DH3DV
504
0
0
12 Sep 2025
Dynamic Structural Recovery Parameters Enhance Prediction of Visual Outcomes After Macular Hole Surgery
Dynamic Structural Recovery Parameters Enhance Prediction of Visual Outcomes After Macular Hole Surgery
Yinzheng Zhao
Zhihao Zhao
Rundong Jiang
Louisa Sackewitz
Quanmin Liang
M. Maier
Daniel Zapp
Peter Charbel Issa
M. A. Nasseri
135
0
0
11 Sep 2025
E2E Learning Massive MIMO for Multimodal Semantic Non-Orthogonal Transmission and Fusion
E2E Learning Massive MIMO for Multimodal Semantic Non-Orthogonal Transmission and Fusion
Minghui Wu
Zhen Gao
242
0
0
09 Sep 2025
Comparative Analysis of Transformer Models in Disaster Tweet Classification for Public Safety
Comparative Analysis of Transformer Models in Disaster Tweet Classification for Public Safety
Sharif Noor Zisad
N. M. Istiak Chowdhury
Ragib Hasan
273
1
0
04 Sep 2025
Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection
Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection
Yijun Zhou
Yikui Zhai
Z. Ying
Tingfeng Xian
Wenlve Zhou
Zhiheng Zhou
Xiaolin Tian
Xudong Jia
Hongsheng Zhang
C. L. Philip Chen
191
4
0
04 Sep 2025
SDiFL: Stable Diffusion-Driven Framework for Image Forgery Localization
SDiFL: Stable Diffusion-Driven Framework for Image Forgery Localization
Yang Su
Shunquan Tan
Jiwu Huang
DiffM
127
0
0
27 Aug 2025
A Lightweight Group Multiscale Bidirectional Interactive Network for Real-Time Steel Surface Defect Detection
A Lightweight Group Multiscale Bidirectional Interactive Network for Real-Time Steel Surface Defect Detection
Yong Zhang
C. Chen
Qiang Gao
Y. Wang
Bin Fang
226
0
0
22 Aug 2025
A Comprehensive Review of Agricultural Parcel and Boundary Delineation from Remote Sensing Images: Recent Progress and Future Perspectives
A Comprehensive Review of Agricultural Parcel and Boundary Delineation from Remote Sensing Images: Recent Progress and Future Perspectives
Lixian Zhang
Zi Ye
Yibin Wen
Jianxi Huang
Zhiwei Zhang
Qingmei Li
Qiong Hu
Baodong Xu
Lingyuan Zhao
Haohuan Fu
146
2
0
20 Aug 2025
CuMoLoS-MAE: A Masked Autoencoder for Remote Sensing Data Reconstruction
CuMoLoS-MAE: A Masked Autoencoder for Remote Sensing Data Reconstruction
Anurup Naskar
Nathanael Zhixin Wong
Sara Shamekh
130
0
0
20 Aug 2025
Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction
Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction
Muzammil Khan
Enzo Kerkhof
Matteo Fusaglia
Koert Kuhlmann
T. Ruers
Françoise J. Siepel
MDE
128
0
0
15 Aug 2025
Edge General Intelligence Through World Models and Agentic AI: Fundamentals, Solutions, and Challenges
Edge General Intelligence Through World Models and Agentic AI: Fundamentals, Solutions, and Challenges
Changyuan Zhao
Guangyuan Liu
Ruichen Zhang
Yinqiu Liu
Jiacheng Wang
...
Shen
Zhu Han
Sumei Sun
Chau Yuen
Dong In Kim
260
11
0
13 Aug 2025
Automated Segmentation of Coronal Brain Tissue Slabs for 3D Neuropathology
Automated Segmentation of Coronal Brain Tissue Slabs for 3D Neuropathology
Jonathan Williams Ramirez
Dina Zemlyanker
Lucas Jacob Deden Binder
Rogeny Herisse
Erendira Garcia Pallares
...
Derek H. Oakley
C. M. Donald
C. Dirk Keene
Bradley T. Hyman
Juan Eugenio Iglesias
131
0
0
13 Aug 2025
Aligning Effective Tokens with Video Anomaly in Large Language Models
Aligning Effective Tokens with Video Anomaly in Large Language Models
Yingxian Chen
Jiahui Liu
Ruidi Fan
Yanwei Li
Chirui Chang
Shizhen Zhao
W. Fok
Xiaojuan Qi
Yik-Chung Wu
261
4
0
08 Aug 2025
Zero-shot Shape Classification of Nanoparticles in SEM Images using Vision Foundation Models
Zero-shot Shape Classification of Nanoparticles in SEM Images using Vision Foundation Models
Freida Barnatan
Emunah Goldstein
Einav Kalimian
Orchen Madar
Avi Huri
David Zitoun
Yaákov Mandelbaum
Moshe Amitay
VLM
133
1
0
05 Aug 2025
SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral
SpectraLLM: Uncovering the Ability of LLMs for Molecule Structure Elucidation from Multi-Spectral
Yunyue Su
Jiahui Chen
Zao Jiang
Zhenyi Zhong
Liang Wang
Sihan Yang
Zhaoxiang Zhang
238
2
0
04 Aug 2025
Multimodal Large Language Models for End-to-End Affective Computing: Benchmarking and Boosting with Generative Knowledge Prompting
Multimodal Large Language Models for End-to-End Affective Computing: Benchmarking and Boosting with Generative Knowledge Prompting
Miaosen Luo
Jiesen Long
Zequn Li
Yunying Yang
Yuncheng Jiang
Sijie Mai
285
3
0
04 Aug 2025
Large AI Model-Enabled Secure Communications in Low-Altitude Wireless Networks: Concepts, Perspectives and Case Study
Large AI Model-Enabled Secure Communications in Low-Altitude Wireless Networks: Concepts, Perspectives and Case Study
Chuang Zhang
Geng Sun
Jiacheng Wang
Yijing Lin
Weijie Yuan
Sinem Coleri
191
1
0
01 Aug 2025
A Survey on Deep Multi-Task Learning in Connected Autonomous Vehicles
A Survey on Deep Multi-Task Learning in Connected Autonomous Vehicles
Jiayuan Wang
Farhad Pourpanah
Q. M. Jonathan Wu
Ning Zhang
177
1
0
29 Jul 2025
1234...101112
Next
Page 1 of 12
Pageof 12