ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.03677
  4. Cited By
Visual Transformers: Token-based Image Representation and Processing for
  Computer Vision

Visual Transformers: Token-based Image Representation and Processing for Computer Vision

5 June 2020
Bichen Wu
Chenfeng Xu
Xiaoliang Dai
Alvin Wan
Peizhao Zhang
Zhicheng Yan
M. Tomizuka
Joseph E. Gonzalez
Kurt Keutzer
Peter Vajda
    ViT
ArXivPDFHTML

Papers citing "Visual Transformers: Token-based Image Representation and Processing for Computer Vision"

50 / 62 papers shown
Title
GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability
GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability
Sehyeong Jo
Gangjae Jang
Haesol Park
32
0
0
28 Apr 2025
Topology-Aware Conformal Prediction for Stream Networks
Jifan Zhang
Fangxin Wang
Philip S. Yu
Kaize Ding
Shixiang Zhu
AI4TS
39
0
0
06 Mar 2025
Exploring Visual Embedding Spaces Induced by Vision Transformers for Online Auto Parts Marketplaces
Cameron Armijo
Pablo Rivas
34
0
0
09 Feb 2025
Dynamic Negative Guidance of Diffusion Models
Dynamic Negative Guidance of Diffusion Models
Felix Koulischer
Johannes Deleu
G. Raya
T. Demeester
L. Ambrogioni
DiffM
49
2
0
03 Jan 2025
Cauchy activation function and XNet
Cauchy activation function and XNet
Xin Li
Zhihong Xia
Hongkun Zhang
31
4
0
28 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
V. Papyan
VLM
43
1
0
20 Sep 2024
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function
  Landscapes
Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes
Nikita Kiselev
Andrey Grabovoy
36
1
0
18 Sep 2024
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation
Nischal Khanal
Shivanand Venkanna Sheshappanavar
MDE
34
0
0
10 Sep 2024
Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Multi-Modal Multi-Granularity Tokenizer for Chu Bamboo Slip Scripts
Yingfa Chen
Chenlong Hu
Cong Feng
Chenyang Song
Shi Yu
Xu Han
Zhiyuan Liu
Maosong Sun
21
0
0
02 Sep 2024
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
SwinSF: Image Reconstruction from Spatial-Temporal Spike Streams
Liangyan Jiang
Chuang Zhu
Yanxu Chen
38
2
0
22 Jul 2024
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models
Jayneel Parekh
Quentin Bouniot
Pavlo Mozharovskyi
A. Newson
Florence dÁlché-Buc
SSL
53
1
0
01 Jul 2024
Restoring balance: principled under/oversampling of data for optimal classification
Restoring balance: principled under/oversampling of data for optimal classification
Emanuele Loffredo
Mauro Pastore
Simona Cocco
R. Monasson
35
9
0
15 May 2024
PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote
  Physiological Measurement
PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological Measurement
Jiyao Wang
Hao Lu
Ange Wang
Xiao Yang
Ying Chen
Dengbo He
Kaishun Wu
21
3
0
10 May 2024
Sparse and Transferable Universal Singular Vectors Attack
Sparse and Transferable Universal Singular Vectors Attack
Kseniia Kuvshinova
Olga Tsymboi
Ivan V. Oseledets
AAML
22
0
0
25 Jan 2024
Enhancing Context Through Contrast
Enhancing Context Through Contrast
Kshitij Ambilduke
Aneesh Shetye
Diksha Bagade
Rishika Bhagwatkar
Khurshed Fitter
P. Vagdargi
Shital S. Chiddarwar
19
0
0
06 Jan 2024
Improving Robustness for Vision Transformer with a Simple Dynamic
  Scanning Augmentation
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
22
2
0
01 Nov 2023
Energy-Based Models for Cross-Modal Localization using Convolutional
  Transformers
Energy-Based Models for Cross-Modal Localization using Convolutional Transformers
Alan Wu
Michael S. Ryoo
21
3
0
06 Jun 2023
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain
  Knowledge
SwinFSR: Stereo Image Super-Resolution using SwinIR and Frequency Domain Knowledge
Ke-Jia Chen
Liangyan Li
Huan Liu
Yunzhe Li
Congling Tang
Jun Chen
26
13
0
25 Apr 2023
STB-VMM: Swin Transformer Based Video Motion Magnification
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado-Roigé
M. A. Pérez
16
13
0
20 Feb 2023
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
mSAM: Micro-Batch-Averaged Sharpness-Aware Minimization
Kayhan Behdin
Qingquan Song
Aman Gupta
S. Keerthi
Ayan Acharya
Borja Ocejo
Gregory Dexter
Rajiv Khanna
D. Durfee
Rahul Mazumder
AAML
13
7
0
19 Feb 2023
Transformadores: Fundamentos teoricos y Aplicaciones
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
63
0
0
18 Feb 2023
Explanation on Pretraining Bias of Finetuned Vision Transformer
Explanation on Pretraining Bias of Finetuned Vision Transformer
Bumjin Park
Jaesik Choi
ViT
19
1
0
18 Nov 2022
Traffic Accident Risk Forecasting using Contextual Vision Transformers
Traffic Accident Risk Forecasting using Contextual Vision Transformers
Khaled Saleh
Artur Grigorev
Adriana-Simona Mihaita
ViT
18
10
0
20 Sep 2022
Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the
  Best of Both Students
Transformer-CNN Cohort: Semi-supervised Semantic Segmentation by the Best of Both Students
Xueye Zheng
Yuan Luo
Hao Wang
Chong Fu
Lin Wang
ViT
36
17
0
06 Sep 2022
Open-Vocabulary 3D Detection via Image-level Class and Debiased
  Cross-modal Contrastive Learning
Open-Vocabulary 3D Detection via Image-level Class and Debiased Cross-modal Contrastive Learning
Yuheng Lu
Chenfeng Xu
Xi Wei
Xiaodong Xie
M. Tomizuka
Kurt Keutzer
Shanghang Zhang
3DPC
13
20
0
05 Jul 2022
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Hub-Pathway: Transfer Learning from A Hub of Pre-trained Models
Yang Shu
Zhangjie Cao
Ziyang Zhang
Jianmin Wang
Mingsheng Long
13
4
0
08 Jun 2022
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning
  Using a Lazy Influence Approximation
LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation
Ljubomir Rokvic
Panayiotis Danassis
Sai Praneeth Karimireddy
Boi Faltings
TDI
15
1
0
23 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
54
598
0
09 May 2022
Seeding Diversity into AI Art
Seeding Diversity into AI Art
Marvin Zammit
Antonios Liapis
Georgios N. Yannakakis
22
4
0
02 May 2022
MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral
  Reconstruction
MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction
Yuanhao Cai
Jing Lin
Zudi Lin
Haoqian Wang
Yulun Zhang
Hanspeter Pfister
Radu Timofte
Luc Van Gool
19
170
0
17 Apr 2022
Evolving Modular Soft Robots without Explicit Inter-Module Communication
  using Local Self-Attention
Evolving Modular Soft Robots without Explicit Inter-Module Communication using Local Self-Attention
F. Pigozzi
Yujin Tang
Eric Medvet
David R Ha
32
22
0
13 Apr 2022
Transformer-Based Self-Supervised Learning for Emotion Recognition
Transformer-Based Self-Supervised Learning for Emotion Recognition
Juan Vazquez-Rodriguez
G. Lefebvre
Julien Cumin
James L. Crowley
6
24
0
08 Apr 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
17
0
0
14 Mar 2022
EventFormer: AU Event Transformer for Facial Action Unit Event Detection
EventFormer: AU Event Transformer for Facial Action Unit Event Detection
Yingjie Chen
Jiarui Zhang
Tao Wang
Yun Liang
ViT
17
0
0
12 Mar 2022
Region-Aware Face Swapping
Region-Aware Face Swapping
Chao Xu
Jiangning Zhang
Miao Hua
Qian He
Zili Yi
Yong Liu
CVBM
14
48
0
09 Mar 2022
RFormer: Transformer-based Generative Adversarial Network for Real
  Fundus Image Restoration on A New Clinical Benchmark
RFormer: Transformer-based Generative Adversarial Network for Real Fundus Image Restoration on A New Clinical Benchmark
Zhuo Deng
Yuanhao Cai
Lu Chen
Zheng Gong
Qiqi Bao
Xue Yao
D. Fang
Shaochong Zhang
Lan Ma
ViT
MedIm
18
53
0
03 Jan 2022
Vision Pair Learning: An Efficient Training Framework for Image
  Classification
Vision Pair Learning: An Efficient Training Framework for Image Classification
Bei Tong
Xiaoyuan Yu
ViT
17
0
0
02 Dec 2021
CT-block: a novel local and global features extractor for point cloud
CT-block: a novel local and global features extractor for point cloud
Shangwei Guo
Jun Li
Zhengchao Lai
Xiantong Meng
Shaokun Han
ViT
3DPC
16
2
0
30 Nov 2021
An Image Patch is a Wave: Phase-Aware Vision MLP
An Image Patch is a Wave: Phase-Aware Vision MLP
Yehui Tang
Kai Han
Jianyuan Guo
Chang Xu
Yanxi Li
Chao Xu
Yunhe Wang
11
133
0
24 Nov 2021
A Survey of Visual Transformers
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
66
330
0
11 Nov 2021
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network
  Architecture for Medical Image Analysis
MEDUSA: Multi-scale Encoder-Decoder Self-Attention Deep Neural Network Architecture for Medical Image Analysis
Hossein Aboutalebi
Maya Pavlova
Hayden Gunraj
M. Shafiee
A. Sabri
Amer Alaref
Alexander Wong
15
17
0
12 Oct 2021
Pathologies in priors and inference for Bayesian transformers
Pathologies in priors and inference for Bayesian transformers
Tristan Cinquin
Alexander Immer
Max Horn
Vincent Fortuin
UQCV
BDL
MedIm
20
9
0
08 Oct 2021
Token Pooling in Vision Transformers
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
65
66
0
08 Oct 2021
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent
  Path Finding
Subdimensional Expansion Using Attention-Based Learning For Multi-Agent Path Finding
Lakshay Virmani
Z. Ren
Sivakumar Rathinam
Howie Choset
11
3
0
29 Sep 2021
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal
  Transformer
GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer
Shuaicheng Li
Qianggang Cao
Lingbo Liu
Kunlin Yang
Shinan Liu
Jun Hou
Shuai Yi
ViT
34
102
0
28 Aug 2021
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight
  Transformer
Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer
Zhihe Lu
Sen He
Xiatian Zhu
Li Zhang
Yi-Zhe Song
Tao Xiang
ViT
164
172
0
06 Aug 2021
PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion
PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion
Yu Fu
Tianyang Xu
Xiaojun Wu
J. Kittler
ViT
17
37
0
29 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Philip H. S. Torr
50
27
0
13 Jul 2021
Co-advise: Cross Inductive Bias Distillation
Co-advise: Cross Inductive Bias Distillation
Sucheng Ren
Zhengqi Gao
Tianyu Hua
Zihui Xue
Yonglong Tian
Shengfeng He
Hang Zhao
37
53
0
23 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
19
127
0
21 Jun 2021
12
Next