ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05751
  4. Cited By
Image Transformer

Image Transformer

15 February 2018
Niki Parmar
Ashish Vaswani
Jakob Uszkoreit
Lukasz Kaiser
Noam M. Shazeer
Alexander Ku
Dustin Tran
    ViT
ArXivPDFHTML

Papers citing "Image Transformer"

50 / 277 papers shown
Title
Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators
Deepfakes on Demand: the rise of accessible non-consensual deepfake image generators
Will Hawkins
Chris Russell
Brent Mittelstadt
DiffM
102
0
0
06 May 2025
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Where's the liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content
Haoyue Bai
Yiyou Sun
Wei Cheng
Haifeng Chen
AAML
51
0
0
02 May 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
57
0
0
28 Apr 2025
Learning Streaming Video Representation via Multitask Training
Learning Streaming Video Representation via Multitask Training
Yibin Yan
Jilan Xu
Shangzhe Di
Yikun Liu
Yudi Shi
Qirui Chen
Zeqian Li
Yifei Huang
Weidi Xie
CLL
84
0
0
28 Apr 2025
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches
Advancing Video Anomaly Detection: A Bi-Directional Hybrid Framework for Enhanced Single- and Multi-Task Approaches
Guodong Shen
Yuqi Ouyang
Junru Lu
Yixuan Yang
Victor Sanchez
33
1
0
20 Apr 2025
In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush
In the Blink of an Eye: Instant Game Map Editing using a Generative-AI Smart Brush
Vitaly Gnatyuk
Valeriia Koriukina
Ilya Levoshevich
Pavel Nurminskiy
Guenter Wallner
42
0
0
25 Mar 2025
Direction-Aware Diagonal Autoregressive Image Generation
Direction-Aware Diagonal Autoregressive Image Generation
Yijia Xu
Jianzhong Ju
Jian Luan
J. Cui
54
0
0
14 Mar 2025
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Yongqi Dong
Xingmin Lu
Ruohan Li
Wei Song
B. Arem
Haneen Farah
ViT
105
1
0
21 Feb 2025
Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models
Mamba-Shedder: Post-Transformer Compression for Efficient Selective Structured State Space Models
J. P. Muñoz
Jinjie Yuan
Nilesh Jain
Mamba
68
1
0
28 Jan 2025
Simplified and Generalized Masked Diffusion for Discrete Data
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi
Kehang Han
Z. Wang
Arnaud Doucet
Michalis K. Titsias
DiffM
74
62
0
17 Jan 2025
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
Henry Li
Ronen Basri
Y. Kluger
DiffM
54
2
0
13 Jan 2025
Parallelized Autoregressive Visual Generation
Parallelized Autoregressive Visual Generation
Y. Wang
Shuhuai Ren
Zhijie Lin
Yujin Han
Haoyuan Guo
Zhenheng Yang
Difan Zou
Jiashi Feng
Xihui Liu
VGen
90
11
0
19 Dec 2024
Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads
Siqi Kou
Jiachun Jin
Chang Liu
Ye Ma
Jian Jia
Quan Chen
Peng Jiang
Zhijie Deng
Zhijie Deng
DiffM
VGen
VLM
126
5
0
28 Nov 2024
Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism
  via Dual Diffusion Models and GPT Prompting
Synth-SONAR: Sonar Image Synthesis with Enhanced Diversity and Realism via Dual Diffusion Models and GPT Prompting
Purushothaman Natarajan
Kamal Basha
Athira Nambiar
DiffM
27
0
0
11 Oct 2024
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Jiatao Gu
Yuyang Wang
Yizhe Zhang
Qihang Zhang
Dinghuai Zhang
Navdeep Jaitly
Josh Susskind
Shuangfei Zhai
DiffM
31
12
0
10 Oct 2024
Denoising with a Joint-Embedding Predictive Architecture
Denoising with a Joint-Embedding Predictive Architecture
Dengsheng Chen
Jie Hu
Xiaoming Wei
Enhua Wu
DiffM
52
2
0
02 Oct 2024
Diffusion Models to Enhance the Resolution of Microscopy Images: A
  Tutorial
Diffusion Models to Enhance the Resolution of Microscopy Images: A Tutorial
Harshith Bachimanchi
Giovanni Volpe
26
1
0
24 Sep 2024
Sparse Low-Ranked Self-Attention Transformer for Remaining Useful Lifetime Prediction of Optical Fiber Amplifiers
Sparse Low-Ranked Self-Attention Transformer for Remaining Useful Lifetime Prediction of Optical Fiber Amplifiers
Dominic Schneider
Lutz Rapp
27
0
0
22 Sep 2024
Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming
  Perception
Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception
Xiang Zhang
Yufei Cui
Chenchen Fu
Weiwei Wu
Zihao Wang
Yuyang Sun
Xue Liu
25
0
0
10 Sep 2024
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Zanlin Ni
Yulin Wang
Renping Zhou
Rui Lu
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Yuan Yao
Gao Huang
29
7
0
31 Aug 2024
ELASTIC: Efficient Linear Attention for Sequential Interest Compression
Jiaxin Deng
Shiyao Wang
Song Lu
Yinfeng Li
Xinchen Luo
Yuanjun Liu
Peixing Xu
Guorui Zhou
39
0
0
18 Aug 2024
T-CorresNet: Template Guided 3D Point Cloud Completion with
  Correspondence Pooling Query Generation Strategy
T-CorresNet: Template Guided 3D Point Cloud Completion with Correspondence Pooling Query Generation Strategy
Fan Duan
Jiahao Yu
Li Chen
3DPC
33
0
0
06 Jul 2024
Autoregressive Image Generation without Vector Quantization
Autoregressive Image Generation without Vector Quantization
Tianhong Li
Yonglong Tian
He Li
Mingyang Deng
Kaiming He
DiffM
45
171
0
17 Jun 2024
Predicting Genetic Mutation from Whole Slide Images via
  Biomedical-Linguistic Knowledge Enhanced Multi-label Classification
Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification
Gexin Huang
Chenfei Wu
Mingjie Li
Xiaojun Chang
Ling-Hao Chen
Ying Sun
Shen Zhao
Xiaodan Liang
Liang Lin
MedIm
29
0
0
05 Jun 2024
Large Convolutional Model Tuning via Filter Subspace
Large Convolutional Model Tuning via Filter Subspace
Wei Chen
Zichen Miao
Qiang Qiu
49
3
0
01 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
40
6
0
28 Feb 2024
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model
Tanzila Rahman
Shweta Mahajan
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Leonid Sigal
88
4
0
18 Feb 2024
Data-efficient Large Vision Models through Sequential Autoregression
Data-efficient Large Vision Models through Sequential Autoregression
Jianyuan Guo
Zhiwei Hao
Chengcheng Wang
Yehui Tang
Han Wu
Han Hu
Kai Han
Chang Xu
VLM
36
10
0
07 Feb 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
49
5
0
22 Jan 2024
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
Zijun Long
R. McCreadie
Muhammad Imran
74
9
0
05 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Z. Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
123
233
0
05 Jan 2024
Integrating Human Vision Perception in Vision Transformers for
  Classifying Waste Items
Integrating Human Vision Perception in Vision Transformers for Classifying Waste Items
Akshat Shrivastava
Tapan K. Gandhi
19
1
0
19 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
28
0
0
01 Dec 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
27
4
0
21 Nov 2023
Improving Robustness for Vision Transformer with a Simple Dynamic
  Scanning Augmentation
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
22
2
0
01 Nov 2023
Grid Jigsaw Representation with CLIP: A New Perspective on Image Clustering
Grid Jigsaw Representation with CLIP: A New Perspective on Image Clustering
Zijie Song
Zhenzhen Hu
Richang Hong
SSL
38
0
0
27 Oct 2023
3D TransUNet: Advancing Medical Image Segmentation through Vision
  Transformers
3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers
Jieneng Chen
Jieru Mei
Xianhang Li
Yongyi Lu
Qihang Yu
...
M. Lungren
Lei Xing
Le Lu
Alan L. Yuille
Yuyin Zhou
MedIm
ViT
33
36
0
11 Oct 2023
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for
  Accurate Object Detection
Anchor-Intermediate Detector: Decoupling and Coupling Bounding Boxes for Accurate Object Detection
Yilong Lv
Min Li
Yujie He
Shaopeng Li
Zhuzhen He
Aitao Yang
21
1
0
09 Oct 2023
ADU-Depth: Attention-based Distillation with Uncertainty Modeling for
  Depth Estimation
ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation
Zizhang Wu
Zhuozheng Li
Zhi-Gang Fan
Yunzhe Wu
Xiaoquan Wang
Rui Tang
Jian Pu
23
1
0
26 Sep 2023
RestoreFormer++: Towards Real-World Blind Face Restoration from
  Undegraded Key-Value Pairs
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs
Zhouxia Wang
Jiawei Zhang
Tianshui Chen
Wenping Wang
Ping Luo
16
16
0
14 Aug 2023
Training-free Diffusion Model Adaptation for Variable-Sized
  Text-to-Image Synthesis
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin
Xuli Shen
Bin Li
Xiangyang Xue
24
36
0
14 Jun 2023
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene
  Understanding
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
25
10
0
08 Jun 2023
Hierarchical Attention Encoder Decoder
Hierarchical Attention Encoder Decoder
Asier Mujika
BDL
22
3
0
01 Jun 2023
Visual Affordance Prediction for Guiding Robot Exploration
Visual Affordance Prediction for Guiding Robot Exploration
Homanga Bharadhwaj
Abhi Gupta
Shubham Tulsiani
37
12
0
28 May 2023
FIT: Far-reaching Interleaved Transformers
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
19
12
0
22 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
H. Chen
Jingkuan Song
Feng Zheng
ViT
10
0
0
17 May 2023
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Improving Autoregressive NLP Tasks via Modular Linearized Attention
Victor Agostinelli
Lizhong Chen
22
1
0
17 Apr 2023
From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot
  Keypoint Detection
From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot Keypoint Detection
Changsheng Lu
Hao Zhu
Piotr Koniusz
42
11
0
06 Apr 2023
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for
  Action Segmentation
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation
Peiyao Wang
Haibin Ling
15
2
0
04 Apr 2023
P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data
P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data
Y. Ruan
Xiang Lan
Daniel J. Tan
H. Abdullah
Mengling Feng
LMTD
MedIm
40
1
0
30 Mar 2023
123456
Next