ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.00855
  4. Cited By
Review of Large Vision Models and Visual Prompt Engineering

Review of Large Vision Models and Visual Prompt Engineering

3 July 2023
Jiaqi Wang
Zheng Liu
Lin Zhao
Zihao Wu
Chong Ma
Sigang Yu
Haixing Dai
Qiushi Yang
Yi-Hsueh Liu
Songyao Zhang
Enze Shi
Yi Pan
Tuo Zhang
Dajiang Zhu
Xiang Li
Xi Jiang
Bao Ge
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
    VLM
    LRM
ArXivPDFHTML

Papers citing "Review of Large Vision Models and Visual Prompt Engineering"

36 / 36 papers shown
Title
GBT-SAM: Adapting a Foundational Deep Learning Model for Generalizable Brain Tumor Segmentation via Efficient Integration of Multi-Parametric MRI Data
GBT-SAM: Adapting a Foundational Deep Learning Model for Generalizable Brain Tumor Segmentation via Efficient Integration of Multi-Parametric MRI Data
Cecilia Diana-Albelda
Roberto Alcover-Couso
Álvaro García-Martín
Jesús Bescós
Marcos Escudero-Viñolo
34
1
0
06 Mar 2025
Towards a Generative AI Design Dialogue
Towards a Generative AI Design Dialogue
Aron E. Owen
Jonathan C. Roberts
18
0
0
19 Aug 2024
Enhanced Object Tracking by Self-Supervised Auxiliary Depth Estimation
  Learning
Enhanced Object Tracking by Self-Supervised Auxiliary Depth Estimation Learning
Zhenyu Wei
Yujie He
Zhanchuan Cai
MDE
30
0
0
23 May 2024
Video Annotator: A framework for efficiently building video classifiers
  using vision-language models and active learning
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning
Amir Ziai
Aneesh Vartakavi
VLM
VGen
19
0
0
09 Feb 2024
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang
Guikun Chen
Xiaodi Li
Wenguan Wang
Yi Yang
LM&Ro
LLMAG
39
35
0
16 Jan 2024
Adaptive Human Trajectory Prediction via Latent Corridors
Adaptive Human Trajectory Prediction via Latent Corridors
Neerja Thakkar
K. Mangalam
Andrea V. Bajcsy
Jitendra Malik
13
4
0
11 Dec 2023
GeoGPT: Understanding and Processing Geospatial Tasks through An
  Autonomous GPT
GeoGPT: Understanding and Processing Geospatial Tasks through An Autonomous GPT
Yifan Zhang
Cheng Wei
Shangyou Wu
Zhengting He
Wenhao Yu
17
25
0
16 Jul 2023
AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology
AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology
Haixing Dai
Yiwei Li
Zheng Liu
Lin Zhao
Zihao Wu
...
Quanzheng Li
Zhuo Chen
D. Zhang
Gengchen Mai
Tianming Liu
LM&MA
39
28
0
16 Jun 2023
Segment Any Anomaly without Training via Hybrid Prompt Regularization
Segment Any Anomaly without Training via Hybrid Prompt Regularization
Yunkang Cao
Xiaohao Xu
Chen Sun
Y. Cheng
Zongwei Du
Liang Gao
Weiming Shen
VLM
24
69
0
18 May 2023
Caption Anything: Interactive Image Description with Diverse Multimodal
  Controls
Caption Anything: Interactive Image Description with Diverse Multimodal Controls
Teng Wang
Jinrui Zhang
Junjie Fei
Hao Zheng
Yunlong Tang
Zhe Li
Mingqi Gao
Shanshan Zhao
MLLM
102
81
0
04 May 2023
SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment
  Anything Model
SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model
Di Wang
Jing Zhang
Bo Du
Minqiang Xu
Lin Liu
Dacheng Tao
L. Zhang
112
71
0
03 May 2023
Scalable Mask Annotation for Video Text Spotting
Scalable Mask Annotation for Video Text Spotting
Haibin He
Jing Zhang
Mengyang Xu
Juhua Liu
Bo Du
Dacheng Tao
90
13
0
02 May 2023
Instruction-ViT: Multi-Modal Prompts for Instruction Learning in ViT
Instruction-ViT: Multi-Modal Prompts for Instruction Learning in ViT
Zhe Xiao
Yuzhong Chen
Lu Zhang
Jun Yao
Zihao Wu
...
Yixuan Yuan
Dinggang Shen
Dajiang Zhu
Tianming Liu
Xi Jiang
VLM
MLLM
55
17
0
29 Apr 2023
SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective
SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective
An-Chi Wang
Mobarakol Islam
Mengya Xu
Yang Zhang
Hongliang Ren
AAML
VLM
74
36
0
28 Apr 2023
Prompt Engineering for Healthcare: Methodologies and Applications
Prompt Engineering for Healthcare: Methodologies and Applications
Jiaqi Wang
Enze Shi
Sigang Yu
Zihao Wu
Chong Ma
...
Dajiang Zhu
Yixuan Yuan
Dinggang Shen
Tianming Liu
Shu Zhang
LM&MA
42
106
0
28 Apr 2023
Edit Everything: A Text-Guided Generative System for Images Editing
Edit Everything: A Text-Guided Generative System for Images Editing
Defeng Xie
Ruichen Wang
Jiancang Ma
Chen Chen
H. Lu
D. Yang
Fobo Shi
Xiaodong Lin
DiffM
80
31
0
27 Apr 2023
Diversity-Aware Meta Visual Prompting
Diversity-Aware Meta Visual Prompting
Qidong Huang
Xiaoyi Dong
Dongdong Chen
Weiming Zhang
Feifei Wang
Gang Hua
Neng H. Yu
VLM
VPVLM
36
52
0
14 Mar 2023
Mask-guided BERT for Few Shot Text Classification
Mask-guided BERT for Few Shot Text Classification
Wenxiong Liao
Zheng Liu
Haixing Dai
Zihao Wu
Yiyang Zhang
...
Dajiang Zhu
Tianming Liu
Sheng R. Li
Xiang Li
Hongmin Cai
VLM
29
39
0
21 Feb 2023
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis
Ming Tao
Bingkun Bao
Hao Tang
Changsheng Xu
DiffM
VLM
58
99
0
30 Jan 2023
MaPLe: Multi-modal Prompt Learning
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
186
521
0
06 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
51
80
0
03 Oct 2022
Diffusion Models in Vision: A Survey
Diffusion Models in Vision: A Survey
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffM
VLM
MedIm
186
1,098
0
10 Sep 2022
Rectify ViT Shortcut Learning by Visual Saliency
Rectify ViT Shortcut Learning by Visual Saliency
Chong Ma
Lin Zhao
Yuzhong Chen
David Liu
Xi Jiang
Tuo Zhang
Xintao Hu
Dinggang Shen
Dajiang Zhu
Tianming Liu
ViT
17
20
0
17 Jun 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
138
631
0
26 May 2022
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Mask-guided Vision Transformer (MG-ViT) for Few-Shot Learning
Yuzhong Chen
Zhe Xiao
Lin Zhao
Lu Zhang
Haixing Dai
...
Tuo Zhang
Changying Li
Dajiang Zhu
Tianming Liu
Xi Jiang
36
18
0
20 May 2022
Discovering Dynamic Functional Brain Networks via Spatial and
  Channel-wise Attention
Discovering Dynamic Functional Brain Networks via Spatial and Channel-wise Attention
Yiheng Liu
Enjie Ge
Mengshen He
Zheng-Ning Liu
Shijie Zhao
Xintao Hu
Dajiang Zhu
Tianming Liu
Bao Ge
24
10
0
19 May 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
175
494
0
22 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
360
0
17 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
322
2,108
0
02 Sep 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
220
698
0
28 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip
  Retrieval
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
303
771
0
18 Apr 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
282
1,490
0
27 Feb 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
222
2,404
0
04 Jan 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
241
1,898
0
31 Dec 2020
1