Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.01518
Cited By
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
2 December 2021
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting"
50 / 400 papers shown
Title
A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Kun Ding
Ying Wang
Gaofeng Meng
Shiming Xiang
VLM
29
0
0
15 Oct 2024
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Alex C. Kot
Yihong Gong
VLM
30
1
0
14 Oct 2024
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Kun Ding
Qiang Yu
Haojian Zhang
Gaofeng Meng
Shiming Xiang
VLM
20
0
0
11 Oct 2024
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models
Rabin Adhikari
Safal Thapaliya
Manish Dhakal
Bishesh Khanal
MLLM
VLM
28
0
0
07 Oct 2024
Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI
Chengyuan Xu
Radha Kumaran
Noah Stier
Kangyou Yu
Tobias Höllerer
32
0
0
06 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
16
7
0
03 Oct 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
23
0
0
02 Oct 2024
PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection
Qihang Zhou
Jiangtao Yan
Shibo He
Wenchao Meng
Jiming Chen
3DPC
24
6
0
01 Oct 2024
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
Bingqing Zhang
Zhuo Cao
Heming Du
Xin Yu
Xue Li
Jiajun Liu
Sen Wang
VGen
16
0
0
30 Sep 2024
MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
Taha Koleilat
Hojat Asgariandehkordi
H. Rivaz
Yiming Xiao
MedIm
VLM
36
6
0
28 Sep 2024
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
30
6
0
25 Sep 2024
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
Anil Osman Tur
Alessandro Conti
Cigdem Beyan
Davide Boscaini
Roberto Larcher
S. Messelodi
Fabio Poiesi
Elisa Ricci
VLM
29
0
0
23 Sep 2024
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
Xingtao Lin
Heqian Qiu
Lanxiao Wang
RUihang Wang
Linfeng XU
Hongliang Li
VLM
21
0
0
20 Sep 2024
Resolving Inconsistent Semantics in Multi-Dataset Image Segmentation
Qilong Zhangli
Di Liu
Abhishek Aich
Dimitris Metaxas
S. Schulter
31
0
0
15 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
38
1
0
10 Sep 2024
Text-Enhanced Zero-Shot Action Recognition: A training-free approach
Massimo Bosetti
Shibingfeng Zhang
Bendetta Liberatori
Giacomo Zara
Elisa Ricci
Paolo Rota
VLM
36
0
0
29 Aug 2024
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration
Xu Zhang
Jiaqi Ma
Guoli Wang
Q. Zhang
Huan Zhang
Lefei Zhang
VLM
91
5
0
28 Aug 2024
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
Yubin Wang
Xinyang Jiang
De Cheng
Wenli Sun
Dongsheng Li
Cairong Zhao
VLM
35
0
0
27 Aug 2024
Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models
Shuai Fu
Xiequn Wang
Qiushi Huang
Yu Zhang
VLM
37
2
0
26 Aug 2024
Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action Localization
Jia-Run Du
Kun-Yu Lin
Jingke Meng
Wei-Shi Zheng
26
0
0
25 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
40
4
0
23 Aug 2024
Narrowing the Gap between Vision and Action in Navigation
Yue Zhang
Parisa Kordjamshidi
24
2
0
19 Aug 2024
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
23
9
0
12 Aug 2024
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
Geuntaek Lim
Hyunwoo Kim
Joonsoo Kim
Yukyung Choi
20
0
0
12 Aug 2024
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Y. Zhu
Keren Ye
Junjie Ke
Jiahui Yu
Leonidas J. Guibas
P. Milanfar
Feng Yang
43
2
0
07 Aug 2024
Visual Grounding for Object-Level Generalization in Reinforcement Learning
Haobin Jiang
Zongqing Lu
LM&Ro
25
2
0
04 Aug 2024
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval
Gangyan Zeng
Yuan Zhang
Jin Wei
Dongbao Yang
Peng Zhang
Yiwen Gao
Xugong Qin
Yu Zhou
VLM
CLIP
13
0
0
01 Aug 2024
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
Anurag Das
Xinting Hu
Li Jiang
Bernt Schiele
VLM
31
3
0
31 Jul 2024
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang
Lechao Cheng
Weikai Chen
Pingping Zhang
Liang Lin
Fan Zhou
Guanbin Li
VLM
ObjD
26
1
0
31 Jul 2024
Autogenic Language Embedding for Coherent Point Tracking
Zikai Song
Ying Tang
Run Luo
Lintao Ma
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
39
4
0
30 Jul 2024
ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Guoliang Xu
Jianqin Yin
Feng Zhou
Yonghao Dang
VLM
36
0
0
29 Jul 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang
Yuxi Wang
Shuai Li
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
33
2
0
18 Jul 2024
VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation
Zhen Qu
Xian Tao
Mukesh Prasad
Fei Shen
Zhengtao Zhang
Xinyi Gong
Guiguang Ding
VLM
24
10
0
17 Jul 2024
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao
Xiaohan Ding
Juexiao Feng
Yuhong Yang
Hui Chen
Guiguang Ding
VLM
MQ
22
5
0
15 Jul 2024
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak
Byeongju Woo
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
37
3
0
12 Jul 2024
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
Wenshuo Peng
Kaipeng Zhang
Yue Yang
Hao Zhang
Yu Qiao
VLM
25
2
0
11 Jul 2024
Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization
Jinlong Li
Zequn Jie
Elisa Ricci
Lin Ma
N. Sebe
VLM
34
1
0
11 Jul 2024
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
Tong Shao
Zhuotao Tian
Hang Zhao
Jingyong Su
VLM
29
14
0
11 Jul 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao
Bo Wan
Xu Jia
Yunzhi Zhuge
Ying Zhang
Huchuan Lu
Long Chen
VLM
37
4
0
10 Jul 2024
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding
Wenhao Xu
Wenming Weng
Yueyi Zhang
Zhiwei Xiong
VLM
29
0
0
09 Jul 2024
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Jiaying Shi
Xuetong Xue
Shenghui Xu
VLM
26
0
0
08 Jul 2024
CLIPVQA:Video Quality Assessment via CLIP
Fengchuang Xing
Mingjie Li
Yuan-Gen Wang
Guopu Zhu
Xiaochun Cao
CLIP
ViT
36
4
0
06 Jul 2024
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
CLIP
21
2
0
04 Jul 2024
Do Generalised Classifiers really work on Human Drawn Sketches?
Hmrishav Bandyopadhyay
Pinaki Nath Chowdhury
Aneeshan Sain
Subhadeep Koley
Tao Xiang
A. Bhunia
Yi-Zhe Song
VLM
31
2
0
04 Jul 2024
SOWA: Adapting Hierarchical Frozen Window Self-Attention to Visual-Language Models for Better Anomaly Detection
Zongxiang Hu
Zhaosheng Zhang
VLM
22
1
0
04 Jul 2024
CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation
Zuo Zuo
Jiahao Dong
Yao Wu
Yanyun Qu
Zongze Wu
27
3
0
27 Jun 2024
HGTDP-DTA: Hybrid Graph-Transformer with Dynamic Prompt for Drug-Target Binding Affinity Prediction
Xi Xiao
Wentao Wang
Jiacheng Xie
Lijing Zhu
Gaofei Chen
Zhengji Li
Tianyang Wang
Min Xu
21
4
0
25 Jun 2024
Unveiling Encoder-Free Vision-Language Models
Haiwen Diao
Yufeng Cui
Xiaotong Li
Yueze Wang
Huchuan Lu
Xinlong Wang
VLM
38
28
0
17 Jun 2024
Lightweight Model Pre-training via Language Guided Knowledge Distillation
Mingsheng Li
Lin Zhang
Mingzhen Zhu
Zilong Huang
Gang Yu
Jiayuan Fan
Tao Chen
29
0
0
17 Jun 2024
Previous
1
2
3
4
5
6
7
8
Next