ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.01518
  4. Cited By
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

2 December 2021
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
    VLM
    CLIP
ArXivPDFHTML

Papers citing "DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting"

50 / 400 papers shown
Title
A Survey of Low-shot Vision-Language Model Adaptation via Representer
  Theorem
A Survey of Low-shot Vision-Language Model Adaptation via Representer Theorem
Kun Ding
Ying Wang
Gaofeng Meng
Shiming Xiang
VLM
29
0
0
15 Oct 2024
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
LOBG:Less Overfitting for Better Generalization in Vision-Language Model
Chenhao Ding
Xinyuan Gao
Songlin Dong
Yuhang He
Qiang Wang
Alex C. Kot
Yihong Gong
VLM
30
1
0
14 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
64
3
0
14 Oct 2024
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation
Kun Ding
Qiang Yu
Haojian Zhang
Gaofeng Meng
Shiming Xiang
VLM
20
0
0
11 Oct 2024
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation
  Models
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models
Rabin Adhikari
Safal Thapaliya
Manish Dhakal
Bishesh Khanal
MLLM
VLM
28
0
0
07 Oct 2024
Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI
Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI
Chengyuan Xu
Radha Kumaran
Noah Stier
Kangyou Yu
Tobias Höllerer
32
0
0
06 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through
  Language Descriptions
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
16
7
0
03 Oct 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
23
0
0
02 Oct 2024
PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot
  3D Anomaly Detection
PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection
Qihang Zhou
Jiangtao Yan
Shibo He
Wenchao Meng
Jiming Chen
3DPC
24
6
0
01 Oct 2024
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
Bingqing Zhang
Zhuo Cao
Heming Du
Xin Yu
Xue Li
Jiajun Liu
Sen Wang
VGen
16
0
0
30 Sep 2024
MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation
Taha Koleilat
Hojat Asgariandehkordi
H. Rivaz
Yiming Xiao
MedIm
VLM
36
6
0
28 Sep 2024
Attention Prompting on Image for Large Vision-Language Models
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
30
6
0
25 Sep 2024
Exploring Fine-grained Retail Product Discrimination with Zero-shot
  Object Classification Using Vision-Language Models
Exploring Fine-grained Retail Product Discrimination with Zero-shot Object Classification Using Vision-Language Models
Anil Osman Tur
Alessandro Conti
Cigdem Beyan
Davide Boscaini
Roberto Larcher
S. Messelodi
Fabio Poiesi
Elisa Ricci
VLM
29
0
0
23 Sep 2024
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region
  Text Prompt
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
Xingtao Lin
Heqian Qiu
Lanxiao Wang
RUihang Wang
Linfeng XU
Hongliang Li
VLM
21
0
0
20 Sep 2024
Resolving Inconsistent Semantics in Multi-Dataset Image Segmentation
Resolving Inconsistent Semantics in Multi-Dataset Image Segmentation
Qilong Zhangli
Di Liu
Abhishek Aich
Dimitris Metaxas
S. Schulter
31
0
0
15 Sep 2024
Revisiting Prompt Pretraining of Vision-Language Models
Revisiting Prompt Pretraining of Vision-Language Models
Zhenyuan Chen
Lingfeng Yang
Shuo Chen
Zhaowei Chen
Jiajun Liang
Xiang Li
MLLM
VPVLM
VLM
38
1
0
10 Sep 2024
Text-Enhanced Zero-Shot Action Recognition: A training-free approach
Text-Enhanced Zero-Shot Action Recognition: A training-free approach
Massimo Bosetti
Shibingfeng Zhang
Bendetta Liberatori
Giacomo Zara
Elisa Ricci
Paolo Rota
VLM
36
0
0
29 Aug 2024
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration
Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration
Xu Zhang
Jiaqi Ma
Guoli Wang
Q. Zhang
Huan Zhang
Lefei Zhang
VLM
91
5
0
28 Aug 2024
HPT++: Hierarchically Prompting Vision-Language Models with
  Multi-Granularity Knowledge Generation and Improved Structure Modeling
HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling
Yubin Wang
Xinyang Jiang
De Cheng
Wenli Sun
Dongsheng Li
Cairong Zhao
VLM
35
0
0
27 Aug 2024
Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models
Nemesis: Normalizing the Soft-prompt Vectors of Vision-Language Models
Shuai Fu
Xiequn Wang
Qiushi Huang
Yu Zhang
VLM
37
2
0
26 Aug 2024
Towards Completeness: A Generalizable Action Proposal Generator for
  Zero-Shot Temporal Action Localization
Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action Localization
Jia-Run Du
Kun-Yu Lin
Jingke Meng
Wei-Shi Zheng
26
0
0
25 Aug 2024
Image Segmentation in Foundation Model Era: A Survey
Image Segmentation in Foundation Model Era: A Survey
Tianfei Zhou
Fei Zhang
Boyu Chang
Wenguan Wang
Ye Yuan
E. Konukoglu
Daniel Cremers
VLM
40
4
0
23 Aug 2024
Narrowing the Gap between Vision and Action in Navigation
Narrowing the Gap between Vision and Action in Navigation
Yue Zhang
Parisa Kordjamshidi
24
2
0
19 Aug 2024
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal
  Omni-Scale Feature Learning
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
23
9
0
12 Aug 2024
Probabilistic Vision-Language Representation for Weakly Supervised
  Temporal Action Localization
Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization
Geuntaek Lim
Hyunwoo Kim
Joonsoo Kim
Yukyung Choi
20
0
0
12 Aug 2024
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language
  Modeling
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Y. Zhu
Keren Ye
Junjie Ke
Jiahui Yu
Leonidas J. Guibas
P. Milanfar
Feng Yang
43
2
0
07 Aug 2024
Visual Grounding for Object-Level Generalization in Reinforcement
  Learning
Visual Grounding for Object-Level Generalization in Reinforcement Learning
Haobin Jiang
Zongqing Lu
LM&Ro
25
2
0
04 Aug 2024
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and
  Flexible Scene Text Retrieval
Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval
Gangyan Zeng
Yuan Zhang
Jin Wei
Dongbao Yang
Peng Zhang
Yiwen Gao
Xugong Qin
Yu Zhou
VLM
CLIP
13
0
0
01 Aug 2024
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
Anurag Das
Xinting Hu
Li Jiang
Bernt Schiele
VLM
31
3
0
31 Jul 2024
MarvelOVD: Marrying Object Recognition and Vision-Language Models for
  Robust Open-Vocabulary Object Detection
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection
Kuo Wang
Lechao Cheng
Weikai Chen
Pingping Zhang
Liang Lin
Fan Zhou
Guanbin Li
VLM
ObjD
26
1
0
31 Jul 2024
Autogenic Language Embedding for Coherent Point Tracking
Autogenic Language Embedding for Coherent Point Tracking
Zikai Song
Ying Tang
Run Luo
Lintao Ma
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
39
4
0
30 Jul 2024
ActivityCLIP: Enhancing Group Activity Recognition by Mining
  Complementary Information from Text to Supplement Image Modality
ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality
Guoliang Xu
Jianqin Yin
Feng Zhou
Yonghao Dang
VLM
36
0
0
29 Jul 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided
  Self-Distillation
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang
Yuxi Wang
Shuai Li
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
33
2
0
18 Jul 2024
VCP-CLIP: A visual context prompting model for zero-shot anomaly
  segmentation
VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation
Zhen Qu
Xian Tao
Mukesh Prasad
Fei Shen
Zhengtao Zhang
Xinyi Gong
Guiguang Ding
VLM
24
10
0
17 Jul 2024
Quantized Prompt for Efficient Generalization of Vision-Language Models
Quantized Prompt for Efficient Generalization of Vision-Language Models
Tianxiang Hao
Xiaohan Ding
Juexiao Feng
Yuhong Yang
Hui Chen
Guiguang Ding
VLM
MQ
22
5
0
15 Jul 2024
Textual Query-Driven Mask Transformer for Domain Generalized
  Segmentation
Textual Query-Driven Mask Transformer for Domain Generalized Segmentation
Byeonghyun Pak
Byeongju Woo
Sunghwan Kim
Dae-Hwan Kim
Hoseong Kim
37
3
0
12 Jul 2024
Data Adaptive Traceback for Vision-Language Foundation Models in Image
  Classification
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
Wenshuo Peng
Kaipeng Zhang
Yue Yang
Hao Zhang
Yu Qiao
VLM
25
2
0
11 Jul 2024
Enhancing Robustness of Vision-Language Models through Orthogonality
  Learning and Cross-Regularization
Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Cross-Regularization
Jinlong Li
Zequn Jie
Elisa Ricci
Lin Ma
N. Sebe
VLM
34
1
0
11 Jul 2024
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic
  Segmentation
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
Tong Shao
Zhuotao Tian
Hang Zhao
Jingyong Su
VLM
29
14
0
11 Jul 2024
SHERL: Synthesizing High Accuracy and Efficient Memory for
  Resource-Limited Transfer Learning
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Haiwen Diao
Bo Wan
Xu Jia
Yunzhi Zhuge
Ying Zhang
Huchuan Lu
Long Chen
VLM
37
4
0
10 Jul 2024
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based
  Understanding
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding
Wenhao Xu
Wenming Weng
Yueyi Zhang
Zhiwei Xiong
VLM
29
0
0
09 Jul 2024
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot
  Classification
Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification
Jiaying Shi
Xuetong Xue
Shenghui Xu
VLM
26
0
0
08 Jul 2024
CLIPVQA:Video Quality Assessment via CLIP
CLIPVQA:Video Quality Assessment via CLIP
Fengchuang Xing
Mingjie Li
Yuan-Gen Wang
Guopu Zhu
Xiaochun Cao
CLIP
ViT
36
4
0
06 Jul 2024
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners
Mushui Liu
Bozheng Li
Yunlong Yu
VLM
CLIP
21
2
0
04 Jul 2024
Do Generalised Classifiers really work on Human Drawn Sketches?
Do Generalised Classifiers really work on Human Drawn Sketches?
Hmrishav Bandyopadhyay
Pinaki Nath Chowdhury
Aneeshan Sain
Subhadeep Koley
Tao Xiang
A. Bhunia
Yi-Zhe Song
VLM
31
2
0
04 Jul 2024
SOWA: Adapting Hierarchical Frozen Window Self-Attention to
  Visual-Language Models for Better Anomaly Detection
SOWA: Adapting Hierarchical Frozen Window Self-Attention to Visual-Language Models for Better Anomaly Detection
Zongxiang Hu
Zhaosheng Zhang
VLM
22
1
0
04 Jul 2024
CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with
  Multi-View Images Generation
CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation
Zuo Zuo
Jiahao Dong
Yao Wu
Yanyun Qu
Zongze Wu
27
3
0
27 Jun 2024
HGTDP-DTA: Hybrid Graph-Transformer with Dynamic Prompt for Drug-Target
  Binding Affinity Prediction
HGTDP-DTA: Hybrid Graph-Transformer with Dynamic Prompt for Drug-Target Binding Affinity Prediction
Xi Xiao
Wentao Wang
Jiacheng Xie
Lijing Zhu
Gaofei Chen
Zhengji Li
Tianyang Wang
Min Xu
21
4
0
25 Jun 2024
Unveiling Encoder-Free Vision-Language Models
Unveiling Encoder-Free Vision-Language Models
Haiwen Diao
Yufeng Cui
Xiaotong Li
Yueze Wang
Huchuan Lu
Xinlong Wang
VLM
38
28
0
17 Jun 2024
Lightweight Model Pre-training via Language Guided Knowledge
  Distillation
Lightweight Model Pre-training via Language Guided Knowledge Distillation
Mingsheng Li
Lin Zhang
Mingzhen Zhu
Zilong Huang
Gang Yu
Jiayuan Fan
Tao Chen
29
0
0
17 Jun 2024
Previous
12345678
Next