Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.01518
Cited By
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
2 December 2021
Yongming Rao
Wenliang Zhao
Guangyi Chen
Yansong Tang
Zheng Zhu
Guan Huang
Jie Zhou
Jiwen Lu
VLM
CLIP
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting"
50 / 400 papers shown
Title
IFSeg: Image-free Semantic Segmentation via Vision-Language Model
Sukmin Yun
S. Park
Paul Hongsuck Seo
Jinwoo Shin
VLM
MLLM
49
14
0
25 Mar 2023
CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not
Aneeshan Sain
A. Bhunia
Pinaki Nath Chowdhury
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
VLM
26
78
0
23 Mar 2023
Visual-Language Prompt Tuning with Knowledge-guided Context Optimization
Hantao Yao
Rui Zhang
Changsheng Xu
VLM
VPVLM
122
200
0
23 Mar 2023
Natural Language-Assisted Sign Language Recognition
Ronglai Zuo
Fangyun Wei
Brian Mak
SLR
13
37
0
21 Mar 2023
Detecting Everything in the Open World: Towards Universal Object Detection
Zhenyu Wang
Yali Li
Xi Chen
Ser-Nam Lim
Antonio Torralba
Hengshuang Zhao
Shengjin Wang
ObjD
VLM
24
76
0
21 Mar 2023
Implicit Neural Representation for Cooperative Low-light Image Enhancement
Shuzhou Yang
Moxuan Ding
Yanmin Wu
Zihan Li
Jian Zhang
23
79
0
21 Mar 2023
GridCLIP: One-Stage Object Detection by Grid-Level CLIP Representation Learning
Jiaying Lin
S. Gong
VLM
CLIP
ObjD
17
22
0
16 Mar 2023
Unified Visual Relationship Detection with Vision and Language Models
Long Zhao
Liangzhe Yuan
Boqing Gong
Yin Cui
Florian Schroff
Ming Yang
Hartwig Adam
Ting Liu
ObjD
25
9
0
16 Mar 2023
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Wei Lin
Leonid Karlinsky
Nina Shvetsova
Horst Possegger
Mateusz Koziñski
Rameswar Panda
Rogerio Feris
Hilde Kuehne
Horst Bischof
VLM
100
38
0
15 Mar 2023
A Simple Framework for Open-Vocabulary Segmentation and Detection
Hao Zhang
Feng Li
Xueyan Zou
Siyi Liu
Chun-yue Li
Jianfeng Gao
Jianwei Yang
Lei Zhang
ObjD
VLM
6
149
0
14 Mar 2023
Towards Universal Vision-language Omni-supervised Segmentation
Bowen Dong
Jiaxi Gu
Jianhua Han
Hang Xu
W. Zuo
VLM
23
1
0
12 Mar 2023
DeltaEdit: Exploring Text-free Training for Text-Driven Image Manipulation
Yueming Lyu
Tianwei Lin
Fu Li
Dongliang He
Jing Dong
Tien-Ping Tan
33
38
0
11 Mar 2023
Iterative Few-shot Semantic Segmentation from Image Label Text
Haohan Wang
L. Liu
Wuhao Zhang
Jiangning Zhang
Zhenye Gan
Yabiao Wang
Chengjie Wang
Haoqian Wang
VLM
8
16
0
10 Mar 2023
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Shijie Geng
Jianbo Yuan
Yu Tian
Yuxiao Chen
Yongfeng Zhang
CLIP
VLM
41
44
0
06 Mar 2023
CLIP-guided Prototype Modulating for Few-shot Action Recognition
Xiang Wang
Shiwei Zhang
Jun Cen
Changxin Gao
Yingya Zhang
Deli Zhao
Nong Sang
VLM
6
53
0
06 Mar 2023
Improving Audio-Visual Video Parsing with Pseudo Visual Labels
Jinxing Zhou
Dan Guo
Yiran Zhong
Meng Wang
VLM
31
11
0
04 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception
Wenliang Zhao
Yongming Rao
Zuyan Liu
Benlin Liu
Jie Zhou
Jiwen Lu
ObjD
VLM
MDE
158
213
0
03 Mar 2023
Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Renrui Zhang
Xiangfei Hu
Bohao Li
Siyuan Huang
Hanqiu Deng
Hongsheng Li
Yu Qiao
Peng Gao
VLM
MLLM
25
170
0
03 Mar 2023
Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving
Xiwen Liang
Minzhe Niu
Jianhua Han
Hang Xu
Chunjing Xu
Xiaodan Liang
VLM
21
13
0
03 Mar 2023
DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction
Shubhankar Borse
Debasmit Das
Hyojin Park
H. Cai
Risheek Garrepalli
Fatih Porikli
23
9
0
02 Mar 2023
Nearest Neighbors Meet Deep Neural Networks for Point Cloud Analysis
Renrui Zhang
Liuhui Wang
Ziyu Guo
Jianbo Shi
3DPC
32
10
0
01 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
21
28
0
28 Feb 2023
Turning a CLIP Model into a Scene Text Detector
Wenwen Yu
Yuliang Liu
Wei Hua
Deqiang Jiang
Bo Ren
Xiang Bai
VLM
CLIP
MLLM
25
53
0
28 Feb 2023
Aligning Bag of Regions for Open-Vocabulary Object Detection
Size Wu
Wenwei Zhang
Sheng Jin
Wentao Liu
Chen Change Loy
VLM
ObjD
34
108
0
27 Feb 2023
LMSeg: Language-guided Multi-dataset Segmentation
Qiang-feng Zhou
Yuang Liu
Chaohui Yu
Jingliang Li
Zhibin Wang
Fan Wang
VLM
13
18
0
27 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
29
3
0
13 Feb 2023
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation
Yash J. Patel
Yusheng Xie
Yi Zhu
Srikar Appalaraju
R. Manmatha
19
4
0
07 Feb 2023
ZegOT: Zero-shot Segmentation Through Optimal Transport of Text Prompts
Kwanyoung Kim
Y. Oh
Jong Chul Ye
VLM
OT
CLIP
19
19
0
28 Jan 2023
Joint Representation Learning for Text and 3D Point Cloud
Rui Huang
Xuran Pan
Henry Zheng
Haojun Jiang
Zhifeng Xie
S. Song
Gao Huang
13
19
0
18 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
27
11
0
17 Jan 2023
CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP
Runnan Chen
Youquan Liu
Lingdong Kong
Xinge Zhu
Yuexin Ma
Yikang Li
Yuenan Hou
Yu Qiao
Wenping Wang
CLIP
3DPC
6
138
0
12 Jan 2023
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
Jie Liu
Yixiao Zhang
Jieneng Chen
Junfei Xiao
Yongyi Lu
Bennett A. Landman
Yixuan Yuan
Alan Yuille
Yucheng Tang
Zongwei Zhou
VLM
MedIm
27
188
0
02 Jan 2023
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou
Zi-Yi Dou
Jianwei Yang
Zhe Gan
Linjie Li
...
Lu Yuan
Nanyun Peng
Lijuan Wang
Yong Jae Lee
Jianfeng Gao
VLM
MLLM
ObjD
13
238
0
21 Dec 2022
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya-Qin Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
13
11
0
19 Dec 2022
CREPE: Can Vision-Language Foundation Models Reason Compositionally?
Zixian Ma
Jerry Hong
Mustafa Omer Gul
Mona Gandhi
Irena Gao
Ranjay Krishna
CoGe
18
125
0
13 Dec 2022
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Xin Ni
Yong Liu
Hao Wen
Yatai Ji
Jing Xiao
Yujiu Yang
21
9
0
09 Dec 2022
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqi Zhou
Bowen Zhang
Yinjie Lei
Lingqiao Liu
Yifan Liu
VLM
19
167
0
07 Dec 2022
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
F. Khan
CLIP
VLM
17
148
0
06 Dec 2022
Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Representation
En Yu
Songtao Liu
Zhuoling Li
Jinrong Yang
Zeming Li
Shoudong Han
Wenbing Tao
18
12
0
03 Dec 2022
PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models
Minghua Liu
Yinhao Zhu
H. Cai
Shizhong Han
Z. Ling
Fatih Porikli
Hao Su
3DPC
21
68
0
03 Dec 2022
OpenScene: 3D Scene Understanding with Open Vocabularies
Songyou Peng
Kyle Genova
ChiyuMaxJiang
Andrea Tagliasacchi
Marc Pollefeys
Thomas Funkhouser
3DPC
VLM
20
343
0
28 Nov 2022
Learning Object-Language Alignments for Open-Vocabulary Object Detection
Chuang Lin
Pei Sun
Yi-Xin Jiang
Ping Luo
Lizhen Qu
Gholamreza Haffari
Zehuan Yuan
Jianfei Cai
VLM
ObjD
8
95
0
27 Nov 2022
SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation
Huaishao Luo
Junwei Bao
Youzheng Wu
Xiaodong He
Tianrui Li
VLM
24
144
0
27 Nov 2022
CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text Labels
Siyuan Li
Li Sun
Qingli Li
VLM
28
148
0
25 Nov 2022
Multitask Vision-Language Prompt Tuning
Sheng Shen
Shijia Yang
Tianjun Zhang
Bohan Zhai
Joseph E. Gonzalez
Kurt Keutzer
Trevor Darrell
VLM
VPVLM
17
49
0
21 Nov 2022
ClipCrop: Conditioned Cropping Driven by Vision-Language Model
Zhihang Zhong
Mingxi Cheng
Zhirong Wu
Yuhui Yuan
Yinqiang Zheng
Ji Li
Han Hu
Stephen Lin
Yoichi Sato
Imari Sato
VLM
CLIP
22
3
0
21 Nov 2022
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Yue Yang
Artemis Panagopoulou
Shenghao Zhou
Daniel Jin
Chris Callison-Burch
Mark Yatskar
17
211
0
21 Nov 2022
CASA: Category-agnostic Skeletal Animal Reconstruction
Yuefan Wu
Ze-Yin Chen
Shao-Wei Liu
Zhongzheng Ren
Shenlong Wang
15
29
0
04 Nov 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya-Qin Zhang
Weidi Xie
VLM
21
48
0
27 Oct 2022
Prompting through Prototype: A Prototype-based Prompt Learning on Pretrained Vision-Language Models
Yue Zhang
Hongliang Fei
Dingcheng Li
Tan Yu
Ping Li
VPVLM
VLM
15
9
0
19 Oct 2022
Previous
1
2
3
4
5
6
7
8
Next