Image-to-Image Matching via Foundation Models: A New Perspective for
Open-Vocabulary Semantic Segmentation

Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation

30 March 2024

Papers citing "Image-to-Image Matching via Foundation Models: A New Perspective for Open-Vocabulary Semantic Segmentation"

13 / 13 papers shown

Title
InvSeg: Test-Time Prompt Inversion for Semantic Segmentation Jiayi Lin Jiabo Huang Jian Hu S. Gong DiffM VLM 21 0 0 15 Oct 2024
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs Venkata Naren Devarakonda Raktim Gautam Goswami Ali Umut Kaypak Naman Patel Rooholla Khorrambakht P. Krishnamurthy Farshad Khorrami LM&Ro 19 3 0 08 Oct 2024
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching Yang Liu Muzhi Zhu Hengtao Li Hao Chen Xinlong Wang Chunhua Shen VLM MLLM 77 82 0 22 May 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models Jiarui Xu Sifei Liu Arash Vahdat Wonmin Byeon Xiaolong Wang Shalini De Mello VLM 198 318 0 08 Mar 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception Wenliang Zhao Yongming Rao Zuyan Liu Benlin Liu Jie Zhou Jiwen Lu ObjD VLM MDE 147 213 0 03 Mar 2023
Semantic Segmentation by Early Region Proxy Yifan Zhang Bo Pang Cewu Lu ViT 32 23 0 26 Mar 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision Jiarui Xu Shalini De Mello Sifei Liu Wonmin Byeon Thomas Breuel Jan Kautz X. Wang ViT VLM 173 494 0 22 Feb 2022
Masked Autoencoders Are Scalable Vision Learners Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Dollár Ross B. Girshick ViT TPM 255 7,337 0 11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers Mathilde Caron Hugo Touvron Ishan Misra Hervé Jégou Julien Mairal Piotr Bojanowski Armand Joulin 283 5,723 0 29 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision Chao Jia Yinfei Yang Ye Xia Yi-Ting Chen Zarana Parekh Hieu H. Pham Quoc V. Le Yun-hsuan Sung Zhen Li Tom Duerig VLM CLIP 293 2,875 0 11 Feb 2021
Semantic Understanding of Scenes through the ADE20K Dataset Bolei Zhou Hang Zhao Xavier Puig Tete Xiao Sanja Fidler Adela Barriuso Antonio Torralba SSeg 243 1,817 0 18 Aug 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger Philipp Fischer Thomas Brox SSeg 3DV 226 74,467 0 18 May 2015