ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.08636
  4. Cited By
Probing the 3D Awareness of Visual Foundation Models

Probing the 3D Awareness of Visual Foundation Models

12 April 2024
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas J. Guibas
Justin Johnson
Varun Jampani
ArXivPDFHTML

Papers citing "Probing the 3D Awareness of Visual Foundation Models"

21 / 21 papers shown
Title
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation
Volodymyr Havrylov
Haiwen Huang
Dan Zhang
Andreas Geiger
34
0
0
04 May 2025
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models
Wufei Ma
Luoxin Ye
Nessa McWeeney
Celso M de Melo
A. Yuille
Jieneng Chen
LRM
57
1
0
01 May 2025
Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
Victoria Yue Chen
Daoye Wang
Stephan Garbin
Jan Bednarík
Sebastian Winberg
Timo Bolkart
Thabo Beeler
3DH
3DPC
32
0
0
28 Apr 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
61
0
0
20 Mar 2025
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering
Yanpeng Zhao
Yiwei Hao
Siyu Gao
Yunbo Wang
Xiaokang Yang
OCL
111
1
0
17 Feb 2025
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
51
7
0
21 Oct 2024
TIPS: Text-Image Pretraining with Spatial awareness
TIPS: Text-Image Pretraining with Spatial awareness
Kevis-Kokitsi Maninis
Kaifeng Chen
Soham Ghosh
Arjun Karpur
Koert Chen
...
Jan Dlabal
Dan Gnanapragasam
Mojtaba Seyedhosseini
Howard Zhou
Andre Araujo
VLM
30
3
0
21 Oct 2024
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Haoyi Zhu
Honghui Yang
Yating Wang
Jiange Yang
Limin Wang
Tong He
3DH
43
5
0
10 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
25
4
0
02 Oct 2024
SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image
SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image
Dimitrije Antić
Sai Kumar Dwivedi
Shashank Tripathi
Theo Gevers
Dimitrios Tzionas
Dimitrios Tzionas
42
2
0
24 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-xiong Wang
70
15
0
05 Sep 2024
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
Odd-One-Out: Anomaly Detection by Comparing with Neighbors
A. Bhunia
Changjian Li
Hakan Bilen
34
0
0
28 Jun 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
34
3
0
17 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
37
6
0
06 Jun 2024
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains
  Into One
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One
Michael Ranzinger
Greg Heinrich
Jan Kautz
Pavlo Molchanov
VLM
20
42
0
10 Dec 2023
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models
  across Computer Vision Tasks
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks
Micah Goldblum
Hossein Souri
Renkun Ni
Manli Shu
Viraj Prabhu
...
Adrien Bardes
Judy Hoffman
Ramalingam Chellappa
Andrew Gordon Wilson
Tom Goldstein
VLM
68
62
0
30 Oct 2023
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Wei Yin
Chi Zhang
Hao Chen
Zhipeng Cai
Gang Yu
Kaixuan Wang
Xiaozhi Chen
Chunhua Shen
MDE
126
169
0
20 Jul 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion
  Models
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
198
318
0
08 Mar 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
515
0
02 Jan 2023
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
1