Challenges of Multi-Modal Coreset Selection for Depth Prediction
Abstract
Coreset selection methods are effective in accelerating training and reducing memory requirements but remain largely unexplored in applied multimodal settings. We adapt a state-of-the-art (SoTA) coreset selection technique for multimodal data, focusing on the depth prediction task. Our experiments with embedding aggregation and dimensionality reduction approaches reveal the challenges of extending unimodal algorithms to multimodal scenarios, highlighting the need for specialized methods to better capture inter-modal relationships.
View on arXivMain:2 Pages
Bibliography:2 Pages
1 Tables
Appendix:1 Pages
Comments on this paper
