Title |
---|
![]() Siamese Vision Transformers are Scalable Audio-visual Learners Yan-Bo Lin Gedas Bertasius |
![]() PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm Haoyi Zhu Honghui Yang Xiaoyang Wu Di Huang Sha Zhang ...Hengshuang Zhao Chunhua Shen Yu Qiao Tong He Wanli Ouyang |