Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.05709
Cited By
OmniVec: Learning robust representations with cross modal sharing
7 November 2023
Siddharth Srivastava
Gaurav Sharma
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"OmniVec: Learning robust representations with cross modal sharing"
46 / 46 papers shown
Title
Balancing Accuracy, Calibration, and Efficiency in Active Learning with Vision Transformers Under Label Noise
Moseli Motsóehli
Hope Mogale
Kyungim Baek
30
0
0
07 May 2025
FSSUAVL: A Discriminative Framework using Vision Models for Federated Self-Supervised Audio and Image Understanding
Yasar Abbas Ur Rehman
Kin Wai Lau
Yuyang Xie
Ma Lan
Jiajun Shen
29
0
0
13 Apr 2025
UniViTAR: Unified Vision Transformer with Native Resolution
Limeng Qiao
Yiyang Gan
Bairui Wang
Jie Qin
Shuang Xu
Siqi Yang
Lin Ma
44
0
0
02 Apr 2025
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Paul Koch
Jörg Krüger
Ankit Chowdhury
O. Heimann
MDE
48
0
0
25 Mar 2025
DUNIA: Pixel-Sized Embeddings via Cross-Modal Alignment for Earth Observation Applications
Ibrahim Fayad
Max Zimmer
Martin Schwartz
P. Ciais
Fabian Gieseke
Gabriel Belouze
Sarah Brood
A. D. Truchis
Alexandre d’Aspremont
AI4TS
35
0
0
24 Feb 2025
MVIP -- A Dataset and Methods for Application Oriented Multi-View and Multi-Modal Industrial Part Recognition
Paul Koch
Marian Schluter
Jörg Krüger
59
0
0
24 Feb 2025
Learning Priors of Human Motion With Vision Transformers
Placido Falqueto
Alberto Sanfeliu
Luigi Palopoli
Daniele Fontanelli
ViT
140
0
0
30 Jan 2025
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria
Stefan Schoder
32
0
0
31 Dec 2024
TaxaBind: A Unified Embedding Space for Ecological Applications
S. Sastry
Subash Khanal
A. Dhakal
Adeel Ahmad
Nathan Jacobs
48
6
0
01 Nov 2024
MARs: Multi-view Attention Regularizations for Patch-based Feature Recognition of Space Terrain
Timothy Chase Jr
Karthik Dantu
14
0
0
07 Oct 2024
Grokking at the Edge of Linear Separability
Alon Beck
Noam Levi
Yohai Bar-Sinai
21
0
0
06 Oct 2024
GenRec: Unifying Video Generation and Recognition with Diffusion Models
Zejia Weng
Xitong Yang
Zhen Xing
Zuxuan Wu
Yu-Gang Jiang
VGen
DiffM
22
5
0
27 Aug 2024
FungiTastic: A multi-modal dataset and benchmark for image categorization
Lukás Picek
Klara Janouskova
Milan Šulc
Jirí Matas
65
1
0
24 Aug 2024
A Survey and Evaluation of Adversarial Attacks for Object Detection
Khoi Nguyen Tiet Nguyen
Wenyu Zhang
Kangkang Lu
Yuhuan Wu
Xingjian Zheng
Hui Li Tan
Liangli Zhen
AAML
24
0
0
04 Aug 2024
This Probably Looks Exactly Like That: An Invertible Prototypical Network
Zachariah Carmichael
Timothy Redgrave
Daniel Gonzalez Cedre
Walter J. Scheirer
BDL
12
2
0
16 Jul 2024
MMIS: Multimodal Dataset for Interior Scene Visual Generation and Recognition
Hozaifa Kassab
Ahmed Mahmoud
Mohamed Bahaa
Ammar Mohamed
Ali Hamdi
VLM
23
0
0
08 Jul 2024
ADAPT: Multimodal Learning for Detecting Physiological Changes under Missing Modalities
Julie Mordacq
Léo Milecki
Maria Vakalopoulou
Steve Oudot
Vicky Kalogeiton
OffRL
MedIm
25
3
0
04 Jul 2024
Multi-modal Transfer Learning between Biological Foundation Models
Juan Jose Garau-Luis
Patrick Bordes
Liam Gonzalez
Masa Roller
Bernardo P. de Almeida
...
Stefan Laurent
Jan Grzegorzewski
Maren Lang
Thomas Pierrot
Guillaume Richard
AI4CE
25
1
0
20 Jun 2024
Enhancing Domain Adaptation through Prompt Gradient Alignment
Hoang Phan
Lam C. Tran
Quyen Tran
Trung Le
45
0
0
13 Jun 2024
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
ViT
35
15
0
03 Jun 2024
Optimizing Foundation Model Inference on a Many-tiny-core Open-source RISC-V Platform
Viviane Potocnik
Luca Colagrande
Tim Fischer
L. Bertaccini
Daniele Jahier Pagliari
Alessio Burrello
Luca Benini
15
0
0
29 May 2024
On the Foundations of Earth and Climate Foundation Models
Xiao Xiang Zhu
Zhitong Xiong
Yi Wang
Adam J. Stewart
Konrad Heidler
Yuanyuan Wang
Zhenghang Yuan
Thomas Dujardin
Qingsong Xu
Yilei Shi
AI4Cl
AI4CE
20
20
0
07 May 2024
RankCLIP: Ranking-Consistent Language-Image Pretraining
Yiming Zhang
Zhuokai Zhao
Zhaorun Chen
Zhili Feng
Zenghui Ding
Yining Sun
SSL
VLM
34
7
0
15 Apr 2024
OmniSat: Self-Supervised Modality Fusion for Earth Observation
Guillaume Astruc
Nicolas Gonthier
Clement Mallet
Loic Landrieu
20
23
0
12 Apr 2024
HAPNet: Toward Superior RGB-Thermal Scene Parsing via Hybrid, Asymmetric, and Progressive Heterogeneous Feature Fusion
Jiahang Li
Peng Yun
Qijun Chen
Rui Fan
25
3
0
04 Apr 2024
LSKNet: A Foundation Lightweight Backbone for Remote Sensing
Yuxuan Li
Xiang Li
Yimain Dai
Qibin Hou
Li Liu
Yongxiang Liu
Ming-Ming Cheng
Jian Yang
29
29
0
18 Mar 2024
SUPClust: Active Learning at the Boundaries
Yuta Ono
Till Aczél
Benjamin Estermann
Roger Wattenhofer
26
1
0
06 Mar 2024
StochCA: A Novel Approach for Exploiting Pretrained Models with Cross-Attention
SeungWon Seo
Suho Lee
Sangheum Hwang
19
0
0
25 Feb 2024
Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?
Tiantian Feng
Daniel Yang
Digbalay Bose
Shrikanth Narayanan
24
4
0
14 Feb 2024
Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
Ruiping Liu
Jiaming Zhang
Kunyu Peng
Yufan Chen
Ke Cao
Junwei Zheng
M. Sarfraz
Kailun Yang
Rainer Stiefelhagen
VLM
32
7
0
30 Jan 2024
VI-PANN: Harnessing Transfer Learning and Uncertainty-Aware Variational Inference for Improved Generalization in Audio Pattern Recognition
John Fischer
Marko Orescanin
Eric Eckstrand
UQCV
BDL
10
4
0
10 Jan 2024
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye
Dan Xu
ViT
8
10
0
08 Jun 2023
Can We Solve 3D Vision Tasks Starting from A 2D Vision Transformer?
Yi Wang
Zhiwen Fan
Tianlong Chen
Hehe Fan
Zhangyang Wang
ViT
26
9
0
15 Sep 2022
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
87
93
0
04 Jul 2022
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
91
110
0
23 Jun 2022
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
15
14
0
22 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
111
262
0
02 Feb 2022
Benchmarking Robustness of 3D Point Cloud Recognition Against Common Corruptions
Jiachen Sun
Qingzhao Zhang
B. Kailkhura
Zhiding Yu
Chaowei Xiao
Z. Morley Mao
3DPC
31
83
0
28 Jan 2022
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
F. Khan
H. Fu
ViT
LM&MA
MedIm
100
653
0
24 Jan 2022
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
209
222
0
20 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
242
554
0
28 Sep 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Yin Cui
Boqing Gong
ViT
229
573
0
22 Apr 2021
Regularization Strategy for Point Cloud via Rigidly Mixed Sample
Dogyoon Lee
Jaeha Lee
Junhyeop Lee
Hyeongmin Lee
Minhyeok Lee
Sungmin Woo
Sangyoun Lee
3DPC
128
72
0
03 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
219
2,404
0
04 Jan 2021
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
192
204
0
23 Jan 2020
1