Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.08732
Cited By
Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos
19 October 2019
Kranti K. Parida
Neeraj Matiyali
T. Guha
Gaurav Sharma
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos"
9 / 9 papers shown
Title
Adapting to the Unknown: Training-Free Audio-Visual Event Perception with Dynamic Thresholds
E. Shaar
Ariel Shaulov
Gal Chechik
Lior Wolf
VLM
41
0
0
17 Mar 2025
Towards Open-Vocabulary Audio-Visual Event Localization
Jinxing Zhou
D. Guo
Ruohao Guo
Yuxin Mao
Jingjing Hu
Yiran Zhong
Xiaojun Chang
M. Wang
VLM
46
4
0
18 Nov 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
26
2
0
09 Apr 2024
Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers
James Gunn
Zygmunt Lenyk
Anuj Sharma
Andrea Donati
Alexandru Buburuzan
John Redford
Romain Mueller
MDE
35
8
0
22 Dec 2023
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
32
25
0
20 Jul 2022
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
16
7
0
04 Jun 2022
Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
31
20
0
15 Nov 2021
CrossATNet - A Novel Cross-Attention Based Framework for Sketch-Based Image Retrieval
Ushasi Chaudhuri
Biplab Banerjee
A. Bhattacharya
Mihai Datcu
23
29
0
20 Apr 2021
Beyond Image to Depth: Improving Depth Prediction using Echoes
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
33
37
0
15 Mar 2021
1