Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2002.05070
Cited By
AlignNet: A Unifying Approach to Audio-Visual Alignment
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
12 February 2020
Jianren Wang
Zhaoyuan Fang
Hang Zhao
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AlignNet: A Unifying Approach to Audio-Visual Alignment"
16 / 16 papers shown
Effectively obtaining acoustic, visual and textual data from videos
Jorge E. León
Miguel Carrasco
VGen
178
2
0
06 Sep 2025
ESG-Net: Event-Aware Semantic Guided Network for Dense Audio-Visual Event Localization
Huilai Li
Yonghao Dang
Ying Xing
Yiming Wang
Jianqin Yin
232
0
0
14 Jul 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
Weiming Dong
Changsheng Xu
437
2
0
17 Apr 2025
Enhancing Explainability with Multimodal Context Representations for Smarter Robots
Anargh Viswanath
Lokesh Veeramacheneni
Hendrik Buschmeier
193
1
0
28 Feb 2025
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
267
4
0
29 Oct 2024
PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers' Opinion Scores
European Conference on Computer Vision (ECCV), 2024
Lucas Goncalves
Prashant Mathur
Chandrashekhar Lavania
Metehan Cekic
Marcello Federico
Kyu J. Han
223
9
0
10 Apr 2024
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
IEEE transactions on multimedia (IEEE TMM), 2023
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
495
13
0
10 Oct 2023
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
IEEE International Conference on Computer Vision (ICCV), 2023
Sarah Ibrahimi
Xiaohang Sun
Pichao Wang
Amanmeet Garg
Ashutosh Sanan
Mohamed Omar
341
37
0
24 Jul 2023
Video-to-Music Recommendation using Temporal Alignment of Segments
IEEE transactions on multimedia (IEEE TMM), 2023
Laure Prétet
G. Richard
Clement Souchier
Geoffroy Peeters
AI4TS
213
20
0
12 Jun 2023
Long-Term Rhythmic Video Soundtracker
International Conference on Machine Learning (ICML), 2023
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
393
21
0
02 May 2023
MLink: Linking Black-Box Models from Multiple Domains for Collaborative Inference
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Mu Yuan
Lan Zhang
Zimu Zheng
Yi-Nan Zhang
Xiang-Yang Li
383
3
0
28 Sep 2022
Learning Music-Dance Representations through Explicit-Implicit Rhythm Synchronization
IEEE transactions on multimedia (IEEE TMM), 2022
Jiashuo Yu
Junfu Pu
Ying Cheng
Rui Feng
Ying Shan
316
7
0
07 Jul 2022
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Arda Senocak
Junsik Kim
Tae-Hyun Oh
H. Ryu
Dingzeyu Li
In So Kweon
189
1
0
12 Feb 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
165
8
0
26 Oct 2021
Visual Speech Enhancement Without A Real Visual Stream
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
DiffM
202
21
0
20 Dec 2020
Motion Prediction in Visual Object Tracking
Jianren Wang
Yihui He
175
8
0
01 Jul 2020
1
Page 1 of 1