ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.05070
  4. Cited By
AlignNet: A Unifying Approach to Audio-Visual Alignment

AlignNet: A Unifying Approach to Audio-Visual Alignment

IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
12 February 2020
Jianren Wang
Zhaoyuan Fang
Hang Zhao
ArXiv (abs)PDFHTML

Papers citing "AlignNet: A Unifying Approach to Audio-Visual Alignment"

16 / 16 papers shown
Effectively obtaining acoustic, visual and textual data from videos
Effectively obtaining acoustic, visual and textual data from videos
Jorge E. León
Miguel Carrasco
VGen
178
2
0
06 Sep 2025
ESG-Net: Event-Aware Semantic Guided Network for Dense Audio-Visual Event Localization
ESG-Net: Event-Aware Semantic Guided Network for Dense Audio-Visual Event Localization
Huilai Li
Yonghao Dang
Ying Xing
Yiming Wang
Jianqin Yin
232
0
0
14 Jul 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
Weiming Dong
Changsheng Xu
437
2
0
17 Apr 2025
Enhancing Explainability with Multimodal Context Representations for Smarter Robots
Enhancing Explainability with Multimodal Context Representations for Smarter Robots
Anargh Viswanath
Lokesh Veeramacheneni
Hendrik Buschmeier
193
1
0
28 Feb 2025
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal SynthesisProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
267
4
0
29 Oct 2024
PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in
  Viewers' Opinion Scores
PEAVS: Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers' Opinion ScoresEuropean Conference on Computer Vision (ECCV), 2024
Lucas Goncalves
Prashant Mathur
Chandrashekhar Lavania
Metehan Cekic
Marcello Federico
Kyu J. Han
223
9
0
10 Apr 2024
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Cross-modal Cognitive Consensus guided Audio-Visual SegmentationIEEE transactions on multimedia (IEEE TMM), 2023
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
495
13
0
10 Oct 2023
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature
  Alignment
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature AlignmentIEEE International Conference on Computer Vision (ICCV), 2023
Sarah Ibrahimi
Xiaohang Sun
Pichao Wang
Amanmeet Garg
Ashutosh Sanan
Mohamed Omar
341
37
0
24 Jul 2023
Video-to-Music Recommendation using Temporal Alignment of Segments
Video-to-Music Recommendation using Temporal Alignment of SegmentsIEEE transactions on multimedia (IEEE TMM), 2023
Laure Prétet
G. Richard
Clement Souchier
Geoffroy Peeters
AI4TS
213
20
0
12 Jun 2023
Long-Term Rhythmic Video Soundtracker
Long-Term Rhythmic Video SoundtrackerInternational Conference on Machine Learning (ICML), 2023
Jiashuo Yu
Yaohui Wang
Xinyuan Chen
Xiao Sun
Yu Qiao
DiffM
393
21
0
02 May 2023
MLink: Linking Black-Box Models from Multiple Domains for Collaborative
  Inference
MLink: Linking Black-Box Models from Multiple Domains for Collaborative InferenceIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Mu Yuan
Lan Zhang
Zimu Zheng
Yi-Nan Zhang
Xiang-Yang Li
383
3
0
28 Sep 2022
Learning Music-Dance Representations through Explicit-Implicit Rhythm
  Synchronization
Learning Music-Dance Representations through Explicit-Implicit Rhythm SynchronizationIEEE transactions on multimedia (IEEE TMM), 2022
Jiashuo Yu
Junfu Pu
Ying Cheng
Rui Feng
Ying Shan
316
7
0
07 Jul 2022
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Arda Senocak
Junsik Kim
Tae-Hyun Oh
H. Ryu
Dingzeyu Li
In So Kweon
189
1
0
12 Feb 2022
TriBERT: Full-body Human-centric Audio-visual Representation Learning
  for Visual Sound Separation
TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation
Tanzila Rahman
Mengyu Yang
Leonid Sigal
ViT
165
8
0
26 Oct 2021
Visual Speech Enhancement Without A Real Visual Stream
Visual Speech Enhancement Without A Real Visual StreamIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Sindhu B. Hegde
Prajwal K R
Rudrabha Mukhopadhyay
Vinay P. Namboodiri
C. V. Jawahar
DiffM
202
21
0
20 Dec 2020
Motion Prediction in Visual Object Tracking
Motion Prediction in Visual Object Tracking
Jianren Wang
Yihui He
175
8
0
01 Jul 2020
1
Page 1 of 1