ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05466
  4. Cited By
Discriminative Sounding Objects Localization via Self-supervised
  Audiovisual Matching

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

12 October 2020
Di Hu
Rui Qian
Minyue Jiang
Xiao Tan
Shilei Wen
Errui Ding
Weiyao Lin
Dejing Dou
ArXivPDFHTML

Papers citing "Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching"

18 / 18 papers shown
Title
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio
Xavier Juanola
Gloria Haro
Magdalena Fuentes
28
2
0
01 Oct 2024
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Ling Xing
Hongyu Qu
Rui Yan
Xiangbo Shu
Jinhui Tang
45
0
0
12 Sep 2024
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual
  Question Answering
CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering
Yuanyuan Jiang
Jianqin Yin
38
1
0
13 May 2024
Deep Video Inpainting Guided by Audio-Visual Self-Supervision
Deep Video Inpainting Guided by Audio-Visual Self-Supervision
Kyuyeon Kim
Junsik Jung
Woo Jae Kim
Sung-eui Yoon
SSL
23
1
0
11 Oct 2023
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Cross-modal Cognitive Consensus guided Audio-Visual Segmentation
Zhaofeng Shi
Qingbo Wu
Fanman Meng
Linfeng Xu
Hongliang Li
VOS
25
3
0
10 Oct 2023
Sound Source Localization is All about Cross-Modal Alignment
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
19
18
0
19 Sep 2023
RealImpact: A Dataset of Impact Sound Fields for Real Objects
RealImpact: A Dataset of Impact Sound Fields for Real Objects
Samuel Clarke
Ruohan Gao
Mason Wang
M. Rau
Julia Xu
Jui-Hsien Wang
Doug L. James
Jiajun Wu
27
9
0
16 Jun 2023
Video-to-Music Recommendation using Temporal Alignment of Segments
Video-to-Music Recommendation using Temporal Alignment of Segments
Laure Prétet
G. Richard
Clement Souchier
Geoffroy Peeters
AI4TS
19
13
0
12 Jun 2023
Motion and Context-Aware Audio-Visual Conditioned Video Prediction
Motion and Context-Aware Audio-Visual Conditioned Video Prediction
Yating Xu
Conghui Hu
G. Lee
VGen
35
0
0
09 Dec 2022
Audio-Visual Segmentation
Audio-Visual Segmentation
Jinxing Zhou
Jianyuan Wang
J. Zhang
Weixuan Sun
Jing Zhang
Stan Birchfield
Dan Guo
Lingpeng Kong
Meng Wang
Yiran Zhong
VOS
25
110
0
11 Jul 2022
Learning Music-Dance Representations through Explicit-Implicit Rhythm
  Synchronization
Learning Music-Dance Representations through Explicit-Implicit Rhythm Synchronization
Jiashuo Yu
Junfu Pu
Ying Cheng
Rui Feng
Ying Shan
14
5
0
07 Jul 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander William Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
31
79
0
16 Jun 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
135
0
26 Mar 2022
Visual Acoustic Matching
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
11
55
0
14 Feb 2022
Multi-Modal Perception Attention Network with Self-Supervised Learning
  for Audio-Visual Speaker Tracking
Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking
Yidi Li
Hong Liu
Hao Tang
10
20
0
14 Dec 2021
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from
  Video
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
Rishabh Garg
Ruohan Gao
Kristen Grauman
13
27
0
21 Nov 2021
Beyond Image to Depth: Improving Depth Prediction using Echoes
Beyond Image to Depth: Improving Depth Prediction using Echoes
Kranti K. Parida
Siddharth Srivastava
Gaurav Sharma
MDE
26
37
0
15 Mar 2021
There is More than Meets the Eye: Self-Supervised Multi-Object Detection
  and Tracking with Sound by Distilling Multimodal Knowledge
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Francisco Rivera Valverde
Juana Valeria Hurtado
Abhinav Valada
26
72
0
01 Mar 2021
1