ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.11602
  4. Cited By
Recursive Visual Sound Separation Using Minus-Plus Net

Recursive Visual Sound Separation Using Minus-Plus Net

30 August 2019
Xudong Xu
Bo Dai
Dahua Lin
ArXivPDFHTML

Papers citing "Recursive Visual Sound Separation Using Minus-Plus Net"

50 / 61 papers shown
Title
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation
Akam Rahimi
Triantafyllos Afouras
Andrew Zisserman
40
28
0
02 Jan 2025
Continual Audio-Visual Sound Separation
Continual Audio-Visual Sound Separation
Weiguo Pian
Yiyang Nan
Shijian Deng
Shentong Mo
Yunhui Guo
Yapeng Tian
VLM
CLL
43
0
0
05 Nov 2024
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Aligning Audio-Visual Joint Representations with an Agentic Workflow
Shentong Mo
Yibing Song
23
0
0
30 Oct 2024
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios
  with Missing Visual Cues
RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Tianrui Pan
Jie Liu
Bohan Wang
Jie Tang
Gangshan Wu
32
2
0
27 Jul 2024
Aligning Sight and Sound: Advanced Sound Source Localization Through
  Audio-Visual Alignment
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
36
3
0
18 Jul 2024
Semantic Grouping Network for Audio Source Separation
Semantic Grouping Network for Audio Source Separation
Shentong Mo
Yapeng Tian
34
4
0
04 Jul 2024
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual
  Transformers
MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers
Tanvir Mahmud
Shentong Mo
Yapeng Tian
Diana Marculescu
34
4
0
07 Jun 2024
Robust Active Speaker Detection in Noisy Environments
Robust Active Speaker Detection in Noisy Environments
Siva Sai Nagender Vasireddy
Chenxu Zhang
Xiaohu Guo
Yapeng Tian
32
0
0
27 Mar 2024
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense
  Interactions through Masked Modeling
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
Shentong Mo
Pedro Morgado
19
13
0
02 Dec 2023
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables
Bandhav Veluri
Malek Itani
Justin Chan
Takuya Yoshioka
Shyamnath Gollakota
23
15
0
01 Nov 2023
LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
LAVSS: Location-Guided Audio-Visual Spatial Audio Separation
Yuxin Ye
Wenming Yang
Yapeng Tian
26
10
0
31 Oct 2023
Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware
  Sound Separation
Separating Invisible Sounds Toward Universal Audiovisual Scene-Aware Sound Separation
Yiyang Su
A. Vosoughi
Shijian Deng
Yapeng Tian
Chenliang Xu
26
4
0
18 Oct 2023
Sound Source Localization is All about Cross-Modal Alignment
Sound Source Localization is All about Cross-Modal Alignment
Arda Senocak
H. Ryu
Junsik Kim
Tae-Hyun Oh
Hanspeter Pfister
Joon Son Chung
21
18
0
19 Sep 2023
AdVerb: Visually Guided Audio Dereverberation
AdVerb: Visually Guided Audio Dereverberation
Sanjoy Chowdhury
Sreyan Ghosh
Subhrajyoti Dasgupta
Anton Ratnarajah
Utkarsh Tyagi
Dinesh Manocha
22
11
0
23 Aug 2023
Visually-Guided Sound Source Separation with Audio-Visual Predictive
  Coding
Visually-Guided Sound Source Separation with Audio-Visual Predictive Coding
Zengjie Song
Zhaoxiang Zhang
19
1
0
19 Jun 2023
A Unified Audio-Visual Learning Framework for Localization, Separation,
  and Recognition
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
Shentong Mo
Pedro Morgado
30
21
0
30 May 2023
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event
  Parser
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser
Yun-hsuan Lai
Yen-Chun Chen
Y. Wang
18
10
0
27 May 2023
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head
  Video Generation
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Fa-Ting Hong
Li Shen
Dan Xu
3DH
CVBM
21
15
0
10 May 2023
Audio-Visual Grouping Network for Sound Localization from Mixtures
Audio-Visual Grouping Network for Sound Localization from Mixtures
Shentong Mo
Yapeng Tian
37
42
0
29 Mar 2023
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Kun Su
Kaizhi Qian
Eli Shlizerman
Antonio Torralba
Chuang Gan
VGen
AI4CE
35
20
0
29 Mar 2023
iQuery: Instruments as Queries for Audio-Visual Sound Separation
iQuery: Instruments as Queries for Audio-Visual Sound Separation
Jiaben Chen
Renrui Zhang
Dongze Lian
Jiaqi Yang
Ziyao Zeng
Jianbo Shi
16
26
0
07 Dec 2022
Real-Time Target Sound Extraction
Real-Time Target Sound Extraction
Bandhav Veluri
Justin Chan
Malek Itani
Tuochao Chen
Takuya Yoshioka
Shyamnath Gollakota
36
30
0
04 Nov 2022
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio
  Visual Event Localization
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization
Tanvir Mahmud
Diana Marculescu
CLIP
11
31
0
11 Oct 2022
Learning in Audio-visual Context: A Review, Analysis, and New
  Perspective
Learning in Audio-visual Context: A Review, Analysis, and New Perspective
Yake Wei
Di Hu
Yapeng Tian
Xuelong Li
46
55
0
20 Aug 2022
ConceptBeam: Concept Driven Target Speech Extraction
ConceptBeam: Concept Driven Target Speech Extraction
Yasunori Ohishi
Marc Delcroix
Tsubasa Ochiai
S. Araki
Daiki Takeuchi
Daisuke Niizumi
Akisato Kimura
N. Harada
K. Kashino
33
18
0
25 Jul 2022
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated
  Open-Domain On-Screen Sound Separation
AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Efthymios Tzinis
Scott Wisdom
Tal Remez
J. Hershey
31
29
0
20 Jul 2022
Multi-Modal Multi-Correlation Learning for Audio-Visual Speech
  Separation
Multi-Modal Multi-Correlation Learning for Audio-Visual Speech Separation
Xiaoyu Wang
Xiangyu Kong
Xiulian Peng
Yan Lu
17
6
0
04 Jul 2022
Exploiting Transformation Invariance and Equivariance for
  Self-supervised Sound Localisation
Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation
Jinxian Liu
Chen Ju
Weidi Xie
Ya-Qin Zhang
23
38
0
26 Jun 2022
Text-Driven Separation of Arbitrary Sounds
Text-Driven Separation of Arbitrary Sounds
Kevin Kilgour
Beat Gfeller
Qingqing Huang
A. Jansen
Scott Wisdom
Marco Tagliasacchi
22
30
0
12 Apr 2022
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
  by Re-Synthesis
Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Karren D. Yang
Dejan Marković
Steven Krenn
Vasu Agrawal
Alexander Richard
VGen
16
32
0
31 Mar 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
136
0
26 Mar 2022
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Audio-Visual Fusion Layers for Event Type Aware Video Recognition
Arda Senocak
Junsik Kim
Tae-Hyun Oh
H. Ryu
Dingzeyu Li
In So Kweon
13
1
0
12 Feb 2022
Active Audio-Visual Separation of Dynamic Sound Sources
Active Audio-Visual Separation of Dynamic Sound Sources
Sagnik Majumder
Kristen Grauman
19
21
0
02 Feb 2022
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from
  Video
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
Rishabh Garg
Ruohan Gao
Kristen Grauman
15
27
0
21 Nov 2021
Visual Scene Graphs for Audio Source Separation
Visual Scene Graphs for Audio Source Separation
Moitreya Chatterjee
Jonathan Le Roux
N. Ahuja
A. Cherian
10
35
0
24 Sep 2021
V-SlowFast Network for Efficient Visual Sound Separation
V-SlowFast Network for Efficient Visual Sound Separation
Lingyu Zhu
Esa Rahtu
44
10
0
18 Sep 2021
Move2Hear: Active Audio-Visual Source Separation
Move2Hear: Active Audio-Visual Source Separation
Sagnik Majumder
Ziad Al-Halah
Kristen Grauman
11
44
0
15 May 2021
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial
  Audio Generation
Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation
Yan-Bo Lin
Y. Wang
48
21
0
03 May 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized
  Audio-Visual Representation
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
26
360
0
22 Apr 2021
A cappella: Audio-visual Singing Voice Separation
A cappella: Audio-visual Singing Voice Separation
Juan F. Montesinos
V. S. Kadandale
G. Haro
38
16
0
20 Apr 2021
Detector-Free Weakly Supervised Grounding by Separation
Detector-Free Weakly Supervised Grounding by Separation
Assaf Arbelle
Sivan Doveh
Amit Alfassy
J. Shtok
Guy Lev
...
Kate Saenko
S. Ullman
Raja Giryes
Rogerio Feris
Leonid Karlinsky
35
23
0
20 Apr 2021
Visually Guided Sound Source Separation and Localization using
  Self-Supervised Motion Representations
Visually Guided Sound Source Separation and Localization using Self-Supervised Motion Representations
Lingyu Zhu
Esa Rahtu
10
25
0
17 Apr 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
13
53
0
13 Apr 2021
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound
  Separation
Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation
Yapeng Tian
Di Hu
Chenliang Xu
ObjD
13
86
0
05 Apr 2021
Can audio-visual integration strengthen robustness under multimodal
  attacks?
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
23
37
0
05 Apr 2021
Unsupervised Sound Localization via Iterative Contrastive Learning
Unsupervised Sound Localization via Iterative Contrastive Learning
Yan-Bo Lin
Hung-Yu Tseng
Hsin-Ying Lee
Yen-Yu Lin
Ming-Hsuan Yang
SSL
19
34
0
01 Apr 2021
Weakly-supervised Audio-visual Sound Source Detection and Separation
Weakly-supervised Audio-visual Sound Source Detection and Separation
Tanzila Rahman
Leonid Sigal
16
7
0
25 Mar 2021
Music source separation conditioned on 3D point clouds
Music source separation conditioned on 3D point clouds
Francesc Lluís
V. Chatziioannou
A. Hofmann
3DPC
24
5
0
03 Feb 2021
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
VisualVoice: Audio-Visual Speech Separation with Cross-Modal Consistency
Ruohan Gao
Kristen Grauman
CVBM
185
198
0
08 Jan 2021
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of
  On-Screen Sounds
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Efthymios Tzinis
Scott Wisdom
A. Jansen
Shawn Hershey
Tal Remez
D. Ellis
J. Hershey
26
68
0
02 Nov 2020
12
Next