Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.10997
Cited By
Vision-Infused Deep Audio Inpainting
24 October 2019
Hang Zhou
Ziwei Liu
Lingfeng Guo
Ping Luo
Dahua Lin
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision-Infused Deep Audio Inpainting"
24 / 24 papers shown
Title
SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding
Mingfei Chen
I. D. Gebru
Ishwarya Ananthabhotla
Christian Richardt
Dejan Marković
Jake Sandakly
Steven Krenn
Todd Keebler
Eli Shlizerman
Alexander Richard
21
0
0
08 Apr 2025
MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
Yu-Fen Huang
Nikki Moran
Simon Coleman
Jon Kelly
Shun-Hwa Wei
...
Chih-Hsuan Li
Da-Yu Huang
Hsuan-Kai Kao
Ting-Wei Lin
Li Su
24
1
0
10 Jun 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
26
2
0
09 Apr 2024
Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Deniss Strods
A. Smeaton
6
1
0
09 May 2023
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
30
25
0
20 Jul 2022
Show Me Your Face, And I'll Tell You How You Speak
Christen Millerdurai
L. A. Khaliq
Timon Ulrich
CVBM
60
0
0
28 Jun 2022
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Changan Chen
Carl Schissler
Sanchit Garg
Philip Kobernik
Alexander William Clegg
P. Calamia
Dhruv Batra
Philip Robinson
Kristen Grauman
3DGS
31
79
0
16 Jun 2022
Learning Speaker-specific Lip-to-Speech Generation
Munender Varshney
Ravindra Yadav
Vinay P. Namboodiri
R. Hegde
16
7
0
04 Jun 2022
INTERSPEECH 2022 Audio Deep Packet Loss Concealment Challenge
Lorenz Diener
Sten Sootla
Solomiya Branets
Ando Saabas
R. Aichner
Ross Cutler
18
40
0
11 Apr 2022
tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Nils L. Westhausen
B. Meyer
16
7
0
04 Apr 2022
Learning to Answer Questions in Dynamic Audio-Visual Scenarios
Guangyao Li
Yake Wei
Yapeng Tian
Chenliang Xu
Ji-Rong Wen
Di Hu
29
136
0
26 Mar 2022
A Neural Vocoder Based Packet Loss Concealment Algorithm
Yaofeng Zhou
C. Bao
18
2
0
26 Mar 2022
Visual Acoustic Matching
Changan Chen
Ruohan Gao
P. Calamia
Kristen Grauman
16
55
0
14 Feb 2022
Visual Sound Localization in the Wild by Cross-Modal Interference Erasing
Xian Liu
Rui Qian
Hang Zhou
Di Hu
Weiyao Lin
Ziwei Liu
Bolei Zhou
Xiaowei Zhou
6
25
0
13 Feb 2022
Sound and Visual Representation Learning with Multiple Pretraining Tasks
A. Vasudevan
Dengxin Dai
Luc Van Gool
SSL
31
6
0
04 Jan 2022
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video
Rishabh Garg
Ruohan Gao
Kristen Grauman
15
27
0
21 Nov 2021
FoleyGAN: Visually Guided Generative Adversarial Network-Based Synchronous Sound Generation in Silent Videos
Sanchita Ghose
John J. Prevost
GAN
16
26
0
20 Jul 2021
Exploiting Explanations for Model Inversion Attacks
Xu Zhao
Wencan Zhang
Xiao Xiao
Brian Y. Lim
MIACV
21
82
0
26 Apr 2021
Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation
Hang Zhou
Yasheng Sun
Wayne Wu
Chen Change Loy
Xiaogang Wang
Ziwei Liu
CVBM
26
360
0
22 Apr 2021
Visually Informed Binaural Audio Generation without Binaural Audios
Xudong Xu
Hang Zhou
Ziwei Liu
Bo Dai
Xiaogang Wang
Dahua Lin
DiffM
13
53
0
13 Apr 2021
Online Multi-modal Person Search in Videos
J. Xia
Anyi Rao
Qingqiu Huang
Linning Xu
Jiangtao Wen
Dahua Lin
23
28
0
08 Aug 2020
Rotate-and-Render: Unsupervised Photorealistic Face Rotation from Single-View Images
Hang Zhou
Jihao Liu
Ziwei Liu
Yu Liu
Xiaogang Wang
CVBM
3DH
13
103
0
18 Mar 2020
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
27
251
0
10 Dec 2019
Lip Reading Sentences in the Wild
Joon Son Chung
A. Senior
Oriol Vinyals
Andrew Zisserman
162
784
0
16 Nov 2016
1