Learning to Separate Object Sounds by Watching Unlabeled Video

5 April 2018

Papers citing "Learning to Separate Object Sounds by Watching Unlabeled Video"

28 / 78 papers shown

Title
Telling Left from Right: Learning Spatial Correspondence of Sight and Sound Karren D. Yang Bryan C. Russell Justin Salamon SSL 24 75 0 11 Jun 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network Lingyu Zhu Esa Rahtu 22 23 0 04 Jun 2020
VisualEchoes: Spatial Image Representation Learning through Echolocation Ruohan Gao Changan Chen Ziad Al-Halah Carl Schissler Kristen Grauman MDE SSL 171 84 0 04 May 2020
Conditioned Source Separation for Music Instrument Performances Olga Slizovskaia G. Haro E. Gómez 30 38 0 08 Apr 2020
The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation S. Kasaei J. Melsen Floris van Beers Christiaan Steenkist K. Vončina 29 12 0 18 Mar 2020
Audiovisual SlowFast Networks for Video Recognition Fanyi Xiao Yong Jae Lee Kristen Grauman Jitendra Malik Christoph Feichtenhofer 197 207 0 23 Jan 2020
Deep Audio-Visual Learning: A Survey Hao Zhu Mandi Luo Rui Wang A. Zheng Ran He 31 156 0 14 Jan 2020
Look, Listen, and Act: Towards Audio-Visual Embodied Navigation Chuang Gan Yiwei Zhang Jiajun Wu Boqing Gong J. Tenenbaum 24 137 0 25 Dec 2019
Listen to Look: Action Recognition by Previewing Audio Ruohan Gao Tae-Hyun Oh Kristen Grauman Lorenzo Torresani VLM 29 251 0 10 Dec 2019
ClusterFit: Improving Generalization of Visual Representations Xueting Yan Ishan Misra Abhinav Gupta Deepti Ghadiyaram D. Mahajan SSL VLM 27 132 0 06 Dec 2019
Learning to Localize Sound Sources in Visual Scenes: Analysis and Applications Arda Senocak Tae-Hyun Oh Junsik Kim Ming-Hsuan Yang In So Kweon SSL 33 52 0 20 Nov 2019
Vision-Infused Deep Audio Inpainting Hang Zhou Ziwei Liu Lingfeng Guo Ping Luo Dahua Lin 35 88 0 24 Oct 2019
Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zeroshot Classification and Retrieval of Videos Kranti K. Parida Neeraj Matiyali T. Guha Gaurav Sharma VLM 35 41 0 19 Oct 2019
Learning to Have an Ear for Face Super-Resolution Givi Meishvili Simon Jenni Paolo Favaro SupR CVBM 33 23 0 27 Sep 2019
Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning Tanzila Rahman Bicheng Xu Leonid Sigal 30 78 0 22 Sep 2019
Recursive Visual Sound Separation Using Minus-Plus Net Xudong Xu Bo Dai Dahua Lin 35 91 0 30 Aug 2019
Self-supervised audio representation learning for mobile devices Marco Tagliasacchi Beat Gfeller Félix de Chaumont Quitry Dominik Roblek SSL AI4TS 6 46 0 24 May 2019
Scaling and Benchmarking Self-Supervised Visual Representation Learning Priya Goyal D. Mahajan Abhinav Gupta Ishan Misra SSL 26 396 0 03 May 2019
Co-Separating Sounds of Visual Objects Ruohan Gao Kristen Grauman 33 206 0 16 Apr 2019
The Sound of Motions Hang Zhao Chuang Gan Wei-Chiu Ma Antonio Torralba 17 251 0 11 Apr 2019
A Simple Baseline for Audio-Visual Scene-Aware Dialog Idan Schwartz Alex Schwing Tamir Hazan 27 69 0 11 Apr 2019
2.5D Visual Sound Ruohan Gao Kristen Grauman VGen 27 130 0 11 Dec 2018
An Attempt towards Interpretable Audio-Visual Video Captioning Yapeng Tian Chenxiao Guan Justin Goodman Marc Moore Chenliang Xu 36 20 0 07 Dec 2018
The Visual Centrifuge: Model-Free Layered Video Representations Jean-Baptiste Alayrac João Carreira Andrew Zisserman 23 48 0 04 Dec 2018
Uncertainty aware audiovisual activity recognition using deep Bayesian variational inference Mahesh Subedar R. Krishnan P. López-Meyer Omesh Tickoo Jonathan Huang BDL EDL UQCV 29 0 0 27 Nov 2018
Identify, locate and separate: Audio-visual object extraction in large video collections using weak supervision Sanjeel Parekh A. Ozerov S. Essid Ngoc Q. K. Duong P. Pérez G. Richard 28 16 0 09 Nov 2018
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features Andrew Owens Alexei A. Efros SSL 51 745 0 10 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos Yapeng Tian Jing Shi Bochen Li Zhiyao Duan Chenliang Xu 53 426 0 23 Mar 2018