SoundNet: Learning Sound Representations from Unlabeled Video

27 October 2016

Y. Aytar

Carl Vondrick

Antonio Torralba

SSL

ArXiv PDF HTML

Papers citing "SoundNet: Learning Sound Representations from Unlabeled Video"

20 / 120 papers shown

Title
A Simple Baseline for Audio-Visual Scene-Aware Dialog Idan Schwartz A. Schwing Tamir Hazan 19 69 0 11 Apr 2019
DistInit: Learning Video Representations Without a Single Labeled Video Rohit Girdhar Du Tran Lorenzo Torresani Deva Ramanan 19 54 0 26 Jan 2019
Deep Learning for Human Affect Recognition: Insights and New Developments Philipp V. Rouast M. Adam R. Chiong 24 167 0 09 Jan 2019
Emotion Recognition in Speech using Cross-Modal Transfer in the Wild Samuel Albanie Arsha Nagrani Andrea Vedaldi Andrew Zisserman CVBM 19 270 0 16 Aug 2018
End-to-End Audio Visual Scene-Aware Dialog using Multimodal Attention-Based Video Features Chiori Hori Huda AlAmri Jue Wang G. Wichern Takaaki Hori ... Raphael Gontijo-Lopes Abhishek Das Irfan Essa Dhruv Batra Devi Parikh VGen 16 125 0 21 Jun 2018
Weakly-supervised Visual Instrument-playing Action Detection in Videos Jen-Yu Liu Yi-Hsuan Yang Shyh-Kang Jeng 19 13 0 05 May 2018
Multimodal Emotion Recognition for One-Minute-Gradual Emotion Challenge Ziqi Zheng Chenjie Cao Xingwei Chen Guoqiang Xu 24 19 0 03 May 2018
Learnable PINs: Cross-Modal Embeddings for Person Identity Arsha Nagrani Samuel Albanie Andrew Zisserman SSL 13 140 0 02 May 2018
A Bimodal Learning Approach to Assist Multi-sensory Effects Synchronization R. Abreu J. Santos Eduardo Bezerra 13 8 0 28 Apr 2018
The Sound of Pixels Hang Zhao Chuang Gan Andrew Rouditchenko Carl Vondrick Josh H. McDermott Antonio Torralba VLM 22 527 0 09 Apr 2018
Audio-Visual Event Localization in Unconstrained Videos Yapeng Tian Jing Shi Bochen Li Zhiyao Duan Chenliang Xu 14 422 0 23 Mar 2018
Moments in Time Dataset: one million videos for event understanding Mathew Monfort A. Andonian Bolei Zhou K. Ramakrishnan Sarah Adel Bargal ... L. Brown Quanfu Fan Dan Gutfreund Carl Vondrick A. Oliva 22 538 0 09 Jan 2018
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning Andrew Owens Jiajun Wu Josh H. McDermott William T. Freeman Antonio Torralba SSL 22 177 0 20 Dec 2017
Objects that Sound Relja Arandjelović Andrew Zisserman ObjD VOS 19 528 0 18 Dec 2017
Semantic speech retrieval with a visually grounded model of untranscribed speech Herman Kamper Gregory Shakhnarovich Karen Livescu 13 53 0 05 Oct 2017
Audio Super Resolution using Neural Networks Volodymyr Kuleshov S. Enam Stefano Ermon SupR 16 126 0 02 Aug 2017
Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks M. Huzaifah AI4TS 20 148 0 22 Jun 2017
Multimodal Machine Learning: A Survey and Taxonomy T. Baltrušaitis Chaitanya Ahuja Louis-Philippe Morency 13 2,855 0 26 May 2017
Generating Videos with Scene Dynamics Carl Vondrick Hamed Pirsiavash Antonio Torralba GAN VGen 66 1,460 0 08 Sep 2016
Acoustic Scene Classification D. Barchiesi D. Giannoulis D. Stowell Mark D. Plumbley 98 405 0 13 Nov 2014