Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of
On-Screen Sounds

Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds

2 November 2020

Efthymios Tzinis

Papers citing "Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds"

10 / 10 papers shown

Title
Reading to Listen at the Cocktail Party: Multi-Modal Speech Separation Akam Rahimi Triantafyllos Afouras Andrew Zisserman 37 28 0 02 Jan 2025
Sound Source Localization is All about Cross-Modal Alignment Arda Senocak H. Ryu Junsik Kim Tae-Hyun Oh Hanspeter Pfister Joon Son Chung 19 17 0 19 Sep 2023
Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning Nikhil Singh Chih-Wei Wu Iroro Orife Mahdi M. Kalayeh 23 2 0 12 Apr 2023
Audio Self-supervised Learning: A Survey Shuo Liu Adria Mallol-Ragolta Emilia Parada-Cabeleiro Kun Qian Xingshuo Jing Alexander Kathan Bin Hu Bjoern W. Schuller SSL 22 106 0 02 Mar 2022
Geometry-Aware Multi-Task Learning for Binaural Audio Generation from Video Rishabh Garg Ruohan Gao Kristen Grauman 13 27 0 21 Nov 2021
Cross-Modality Fusion Transformer for Multispectral Object Detection Q. Fang D. Han Zhaokui Wang ViT 8 137 0 30 Oct 2021
Wav2CLIP: Learning Robust Audio Representations From CLIP Ho-Hsiang Wu Prem Seetharaman Kundan Kumar J. P. Bello CLIP VLM 11 267 0 21 Oct 2021
Attention Bottlenecks for Multimodal Fusion Arsha Nagrani Shan Yang Anurag Arnab A. Jansen Cordelia Schmid Chen Sun 23 536 0 30 Jun 2021
DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement Yuma Koizumi Shigeki Karita Scott Wisdom Hakan Erdogan J. Hershey Llion Jones M. Bacchiani 19 41 0 30 Jun 2021
Source separation with weakly labelled data: An approach to computational auditory scene analysis Qiuqiang Kong Yuxuan Wang Xuchen Song Yin Cao Wenwu Wang Mark D. Plumbley 16 47 0 06 Feb 2020