ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.06007
27
54

The Cone of Silence: Speech Separation by Localization

12 October 2020
Teerapat Jenrungrot
V. Jayaram
S. M. Seitz
Ira Kemelmacher-Shlizerman
ArXivPDFHTML
Abstract

Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers. At the core of our method is a deep network, in the waveform domain, which isolates sources within an angular region θ±w/2\theta \pm w/2θ±w/2, given an angle of interest θ\thetaθ and angular window size www. By exponentially decreasing www, we can perform a binary search to localize and separate all sources in logarithmic time. Our algorithm allows for an arbitrary number of potentially moving speakers at test time, including more speakers than seen during training. Experiments demonstrate state-of-the-art performance for both source separation and source localization, particularly in high levels of background noise.

View on arXiv
Comments on this paper