Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis

3 November 2020

Papers citing "Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis"

50 / 60 papers shown

FlexIO: Flexible Single- and Multi-Channel Speech Separation and Enhancement

194

24 Oct 2025

Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams

142

04 Oct 2025

Data-independent Beamforming for End-to-end Multichannel Multi-speaker ASR

190

12 Sep 2025

Error Analysis in a Modular Meeting Transcription System

225

12 Sep 2025

Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder

156

28 Aug 2025

Advances in Speech Separation: Techniques, Challenges, and Future Trends

...

234

14 Aug 2025

Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition

Yufeng Yang

H. Taherian

Vahid Ahmadi Kalkhorani

DeLiang Wang

233

23 Mar 2025

Target Speaker ASR with Whisper

640

17 Jan 2025

DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition

312

03 Jan 2025

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Thai-Binh Nguyen

Alexander Waibel

303

27 Nov 2024

USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024

Bang Zeng

Ming Li

491

04 Sep 2024

Enhanced Reverberation as Supervision for Unsupervised Speech SeparationInterspeech (Interspeech), 2024

Kohei Saijo

Gordon Wichern

François G. Germain

Zexu Pan

Jonathan Le Roux

255

06 Aug 2024

RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual CuesACM Multimedia (MM), 2024

Gangshan Wu

329

27 Jul 2024

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR

Linhao Dong

242

04 Mar 2024

On Speaker Attribution with SURTThe Speaker and Language Recognition Workshop (Odyssey), 2024

Desh Raj

Sanjeev Khudanpur

Matthew Maciejewski

Leibny Paola García-Perera

Daniel Povey

Sanjeev Khudanpur

290

28 Jan 2024

NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting TranscriptionInterspeech (Interspeech), 2024

...

274

16 Jan 2024

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings

287

11 Dec 2023

Multi-channel Conversational Speaker Separation via Neural DiarizationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

H. Taherian

DeLiang Wang

BDL

266

15 Nov 2023

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker ExtractionIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2023

Kay Chen Tan

435

11 Oct 2023

One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Shinji Watanabe

298

02 Oct 2023

Meeting Recognition with Continuous Speech Separation and Transcription-Supported Diarization

334

28 Sep 2023

t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation CapabilityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Zhuo Chen

211

15 Sep 2023

Combining TF-GridNet and Mixture Encoder for Continuous Speech Separation for Meeting TranscriptionSpoken Language Technology Workshop (SLT), 2023

316

15 Sep 2023

Attention-based Encoder-Decoder End-to-End Neural Diarization with Embedding EnhancerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Zhengyang Chen

Bing Han

Shuai Wang

Yan-min Qian

281

13 Sep 2023

LibriWASN: A Data Set for Meeting Separation, Diarization, and Recognition with Asynchronous Recording Devices

Joerg Schmalenstroeer

Tobias Gburrek

Reinhold Haeb-Umbach

186

21 Aug 2023

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning RepresentationIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023

Wangyou Zhang

259

23 Jul 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

...

Sanjeev Khudanpur

285

23 Jun 2023

SURT 2.0: Advances in Transducer-based Multi-talker Speech RecognitionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Desh Raj

Daniel Povey

Sanjeev Khudanpur

VLM

399

18 Jun 2023

A Teacher-Student approach for extracting informative speaker embeddings from speech mixturesInterspeech (Interspeech), 2023

374

01 Jun 2023

On Data Sampling Strategies for Training Neural Network Speech Separation ModelsEuropean Signal Processing Conference (EUSIPCO), 2023

William Ravenscroft

Stefan Goetze

Thomas Hain

VLM

227

14 Apr 2023

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone ConversationsSpeech Communication (Speech Commun.), 2023

371

21 Mar 2023

TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker EmbeddingsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Christoph Boeddeker

Aswin Shanmugam Subramanian

Gordon Wichern

Reinhold Haeb-Umbach

Jonathan Le Roux

360

07 Mar 2023

Multi-resolution location-based training for multi-channel continuous speech separationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

H. Taherian

DeLiang Wang

235

16 Jan 2023

GPU-accelerated Guided Source Separation for Meeting TranscriptionInterspeech (Interspeech), 2022

Desh Raj

Daniel Povey

Sanjeev Khudanpur

402

10 Dec 2022

On Word Error Rate Definitions and their Efficient Computation for Multi-Speaker Speech Recognition SystemsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

239

29 Nov 2022

Reverberation as Supervision for Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

R. Aralikatti

Christoph Boeddeker

Gordon Wichern

Aswin Shanmugam Subramanian

Jonathan Le Roux

224

15 Nov 2022

Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts

248

11 Nov 2022

Simulating realistic speech overlaps improves multi-talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

361

27 Oct 2022

CasNet: Investigating Channel Robustness for Speech Separation

Fan Wang

Yao-Fei Cheng

Hung-Shin Lee

Yu Tsao

Hsin-Min Wang

152

27 Oct 2022

Spatial-aware Speaker Diarization for Multi-channel Multi-party MeetingInterspeech (Interspeech), 2022

202

24 Sep 2022

VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

374

12 Sep 2022

Analysis of impact of emotions on target speech extraction and speech separationInternational Workshop on Acoustic Signal Enhancement (IWAENC), 2022

200

15 Aug 2022

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting TranscriptionInterspeech (Interspeech), 2022

Xianrui Zheng

Chuxu Zhang

P. Woodland

197

08 Jul 2022

A Meeting Transcription System for an Ad-Hoc Acoustic Sensor Network

Joerg Schmalenstroeer

Reinhold Haeb-Umbach

153

02 May 2022

Leveraging Real Conversational Data for Multi-Channel Continuous Speech SeparationInterspeech (Interspeech), 2022

224

07 Apr 2022

An Initialization Scheme for Meeting Separation with Spatial Mixture ModelsInterspeech (Interspeech), 2022

249

04 Apr 2022

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of SpeakersSpoken Language Technology Workshop (SLT), 2022

326

31 Mar 2022

Streaming Speaker-Attributed ASR with Token-Level Speaker EmbeddingsInterspeech (Interspeech), 2022

282

30 Mar 2022

Disentangling the Impacts of Language and Channel Variability on Speech Separation NetworksInterspeech (Interspeech), 2022

Fan Wang

Hung-Shin Lee

Yu Tsao

Hsin-Min Wang

294

30 Mar 2022

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

...

Kong Aik Lee

Zhijie Yan

B. Ma

Xin Xu

Hui Bu

240

08 Feb 2022