v1v2 (latest)

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

20 April 2020

Jon Barker

Sanjeev Khudanpur

Aswin Shanmugam Subramanian

Papers citing "CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings"

50 / 195 papers shown

LibriConvo: Simulating Conversations from Read Literature for ASR and Diarization

Máté Gedeon

Péter Mihajlik

27 Oct 2025

A Cocktail-Party Benchmark: Multi-Modal dataset and Comparative Evaluation Results

115

27 Oct 2025

M3-SLU: Evaluating Speaker-Attributed Reasoning in Multimodal Large Language Models

243

22 Oct 2025

Hallucination Benchmark for Speech Foundation Models

Alkis Koudounas

Moreno La Quatra

Manuel Giollo

Sabato Marco Siniscalchi

Elena Baralis

HILM

319

18 Oct 2025

Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation

161

11 Oct 2025

Target speaker anonymization in multi-speaker recordings

117

10 Oct 2025

LOTUSDIS: A Thai far-field meeting corpus for robust conversational ASR

Pattara Tipaksorn

Sumonmas Thatphithakkul

Vataya Chunwijitra

Kwanchiva Thangthai

23 Sep 2025

AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs

Sathwik Tejaswi Madhusudhan

AuLLM ELM

218

09 Sep 2025

Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder

149

28 Aug 2025

A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet

245

14 Aug 2025

MSU-Bench: Towards Understanding the Conversational Multi-talker Scenarios

227

11 Aug 2025

SPGISpeech 2.0: Transcribed multi-speaker financial audio for speaker-tagged transcription

07 Aug 2025

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?

199

12 Jul 2025

The Impact of Automatic Speech Transcription on Speaker Attribution

Cristina Aggazzotti

Matthew Wiesner

Elizabeth Allyn Smith

Nicholas Andrews

290

11 Jul 2025

Omni-Router: Sharing Routing Decisions in Sparse Mixture-of-Experts for Speech Recognition

299

08 Jul 2025

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR

243

27 Jun 2025

Exploring Speaker Diarization with Mixture of Experts

206

17 Jun 2025

Speaker-Distinguishable CTC: Learning Speaker Distinction Using CTC for Multi-Talker Speech Recognition

153

09 Jun 2025

Pseudo Labels-based Neural Speech Enhancement for the AVSR Task in the MISP-Meeting Challenge

183

30 May 2025

SuPseudo: A Pseudo-supervised Learning Method for Neural Speech Enhancement in Far-field Speech Recognition

Longjie Luo

Lin Li

Q. Hong

225

30 May 2025

The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition

Siniscalchi Sabato Marco

O. Scharenborg

355

20 May 2025

Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio

Xinlu He

Jacob Whitehill

320

16 May 2025

BERSting at the Screams: A Benchmark for Distanced, Emotional and Shouted Speech RecognitionComputer Speech and Language (CSL), 2025

395

30 Apr 2025

Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition

Yufeng Yang

H. Taherian

Vahid Ahmadi Kalkhorani

DeLiang Wang

228

23 Mar 2025

Adopting Whisper for Confidence EstimationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

304

20 Feb 2025

Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization ChallengeNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Lian Remme

Kevin Tang

317

18 Feb 2025

On the Robust Approximation of ASR MetricsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

345

18 Feb 2025

SCDiar: a streaming diarization system based on speaker change detection and speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

189

28 Jan 2025

Summary of the NOTSOFAR-1 Challenge: Highlights and LearningsComputer Speech and Language (CSL), 2025

360

28 Jan 2025

SEAL: Speaker Error Correction using Acoustic-conditioned Large Language ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

295

14 Jan 2025

Microphone Array Signal Processing and Deep Learning for Speech EnhancementIEEE Signal Processing Magazine (IEEE Signal Process. Mag.), 2024

277

13 Jan 2025

Guided Speaker EmbeddingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

349

03 Jan 2025

DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition

305

03 Jan 2025

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Thai-Binh Nguyen

Alexander Waibel

291

27 Nov 2024

Joint Beamforming and Speaker-Attributed ASR for Real Distant-Microphone Meeting Transcription

377

29 Oct 2024

STCON System for the CHiME-8 Challenge

...

Dmitriy Miroshnichenko

253

17 Oct 2024

SonicSim: A customizable simulation platform for speech processing in moving sound source scenariosInternational Conference on Learning Representations (ICLR), 2024

Kai Li

333

02 Oct 2024

Alignment-Free Training for Transducer-based Multi-Talker ASRIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Marc Delcroix

259

30 Sep 2024

Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party MeetingsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Jia Pan

330

25 Sep 2024

META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR

Jinhan Wang

Weiqing Wang

Kunal Dhawan

Taejin Park

Myungjong Kim

Ivan Medennikov

He Huang

Nithin Koluguri

Jagadeesh Balam

Boris Ginsburg

345

18 Sep 2024

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion RecognitionSpoken Language Technology Workshop (SLT), 2024

Chao-Han Huck Yang

Taejin Park

Yuan Gong

Yuanchao Li

Zhehuai Chen

...

Peter Bell

Shinji Watanabe

356

15 Sep 2024

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems

349

10 Sep 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASRSpoken Language Technology Workshop (SLT), 2024

Weiqing Wang

Kunal Dhawan

Taejin Park

Jagadeesh Balam

Boris Ginsburg

261

02 Sep 2024

LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker DiarizationInterspeech (Interspeech), 2024

Zengrui Jin

Mohan Shi

...

Yong Xu

Shi-Xiong Zhang

Daniel Povey

223

01 Sep 2024

Advancing Multi-talker ASR Performance with Large Language ModelsSpoken Language Technology Workshop (SLT), 2024

Mohan Shi

Zengrui Jin

Yaoxun Xu

Yong Xu

Shi-Xiong Zhang

Kun Wei

Yiwen Shao

Chunlei Zhang

Dong Yu

240

30 Aug 2024

Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization

Luyao Cheng

Hui Wang

Siqi Zheng

Yafeng Chen

Rongjie Huang

Qinglin Zhang

Qian Chen

Xihao Li

258

22 Aug 2024

Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition

Samuele Cornell

Jordan Darefsky

Zhiyao Duan

Shinji Watanabe

SyDa

288

17 Aug 2024

ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech EnhancementJournal of the Acoustical Society of America (JASA), 2024

Zhong-Qiu Wang

270

28 Jul 2024

The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization

Shinji Watanabe

234

23 Jul 2024

Self-Train Before You Transcribe

Robert Flynn

Anton Ragni

297

17 Jun 2024