Demographic Attributes Prediction from Speech Using WavLM EmbeddingsAnnual Conference on Information Sciences and Systems (CISS), 2025 |
SCDiar: a streaming diarization system based on speaker change detection and speech recognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025 |
USED: Universal Speaker Extraction and DiarizationIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023 |
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor ExtractionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024 |
Incorporating Spatial Cues in Modular Speaker Diarization for
Multi-channel Multi-party MeetingsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
FruitsMusic: A Real-World Corpus of Japanese Idol-Group SongsInternational Society for Music Information Retrieval Conference (ISMIR), 2024 |
Leveraging Self-Supervised Learning for Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Unified Audio Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Focus Agent: LLM-Powered Virtual Focus GroupInternational Conference on Intelligent Virtual Agents (IVA), 2024 |
Resource-Efficient Adaptation of Speech Foundation Models for
Multi-Speaker ASRSpoken Language Technology Workshop (SLT), 2024 Weiqing Wang Kunal Dhawan Taejin Park Krishna Puvvada Ivan Medennikov Somshubra Majumdar He Huang Jagadeesh Balam Boris Ginsburg |
The VoxCeleb Speaker Recognition Challenge: A RetrospectiveIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2024 |
Continuous Target Speech Extraction: Enhancing Personalized Diarization
and Extraction on Complex RecordingsIEEE International Joint Conference on Neural Network (IJCNN), 2024 |
EEND-M2F: Masked-attention mask transformers for speaker diarizationInterspeech (Interspeech), 2024 |
Boosting Unknown-number Speaker Separation with Transformer
Decoder-based AttractorIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Look, Listen and Recognise: Character-Aware Audio-Visual SubtitlingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
Powerset multi-class cross entropy loss for neural speaker diarizationInterspeech (Interspeech), 2023 |
SA-Paraformer: Non-autoregressive End-to-End Speaker-Attributed ASRAutomatic Speech Recognition & Understanding (ASRU), 2023 |
Discriminative Training of VBx DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Frame-wise streaming end-to-end speaker diarization with
non-autoregressive self-attention-based attractorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
NTT speaker diarization system for CHiME-7: multi-domain,
multi-microphone End-to-end and vector clustering diarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Profile-Error-Tolerant Target-Speaker Voice Activity DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
DiaCorrect: Error Correction Back-end For Speaker DiarizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Attention-based Encoder-Decoder End-to-End Neural Diarization with
Embedding EnhancerIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023 |