FlowSep: Language-Queried Sound Separation with Rectified Flow MatchingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 |
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with
Adversarial Conditional Diffusion DistillationInterspeech (Interspeech), 2024 |
URGENT Challenge: Universality, Robustness, and Generalizability For
Speech EnhancementInterspeech (Interspeech), 2024 Wangyou Zhang Robin Scheibler Kohei Saijo Samuele Cornell Chenda Li ...Jan Pirklbauer Marvin Sach Shinji Watanabe Tim Fingscheidt Yanmin Qian |
Beyond Performance Plateaus: A Comprehensive Study on Scalability in
Speech EnhancementInterspeech (Interspeech), 2024 |
Denoising Diffusion Bridge ModelsInternational Conference on Learning Representations (ICLR), 2023 |
Music Source Separation Based on a Lightweight Deep Learning Framework
(DTTNET: DUAL-PATH TFC-TDF UNET)IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
SingFake: Singing Voice Deepfake DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
Music Source Separation with Band-Split RoPE TransformerIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 |
The Sound Demixing Challenge 2023 Music Demixing TrackTransactions of the International Society for Music Information Retrieval (TISMIR), 2023 |
Let's Verify Step by StepInternational Conference on Learning Representations (ICLR), 2023 |
Multi-Source Diffusion Models for Simultaneous Music Generation and
SeparationInternational Conference on Learning Representations (ICLR), 2023 |
Diffusion-based Generative Speech Source SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 |
Scaling Laws for Reward Model OveroptimizationInternational Conference on Machine Learning (ICML), 2022 |
Flow Matching for Generative ModelingInternational Conference on Learning Representations (ICLR), 2022 |
Music Source Separation with Band-split RNNIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 |
Flow Straight and Fast: Learning to Generate and Transfer Data with
Rectified FlowInternational Conference on Learning Representations (ICLR), 2022 |
Speech Enhancement and Dereverberation with Diffusion-based Generative ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022 |
UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022Interspeech (Interspeech), 2022 |
Improving Source Separation by Explicitly Modeling Dependencies Between
SourcesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022 |
Self-Consistency Improves Chain of Thought Reasoning in Language ModelsInternational Conference on Learning Representations (ICLR), 2022 |
Chain-of-Thought Prompting Elicits Reasoning in Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022 |
DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to
evaluate Noise SuppressorsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 |
Attention is All You Need in Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020 |
WHAMR!: Noisy and Reverberant Single-Channel Speech SeparationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019 |
A scalable noisy speech dataset and online subjective test frameworkInterspeech (Interspeech), 2019 |