ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.06379
  4. Cited By
Dual-path RNN: efficient long sequence modeling for time-domain
  single-channel speech separation

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

14 October 2019
Yi Luo
Zhuo Chen
Takuya Yoshioka
    AI4TS
ArXivPDFHTML

Papers citing "Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation"

50 / 107 papers shown
Title
Listen to Extract: Onset-Prompted Target Speaker Extraction
Listen to Extract: Onset-Prompted Target Speaker Extraction
Pengjie Shen
Kangrui Chen
Shulin He
Pengru Chen
Shuqi Yuan
He Kong
Xueliang Zhang
Z. Wang
48
0
0
08 May 2025
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
Kohei Saijo
Tetsuji Ogawa
52
1
0
28 Apr 2025
Speech Enhancement with Overlapped-Frame Information Fusion and Causal Self-Attention
Speech Enhancement with Overlapped-Frame Information Fusion and Causal Self-Attention
Yuewei Zhang
Huanbin Zou
Jie Zhu
41
0
0
21 Jan 2025
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing
David Perera
Victor Letzelter
Théo Mariotte
Adrien Cortés
Mickaël Chen
S. Essid
Ga¨el Richard
74
2
0
20 Jan 2025
USED: Universal Speaker Extraction and Diarization
USED: Universal Speaker Extraction and Diarization
Junyi Ao
Mehmet Sinan Yildirim
Ruijie Tao
Mengyao Ge
Shuai Wang
Yan-min Qian
Haizhou Li
38
5
0
17 Jan 2025
Beyond Speaker Identity: Text Guided Target Speech Extraction
Beyond Speaker Identity: Text Guided Target Speech Extraction
Mingyue Huo
Abhinav Jain
Cong Phuoc Huynh
Fanjie Kong
Pichao Wang
Zhu Liu
Vimal Bhat
51
0
0
17 Jan 2025
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Longbiao Cheng
Ashutosh Pandey
Buye Xu
T. Delbruck
V. Ithapu
Shih-Chii Liu
37
0
0
04 Nov 2024
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
Han Yin
Yang Xiao
Jisheng Bai
Rohan Kumar Das
31
0
0
02 Nov 2024
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning
CleanUMamba: A Compact Mamba Network for Speech Denoising using Channel Pruning
Sjoerd Groot
Qinyu Chen
Jan C. van Gemert
Chang Gao
Mamba
132
0
0
14 Oct 2024
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation
Mohan Xu
Kai Li
Guo Chen
Xiaolin Hu
43
0
0
02 Oct 2024
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li
Wendi Sang
Chang Zeng
Runxuan Yang
Guo Chen
Xiaolin Hu
28
2
0
02 Oct 2024
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
Artem Dementyev
Chandan K. A. Reddy
Scott Wisdom
Navin Chatlani
J. Hershey
R. Lyon
18
0
0
26 Sep 2024
Exploring Text-Queried Sound Event Detection with Audio Source Separation
Exploring Text-Queried Sound Event Detection with Audio Source Separation
Han Yin
Jisheng Bai
Yang Xiao
Hui Wang
Siqi Zheng
Yafeng Chen
Rohan Kumar Das
Chong Deng
Jianfeng Chen
32
3
0
20 Sep 2024
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
Bang Zeng
Ming Li
31
2
0
04 Sep 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in
  Speech Enhancement
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Wangyou Zhang
Kohei Saijo
Jee-weon Jung
Chenda Li
Shinji Watanabe
Yanmin Qian
32
4
0
06 Jun 2024
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by
  Magnitude Conditioning
ConSep: a Noise- and Reverberation-Robust Speech Separation Framework by Magnitude Conditioning
Kuan-Hsun Ho
J. Hung
Berlin Chen
34
0
0
04 Mar 2024
Sound Source Separation Using Latent Variational Block-Wise
  Disentanglement
Sound Source Separation Using Latent Variational Block-Wise Disentanglement
Karim Helwani
M. Togami
Paris Smaragdis
Michael M. Goodwin
BDL
DRL
21
1
0
08 Feb 2024
Proactive Detection of Voice Cloning with Localized Watermarking
Proactive Detection of Voice Cloning with Localized Watermarking
Robin San Roman
Pierre Fernandez
Alexandre Défossez
Teddy Furon
Tuan Tran
Hady ElSahar
49
40
0
30 Jan 2024
On Speaker Attribution with SURT
On Speaker Attribution with SURT
Desh Raj
Matthew Wiesner
Matthew Maciejewski
Leibny Paola García-Perera
Daniel Povey
Sanjeev Khudanpur
29
3
0
28 Jan 2024
Combined Generative and Predictive Modeling for Speech Super-resolution
Combined Generative and Predictive Modeling for Speech Super-resolution
Heming Wang
Eric W. Healy
DeLiang Wang
DiffM
27
0
0
25 Jan 2024
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down
  Fusion
TDFNet: An Efficient Audio-Visual Speech Separation Model with Top-down Fusion
Samuel Pegg
Kai Li
Xiaolin Hu
32
1
0
25 Jan 2024
A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech
  Enhancement
A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech Enhancement
Yuewei Zhang
Huanbin Zou
Jie Zhu
21
3
0
19 Jan 2024
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for
  Enhanced Time-Domain Monaural Speech Separation
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
Shengkui Zhao
Yukun Ma
Chongjia Ni
Chong Zhang
Hao Wang
Trung Hieu Nguyen
Kun Zhou
J. Yip
Dianwen Ng
Bin Ma
13
21
0
19 Dec 2023
Investigating the Design Space of Diffusion Models for Speech
  Enhancement
Investigating the Design Space of Diffusion Models for Speech Enhancement
Philippe Gonzalez
Zheng-Hua Tan
Jan Østergaard
Jesper Jensen
T. S. Alstrøm
Tobias May
DiffM
27
6
0
07 Dec 2023
FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for
  Distortion-Invariant Robust Speech Recognition
FAT-HuBERT: Front-end Adaptive Training of Hidden-unit BERT for Distortion-Invariant Robust Speech Recognition
Dongning Yang
Wei Wang
Yanmin Qian
13
3
0
29 Nov 2023
On Time Domain Conformer Models for Monaural Speech Separation in Noisy
  Reverberant Acoustic Environments
On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
William Ravenscroft
Stefan Goetze
Thomas Hain
28
7
0
09 Oct 2023
Speech enhancement with frequency domain auto-regressive modeling
Speech enhancement with frequency domain auto-regressive modeling
Anurenjan Purushothaman
Debottam Dutta
Rohit Kumar
Sriram Ganapathy
17
2
0
24 Sep 2023
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker
  Extraction
AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction
Jiuxin Lin
X. Cai
Heinrich Dinkel
Jun Chen
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Zhiyong Wu
Yujun Wang
Helen M. Meng
22
21
0
25 Jun 2023
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
SURT 2.0: Advances in Transducer-based Multi-talker Speech Recognition
Desh Raj
Daniel Povey
Sanjeev Khudanpur
VLM
26
9
0
18 Jun 2023
Multi-Loss Convolutional Network with Time-Frequency Attention for
  Speech Enhancement
Multi-Loss Convolutional Network with Time-Frequency Attention for Speech Enhancement
Liang Wan
Hongqing Liu
Yi Zhou
Jie Ji
25
2
0
15 Jun 2023
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated
  Convolution and Channel Attention
An Efficient Speech Separation Network Based on Recurrent Fusion Dilated Convolution and Channel Attention
Junyu Wang
22
1
0
09 Jun 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
A. Brutti
S. Squartini
39
9
0
29 May 2023
A Neural State-Space Model Approach to Efficient Speech Separation
A Neural State-Space Model Approach to Efficient Speech Separation
Chen Chen
Chao-Han Huck Yang
Kai Li
Yuchen Hu
Pin-Jui Ku
Chng Eng Siong
31
11
0
26 May 2023
Multi-Dimensional and Multi-Scale Modeling for Speech Separation
  Optimized by Discriminative Learning
Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning
Zhaoxi Mu
Xinyu Yang
Wenjing Zhu
16
5
0
07 Mar 2023
DFSNet: A Steerable Neural Beamformer Invariant to Microphone Array
  Configuration for Real-Time, Low-Latency Speech Enhancement
DFSNet: A Steerable Neural Beamformer Invariant to Microphone Array Configuration for Real-Time, Low-Latency Speech Enhancement
A. Kovalyov
Kashyap Patel
Issa Panahi
18
3
0
26 Feb 2023
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker
  Embeddings
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings
Kai Liu
Xucheng Wan
Z.C. Du
Huan Zhou
VLM
27
1
0
16 Jan 2023
Towards Unified All-Neural Beamforming for Time and Frequency Domain
  Speech Separation
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Rongzhi Gu
Shi-Xiong Zhang
Yuexian Zou
Dong Yu
AI4TS
22
24
0
16 Dec 2022
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech
  Enhancement
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement
Dongheon Lee
Jung-Woo Choi
24
25
0
15 Dec 2022
Multi-Scale Feature Fusion Transformer Network for End-to-End Single
  Channel Speech Separation
Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Yinhao Xu
Jian Zhou
L. Tao
H. Kwan
27
0
0
14 Dec 2022
GPU-accelerated Guided Source Separation for Meeting Transcription
GPU-accelerated Guided Source Separation for Meeting Transcription
Desh Raj
Daniel Povey
Sanjeev Khudanpur
20
34
0
10 Dec 2022
Deep neural network techniques for monaural speech enhancement: state of
  the art analysis
Deep neural network techniques for monaural speech enhancement: state of the art analysis
P. Ochieng
28
21
0
01 Dec 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech
  Separation
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation
Zhongqiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
29
119
0
22 Nov 2022
Hybrid Transformers for Music Source Separation
Hybrid Transformers for Music Source Separation
Simon Rouard
Francisco Massa
Alexandre Défossez
16
128
0
15 Nov 2022
Speech separation with large-scale self-supervised learning
Speech separation with large-scale self-supervised learning
Zhuo Chen
Naoyuki Kanda
Jian Wu
Yu-Huan Wu
Xiaofei Wang
Takuya Yoshioka
Jinyu Li
S. Sivasankaran
Sefik Emre Eskimez
17
14
0
09 Nov 2022
Improving performance of real-time full-band blind packet-loss
  concealment with predictive network
Improving performance of real-time full-band blind packet-loss concealment with predictive network
Viet-Anh Nguyen
Anh H. T. Nguyen
Andy W. H. Khong
22
7
0
08 Nov 2022
Speech Enhancement with Perceptually-motivated Optimization and Dual
  Transformations
Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations
Xucheng Wan
Kai Liu
Z.C. Du
Huan Zhou
8
0
0
24 Sep 2022
Inference skipping for more efficient real-time speech enhancement with
  parallel RNNs
Inference skipping for more efficient real-time speech enhancement with parallel RNNs
Xiaohuai Le
Tong Lei
Kai-Jyun Chen
Jing Lu
30
20
0
22 Jul 2022
NESC: Robust Neural End-2-End Speech Coding with GANs
NESC: Robust Neural End-2-End Speech Coding with GANs
N. Pia
Kishan Gupta
Srikanth Korse
M. Multrus
Guillaume Fuchs
33
15
0
07 Jul 2022
Semi-supervised Time Domain Target Speaker Extraction with Attention
Semi-supervised Time Domain Target Speaker Extraction with Attention
Zhepei Wang
Ritwik Giri
Shrikant Venkataramani
Umut Isik
J. Valin
Paris Smaragdis
Mike Goodwin
A. Krishnaswamy
16
7
0
18 Jun 2022
SepIt: Approaching a Single Channel Speech Separation Bound
SepIt: Approaching a Single Channel Speech Separation Bound
Shahar Lutati
Eliya Nachmani
Lior Wolf
VLM
43
27
0
24 May 2022
123
Next