ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.10372
  4. Cited By
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office
  Environment

L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
21 February 2022
E. Guizzo
Christian Marinoni
Marco Pennese
Xinlei Ren
Xiguang Zheng
Chen Zhang
Bruno Masiero
A. Uncini
Danilo Comminiello
ArXiv (abs)PDFHTML

Papers citing "L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment"

23 / 23 papers shown
Title
HDA-SELD: Hierarchical Cross-Modal Distillation with Multi-Level Data Augmentation for Low-Resource Audio-Visual Sound Event Localization and Detection
HDA-SELD: Hierarchical Cross-Modal Distillation with Multi-Level Data Augmentation for Low-Resource Audio-Visual Sound Event Localization and Detection
Qing Wang
Ya Jiang
Hang Chen
Sabato Marco Siniscalchi
Jun Du
J. Gao
VLM
138
0
0
17 Aug 2025
SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic Camera
SoundLoc3D: Invisible 3D Sound Source Localization and Classification Using a Multimodal RGB-D Acoustic CameraIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Yuhang He
Sangyun Shin
Anoop Cherian
Niki Trigoni
Andrew Markham
392
0
0
31 Dec 2024
PSELDNets: Pre-trained Neural Networks on a Large-scale Synthetic Dataset for Sound Event Localization and Detection
PSELDNets: Pre-trained Neural Networks on a Large-scale Synthetic Dataset for Sound Event Localization and DetectionIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2024
Jinbo Hu
Yin Cao
Ming Wu
Fang Kang
Feiran Yang
Wenwu Wang
Mark D. Plumbley
J. Yang
281
5
0
10 Nov 2024
Can Large Language Models Understand Spatial Audio?
Can Large Language Models Understand Spatial Audio?
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
...
Jun Zhang
Lu Lu
Zejun Ma
Yuxuan Wang
Chao Zhang
293
18
0
12 Jun 2024
URGENT Challenge: Universality, Robustness, and Generalizability For
  Speech Enhancement
URGENT Challenge: Universality, Robustness, and Generalizability For Speech EnhancementInterspeech (Interspeech), 2024
Wangyou Zhang
Robin Scheibler
Kohei Saijo
Samuele Cornell
Chenda Li
...
Jan Pirklbauer
Marvin Sach
Shinji Watanabe
Tim Fingscheidt
Yanmin Qian
VLM
207
45
0
07 Jun 2024
Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a
  Single-Channel Model
Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
Arthur N. dos Santos
Bruno S. Masiero
Túlio C. L. Mateus
139
0
0
22 Apr 2024
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality
Christian Marinoni
R. F. Gramaccioni
Changan Chen
A. Uncini
Danilo Comminiello
136
7
0
14 Feb 2024
BAT: Learning to Reason about Spatial Sounds with Large Language Models
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Zhisheng Zheng
Puyuan Peng
Ziyang Ma
Xie Chen
Eunsol Choi
David Harwath
LRM
341
38
0
02 Feb 2024
Fusion of Audio and Visual Embeddings for Sound Event Localization and
  Detection
Fusion of Audio and Visual Embeddings for Sound Event Localization and DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Davide Berghi
Peipei Wu
Jinzheng Zhao
Wenwu Wang
Philip J. B. Jackson
238
23
0
14 Dec 2023
w2v-SELD: A Sound Event Localization and Detection Framework for
  Self-Supervised Spatial Audio Pre-Training
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-TrainingIEEE Access (IEEE Access), 2023
Orlem Lima dos Santos
Karen Rosero
R. Lotufo
SSL
126
10
0
12 Dec 2023
D4AM: A General Denoising Framework for Downstream Acoustic Models
D4AM: A General Denoising Framework for Downstream Acoustic ModelsInternational Conference on Learning Representations (ICLR), 2023
H. Wang
Yu Tsao
Hsin-Min Wang
Chu-Song Chen
138
5
0
28 Nov 2023
TorchAudio 2.1: Advancing speech recognition, self-supervised learning,
  and audio processing components for PyTorch
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorchAutomatic Speech Recognition & Understanding (ASRU), 2023
Jeff Hwang
Moto Hira
Caroline Chen
Xiaohui Zhang
Zhaoheng Ni
...
Yumeng Tao
Robin Scheibler
Samuele Cornell
Sean Kim
Stavros Petridis
208
33
0
27 Oct 2023
Dual input neural networks for positional sound source localization
Dual input neural networks for positional sound source localizationEURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process), 2023
Eric Grinstein
Vincent W. Neo
Patrick A. Naylor
127
6
0
08 Aug 2023
A General Unfolding Speech Enhancement Method Motivated by Taylor's
  Theorem
A General Unfolding Speech Enhancement Method Motivated by Taylor's TheoremIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Andong Li
Guochen Yu
C. Zheng
Wenzhe Liu
Xiaodong Li
248
21
0
30 Nov 2022
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech
  Separation
TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech SeparationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Zhongqiu Wang
Samuele Cornell
Shukjae Choi
Younglo Lee
Byeonghak Kim
Shinji Watanabe
246
202
0
22 Nov 2022
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech
  Enhancement from Beam-Space Dictionary Perspective
TaylorBeamixer: Learning Taylor-Inspired All-Neural Multi-Channel Speech Enhancement from Beam-Space Dictionary PerspectiveInterspeech (Interspeech), 2022
Andong Li
Guochen Yu
Wenzhe Liu
Xiaodong Li
C. Zheng
174
2
0
22 Nov 2022
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid
  filtering for multi-channel speech enhancement
spatial-dccrn: dccrn equipped with frame-level angle feature and hybrid filtering for multi-channel speech enhancementSpoken Language Technology Workshop (SLT), 2022
Shubo Lv
Yihui Fu
Yukai Jv
Linfu Xie
Weixin Zhu
Wei Rao
Yannan Wang
148
11
0
17 Oct 2022
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition,
  Translation, and Understanding
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and UnderstandingInterspeech (Interspeech), 2022
Yen-Ju Lu
Xuankai Chang
Chenda Li
Wangyou Zhang
Samuele Cornell
...
Robin Scheibler
Zhong-Qiu Wang
Yu Tsao
Y. Qian
Shinji Watanabe
VLM
195
35
0
19 Jul 2022
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic
  Representation
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic RepresentationPattern Recognition Letters (PR), 2022
Eleonora Grassucci
Gioia Mancini
Christian Brignone
A. Uncini
Danilo Comminiello
152
18
0
04 Apr 2022
A Track-Wise Ensemble Event Independent Network for Polyphonic Sound
  Event Localization and Detection
A Track-Wise Ensemble Event Independent Network for Polyphonic Sound Event Localization and DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jinbo Hu
Yin Cao
Ming Wu
Qiuqiang Kong
Feiran Yang
Mark D. Plumbley
J. Yang
172
29
0
19 Mar 2022
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE
  Submission to The L3DAS22 Challenge
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 ChallengeIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Yen-Ju Lu
Samuele Cornell
Xuankai Chang
Wangyou Zhang
Chenda Li
Zhaoheng Ni
Zhong-Qiu Wang
Shinji Watanabe
127
32
0
24 Feb 2022
The PCG-AIID System for L3DAS22 Challenge: MIMO and MISO convolutional
  recurrent Network for Multi Channel Speech Enhancement and Speech Recognition
The PCG-AIID System for L3DAS22 Challenge: MIMO and MISO convolutional recurrent Network for Multi Channel Speech Enhancement and Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Jingdong Li
Yuanyuan Zhu
Dawei Luo
Yun Liu
Guohui Cui
Zhaoxia Li
183
15
0
21 Feb 2022
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based
  Acoustic Modeling for Sound Event Localization and Detection
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and DetectionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2021
Qing Wang
Jun Du
Hua-Xin Wu
Jia Pan
Feng Ma
Chin-Hui Lee
138
115
0
08 Jan 2021
1