ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.11889
  4. Cited By
Exploring the Potential of SSL Models for Sound Event Detection

Exploring the Potential of SSL Models for Sound Event Detection

17 May 2025
Hanfang Cui
Longfei Song
Li Li
Dongxing Xu
Yanhua Long
ArXiv (abs)PDFHTML

Papers citing "Exploring the Potential of SSL Models for Sound Event Detection"

20 / 20 papers shown
Title
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based
  Pre-training for Sound Event Detection
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection
Pengfei Cai
Yan Song
Kang Li
Haoyu Song
Ian Mcloughlin
76
6
0
16 Aug 2024
Scaling up masked audio encoder learning for general audio
  classification
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
96
7
0
11 Jun 2024
Sound Event Bounding Boxes
Sound Event Bounding Boxes
Janek Ebbers
François Germain
Gordon Wichern
Jonathan Le Roux
76
13
0
06 Jun 2024
Fine-tune the pretrained ATST model for sound event detection
Fine-tune the pretrained ATST model for sound event detection
Nian Shao
Xian Li
Xiaofei Li
70
27
0
15 Sep 2023
Semi-supervsied Learning-based Sound Event Detection using Freuqency
  Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task
  4
Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4
Ji Woon Kim
Sangho Son
Yoon-Gue Song
Hyeongju Kim
Ilhyeon Song
Jeong Eun Lim
42
21
0
10 Jun 2023
Self-supervised Audio Teacher-Student Transformer for Both Clip-level
  and Frame-level Tasks
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
Xian Li
Nian Shao
Xiaofei Li
ViTCLIP
103
28
0
07 Jun 2023
Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot
  Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Description and Discussion on DCASE 2023 Challenge Task 2: First-Shot Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Kota Dohi
Keisuke Imoto
Noboru Harada
Daisuke Niizumi
Yuma Koizumi
Tomoya Nishida
Harsh Purohit
Ryo Tanabe
Takashi Endo
Yohei Kawaguchi
63
42
0
13 May 2023
BEATs: Audio Pre-Training with Acoustic Tokenizers
BEATs: Audio Pre-Training with Acoustic Tokenizers
Sanyuan Chen
Yu-Huan Wu
Chengyi Wang
Shujie Liu
Daniel C. Tompkins
Zhuo Chen
Furu Wei
124
299
0
18 Dec 2022
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
135
108
0
06 Mar 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViTTPM
507
7,865
0
11 Nov 2021
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
294
1,911
0
26 Oct 2021
HuBERT: Self-Supervised Speech Representation Learning by Masked
  Prediction of Hidden Units
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu
Benjamin Bolte
Yao-Hung Hubert Tsai
Kushal Lakhotia
Ruslan Salakhutdinov
Abdel-rahman Mohamed
SSL
190
3,013
0
14 Jun 2021
AST: Audio Spectrogram Transformer
AST: Audio Spectrogram Transformer
Yuan Gong
Yu-An Chung
James R. Glass
ViT
200
887
0
05 Apr 2021
Joint framework with deep feature distillation and adaptive focal loss
  for weakly supervised audio tagging and acoustic event detection
Joint framework with deep feature distillation and adaptive focal loss for weakly supervised audio tagging and acoustic event detection
Yunhao Liang
Yanhua Long
Yijie Li
Jiaen Liang
Yuping Wang
32
9
0
23 Mar 2021
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech
  Representations
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
Alexei Baevski
Henry Zhou
Abdel-rahman Mohamed
Michael Auli
SSL
323
5,868
0
20 Jun 2020
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern
  Recognition
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
Qiuqiang Kong
Yin Cao
Turab Iqbal
Yuxuan Wang
Wenwu Wang
Mark D. Plumbley
VLMSSL
246
1,090
0
21 Dec 2019
A Framework for the Robust Evaluation of Sound Event Detection
A Framework for the Robust Evaluation of Sound Event Detection
Cagdas Bilen
Giacomo Ferroni
Francesco Tuveri
Juan Azcarreta
Sacha Krstulović
95
165
0
18 Oct 2019
Representation Learning with Contrastive Predictive Coding
Representation Learning with Contrastive Predictive Coding
Aaron van den Oord
Yazhe Li
Oriol Vinyals
DRLSSL
360
10,385
0
10 Jul 2018
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
939
133,201
0
12 Jun 2017
An End-to-End Trainable Neural Network for Image-based Sequence
  Recognition and Its Application to Scene Text Recognition
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
Baoguang Shi
X. Bai
Cong Yao
VLM
253
2,499
0
21 Jul 2015
1