Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.06095
Cited By
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
9 April 2024
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Modeling Duo: Towards a Universal Audio Pre-training Framework"
11 / 11 papers shown
Title
Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks
Christos Plachouras
Julien Guinot
George Fazekas
Elio Quinton
Emmanouil Benetos
Johan Pauwels
38
1
0
09 May 2025
Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis
Daisuke Niizumi
Daiki Takeuchi
Masahiro Yasuda
Binh Thien Nguyen
Yasunori Ohishi
N. Harada
27
0
0
25 Apr 2025
Myna: Masking-Based Contrastive Learning of Musical Representations
Ori Yonay
Tracy Hammond
Tianbao Yang
AAML
51
0
0
20 Feb 2025
Effective Pre-Training of Audio Transformers for Sound Event Detection
Florian Schmid
T. Morocutti
Francesco Foscarin
Jan Schluter
Paul Primus
Gerhard Widmer
ViT
23
1
0
14 Sep 2024
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Md Awsafur Rahman
Zaber Ibn Abdul Hakim
Najibul Haque Sarker
Bishmoy Paul
S. Fattah
36
6
0
26 Aug 2024
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters
Junyi Peng
Themos Stafylakis
Rongzhi Gu
Oldvrich Plchot
Ladislav Movsner
Lukávs Burget
JanHonza'' vCernocký
29
22
0
28 Oct 2022
Pretraining Respiratory Sound Representations using Metadata and Contrastive Learning
Ilyass Moummad
Nicolas Farrugia
19
17
0
27 Oct 2022
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
45
8
0
30 Sep 2022
Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li
Alexei A. Efros
Deepak Pathak
SSL
40
33
0
29 Sep 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
114
262
0
02 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
1