Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.07402
Cited By
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
15 April 2022
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations"
41 / 41 papers shown
Title
Assessing the Utility of Audio Foundation Models for Heart and Respiratory Sound Analysis
Daisuke Niizumi
Daiki Takeuchi
Masahiro Yasuda
Binh Thien Nguyen
Yasunori Ohishi
N. Harada
27
0
0
25 Apr 2025
Parameter-Efficient Continual Fine-Tuning: A Survey
Eric Nuertey Coleman
Luigi Quarantiello
Ziyue Liu
Qinwen Yang
Samrat Mukherjee
J. Hurtado
Vincenzo Lomonaco
CLL
27
0
0
18 Apr 2025
Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning
Aurian Quélennec
Pierre Chouteau
Geoffroy Peeters
S. Essid
SSL
52
0
0
17 Feb 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
67
0
0
05 Feb 2025
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection
Pengfei Cai
Yan Song
Nan Jiang
Qing Gu
Ian Mcloughlin
30
2
0
26 Sep 2024
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model
Carlos Hernandez-Olivan
Marc Delcroix
Tsubasa Ochiai
Daisuke Niizumi
Naohiro Tawara
Tomohiro Nakatani
Shoko Araki
29
2
0
19 Sep 2024
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation
Alain Riou
Stefan Lattner
Gaëtan Hadjeres
Michael Anslow
Geoffroy Peeters
26
2
0
05 Aug 2024
Self-Supervised Embeddings for Detecting Individual Symptoms of Depression
Sri Harsha Dumpala
Katerina Dikaios
Abraham Nunes
Frank Rudzicz
Rudolf Uher
Sageev Oore
SSL
36
1
0
25 Jun 2024
Scaling up masked audio encoder learning for general audio classification
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
Bin Wang
22
2
0
11 Jun 2024
M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
Masahiro Yasuda
Shunsuke Tsubaki
Keisuke Imoto
VLM
31
5
0
04 Jun 2024
Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning
Alain Riou
Stefan Lattner
Gaëtan Hadjeres
Geoffroy Peeters
21
2
0
14 May 2024
Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion
Syed Hammad Ahmed
M. Khan
G. Sukthankar
21
0
0
09 May 2024
Benchmarking Representations for Speech, Music, and Acoustic Events
Moreno La Quatra
Alkis Koudounas
Lorenzo Vaiani
Elena Baralis
Luca Cagliero
Paolo Garza
Sabato Marco Siniscalchi
24
10
0
02 May 2024
Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
MedIm
19
2
0
26 Apr 2024
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
29
10
0
09 Apr 2024
On the Effect of Data-Augmentation on Local Embedding Properties in the Contrastive Learning of Music Audio Representations
Matthew C. McCallum
Matthew E. P. Davies
Florian Henkel
Jaehun Kim
Samuel E. Sandberg
33
6
0
17 Jan 2024
Singer Identity Representation Learning using Self-Supervised Techniques
Bernardo Torres
Stefan Lattner
Gaël Richard
SSL
27
8
0
10 Jan 2024
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild
Zhi-Song Liu
Robin Courant
Vicky Kalogeiton
25
6
0
08 Jan 2024
Self-Supervised Learning for Few-Shot Bird Sound Classification
Ilyass Moummad
Romain Serizel
Nicolas Farrugia
SSL
11
9
0
25 Dec 2023
On the choice of the optimal temporal support for audio classification with Pre-trained embeddings
Aurian Quélennec
Michel Olvera
Geoffroy Peeters
S. Essid
17
2
0
21 Dec 2023
Self-Supervised Learning for Anomalous Sound Detection
Kevin Wilkinghoff
29
11
0
15 Dec 2023
Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer
Bing Yang
Xiaofei Li
SSL
17
3
0
01 Dec 2023
Semi-supervised Sound Event Detection with Local and Global Consistency Regularization
Yiming Li
Xiangdong Wang
Hong Liu
Rui Tao
Long Yan
Kazushige Ouchi
13
3
0
15 Sep 2023
PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective
Alain Riou
Stefan Lattner
Gaëtan Hadjeres
Geoffroy Peeters
11
12
0
05 Sep 2023
Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive Learning
Ilyass Moummad
Romain Serizel
Nicolas Farrugia
15
2
0
02 Sep 2023
How to Scale Your EMA
Dan Busbridge
Jason Ramapuram
Pierre Ablin
Tatiana Likhomanenko
Eeshan Gunesh Dhekane
Xavier Suau
Russ Webb
25
17
0
25 Jul 2023
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
Xian Li
Nian Shao
Xiaofei Li
ViT
CLIP
8
24
0
07 Jun 2023
Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
37
3
0
23 May 2023
Environmental sound synthesis from vocal imitations and sound event labels
Yuki Okamoto
Keisuke Imoto
Shinnosuke Takamichi
Ryotaro Nagase
Takahiro Fukumori
Y. Yamashita
13
0
0
29 Apr 2023
Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play
Irmak Güzey
Ben Evans
Soumith Chintala
Lerrel Pinto
54
64
0
21 Mar 2023
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation
Bac Nguyen
Stefan Uhlich
Fabien Cardinaux
SSL
26
3
0
07 Mar 2023
Training one model to detect heart and lung sound events from single point auscultations
Leander Melms
Robert R. Ilesan
Ulrich Köhler
O. Hildebrandt
R. Conradt
...
Jürgen R. Schaefer
Tobias Müller
J. Obergassel
Nadine Schlicker
M. Hirsch
15
2
0
15 Jan 2023
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
Pritam Sarkar
Ali Etemad
19
20
0
25 Nov 2022
Self-Supervised Learning for Speech Enhancement through Synthesis
Bryce Irvin
Marko Stamenovic
M. Kegler
Li-Chia Yang
27
18
0
04 Nov 2022
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
18
29
0
26 Oct 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
19
8
0
28 Sep 2022
Representation Learning for the Automatic Indexing of Sound Effects Libraries
Alison B. Ma
Alexander Lerch
21
0
0
18 Aug 2022
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
70
41
0
26 Apr 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
99
144
0
02 Feb 2021
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
171
288
0
25 Jan 2020
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,196
0
16 Nov 2016
1