ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.06695
  4. Cited By
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio
  Representation

BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation

11 March 2021
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
    SSL
ArXivPDFHTML

Papers citing "BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation"

50 / 103 papers shown
Title
Enhancing Unsupervised Audio Representation Learning via Adversarial
  Sample Generation
Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation
Yulin Pan
Xiangteng He
Biao Gong
Yuxin Peng
Yiliang Lv
SSL
14
0
0
15 Mar 2023
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet
  Tag-guided Synthetic Data
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data
Xuenan Xu
Zhiling Zhang
Zelin Zhou
Pingyue Zhang
Zeyu Xie
Mengyue Wu
Ke Zhu
CLIP
58
14
0
14 Mar 2023
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
19
0
0
10 Mar 2023
Low-Complexity Audio Embedding Extractors
Low-Complexity Audio Embedding Extractors
Florian Schmid
Khaled Koutini
Gerhard Widmer
11
4
0
03 Mar 2023
Randomized Quantization: A Generic Augmentation for Data Agnostic
  Self-supervised Learning
Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised Learning
Huimin Wu
Chenyang Lei
Xiao Sun
Pengju Wang
Qifeng Chen
Kwang-Ting Cheng
Stephen Lin
Zhirong Wu
MQ
19
5
0
19 Dec 2022
BEATs: Audio Pre-Training with Acoustic Tokenizers
BEATs: Audio Pre-Training with Acoustic Tokenizers
Sanyuan Chen
Yu-Huan Wu
Chengyi Wang
Shujie Liu
Daniel C. Tompkins
Zhuo Chen
Furu Wei
22
253
0
18 Dec 2022
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event
  Classification
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification
Sara Atito
Muhammad Awais
Wenwu Wang
Mark D. Plumbley
J. Kittler
ViT
11
9
0
23 Nov 2022
Efficient Speech Quality Assessment using Self-supervised Framewise
  Embeddings
Efficient Speech Quality Assessment using Self-supervised Framewise Embeddings
Karl El Hajal
Zihan Wu
Neil Scheidwasser
Gasser Elbanna
Milos Cernak
18
9
0
12 Nov 2022
SLICER: Learning universal audio representations using low-resource
  self-supervised pre-training
SLICER: Learning universal audio representations using low-resource self-supervised pre-training
Ashish Seth
Sreyan Ghosh
S. Umesh
Dinesh Manocha
SSL
17
2
0
02 Nov 2022
MAST: Multiscale Audio Spectrogram Transformers
MAST: Multiscale Audio Spectrogram Transformers
Sreyan Ghosh
Ashish Seth
S. Umesh
Dinesh Manocha
22
3
0
02 Nov 2022
Improving generalizability of distilled self-supervised speech
  processing models under distorted settings
Improving generalizability of distilled self-supervised speech processing models under distorted settings
Kuan-Po Huang
Yu-Kuan Fu
Tsung-Yuan Hsu
Fabian Ritter Gutierrez
Fan Wang
Liang-Hsuan Tseng
Yu Zhang
Hung-yi Lee
11
13
0
14 Oct 2022
Supervised and Unsupervised Learning of Audio Representations for Music
  Understanding
Supervised and Unsupervised Learning of Audio Representations for Music Understanding
Matthew C. McCallum
Filip Korzeniowski
Sergio Oramas
F. Gouyon
Andreas F. Ehmann
SSL
76
36
0
07 Oct 2022
That Sounds Right: Auditory Self-Supervision for Dynamic Robot
  Manipulation
That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation
Abitha Thankaraj
Lerrel Pinto
33
13
0
03 Oct 2022
An empirical study of weakly supervised audio tagging embeddings for
  general audio representations
An empirical study of weakly supervised audio tagging embeddings for general audio representations
Heinrich Dinkel
Zhiyong Yan
Yongqing Wang
Junbo Zhang
Yujun Wang
14
1
0
30 Sep 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
21
8
0
28 Sep 2022
Equivariant Self-Supervision for Musical Tempo Estimation
Equivariant Self-Supervision for Musical Tempo Estimation
Elio Quinton
28
9
0
03 Sep 2022
Contrastive Audio-Language Learning for Music
Contrastive Audio-Language Learning for Music
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
25
44
0
25 Aug 2022
SampleMatch: Drum Sample Retrieval by Musical Context
SampleMatch: Drum Sample Retrieval by Musical Context
Stefan Lattner
16
7
0
01 Aug 2022
SimCURL: Simple Contrastive User Representation Learning from Command
  Sequences
SimCURL: Simple Contrastive User Representation Learning from Command Sequences
Hang Chu
Amir Hosein Khasahmadi
Karl D. D. Willis
Fraser Anderson
Yaoli Mao
Linh-Tam Tran
Justin Matejka
Jo Vermeulen
SSL
17
2
0
29 Jul 2022
Semi-supervised cross-lingual speech emotion recognition
Semi-supervised cross-lingual speech emotion recognition
Mirko Agarla
Simone Bianco
Luigi Celona
Paolo Napoletano
A. Petrovsky
Flavio Piccoli
Raimondo Schettini
I. Shanin
11
14
0
14 Jul 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
24
21
0
24 Jun 2022
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Learning Behavior Representations Through Multi-Timescale Bootstrapping
Mehdi Azabou
Michael J. Mendelson
Maks Sorokin
S. Thakoor
Nauman Ahad
Carolina Urzay
Eva L. Dyer
AI4CE
21
6
0
14 Jun 2022
Composing General Audio Representation by Fusing Multilayer Features of
  a Pre-trained Model
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
8
5
0
17 May 2022
Learning Representations for New Sound Classes With Continual
  Self-Supervised Learning
Learning Representations for New Sound Classes With Continual Self-Supervised Learning
Zhepei Wang
Cem Subakan
Xilin Jiang
Junkai Wu
Efthymios Tzinis
Mirco Ravanelli
Paris Smaragdis
CLL
SSL
57
19
0
15 May 2022
AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect
  for Remote Work
AVCAffe: A Large Scale Audio-Visual Dataset of Cognitive Load and Affect for Remote Work
Pritam Sarkar
A. Posen
Ali Etemad
32
9
0
13 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
19
34
0
12 May 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning
  General-purpose Audio Representation
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
13
65
0
26 Apr 2022
ATST: Audio Representation Learning with Teacher-Student Transformer
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li
Xiaofei Li
ViT
12
20
0
26 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio
  Representations
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
29
53
0
15 Apr 2022
Self-supervised learning for robust voice cloning
Self-supervised learning for robust voice cloning
Konstantinos Klapsas
Nikolaos Ellinas
Karolos Nikitaras
G. Vamvoukakis
Panos Kakoulidis
...
S. Raptis
June Sig Sung
Gunu Jho
Aimilios Chalamandaris
Pirros Tsiakoulis
SSL
17
6
0
07 Apr 2022
Hybrid Handcrafted and Learnable Audio Representation for Analysis of
  Speech Under Cognitive and Physical Load
Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Gasser Elbanna
A. Biryukov
Neil Scheidwasser
Lara Orlandic
Pablo Mainar
M. Kegler
P. Beckmann
Milos Cernak
4
11
0
30 Mar 2022
Learning neural audio features without supervision
Learning neural audio features without supervision
Sarthak Yadav
Neil Zeghidour
SSL
30
4
0
29 Mar 2022
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio
  Representation Learning
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning
Sreyan Ghosh
Ashish Seth
and Deepak Mittal
Maneesh Singh
S. Umesh
SSL
17
6
0
25 Mar 2022
Contrastive Learning with Positive-Negative Frame Mask for Music
  Representation
Contrastive Learning with Positive-Negative Frame Mask for Music Representation
D. Yao
Zhou Zhao
Shengyu Zhang
Jieming Zhu
Yudong Zhu
Rui Zhang
Xiuqiang He
12
21
0
17 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
23
99
0
06 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
22
106
0
02 Mar 2022
Similarity learning for wells based on logging data
Similarity learning for wells based on logging data
Evgenia Romanenkova
Alina Rogulina
A. Shakirov
N. Stulov
Alexey Zaytsev
L. Ismailova
D. Kovalev
Klemens Katterbauer
A. Alshehri
9
16
0
11 Feb 2022
Self-supervised Graphs for Audio Representation Learning with Limited
  Labeled Data
Self-supervised Graphs for Audio Representation Learning with Limited Labeled Data
A. Shirian
Krishna Somandepalli
T. Guha
SSL
38
10
0
31 Jan 2022
Leveraging Real Talking Faces via Self-Supervision for Robust Forgery
  Detection
Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection
A. Haliassos
Rodrigo Mira
Stavros Petridis
M. Pantic
CVBM
22
123
0
18 Jan 2022
Augmented Contrastive Self-Supervised Learning for Audio Invariant
  Representations
Augmented Contrastive Self-Supervised Learning for Audio Invariant Representations
M Motavali Emami
Dung T. Tran
K. Koishida
SSL
11
2
0
21 Dec 2021
Towards Learning Universal Audio Representations
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
16
68
0
23 Nov 2021
Self-Supervised Audio-Visual Representation Learning with Relaxed
  Cross-Modal Synchronicity
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
Pritam Sarkar
Ali Etemad
SSL
18
11
0
09 Nov 2021
SSAST: Self-Supervised Audio Spectrogram Transformer
SSAST: Self-Supervised Audio Spectrogram Transformer
Yuan Gong
Cheng-I Jeff Lai
Yu-An Chung
James R. Glass
ViT
19
266
0
19 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio
  Representations
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
23
12
0
17 Oct 2021
SERAB: A multi-lingual benchmark for speech emotion recognition
SERAB: A multi-lingual benchmark for speech emotion recognition
Neil Scheidwasser
M. Kegler
P. Beckmann
Milos Cernak
16
44
0
07 Oct 2021
Cross-domain Semi-Supervised Audio Event Classification Using
  Contrastive Regularization
Cross-domain Semi-Supervised Audio Event Classification Using Contrastive Regularization
Donmoon Lee
Kyogu Lee
13
3
0
29 Sep 2021
Performance-Efficiency Trade-offs in Unsupervised Pre-training for
  Speech Recognition
Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition
Felix Wu
Kwangyoun Kim
Jing Pan
Kyu Jeong Han
Kilian Q. Weinberger
Yoav Artzi
14
71
0
14 Sep 2021
Computer Vision Self-supervised Learning Methods on Time Series
Computer Vision Self-supervised Learning Methods on Time Series
Daesoo Lee
Technology
AI4TS
23
4
0
02 Sep 2021
One Billion Audio Sounds from GPU-enabled Modular Synthesis
One Billion Audio Sounds from GPU-enabled Modular Synthesis
Joseph P. Turian
Jordie Shier
George Tzanetakis
K. McNally
Max Henry
17
22
0
27 Apr 2021
Adversarially learning disentangled speech representations for robust
  multi-factor voice conversion
Adversarially learning disentangled speech representations for robust multi-factor voice conversion
Jie Wang
Jingbei Li
Xintao Zhao
Zhiyong Wu
Shiyin Kang
H. Meng
DRL
21
29
0
30 Jan 2021
Previous
123
Next