Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.12764
Cited By
Towards Learning a Universal Non-Semantic Representation of Speech
25 February 2020
Joel Shor
A. Jansen
Ronnie Maor
Oran Lang
Omry Tuval
Félix de Chaumont Quitry
Marco Tagliasacchi
Ira Shavitt
Dotan Emanuel
Yinnon A. Haviv
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Learning a Universal Non-Semantic Representation of Speech"
50 / 105 papers shown
Title
Active Learning of Non-semantic Speech Tasks with Pretrained Models
Harlin Lee
Aaqib Saeed
Andrea L. Bertozzi
VLM
14
2
0
31 Oct 2022
Robust, General, and Low Complexity Acoustic Scene Classification Systems and An Effective Visualization for Presenting a Sound Scene Context
L. D. Pham
Dusan Salovic
Anahid N. Jalali
Alexander Schindler
Khoa Tran
H. Vu
Phu X. Nguyen
21
5
0
16 Oct 2022
On the Utility of Self-supervised Models for Prosody-related Tasks
Guan-Ting Lin
Chiyu Feng
Wei-Ping Huang
Yuan Tseng
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Nigel G. Ward
23
47
0
13 Oct 2022
MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Andrés Vasco-Carofilis
Laura Fernández-Robles
Enrique Alegre
Eduardo FIDALGO
40
1
0
28 Sep 2022
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices
Harlin Lee
Aaqib Saeed
19
2
0
12 Jul 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
31
21
0
24 Jun 2022
Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Einari Vaaras
Manu Airaksinen
Okko Rasanen
17
5
0
21 Jun 2022
Multi-instrument Music Synthesis with Spectrogram Diffusion
Curtis Hawthorne
Ian Simon
Adam Roberts
Neil Zeghidour
Josh Gardner
Ethan Manilow
Jesse Engel
DiffM
21
48
0
11 Jun 2022
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
128
349
0
21 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
22
5
0
17 May 2022
Bias and Fairness on Multimodal Emotion Detection Algorithms
Matheus Schmitz
Rehan Ahmed
Jim Cao
FaML
47
12
0
11 May 2022
Masked Spectrogram Modeling using Masked Autoencoders for Learning General-purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
29
65
0
26 Apr 2022
ATST: Audio Representation Learning with Teacher-Student Transformer
Xian Li
Xiaofei Li
ViT
23
20
0
26 Apr 2022
BYOL for Audio: Exploring Pre-trained General-purpose Audio Representations
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
36
53
0
15 Apr 2022
Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning
Salah Zaiem
Titouan Parcollet
S. Essid
SSL
12
6
0
08 Apr 2022
Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Gasser Elbanna
A. Biryukov
Neil Scheidwasser
Lara Orlandic
Pablo Mainar
M. Kegler
P. Beckmann
Milos Cernak
17
11
0
30 Mar 2022
DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning
Sreyan Ghosh
Ashish Seth
and Deepak Mittal
Maneesh Singh
S. Umesh
SSL
27
6
0
25 Mar 2022
Federated Self-Supervised Learning for Acoustic Event Classification
Meng Feng
Chieh-Chi Kao
Qingming Tang
Ming Sun
Viktor Rozgic
Spyros Matsoukas
Chao Wang
41
11
0
22 Mar 2022
HEAR: Holistic Evaluation of Audio Representations
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
33
99
0
06 Mar 2022
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
35
106
0
02 Mar 2022
Towards a Common Speech Analysis Engine
Hagai Aronowitz
Itai Gat
E. Morais
Weizhong Zhu
R. Hoory
18
3
0
01 Mar 2022
TRILLsson: Distilled Universal Paralinguistic Speech Representations
Joel Shor
Subhashini Venugopalan
17
37
0
01 Mar 2022
Audio-Based Deep Learning Frameworks for Detecting COVID-19
Dat Ngo
L. D. Pham
Hoang Van Truong
Ş. Kolozali
D. Jarchi
23
4
0
10 Feb 2022
A Pre-trained Audio-Visual Transformer for Emotion Recognition
Minh Tran
M. Soleymani
58
25
0
23 Jan 2022
Sound-Dr: Reliable Sound Dataset and Baseline Artificial Intelligence System for Respiratory Illnesses
Hoang Van Truong
Quang-Huy Nguyen
Cuong Q. Nguyen
Phong X. Nguyen
Hoang-Dung Nguyen
21
2
0
12 Jan 2022
Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning
Utku Evci
Vincent Dumoulin
Hugo Larochelle
Michael C. Mozer
25
83
0
10 Jan 2022
Chimpanzee voice prints? Insights from transfer learning experiments from human voices
Maël Leroux
Orestes Uxio Gutierrez Al-Khudhairy
N. Perony
S. Townsend
14
7
0
15 Dec 2021
Transfer Learning with Jukebox for Music Source Separation
W. Z. E. Amri
Oliver Tautz
Helge J. Ritter
Andrew Melnik
68
7
0
28 Nov 2021
Towards Learning Universal Audio Representations
Luyu Wang
Pauline Luc
Yan Wu
Adrià Recasens
Lucas Smaira
...
Andrew Jaegle
Jean-Baptiste Alayrac
Sander Dieleman
João Carreira
Aaron van den Oord
SSL
26
68
0
23 Nov 2021
Detecting Dementia from Speech and Transcripts using Transformers
Loukas Ilias
D. Askounis
J. Psarras
11
32
0
27 Oct 2021
DECAR: Deep Clustering for learning general-purpose Audio Representations
Sreyan Ghosh
Sandesh V Katta
Ashish Seth
S. Umesh
SSL
36
12
0
17 Oct 2021
Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
Joel Shor
A. Jansen
Wei Han
Daniel S. Park
Yu Zhang
SSL
AI4TS
33
54
0
09 Oct 2021
SERAB: A multi-lingual benchmark for speech emotion recognition
Neil Scheidwasser
M. Kegler
P. Beckmann
Milos Cernak
32
44
0
07 Oct 2021
A Cough-based deep learning framework for detecting COVID-19
Hoang Van Truong
L. D. Pham
Dat Ngo
Hoang-Dung Nguyen
29
7
0
07 Oct 2021
VoxCeleb Enrichment for Age and Gender Recognition
Khaled Hechmi
Trung Ngo Trong
Ville Hautamaki
Tomi Kinnunen
14
29
0
28 Sep 2021
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Yu Zhang
Daniel S. Park
Wei Han
James Qin
Anmol Gulati
...
Zhifeng Chen
Quoc V. Le
Chung-Cheng Chiu
Ruoming Pang
Yonghui Wu
SSL
19
175
0
27 Sep 2021
Scenario Aware Speech Recognition: Advancements for Apollo Fearless Steps & CHiME-4 Corpora
Szu-Jui Chen
Wei Xia
John H. L. Hansen
30
9
0
23 Sep 2021
A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis
Dimitris Gkoumas
Bo Wang
Adam Tsakalidis
M. Wolters
A. Zubiaga
Matthew Purver
M. Liakata
19
8
0
03 Sep 2021
Learning De-identified Representations of Prosody from Raw Audio
J. Weston
R. Lenain
U. Meepegama
E. Fristed
SSL
24
15
0
17 Jul 2021
Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Subhashini Venugopalan
Joel Shor
Manoj Plakal
Jimmy Tobin
Katrin Tomanek
Jordan R. Green
Michael P. Brenner
27
12
0
08 Jul 2021
Pretext Tasks selection for multitask self-supervised speech representation learning
Salah Zaiem
Titouan Parcollet
S. Essid
Abdel Heba
SSL
14
12
0
01 Jul 2021
Temporal Convolution Networks with Positional Encoding for Evoked Expression Estimation
V. Huynh
Gueesang Lee
Hyung-Jeong Yang
Soohyung Kim
20
1
0
16 Jun 2021
Teaching keyword spotters to spot new keywords with limited examples
Abhijeet Awasthi
Kevin Kilgour
H. Rom
18
17
0
04 Jun 2021
Self-Supervised Learning from Automatically Separated Sound Scenes
Eduardo Fonseca
A. Jansen
D. Ellis
Scott Wisdom
Marco Tagliasacchi
J. Hershey
Manoj Plakal
Shawn Hershey
R. C. Moore
Xavier Serra
SSL
31
13
0
05 May 2021
SUPERB: Speech processing Universal PERformance Benchmark
Shu-Wen Yang
Po-Han Chi
Yung-Sung Chuang
Cheng-I Jeff Lai
Kushal Lakhotia
...
Shuyan Dong
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
SSL
39
887
0
03 May 2021
Multimodal Self-Supervised Learning of General Audio Representations
Luyu Wang
Pauline Luc
Adrià Recasens
Jean-Baptiste Alayrac
Aaron van den Oord
SSL
78
41
0
26 Apr 2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
SSL
38
175
0
11 Mar 2021
PSLA: Improving Audio Tagging with Pretraining, Sampling, Labeling, and Aggregation
Yuan Gong
Yu-An Chung
James R. Glass
VLM
104
144
0
02 Feb 2021
Learning Efficient Representations for Keyword Spotting with Triplet Loss
R. Vygon
N. Mikhaylovskiy
DML
SSL
60
64
0
12 Jan 2021
A Multi-modal Deep Learning Model for Video Thumbnail Selection
Zhifeng Yu
Nanchun Shi
ViT
9
3
0
31 Dec 2020
Previous
1
2
3
Next