v1v2v3v4v5v6 (latest)

Towards Learning a Universal Non-Semantic Representation of Speech

Interspeech (Interspeech), 2020

25 February 2020

Félix de Chaumont Quitry

Papers citing "Towards Learning a Universal Non-Semantic Representation of Speech"

50 / 107 papers shown

Generalizable Audio Spoofing Detection using Non-Semantic Representations

233

29 Aug 2025

Audio Generation Through Score-Based Generative Modeling: Design Principles and Implementation

327

10 Jun 2025

Self-supervised learning method using multiple sampling strategies for general-purpose audio representationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Ibuki Kuroyanagi

Tatsuya Komatsu

SSL

197

25 May 2025

TS-SUPERB: A Target Speech Processing Benchmark for Speech Self-Supervised Learning ModelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

327

10 May 2025

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Birger Moëll

Fredrik Sand Aronsson

Per Östberg

Jonas Beskow

204

03 Mar 2025

Evaluation of Deep Audio Representations for HearablesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

Fabian Gröger

Pascal Baumann

Ludovic Amruthalingam

Laurent Simon

Ruksana Giurda

Simone Lionetti

431

10 Feb 2025

Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling

Jakob Poncelet

Hugo Van hamme

487

05 Feb 2025

The Unreliability of Acoustic Systems in Alzheimer's Speech Datasets with Heterogeneous Recording Conditions

262

11 Sep 2024

STAB: Speech Tokenizer Assessment Benchmark

Chulayuth Asawaroengchai

Kartik Audhkhasi

Andrew Rosenberg

Ankur Bapna

Bhuvana Ramabhadran

250

04 Sep 2024

ICSD: An Open-source Dataset for Infant Cry and Snoring Detection

381

20 Aug 2024

Predicting Heart Activity from Speech using Data-driven and Knowledge-based features

263

10 Jun 2024

MAD Speech: Measures of Acoustic Diversity of Speech

420

16 Apr 2024

Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders

Rohan kumar Gupta

Rohit Sinha

314

22 Mar 2024

Predicting Generalization of AI Colonoscopy Models to Unseen Data

...

260

14 Mar 2024

HeAR -- Health Acoustic Representations

...

334

04 Mar 2024

Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data

441

07 Feb 2024

Relationship between auditory and semantic entrainment using Deep Neural Networks (DNN)

Jay Kejriwal

Štefan Beňuš

232

27 Dec 2023

The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries

...

242

11 Dec 2023

Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

Dimitris Gkoumas

Matthew Purver

Maria Liakata

238

15 Oct 2023

A Digital Language Coherence Marker for Monitoring Dementia

Dimitris Gkoumas

Adam Tsakalidis

Maria Liakata

201

14 Oct 2023

Performance Conditioning for Diffusion-Based Multi-Instrument Music SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

222

21 Sep 2023

Beyond Accuracy: Measuring Representation Capacity of Embeddings to Preserve Structural and Contextual Information

Sarwan Ali

206

20 Sep 2023

Crowdotic: A Privacy-Preserving Hospital Waiting Room Crowd Density Estimation with Non-speech AudioWorkshop on Mobile Computing Systems and Applications (HotMobile), 2023

Mohammad Arif Ul Alam

Tauhidur Rahman

165

19 Sep 2023

EnCodecMAE: Leveraging neural codecs for universal audio representation learning

L. Pepino

Pablo Riera

Luciana Ferrer

297

14 Sep 2023

Optimizing Audio Augmentations for Contrastive Learning of Health-Related Acoustic Signals

Yossi Matias

252

11 Sep 2023

PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined KeywordsInterspeech (Interspeech), 2023

Yong-Hyeok Lee

Namhyun Cho

266

31 Aug 2023

MASR: Multi-label Aware Speech RepresentationAutomatic Speech Recognition & Understanding (ASRU), 2023

Anjali Raj

Shikhar Bharadwaj

Sriram Ganapathy

Min Ma

Shikhar Vashishth

SSL

216

20 Jul 2023

Representation Learning With Hidden Unit Clustering For Low Resource Speech ApplicationsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Varun Krishna

T. Sai

Sriram Ganapathy

SSL

205

14 Jul 2023

Speech-based Age and Gender Prediction with Transformers

Björn Schuller

163

29 Jun 2023

Female mosquito detection by means of AI techniques inside release containers in the context of a Sterile Insect Technique programEuropean Signal Processing Conference (EUSIPCO), 2023

Javier Naranjo-Alcazar

Jordi Grau-Haro

D. Almenar

P. Zuccarello

129

19 Jun 2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?Interspeech (Interspeech), 2023

292

14 Jun 2023

Label Aware Speech Representation Learning For Language IdentificationInterspeech (Interspeech), 2023

Shikhar Vashishth

Shikhar Bharadwaj

Sriram Ganapathy

197

07 Jun 2023

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level TasksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

442

07 Jun 2023

Automatic Data Augmentation for Domain Adapted Fine-Tuning of Self-Supervised Speech RepresentationsInterspeech (Interspeech), 2023

Salah Zaiem

Titouan Parcollet

S. Essid

235

01 Jun 2023

The Tunnel Effect: Building Data Representations in Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2023

414

31 May 2023

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech RecognitionInterspeech (Interspeech), 2023

Wangyou Zhang

Y. Qian

292

25 May 2023

Happy or Evil Laughter? Analysing a Database of Natural Audio Samples

Aljoscha Dusterhoft

Felix Burkhardt

Björn W. Schuller

134

23 May 2023

Pengi: An Audio Language Model for Audio TasksNeural Information Processing Systems (NeurIPS), 2023

528

268

19 May 2023

Self-supervised Neural Factor Analysis for Disentangling Utterance-level Speech RepresentationsInternational Conference on Machine Learning (ICML), 2023

231

14 May 2023

V2Meow: Meowing to the Visual Beat via Video-to-Music GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023

Kun Su

Judith Yue Li

Qingqing Huang

Dima Kuzmin

Joonseok Lee

...

249

11 May 2023

Emolysis: A Multimodal Open-Source Group Emotion Analysis and Visualization Toolkit

Shreya Ghosh

Zhixi Cai

Parul Gupta

Garima Sharma

Abhinav Dhall

Munawar Hayat

Tom Gedeon

215

09 May 2023

Looking Similar, Sounding Different: Leveraging Counterfactual
Cross-Modal Pairs for Audiovisual Representation Learning

465

12 Apr 2023

Designing and Evaluating Speech Emotion Recognition Systems: A reality check case study with IEMOCAPIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Nikolaos Antoniou

Athanasios Katsamanis

Theodoros Giannakopoulos

Shrikanth Narayanan

223

03 Apr 2023

Transformers in Speech Processing: A Survey

515

21 Mar 2023

Enhancing Unsupervised Audio Representation Learning via Adversarial Sample Generation

Yuxin Peng

153

15 Mar 2023

Speech Intelligibility Classifiers from 550k Disordered Speech SamplesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Subhashini Venugopalan

304

13 Mar 2023

Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical SettingsClinical Natural Language Processing Workshop (ClinicalNLP), 2023

Joel Shor

R. Bi

Subhashini Venugopalan

304

10 Mar 2023

Improving Self-Supervised Learning for Audio Representations by Feature Diversity and DecorrelationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

243

07 Mar 2023

Noise2Music: Text-conditioned Music Generation with Diffusion Models

...

492

253

08 Feb 2023

MusicLM: Generating Music From Text

...

1.1K

647

26 Jan 2023