v1v2 (latest)

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation

Interspeech (Interspeech), 2020

27 July 2020

Yossi Adi

Papers citing "Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation"

44 / 44 papers shown

Late Fusion and Multi-Level Fission Amplify Cross-Modal Transfer in Text-Speech LMs

400

08 Mar 2025

Unsupervised Speech Segmentation: A General Approach Using Speech Language Models

Avishai Elmakies

Omri Abend

Yossi Adi

370

08 Jan 2025

A Simple HMM with Self-Supervised Representations for Phone SegmentationSpoken Language Technology Workshop (SLT), 2024

Gene-Ping Yang

Hao Tang

SSL

292

15 Sep 2024

Speaker- and Text-Independent Estimation of Articulatory Movements and Phoneme Alignments from Speech

Tobias Weise

P. Klumpp

Kubilay Can Demir

Paula Andrea Pérez-Toro

Maria Schuster

E. Noeth

Bjoern Heismann

Andreas Maier

Seung Hee Yang

225

03 Jul 2024

Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling

Injune Hwang

Kyogu Lee

214

01 Apr 2024

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic PiecesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

Heng-Jui Chang

James R. Glass

360

15 Nov 2023

The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any languageNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

291

14 Nov 2023

Towards Matching Phones and Speech RepresentationsAutomatic Speech Recognition & Understanding (ASRU), 2023

Gene-Ping Yang

Hao Tang

SSL

243

26 Oct 2023

Generative Spoken Language Model based on continuous word-sized audio tokensConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yossi Adi

303

08 Oct 2023

Unsupervised Speech Recognition with N-Skipgram and Positional Unigram MatchingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

341

03 Oct 2023

Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction AttacksNetwork and Distributed System Security Symposium (NDSS), 2023

Shu Wang

Kun Sun

Qi Li

AAML

223

18 Aug 2023

What Do Self-Supervised Speech Models Know About Words?Transactions of the Association for Computational Linguistics (TACL), 2023

624

30 Jun 2023

In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis

N. Prabhu

N. Lehmann-Willenbrock

Timo Gerkmann

231

02 Jun 2023

Weakly-supervised forced alignment of disfluent speech using phoneme-level modelingInterspeech (Interspeech), 2023

Theodoros Kouzelis

Georgios Paraskevopoulos

Athanasios Katsamanis

Vassilis Katsouros

347

30 May 2023

Unsupervised Word Segmentation Using Temporal Gradient Pseudo-LabelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

T. Fuchs

Yedid Hoshen

230

30 Mar 2023

Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource LanguagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

202

28 Mar 2023

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representationsFindings (Findings), 2023

441

01 Mar 2023

Analysing Discrete Self Supervised Speech Representation for Spoken Language ModelingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Amitay Sicherman

Yossi Adi

343

02 Jan 2023

Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metricIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022

Hyeongju Kim

Hyeong-Seok Choi

163

13 Dec 2022

Efficient Transformers with Dynamic Token PoolingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

322

17 Nov 2022

Phoneme Segmentation Using Self-Supervised Speech ModelsSpoken Language Technology Workshop (SLT), 2022

Luke Strgar

David Harwath

SSL

242

02 Nov 2022

AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022

Devi Parikh

Yossi Adi

516

426

30 Sep 2022

Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial NetworkIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

197

29 Jul 2022

Unsupervised Symbolic Music Segmentation using Ensemble Temporal Prediction ErrorsInterspeech (Interspeech), 2022

Shahaf Bassan

Yossi Adi

J. Rosenschein

201

02 Jul 2022

DDKtor: Automatic Diadochokinetic Speech AnalysisInterspeech (Interspeech), 2022

136

29 Jun 2022

Variable-rate hierarchical CPC leads to acoustic unit discovery in speechNeural Information Processing Systems (NeurIPS), 2022

329

05 Jun 2022

Unsupervised Word Segmentation using K Nearest NeighborsInterspeech (Interspeech), 2022

195

27 Apr 2022

Self-supervised Speaker DiarizationInterspeech (Interspeech), 2022

Yehoshua Dissen

Felix Kreuk

Joseph Keshet

234

08 Apr 2022

Towards End-to-end Unsupervised Speech RecognitionSpoken Language Technology Workshop (SLT), 2022

266

05 Apr 2022

A Brief Overview of Unsupervised Neural Speech Representation Learning

Lasse Borgholt

Jakob Drachmann Havtorn

266

01 Mar 2022

Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised ScoringIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Herman Kamper

350

24 Feb 2022

Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive MatricesACM Computing Surveys (ACM CSUR), 2022

Mikolaj Malkiñski

Jacek Mańdziuk

549

28 Jan 2022

Phone-to-audio alignment without text: A Semi-supervised ApproachIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

Jian Zhu

Cong Zhang

David Jurgens

307

08 Oct 2021

Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding

Saurabhchand Bhati

Jesús Villalba

Piotr Żelasko

Laureano Moro-Velazquez

Najim Dehak

SSL

402

05 Oct 2021

Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language

C. Jacobs

Herman Kamper

300

24 Jun 2021

Unsupervised Automatic Speech Recognition: A ReviewSpeech Communication (Speech Commun.), 2021

197

09 Jun 2021

Segmental Contrastive Predictive Coding for Unsupervised Word SegmentationInterspeech (Interspeech), 2021

Saurabhchand Bhati

Jesús Villalba

Piotr Żelasko

Laureano Moro-Velazquez

Najim Dehak

SSL

237

03 Jun 2021

Unsupervised Speech RecognitionNeural Information Processing Systems (NeurIPS), 2021

478

295

24 May 2021

Speech Resynthesis from Discrete Disentangled Self-Supervised RepresentationsInterspeech (Interspeech), 2021

Yossi Adi

511

381

01 Apr 2021

Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptationSpoken Language Technology Workshop (SLT), 2021

C. Jacobs

Yevgen Matusevych

Herman Kamper

335

19 Mar 2021

Double Articulation Analyzer with Prosody for Unsupervised Word and Phoneme DiscoveryIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2021

Yasuaki Okuda

Ryo Ozaki

T. Taniguchi

344

15 Mar 2021

CDPAM: Contrastive learning for perceptual audio similarityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

283

09 Feb 2021

Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networksInterspeech (Interspeech), 2020

Herman Kamper

Benjamin van Niekerk

SSL MQ

349

14 Dec 2020

Similarity Analysis of Self-Supervised Speech Representations

428

22 Oct 2020