ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.13465
  4. Cited By
Self-Supervised Contrastive Learning for Unsupervised Phoneme
  Segmentation
v1v2 (latest)

Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation

Interspeech (Interspeech), 2020
27 July 2020
Felix Kreuk
Joseph Keshet
Yossi Adi
    SSL
ArXiv (abs)PDFHTML

Papers citing "Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation"

44 / 44 papers shown
Late Fusion and Multi-Level Fission Amplify Cross-Modal Transfer in Text-Speech LMs
Late Fusion and Multi-Level Fission Amplify Cross-Modal Transfer in Text-Speech LMs
Santiago Cuervo
Adel Moumen
Yanis Labrak
Sameer Khurana
Antoine Laurent
Mickael Rouvier
Phil Woodland
R. Marxer
382
1
0
08 Mar 2025
Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Unsupervised Speech Segmentation: A General Approach Using Speech Language Models
Avishai Elmakies
Omri Abend
Yossi Adi
369
2
0
08 Jan 2025
A Simple HMM with Self-Supervised Representations for Phone Segmentation
A Simple HMM with Self-Supervised Representations for Phone SegmentationSpoken Language Technology Workshop (SLT), 2024
Gene-Ping Yang
Hao Tang
SSL
292
1
0
15 Sep 2024
Speaker- and Text-Independent Estimation of Articulatory Movements and
  Phoneme Alignments from Speech
Speaker- and Text-Independent Estimation of Articulatory Movements and Phoneme Alignments from Speech
Tobias Weise
P. Klumpp
Kubilay Can Demir
Paula Andrea Pérez-Toro
Maria Schuster
E. Noeth
Bjoern Heismann
Andreas Maier
Seung Hee Yang
221
1
0
03 Jul 2024
Removing Speaker Information from Speech Representation using
  Variable-Length Soft Pooling
Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling
Injune Hwang
Kyogu Lee
214
1
0
01 Apr 2024
R-Spin: Efficient Speaker and Noise-invariant Representation Learning
  with Acoustic Pieces
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic PiecesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Heng-Jui Chang
James R. Glass
357
8
0
15 Nov 2023
The taste of IPA: Towards open-vocabulary keyword spotting and forced
  alignment in any language
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any languageNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jian Zhu
Changbing Yang
Farhan Samir
Jahurul Islam
289
20
0
14 Nov 2023
Towards Matching Phones and Speech Representations
Towards Matching Phones and Speech RepresentationsAutomatic Speech Recognition & Understanding (ASRU), 2023
Gene-Ping Yang
Hao Tang
SSL
240
1
0
26 Oct 2023
Generative Spoken Language Model based on continuous word-sized audio
  tokens
Generative Spoken Language Model based on continuous word-sized audio tokensConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Robin Algayres
Yossi Adi
Tu Nguyen
Jade Copet
Gabriel Synnaeve
Benoît Sagot
Emmanuel Dupoux
AuLLM
300
22
0
08 Oct 2023
Unsupervised Speech Recognition with N-Skipgram and Positional Unigram
  Matching
Unsupervised Speech Recognition with N-Skipgram and Positional Unigram MatchingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Liming Wang
M. Hasegawa-Johnson
Chang D. Yoo
SSL
338
5
0
03 Oct 2023
Compensating Removed Frequency Components: Thwarting Voice Spectrum
  Reduction Attacks
Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction AttacksNetwork and Distributed System Security Symposium (NDSS), 2023
Shu Wang
Kun Sun
Qi Li
AAML
217
1
0
18 Aug 2023
What Do Self-Supervised Speech Models Know About Words?
What Do Self-Supervised Speech Models Know About Words?Transactions of the Association for Computational Linguistics (TACL), 2023
Ankita Pasad
C. Chien
Shane Settle
Karen Livescu
SSL
610
63
0
30 Jun 2023
In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised
  Representations and Neural Vocoder-based Resynthesis
In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis
N. Prabhu
N. Lehmann-Willenbrock
Timo Gerkmann
231
4
0
02 Jun 2023
Weakly-supervised forced alignment of disfluent speech using
  phoneme-level modeling
Weakly-supervised forced alignment of disfluent speech using phoneme-level modelingInterspeech (Interspeech), 2023
Theodoros Kouzelis
Georgios Paraskevopoulos
Athanasios Katsamanis
Vassilis Katsouros
344
9
0
30 May 2023
Unsupervised Word Segmentation Using Temporal Gradient Pseudo-Labels
Unsupervised Word Segmentation Using Temporal Gradient Pseudo-LabelsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
T. Fuchs
Yedid Hoshen
228
7
0
30 Mar 2023
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low
  Resource Languages
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource LanguagesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Seong-Hyun Park
Myungseo Song
Bohyung Kim
Tae-Hyun Oh
198
2
0
28 Mar 2023
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised
  representations
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representationsFindings (Findings), 2023
N. Shah
Saiteja Kosgi
Vishal Tambrahalli
Neha Sahipjohn
Anil Nelakanti
Vineet Gandhi
439
11
0
01 Mar 2023
Analysing Discrete Self Supervised Speech Representation for Spoken
  Language Modeling
Analysing Discrete Self Supervised Speech Representation for Spoken Language ModelingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Amitay Sicherman
Yossi Adi
330
56
0
02 Jan 2023
Towards trustworthy phoneme boundary detection with autoregressive model
  and improved evaluation metric
Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metricIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Hyeongju Kim
Hyeong-Seok Choi
160
2
0
13 Dec 2022
Efficient Transformers with Dynamic Token Pooling
Efficient Transformers with Dynamic Token PoolingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Piotr Nawrot
J. Chorowski
Adrian Lañcucki
Edoardo Ponti
313
77
0
17 Nov 2022
Phoneme Segmentation Using Self-Supervised Speech Models
Phoneme Segmentation Using Self-Supervised Speech ModelsSpoken Language Technology Workshop (SLT), 2022
Luke Strgar
David Harwath
SSL
236
13
0
02 Nov 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio GenerationInternational Conference on Learning Representations (ICLR), 2022
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
511
422
0
30 Sep 2022
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based
  on Generative Adversarial Network
Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial NetworkIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Da-Rong Liu
Po-Chun Hsu
Yi-Chen Chen
Sung-Feng Huang
Shun-Po Chuang
Da-Yi Wu
Hung-yi Lee
GAN
197
8
0
29 Jul 2022
Unsupervised Symbolic Music Segmentation using Ensemble Temporal
  Prediction Errors
Unsupervised Symbolic Music Segmentation using Ensemble Temporal Prediction ErrorsInterspeech (Interspeech), 2022
Shahaf Bassan
Yossi Adi
J. Rosenschein
196
7
0
02 Jul 2022
DDKtor: Automatic Diadochokinetic Speech Analysis
DDKtor: Automatic Diadochokinetic Speech AnalysisInterspeech (Interspeech), 2022
Yael Segal
Kasia Hitczenko
Matthew A. Goldrick
Adam Buchwald
A. Roberts
Joseph Keshet
126
2
0
29 Jun 2022
Variable-rate hierarchical CPC leads to acoustic unit discovery in
  speech
Variable-rate hierarchical CPC leads to acoustic unit discovery in speechNeural Information Processing Systems (NeurIPS), 2022
Santiago Cuervo
Adrian Lañcucki
R. Marxer
Paweł Rychlikowski
J. Chorowski
SSL
323
18
0
05 Jun 2022
Unsupervised Word Segmentation using K Nearest Neighbors
Unsupervised Word Segmentation using K Nearest NeighborsInterspeech (Interspeech), 2022
T. Fuchs
Yedid Hoshen
Joseph Keshet
SSL
187
6
0
27 Apr 2022
Self-supervised Speaker Diarization
Self-supervised Speaker DiarizationInterspeech (Interspeech), 2022
Yehoshua Dissen
Felix Kreuk
Joseph Keshet
228
6
0
08 Apr 2022
Towards End-to-end Unsupervised Speech Recognition
Towards End-to-end Unsupervised Speech RecognitionSpoken Language Technology Workshop (SLT), 2022
Alexander H. Liu
Wei-Ning Hsu
Michael Auli
Alexei Baevski
SSL
264
85
0
05 Apr 2022
A Brief Overview of Unsupervised Neural Speech Representation Learning
A Brief Overview of Unsupervised Neural Speech Representation Learning
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
Lars Maaløe
Christian Igel
BDLAI4TSSSL
265
13
0
01 Mar 2022
Word Segmentation on Discovered Phone Units with Dynamic Programming and
  Self-Supervised Scoring
Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised ScoringIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Herman Kamper
348
32
0
24 Feb 2022
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
  Progressive Matrices
Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's Progressive MatricesACM Computing Surveys (ACM CSUR), 2022
Mikolaj Malkiñski
Jacek Mańdziuk
543
56
0
28 Jan 2022
Phone-to-audio alignment without text: A Semi-supervised Approach
Phone-to-audio alignment without text: A Semi-supervised ApproachIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Jian Zhu
Cong Zhang
David Jurgens
307
51
0
08 Oct 2021
Unsupervised Speech Segmentation and Variable Rate Representation
  Learning using Segmental Contrastive Predictive Coding
Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Laureano Moro-Velazquez
Najim Dehak
SSL
397
28
0
05 Oct 2021
Multilingual transfer of acoustic word embeddings improves when training
  on languages related to the target zero-resource language
Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language
C. Jacobs
Herman Kamper
293
12
0
24 Jun 2021
Unsupervised Automatic Speech Recognition: A Review
Unsupervised Automatic Speech Recognition: A ReviewSpeech Communication (Speech Commun.), 2021
Hanan Aldarmaki
Asad Ullah
Nazar Zaki
VLMSSL
194
70
0
09 Jun 2021
Segmental Contrastive Predictive Coding for Unsupervised Word
  Segmentation
Segmental Contrastive Predictive Coding for Unsupervised Word SegmentationInterspeech (Interspeech), 2021
Saurabhchand Bhati
Jesús Villalba
Piotr Żelasko
Laureano Moro-Velazquez
Najim Dehak
SSL
227
44
0
03 Jun 2021
Unsupervised Speech Recognition
Unsupervised Speech RecognitionNeural Information Processing Systems (NeurIPS), 2021
Alexei Baevski
Wei-Ning Hsu
Alexis Conneau
Michael Auli
SSL
478
295
0
24 May 2021
Speech Resynthesis from Discrete Disentangled Self-Supervised
  Representations
Speech Resynthesis from Discrete Disentangled Self-Supervised RepresentationsInterspeech (Interspeech), 2021
Adam Polyak
Yossi Adi
Jade Copet
Eugene Kharitonov
Kushal Lakhotia
Wei-Ning Hsu
Abdel-rahman Mohamed
Emmanuel Dupoux
505
381
0
01 Apr 2021
Acoustic word embeddings for zero-resource languages using
  self-supervised contrastive learning and multilingual adaptation
Acoustic word embeddings for zero-resource languages using self-supervised contrastive learning and multilingual adaptationSpoken Language Technology Workshop (SLT), 2021
C. Jacobs
Yevgen Matusevych
Herman Kamper
331
25
0
19 Mar 2021
Double Articulation Analyzer with Prosody for Unsupervised Word and
  Phoneme Discovery
Double Articulation Analyzer with Prosody for Unsupervised Word and Phoneme DiscoveryIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2021
Yasuaki Okuda
Ryo Ozaki
T. Taniguchi
335
7
0
15 Mar 2021
CDPAM: Contrastive learning for perceptual audio similarity
CDPAM: Contrastive learning for perceptual audio similarityIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Pranay Manocha
Zeyu Jin
Richard Y. Zhang
Adam Finkelstein
281
79
0
09 Feb 2021
Towards unsupervised phone and word segmentation using self-supervised
  vector-quantized neural networks
Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networksInterspeech (Interspeech), 2020
Herman Kamper
Benjamin van Niekerk
SSLMQ
342
38
0
14 Dec 2020
Similarity Analysis of Self-Supervised Speech Representations
Similarity Analysis of Self-Supervised Speech Representations
Yu-An Chung
Yonatan Belinkov
James R. Glass
SSL
425
45
0
22 Oct 2020
1
Page 1 of 1