TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

8 October 2021

Boris Ginsburg

Papers citing "TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context"

24 / 24 papers shown

Title
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness Erfan Loweimi Mengjie Qian Kate Knill Mark J. F. Gales 46 0 0 26 Apr 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance Shehzeen Samarah Hussain Paarth Neekhara Xuesong Yang Edresson Casanova Subhankar Ghosh Mikyas T. Desta Roy Fejgin Rafael Valle Jason Chun Lok Li 59 2 0 07 Feb 2025
People are poorly equipped to detect AI-powered voice clones Sarah Barrington Emily A. Cooper Hany Farid 56 6 0 28 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models Thai-Binh Nguyen Alexander Waibel 74 1 0 27 Nov 2024
The First VoicePrivacy Attacker Challenge Evaluation Plan N. Tomashenko Xiaoxiao Miao Emmanuel Vincent Junichi Yamagishi 117 2 0 09 Oct 2024
TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition Vlad Striletchi Cosmin Striletchi Adriana Stan 38 0 0 16 Sep 2024
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization Samuele Cornell Taejin Park Steve Huang Christoph Boeddeker Xuankai Chang Matthew Maciejewski Matthew Wiesner Paola García Shinji Watanabe 29 9 0 23 Jul 2024
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors J. Hauret Malo Olivier Thomas Joubaud C. Langrenne Sarah Poirée V. Zimpfer Éric Bavu 75 1 0 16 Jul 2024
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System T. Park He Huang Ante Jukić Kunal Dhawan Krishna C. Puvvada Nithin Rao Koluguri Nikolay Karpov A. Laptev Jagadeesh Balam Boris Ginsburg 19 6 0 18 Oct 2023
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition Krishna C. Puvvada Nithin Rao Koluguri Kunal Dhawan Jagadeesh Balam Boris Ginsburg 24 12 0 19 Sep 2023
Voice Morphing: Two Identities in One Voice Sushant Pani Anurag Chowdhury Morgan Sandler Arun Ross 11 1 0 05 Sep 2023
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio Yang Zhang Krishna C. Puvvada Vitaly Lavrukhin Boris Ginsburg 24 14 0 09 Aug 2023
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings L. Serafini Samuele Cornell Giovanni Morrone Enrico Zovato A. Brutti S. Squartini 37 9 0 29 May 2023
Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios Morgan Sandler Arun Ross CVBM 18 0 0 13 May 2023
Residual Information in Deep Speaker Embedding Architectures Adriana Stan 32 5 0 06 Feb 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset J. Peymanfard Samin Heydarian Ali Lashini Hossein Zeinali Mohammad Reza Mohammadi N. Mozayani 21 10 0 21 Jan 2023
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers Cheng-Ping Hsieh Subhankar Ghosh Boris Ginsburg 41 18 0 01 Nov 2022
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification Fei Jia Nithin Rao Koluguri Jagadeesh Balam Boris Ginsburg 10 3 0 27 Oct 2022
Multi-scale Speaker Diarization with Dynamic Scale Weighting Tae Jin Park Nithin Rao Koluguri Jagadeesh Balam Boris Ginsburg 11 19 0 30 Mar 2022
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks Federico Landini Jan Profant Mireia Díez L. Burget 208 198 0 29 Dec 2020
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap Tae Jin Park Kyu Jeong Han Manoj Kumar Shrikanth Narayanan 128 116 0 05 Mar 2020
pyannote.audio: neural building blocks for speaker diarization H. Bredin Ruiqing Yin Juan Manuel Coria G. Gelly Pavel Korshunov Marvin Lavechin D. Fustes Hadrien Titeux Wassim Bouaziz Marie-Philippe Gill 188 312 0 04 Nov 2019
NeMo: a toolkit for building AI applications using Neural Modules Oleksii Kuchaiev Jason Chun Lok Li Huyen Nguyen Oleksii Hrinchuk Ryan Leary ... Jack Cook P. Castonguay Mariya Popova Jocelyn Huang Jonathan M. Cohen 194 291 0 14 Sep 2019
VoxCeleb2: Deep Speaker Recognition Joon Son Chung Arsha Nagrani Andrew Zisserman 216 2,233 0 14 Jun 2018