ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.04410
  4. Cited By
TitaNet: Neural Model for speaker representation with 1D Depth-wise
  separable convolutions and global context

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

8 October 2021
Nithin Rao Koluguri
Taejin Park
Boris Ginsburg
    ViT
ArXivPDFHTML

Papers citing "TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context"

24 / 24 papers shown
Title
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
Speaker Retrieval in the Wild: Challenges, Effectiveness and Robustness
Erfan Loweimi
Mengjie Qian
Kate Knill
Mark J. F. Gales
46
0
0
26 Apr 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance
Shehzeen Samarah Hussain
Paarth Neekhara
Xuesong Yang
Edresson Casanova
Subhankar Ghosh
Mikyas T. Desta
Roy Fejgin
Rafael Valle
Jason Chun Lok Li
59
2
0
07 Feb 2025
People are poorly equipped to detect AI-powered voice clones
People are poorly equipped to detect AI-powered voice clones
Sarah Barrington
Emily A. Cooper
Hany Farid
56
6
0
28 Jan 2025
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
Thai-Binh Nguyen
Alexander Waibel
74
1
0
27 Nov 2024
The First VoicePrivacy Attacker Challenge Evaluation Plan
The First VoicePrivacy Attacker Challenge Evaluation Plan
N. Tomashenko
Xiaoxiao Miao
Emmanuel Vincent
Junichi Yamagishi
117
2
0
09 Oct 2024
TBDM-Net: Bidirectional Dense Networks with Gender Information for
  Speech Emotion Recognition
TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Vlad Striletchi
Cosmin Striletchi
Adriana Stan
38
0
0
16 Sep 2024
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant
  Automatic Speech Recognition and Diarization
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization
Samuele Cornell
Taejin Park
Steve Huang
Christoph Boeddeker
Xuankai Chang
Matthew Maciejewski
Matthew Wiesner
Paola García
Shinji Watanabe
29
9
0
23 Jul 2024
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors
J. Hauret
Malo Olivier
Thomas Joubaud
C. Langrenne
Sarah Poirée
V. Zimpfer
Éric Bavu
75
1
0
16 Jul 2024
The CHiME-7 Challenge: System Description and Performance of NeMo Team's
  DASR System
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System
T. Park
He Huang
Ante Jukić
Kunal Dhawan
Krishna C. Puvvada
Nithin Rao Koluguri
Nikolay Karpov
A. Laptev
Jagadeesh Balam
Boris Ginsburg
19
6
0
18 Oct 2023
Discrete Audio Representation as an Alternative to Mel-Spectrograms for
  Speaker and Speech Recognition
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition
Krishna C. Puvvada
Nithin Rao Koluguri
Kunal Dhawan
Jagadeesh Balam
Boris Ginsburg
24
12
0
19 Sep 2023
Voice Morphing: Two Identities in One Voice
Voice Morphing: Two Identities in One Voice
Sushant Pani
Anurag Chowdhury
Morgan Sandler
Arun Ross
11
1
0
05 Sep 2023
Conformer-based Target-Speaker Automatic Speech Recognition for
  Single-Channel Audio
Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Yang Zhang
Krishna C. Puvvada
Vitaly Lavrukhin
Boris Ginsburg
24
14
0
09 Aug 2023
An Experimental Review of Speaker Diarization methods with application
  to Two-Speaker Conversational Telephone Speech recordings
An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings
L. Serafini
Samuele Cornell
Giovanni Morrone
Enrico Zovato
A. Brutti
S. Squartini
37
9
0
29 May 2023
Vocal Style Factorization for Effective Speaker Recognition in Affective
  Scenarios
Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios
Morgan Sandler
Arun Ross
CVBM
18
0
0
13 May 2023
Residual Information in Deep Speaker Embedding Architectures
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
32
5
0
06 Feb 2023
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech
  Recognition: the Arman-AV Dataset
A Multi-Purpose Audio-Visual Corpus for Multi-Modal Persian Speech Recognition: the Arman-AV Dataset
J. Peymanfard
Samin Heydarian
Ali Lashini
Hossein Zeinali
Mohammad Reza Mohammadi
N. Mozayani
21
10
0
21 Jan 2023
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New
  Speakers
Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers
Cheng-Ping Hsieh
Subhankar Ghosh
Boris Ginsburg
41
18
0
01 Nov 2022
A Compact End-to-End Model with Local and Global Context for Spoken
  Language Identification
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification
Fei Jia
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
10
3
0
27 Oct 2022
Multi-scale Speaker Diarization with Dynamic Scale Weighting
Multi-scale Speaker Diarization with Dynamic Scale Weighting
Tae Jin Park
Nithin Rao Koluguri
Jagadeesh Balam
Boris Ginsburg
11
19
0
30 Mar 2022
Bayesian HMM clustering of x-vector sequences (VBx) in speaker
  diarization: theory, implementation and analysis on standard tasks
Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks
Federico Landini
Jan Profant
Mireia Díez
L. Burget
208
198
0
29 Dec 2020
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized
  Maximum Eigengap
Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap
Tae Jin Park
Kyu Jeong Han
Manoj Kumar
Shrikanth Narayanan
128
116
0
05 Mar 2020
pyannote.audio: neural building blocks for speaker diarization
pyannote.audio: neural building blocks for speaker diarization
H. Bredin
Ruiqing Yin
Juan Manuel Coria
G. Gelly
Pavel Korshunov
Marvin Lavechin
D. Fustes
Hadrien Titeux
Wassim Bouaziz
Marie-Philippe Gill
188
312
0
04 Nov 2019
NeMo: a toolkit for building AI applications using Neural Modules
NeMo: a toolkit for building AI applications using Neural Modules
Oleksii Kuchaiev
Jason Chun Lok Li
Huyen Nguyen
Oleksii Hrinchuk
Ryan Leary
...
Jack Cook
P. Castonguay
Mariya Popova
Jocelyn Huang
Jonathan M. Cohen
194
291
0
14 Sep 2019
VoxCeleb2: Deep Speaker Recognition
VoxCeleb2: Deep Speaker Recognition
Joon Son Chung
Arsha Nagrani
Andrew Zisserman
216
2,233
0
14 Jun 2018
1