Speaker Recognition from Raw Waveform with SincNet

29 July 2018

Mirco Ravanelli

Papers citing "Speaker Recognition from Raw Waveform with SincNet"

50 / 259 papers shown

Title
Representation based meta-learning for few-shot spoken intent recognition Ashish R. Mittal Samarth Bharadwaj Shreya Khare Saneem A. Chemmengath Karthik Sankaranarayanan Brian Kingsbury 11 11 0 29 Jun 2021
SoundDet: Polyphonic Moving Sound Event Detection and Localization from Raw Waveform Yuhang He A. Trigoni Andrew Markham 19 19 0 13 Jun 2021
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild Okan Kopuklu Maja Taseska Gerhard Rigoll 3DV 11 45 0 07 Jun 2021
PF-Net: Personalized Filter for Speaker Recognition from Raw Waveform Wencheng Li Zhenhua Tan Jingyu Ning Zhenche Xia Danke Wu 8 1 0 31 May 2021
EEG-based Cross-Subject Driver Drowsiness Recognition with an Interpretable Convolutional Neural Network Jian Cui Zirui Lan O. Sourina W. Müller-Wittig 11 101 0 30 May 2021
A Modulation Front-End for Music Audio Tagging Cyrus Vahidi C. Saitis Gyorgy Fazekas 6 2 0 25 May 2021
BeamLearning: an end-to-end Deep Learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data Hadrien Pujol Éric Bavu Alexandre Garcia 35 22 0 27 Apr 2021
Voice2Mesh: Cross-Modal 3D Face Model Generation from Voices Cho-Ying Wu Ke Xu Chin-Cheng Hsu Ulrich Neumann CVBM 3DH 27 4 0 21 Apr 2021
End-to-end Keyword Spotting using Neural Architecture Search and Quantization David Peter Wolfgang Roth Franz Pernkopf MQ 12 14 0 14 Apr 2021
Learning Metrics from Mean Teacher: A Supervised Learning Method for Improving the Generalization of Speaker Verification System Ju-ho Kim Hye-jin Shim Jee-weon Jung Ha-Jin Yu 15 1 0 14 Apr 2021
On Architectures and Training for Raw Waveform Feature Extraction in ASR Peter Vieting Christoph Luscher Wilfried Michel Ralf Schluter Hermann Ney 17 9 0 09 Apr 2021
End-to-end speaker segmentation for overlap-aware resegmentation H. Bredin Antoine Laurent VLM 207 161 0 08 Apr 2021
Graph Attention Networks for Anti-Spoofing Hemlata Tak Jee-weon Jung J. Patino Massimiliano Todisco Nicholas W. D. Evans 31 65 0 08 Apr 2021
Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection W. Ge Michele Panariello J. Patino Massimiliano Todisco Nicholas W. D. Evans 3DPC 21 30 0 07 Apr 2021
Learning spectro-temporal representations of complex sounds with parameterized neural networks Rachid Riad Julien Karadayi Anne-Catherine Bachoud-Lévi Emmanuel Dupoux 10 7 0 12 Mar 2021
Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect Jun Wang Max W. Y. Lam Dan Su Dong Yu 4 6 0 02 Mar 2021
Contrastive Separative Coding for Self-supervised Representation Learning Jun Wang Max W. Y. Lam Dan Su Dong Yu SSL 9 3 0 01 Mar 2021
Learnable MFCCs for Speaker Verification Xuechen Liu Md. Sahidullah Tomi Kinnunen 20 17 0 20 Feb 2021
U-vectors: Generating clusterable speaker embedding from unlabeled data M. Mridha Abu Quwsar Ohi M. Monowar Md. Abdul Hamid Md. Rashedul Islam Yutaka Watanobe SSL 14 6 0 07 Feb 2021
Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism Jisi Zhang Catalin Zorila R. Doddipatla Jon Barker 8 13 0 07 Feb 2021
Multi-Task Self-Supervised Pre-Training for Music Classification Ho-Hsiang Wu Chieh-Chi Kao Qingming Tang Ming Sun Brian McFee J. P. Bello Chao Wang SSL 21 36 0 05 Feb 2021
The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap Shota Horiguchi Nelson Yalta Leibny Paola García-Perera Yuki Takashima Yawen Xue Desh Raj Zili Huang Yusuke Fujita Shinji Watanabe Sanjeev Khudanpur BDL 11 36 0 02 Feb 2021
Curriculum Learning: A Survey Petru Soviany Radu Tudor Ionescu Paolo Rota N. Sebe ODL 63 337 0 25 Jan 2021
LEAF: A Learnable Frontend for Audio Classification Neil Zeghidour O. Teboul Félix de Chaumont Quitry Marco Tagliasacchi VLM AAML 74 140 0 21 Jan 2021
MAAS: Multi-modal Assignation for Active Speaker Detection Juan Carlos León Alcázar Fabian Caba Heilbron Ali K. Thabet Bernard Ghanem 57 51 0 11 Jan 2021
Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps Tri Dao N. Sohoni Albert Gu Matthew Eichhorn Amit Blonder Megan Leszczynski Atri Rudra Christopher Ré 17 46 0 29 Dec 2020
A Study of Few-Shot Audio Classification Piper Wolters Chris Careaga Brian Hutchinson Lauren A. Phillips 6 10 0 02 Dec 2020
A comparison of handcrafted, parameterized, and learnable features for speech separation Wenbo Zhu Mou Wang Xiao-Lei Zhang S. Rahardja 12 4 0 29 Nov 2020
Speech Command Recognition in Computationally Constrained Environments with a Quadratic Self-organized Operational Layer M. Soltanian Junaid Malik Jenni Raitoharju Alexandros Iosifidis S. Kiranyaz Denmark 6 11 0 23 Nov 2020
Deep Learning in EEG: Advance of the Last Ten-Year Critical Period Shu Gong Kaibo Xing A. Cichocki Junhua Li VLM 17 64 0 22 Nov 2020
Recognizing More Emotions with Less Data Using Self-supervised Transfer Learning Jonathan Boigne Biman Liyanage Ted Östrem 14 20 0 11 Nov 2020
A Comparison Study on Infant-Parent Voice Diarization Junzhe Zhu M. Hasegawa-Johnson Nancy L. McElwain 12 1 0 05 Nov 2020
Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews Rachid Riad Hadrien Titeux Laurie Lemoine Justine Montillot A. Sliwinski J. Bagnou Xuan-Nga Cao Anne-Catherine Bachoud-Lévi Emmanuel Dupoux 8 0 0 30 Oct 2020
The ins and outs of speaker recognition: lessons from VoxSRC 2020 Yoohwan Kwon Hee-Soo Heo Bong-Jin Lee Joon Son Chung 13 59 0 29 Oct 2020
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding Ge Zhu Fei Jiang Z. Duan 6 25 0 24 Oct 2020
ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding Minjeong Kim Gyuwan Kim Sang-Woo Lee Jung-Woo Ha VLM 16 34 0 23 Oct 2020
Perceptual Loss based Speech Denoising with an ensemble of Audio Pattern Recognition and Self-Supervised Models Saurabh Kataria Jesús Villalba Najim Dehak VLM SSL 6 34 0 22 Oct 2020
Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakers Zeqian Li Jacob Whitehill 12 10 0 22 Oct 2020
Graph Attention Networks for Speaker Verification Jee-weon Jung Hee-Soo Heo Ha-Jin Yu Joon Son Chung 10 26 0 22 Oct 2020
Dataset artefacts in anti-spoofing systems: a case study on the ASVspoof 2017 benchmark Bhusan Chettri Emmanouil Benetos Bob L. T. Sturm 16 27 0 15 Oct 2020
Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions Ludwig Kurzinger Nicolas Lindae Palle Klewitz Gerhard Rigoll 11 5 0 15 Oct 2020
A Lightweight Speaker Recognition System Using Timbre Properties Abu Quwsar Ohi M. Mridha Md. Abdul Hamid M. Monowar Dongsu Lee Jinsul Kim 11 2 0 12 Oct 2020
Attention Driven Fusion for Multi-Modal Emotion Recognition Darshana Priyasad Tharindu Fernando Simon Denman Clinton Fookes S. Sridharan 6 67 0 23 Sep 2020
TRIER: Template-Guided Neural Networks for Robust and Interpretable Sleep Stage Identification from EEG Recordings Taeheon Lee Jeonghwan Hwang Honggu Lee 12 7 0 10 Sep 2020
DeepVOX: Discovering Features from Raw Audio for Speaker Recognition in Non-ideal Audio Signals Anurag Chowdhury Arun Ross 8 2 0 26 Aug 2020
FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data Aditya Sunil Joglekar John H. L. Hansen M. C. Shekhar A. Sangwan 9 24 0 15 Aug 2020
End-to-End Neural Transformer Based Spoken Language Understanding Martin H. Radfar Athanasios Mouchtaris Siegfried Kunzmann 23 61 0 12 Aug 2020
A Comparative Re-Assessment of Feature Extractors for Deep Speaker Embeddings Xuechen Liu Md. Sahidullah Tomi Kinnunen 17 9 0 30 Jul 2020
Double Multi-Head Attention for Speaker Verification Miquel India Pooyan Safari Javier Hernando 17 18 0 26 Jul 2020
End-to-end spoofing detection with raw waveform CLDNNs Heinrich Dinkel Nanxin Chen Y. Qian Kai Yu 32 78 0 26 Jul 2020