ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.00158
  4. Cited By
Speaker Recognition from Raw Waveform with SincNet

Speaker Recognition from Raw Waveform with SincNet

29 July 2018
Mirco Ravanelli
Yoshua Bengio
ArXivPDFHTML

Papers citing "Speaker Recognition from Raw Waveform with SincNet"

50 / 259 papers shown
Title
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Zeyang Song
Jibin Wu
Malu Zhang
Mike Zheng Shou
Haizhou Li
35
4
0
18 Sep 2023
Instabilities in Convnets for Raw Audio
Instabilities in Convnets for Raw Audio
Daniel Haider
Vincent Lostanlen
Martin Ehler
Péter Balázs
11
2
0
11 Sep 2023
Audio Deepfake Detection: A Survey
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
J. Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
24
41
0
29 Aug 2023
StofNet: Super-resolution Time of Flight Network
StofNet: Super-resolution Time of Flight Network
Christopher Hahne
Michael Hayoz
Raphael Sznitman
11
0
0
23 Aug 2023
Complex-valued neural networks for voice anti-spoofing
Complex-valued neural networks for voice anti-spoofing
Nicolas M. Muller
Philip Sperl
Konstantin Böttinger
17
14
0
22 Aug 2023
Neural Architectures Learning Fourier Transforms, Signal Processing and
  Much More....
Neural Architectures Learning Fourier Transforms, Signal Processing and Much More....
Prateek Verma
14
0
0
20 Aug 2023
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality
  Assessment Model
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
Ryandhimas E. Zezario
B. Bai
C. Fuh
Hsin-Min Wang
Yu Tsao
11
3
0
18 Aug 2023
Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with
  Transformers
Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers
Lukas Rauch
Raphael Schwinger
Moritz Wirth
Bernhard Sick
Sven Tomforde
Christoph Scholz
22
4
0
14 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Peter Vieting
Ralf Schluter
Hermann Ney
15
2
0
08 Aug 2023
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link
  between Phonemes and Facial Features
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features
Liao Qu
X. Zou
Xiang Li
Yandong Wen
Rita Singh
Bhiksha Raj
CVBM
6
6
0
26 Jul 2023
Rethinking Voice-Face Correlation: A Geometry View
Rethinking Voice-Face Correlation: A Geometry View
Xiang Li
Yandong Wen
Muqiao Yang
Jinglu Wang
Rita Singh
Bhiksha Raj
CVBM
3DH
9
6
0
26 Jul 2023
Fitting Auditory Filterbanks with Multiresolution Neural Networks
Fitting Auditory Filterbanks with Multiresolution Neural Networks
Vincent Lostanlen
Daniel Haider
Han Han
Mathieu Lagrange
Péter Balázs
Martin Ehler
8
3
0
25 Jul 2023
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice
  Conversion
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion
Zhe Ye
Terui Mao
Li Dong
Diqun Yan
AAML
6
7
0
28 Jun 2023
Exploring Isolated Musical Notes as Pre-training Data for Predominant
  Instrument Recognition in Polyphonic Music
Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music
Lifan Zhong
Erica Cooper
Junichi Yamagishi
N. Minematsu
14
1
0
15 Jun 2023
HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
Doyeon Kim
Soo-Whan Chung
Hyewon Han
Youna Ji
Hong-Goo Kang
16
7
0
02 Jun 2023
Domain knowledge-informed Synthetic fault sample generation with Health
  Data Map for cross-domain Planetary Gearbox Fault Diagnosis
Domain knowledge-informed Synthetic fault sample generation with Health Data Map for cross-domain Planetary Gearbox Fault Diagnosis
Jong Moon Ha
Olga Fink
12
13
0
31 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
14
2
0
24 May 2023
Vocal Style Factorization for Effective Speaker Recognition in Affective
  Scenarios
Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios
Morgan Sandler
Arun Ross
CVBM
8
0
0
13 May 2023
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion
  Recognition
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
Soumya Dutta
Sriram Ganapathy
14
15
0
14 Apr 2023
Time-frequency Network for Robust Speaker Recognition
Time-frequency Network for Robust Speaker Recognition
Jiguo Li
Tianzi Zhang
Xiaobin Liu
Lirong Zheng
11
0
0
05 Mar 2023
Defending against Adversarial Audio via Diffusion Model
Defending against Adversarial Audio via Diffusion Model
Shutong Wu
Jiong Wang
Wei Ping
Weili Nie
Chaowei Xiao
DiffM
19
24
0
02 Mar 2023
Masking Kernel for Learning Energy-Efficient Representations for Speaker
  Recognition and Mobile Health
Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health
Apiwat Ditthapron
E. Agu
A. Lammert
16
0
0
08 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
25
5
0
06 Feb 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We
  Detect Cardiovascular Disease Through Language Models?
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu
William Jongwon Han
Jiacheng Zhu
Mengdi Xu
Michael A. Rosenberg
Emerson Liu
Douglas Weber
Ding Zhao
10
21
0
21 Jan 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
19
14
0
19 Jan 2023
Introducing Model Inversion Attacks on Automatic Speaker Recognition
Introducing Model Inversion Attacks on Automatic Speaker Recognition
Karla Pizzi
Franziska Boenisch
U. Sahin
Konstantin Böttinger
13
3
0
09 Jan 2023
Source Tracing: Detecting Voice Spoofing
Source Tracing: Detecting Voice Spoofing
Tinglong Zhu
Xingming Wang
Xiaoyi Qin
Ming Li
20
10
0
16 Dec 2022
Learnable Front Ends Based on Temporal Modulation for Music Tagging
Learnable Front Ends Based on Temporal Modulation for Music Tagging
Yi Ma
R. Stern
14
0
0
28 Nov 2022
Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting
Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting
Iván López-Espejo
R. Shekar
Z. Tan
Jesper Jensen
John H. L. Hansen
13
2
0
19 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
19
13
0
17 Nov 2022
Audio Anti-spoofing Using a Simple Attention Module and Joint
  Optimization Based on Additive Angular Margin Loss and Meta-learning
Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning
John H. L. Hansen
Zhenyu Wang
27
15
0
17 Nov 2022
Disentangled representation learning for multilingual speaker
  recognition
Disentangled representation learning for multilingual speaker recognition
Kihyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
40
6
0
01 Nov 2022
Brouhaha: multi-task training for voice activity detection,
  speech-to-noise ratio, and C50 room acoustics estimation
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Marvin Lavechin
Marianne Métais
Hadrien Titeux
Alodie Boissonnet
Jade Copet
M. Rivière
Elika Bergelson
Alejandrina Cristià
Emmanuel Dupoux
H. Bredin
26
24
0
24 Oct 2022
Discriminatory and orthogonal feature learning for noise robust keyword
  spotting
Discriminatory and orthogonal feature learning for noise robust keyword spotting
Donghyeon Kim
Kyungdeuk Ko
D. Han
Hanseok Ko
14
3
0
20 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification
Learning Temporal Resolution in Spectrogram for Audio Classification
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
32
7
0
04 Oct 2022
Simple Pooling Front-ends For Efficient Audio Classification
Simple Pooling Front-ends For Efficient Audio Classification
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Mark D. Plumbley
Wenwu Wang
35
16
0
03 Oct 2022
Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Jie Wang
Yuji Liu
Binling Wang
Yiming Zhi
Song Li
Shipeng Xia
Jiayang Zhang
Feng Tong
Lin Li
Q. Hong
15
6
0
24 Sep 2022
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Ziqing Du
Kai Liu
Xucheng Wan
Huan Zhou
11
0
0
24 Sep 2022
Dynamic Time-Alignment of Dimensional Annotations of Emotion using
  Recurrent Neural Networks
Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks
Sina Alisamir
F. Ringeval
François Portet
19
0
0
21 Sep 2022
Overlapped speech and gender detection with WavLM pre-trained features
Overlapped speech and gender detection with WavLM pre-trained features
Martin Lebourdais
Marie Tahon
Antoine Laurent
S. Meignier
27
17
0
09 Sep 2022
Low-Level Physiological Implications of End-to-End Learning of Speech
  Recognition
Low-Level Physiological Implications of End-to-End Learning of Speech Recognition
Louise Coppieters de Gibson
Philip N. Garner
13
1
0
22 Aug 2022
Improving Speech Emotion Recognition Through Focus and Calibration
  Attention Mechanisms
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
14
13
0
21 Aug 2022
Extending GCC-PHAT using Shift Equivariant Neural Networks
Extending GCC-PHAT using Shift Equivariant Neural Networks
Axel Berg
Mark O'Connor
Kalle Åström
Magnus Oskarsson
11
10
0
09 Aug 2022
Decision SincNet: Neurocognitive models of decision making that predict
  cognitive processes from neural signals
Decision SincNet: Neurocognitive models of decision making that predict cognitive processes from neural signals
Qi Sun
Khuong Vo
K. Lui
Michael D. Nunez
J. Vandekerckhove
R. Srinivasan
13
0
0
04 Aug 2022
GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for
  Robust Electrocardiogram Prediction
GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu
Jielin Qiu
Zhuolin Yang
Douglas Weber
M. Rosenberg
Emerson Liu
Bo-wen Li
Ding Zhao
OOD
15
13
0
02 Aug 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
11
1
0
26 Jul 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
19
36
0
15 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
Jan Schluter
Gerald Gutenbrunner
VLM
23
12
0
12 Jul 2022
Detection of Doctored Speech: Towards an End-to-End Parametric
  Learn-able Filter Approach
Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Rohit Arora
11
0
0
27 Jun 2022
Robust Time Series Denoising with Learnable Wavelet Packet Transform
Robust Time Series Denoising with Learnable Wavelet Packet Transform
Gaetan Frusque
Olga Fink
OOD
AI4TS
27
23
0
13 Jun 2022
Previous
123456
Next