Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.00158
Cited By
Speaker Recognition from Raw Waveform with SincNet
29 July 2018
Mirco Ravanelli
Yoshua Bengio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Speaker Recognition from Raw Waveform with SincNet"
50 / 259 papers shown
Title
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Zeyang Song
Jibin Wu
Malu Zhang
Mike Zheng Shou
Haizhou Li
35
4
0
18 Sep 2023
Instabilities in Convnets for Raw Audio
Daniel Haider
Vincent Lostanlen
Martin Ehler
Péter Balázs
11
2
0
11 Sep 2023
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
J. Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
24
41
0
29 Aug 2023
StofNet: Super-resolution Time of Flight Network
Christopher Hahne
Michael Hayoz
Raphael Sznitman
11
0
0
23 Aug 2023
Complex-valued neural networks for voice anti-spoofing
Nicolas M. Muller
Philip Sperl
Konstantin Böttinger
17
14
0
22 Aug 2023
Neural Architectures Learning Fourier Transforms, Signal Processing and Much More....
Prateek Verma
14
0
0
20 Aug 2023
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
Ryandhimas E. Zezario
B. Bai
C. Fuh
Hsin-Min Wang
Yu Tsao
11
3
0
18 Aug 2023
Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers
Lukas Rauch
Raphael Schwinger
Moritz Wirth
Bernhard Sick
Sven Tomforde
Christoph Scholz
22
4
0
14 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Peter Vieting
Ralf Schluter
Hermann Ney
15
2
0
08 Aug 2023
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features
Liao Qu
X. Zou
Xiang Li
Yandong Wen
Rita Singh
Bhiksha Raj
CVBM
6
6
0
26 Jul 2023
Rethinking Voice-Face Correlation: A Geometry View
Xiang Li
Yandong Wen
Muqiao Yang
Jinglu Wang
Rita Singh
Bhiksha Raj
CVBM
3DH
9
6
0
26 Jul 2023
Fitting Auditory Filterbanks with Multiresolution Neural Networks
Vincent Lostanlen
Daniel Haider
Han Han
Mathieu Lagrange
Péter Balázs
Martin Ehler
8
3
0
25 Jul 2023
Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion
Zhe Ye
Terui Mao
Li Dong
Diqun Yan
AAML
6
7
0
28 Jun 2023
Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music
Lifan Zhong
Erica Cooper
Junichi Yamagishi
N. Minematsu
14
1
0
15 Jun 2023
HD-DEMUCS: General Speech Restoration with Heterogeneous Decoders
Doyeon Kim
Soo-Whan Chung
Hyewon Han
Youna Ji
Hong-Goo Kang
16
7
0
02 Jun 2023
Domain knowledge-informed Synthetic fault sample generation with Health Data Map for cross-domain Planetary Gearbox Fault Diagnosis
Jong Moon Ha
Olga Fink
12
13
0
31 May 2023
Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model
Aoi Ito
Shota Horiguchi
SSL
14
2
0
24 May 2023
Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios
Morgan Sandler
Arun Ross
CVBM
8
0
0
13 May 2023
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
Soumya Dutta
Sriram Ganapathy
14
15
0
14 Apr 2023
Time-frequency Network for Robust Speaker Recognition
Jiguo Li
Tianzi Zhang
Xiaobin Liu
Lirong Zheng
11
0
0
05 Mar 2023
Defending against Adversarial Audio via Diffusion Model
Shutong Wu
Jiong Wang
Wei Ping
Weili Nie
Chaowei Xiao
DiffM
19
24
0
02 Mar 2023
Masking Kernel for Learning Energy-Efficient Representations for Speaker Recognition and Mobile Health
Apiwat Ditthapron
E. Agu
A. Lammert
16
0
0
08 Feb 2023
Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
25
5
0
06 Feb 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu
William Jongwon Han
Jiacheng Zhu
Mengdi Xu
Michael A. Rosenberg
Emerson Liu
Douglas Weber
Ding Zhao
10
21
0
21 Jan 2023
LoCoNet: Long-Short Context Network for Active Speaker Detection
Xizi Wang
Feng Cheng
Gedas Bertasius
David J. Crandall
19
14
0
19 Jan 2023
Introducing Model Inversion Attacks on Automatic Speaker Recognition
Karla Pizzi
Franziska Boenisch
U. Sahin
Konstantin Böttinger
13
3
0
09 Jan 2023
Source Tracing: Detecting Voice Spoofing
Tinglong Zhu
Xingming Wang
Xiaoyi Qin
Ming Li
20
10
0
16 Dec 2022
Learnable Front Ends Based on Temporal Modulation for Music Tagging
Yi Ma
R. Stern
14
0
0
28 Nov 2022
Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting
Iván López-Espejo
R. Shekar
Z. Tan
Jesper Jensen
John H. L. Hansen
13
2
0
19 Nov 2022
MelHuBERT: A simplified HuBERT on Mel spectrograms
Tzu-Quan Lin
Hung-yi Lee
Hao Tang
SSL
19
13
0
17 Nov 2022
Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning
John H. L. Hansen
Zhenyu Wang
27
15
0
17 Nov 2022
Disentangled representation learning for multilingual speaker recognition
Kihyun Nam
You-kyong. Kim
Jaesung Huh
Hee-Soo Heo
Jee-weon Jung
Joon Son Chung
40
6
0
01 Nov 2022
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Marvin Lavechin
Marianne Métais
Hadrien Titeux
Alodie Boissonnet
Jade Copet
M. Rivière
Elika Bergelson
Alejandrina Cristià
Emmanuel Dupoux
H. Bredin
26
24
0
24 Oct 2022
Discriminatory and orthogonal feature learning for noise robust keyword spotting
Donghyeon Kim
Kyungdeuk Ko
D. Han
Hanseok Ko
14
3
0
20 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
32
7
0
04 Oct 2022
Simple Pooling Front-ends For Efficient Audio Classification
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Mark D. Plumbley
Wenwu Wang
35
16
0
03 Oct 2022
Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting
Jie Wang
Yuji Liu
Binling Wang
Yiming Zhi
Song Li
Shipeng Xia
Jiayang Zhang
Feng Tong
Lin Li
Q. Hong
15
6
0
24 Sep 2022
Joint Speech Activity and Overlap Detection with Multi-Exit Architecture
Ziqing Du
Kai Liu
Xucheng Wan
Huan Zhou
11
0
0
24 Sep 2022
Dynamic Time-Alignment of Dimensional Annotations of Emotion using Recurrent Neural Networks
Sina Alisamir
F. Ringeval
François Portet
19
0
0
21 Sep 2022
Overlapped speech and gender detection with WavLM pre-trained features
Martin Lebourdais
Marie Tahon
Antoine Laurent
S. Meignier
27
17
0
09 Sep 2022
Low-Level Physiological Implications of End-to-End Learning of Speech Recognition
Louise Coppieters de Gibson
Philip N. Garner
13
1
0
22 Aug 2022
Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Junghun Kim
Yoojin An
Jihie Kim
14
13
0
21 Aug 2022
Extending GCC-PHAT using Shift Equivariant Neural Networks
Axel Berg
Mark O'Connor
Kalle Åström
Magnus Oskarsson
11
10
0
09 Aug 2022
Decision SincNet: Neurocognitive models of decision making that predict cognitive processes from neural signals
Qi Sun
Khuong Vo
K. Lui
Michael D. Nunez
J. Vandekerckhove
R. Srinivasan
13
0
0
04 Aug 2022
GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction
Jiacheng Zhu
Jielin Qiu
Zhuolin Yang
Douglas Weber
M. Rosenberg
Emerson Liu
Bo-wen Li
Ding Zhao
OOD
15
13
0
02 Aug 2022
Generative Extraction of Audio Classifiers for Speaker Identification
Tejumade Afonja
Lucas Bourtoule
Varun Chandrasekaran
Sageev Oore
Nicolas Papernot
AAML
11
1
0
26 Jul 2022
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
Kyle Min
Sourya Roy
Subarna Tripathi
T. Guha
Somdeb Majumdar
19
36
0
15 Jul 2022
EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use
Jan Schluter
Gerald Gutenbrunner
VLM
23
12
0
12 Jul 2022
Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach
Rohit Arora
11
0
0
27 Jun 2022
Robust Time Series Denoising with Learnable Wavelet Packet Transform
Gaetan Frusque
Olga Fink
OOD
AI4TS
27
23
0
13 Jun 2022
Previous
1
2
3
4
5
6
Next