Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.00158
Cited By
Speaker Recognition from Raw Waveform with SincNet
29 July 2018
Mirco Ravanelli
Yoshua Bengio
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Speaker Recognition from Raw Waveform with SincNet"
50 / 259 papers shown
Title
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration
Daniel Haider
Felix Perfler
Péter Balázs
Clara Hollomey
Nicki Holighaus
28
0
0
12 May 2025
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
Yuankun Xie
Ruibo Fu
Z. Wang
Xiaopeng Wang
Songjun Cao
Long Ma
Haonan Cheng
Long Ye
23
0
0
09 Apr 2025
A Practical Synthesis of Detecting AI-Generated Textual, Visual, and Audio Content
Lele Cao
DeLMO
42
0
0
02 Apr 2025
SincPD: An Explainable Method based on Sinc Filters to Diagnose Parkinson's Disease Severity by Gait Cycle Analysis
Armin Salimi-Badr
Mahan Veisi
Sadra Berangi
35
0
0
10 Feb 2025
Adaptive Central Frequencies Locally Competitive Algorithm for Speech
Soufiyan Bahadi
É. Plourde
Jean Rouat
53
0
0
10 Feb 2025
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond
Mardhiyah Sanni
Tassallah Abdullahi
Devendra D. Kayande
Emmanuel Ayodele
Naome A. Etori
...
Chibuzor Okocha
L. Ismaila
Folafunmi Omofoye
Boluwatife A. Adewale
Tobi Olatunji
85
1
0
06 Feb 2025
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Yassine El Kheir
Youness Samih
Suraj Maharjan
Tim Polzehl
Sebastian Möller
67
1
0
05 Feb 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
67
0
0
05 Feb 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang
Jiawei Zhang
J. Wang
Xinyuan Qian
Xu-cheng Yin
CVBM
45
0
0
02 Jan 2025
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
Kazi Nazmul Haque
R. Rana
Tasnim Jarin
Bjorn W. Schuller Jr
60
0
0
30 Nov 2024
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
Zirui Zhang
Wei Hao
Aroon Sankoh
William Lin
Emanuel Mendiola-Ortiz
Junfeng Yang
Chengzhi Mao
AAML
26
2
0
31 Oct 2024
Reverb: Open-Source ASR and Diarization from Rev
Nishchal Bhandari
Danny Chen
Miguel Ángel del Río Fernández
Natalie Delworth
Jennifer Drexler Fox
...
Ondrej Novotný
Jan Profant
Nan Qin
Martin Ratajczak
Jean-Philippe Robichaud
VLM
31
1
0
04 Oct 2024
Freeze and Learn: Continual Learning with Selective Freezing for Speech Deepfake Detection
Davide Salvi
Viola Negroni
Luca Bondi
Paolo Bestagini
Stefano Tubaro
29
1
0
26 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
46
3
0
23 Sep 2024
oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach with Pretrained Models
Muhammad Sudipto Siam Dip
Md Anik Hasan
Sapnil Sarker Bipro
Md Abdur Raiyan
M. A. Motin
29
0
0
16 Sep 2024
Leveraging Self-Supervised Learning for Speaker Diarization
Jiangyu Han
Federico Landini
Johan Rohdin
Anna Silnova
Mireia Díez
Lukas Burget
33
1
0
14 Sep 2024
Biomimetic Frontend for Differentiable Audio Processing
Ruolan Leslie Famularo
D. Zotkin
S. Shamma
R. Duraiswami
AI4TS
31
0
0
13 Sep 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Sung Won Han
SLR
43
0
0
12 Sep 2024
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge
Kirill Borodin
Vasiliy Kudryavtsev
Dmitrii Korzh
Alexey Efimenko
Grach Mkrtchian
Mikhail Gorodnichev
Oleg Y. Rogov
41
1
0
30 Aug 2024
Learning Multi-Target TDOA Features for Sound Event Localization and Detection
Axel Berg
Johanna Engman
Jens Gulin
Karl Åström
Magnus Oskarsson
27
1
0
30 Aug 2024
EmoAttack: Utilizing Emotional Voice Conversion for Speech Backdoor Attacks on Deep Speech Classification Models
Wenhan Yao
Zedong XingXiarun Chen
Jia Liu
yongqiang He
Weiping Wen
AAML
36
0
0
28 Aug 2024
Sample-Independent Federated Learning Backdoor Attack in Speaker Recognition
Weida Xu
Yang Xu
Sicong Zhang
FedML
AAML
36
0
0
25 Aug 2024
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
Zhenyu Wang
John H. L. Hansen
AAML
30
1
0
23 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
26
4
0
21 Jul 2024
Towards Enhanced Classification of Abnormal Lung sound in Multi-breath: A Light Weight Multi-label and Multi-head Attention Classification Method
Yi-Wei Chua
Yun-Chien Cheng
24
0
0
15 Jul 2024
SincVAE: a New Approach to Improve Anomaly Detection on EEG Data Using SincNet and Variational Autoencoder
A. Pollastro
Francesco Isgrò
R. Prevete
31
2
0
25 Jun 2024
Modulated Differentiable STFT and Balanced Spectrum Metric for Freight Train Wheelset Bearing Cross-machine Transfer Fault Diagnosis under Speed Fluctuations
Chao He
Hongmei Shi
Ruixin Li
Jianbo Li
Zujun Yu
30
32
0
17 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
31
0
0
11 Jun 2024
Towards Signal Processing In Large Language Models
Prateek Verma
Mert Pilanci
31
3
0
10 Jun 2024
RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection
Yujie Chen
Jiangyan Yi
Jun Xue
Chenglong Wang
Xiaohui Zhang
Shunbo Dong
Siding Zeng
Jianhua Tao
Lv Zhao
Cunhang Fan
Mamba
38
15
0
10 Jun 2024
Non-autoregressive real-time Accent Conversion model with voice cloning
Vladimir Nechaev
Sergey Kosyakov
29
1
0
21 May 2024
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions
Hanyu Meng
V. Sethu
E. Ambikairajah
24
2
0
10 Apr 2024
Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders
Rohan kumar Gupta
Rohit Sinha
33
0
0
22 Mar 2024
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks
Qu Yang
Qianhui Liu
Nan Li
Meng Ge
Zeyang Song
Haizhou Li
32
4
0
09 Mar 2024
A robust audio deepfake detection system via multi-view feature
Yujie Yang
Haochen Qin
Hang Zhou
Chengcheng Wang
Tianyu Guo
Kai Han
Yunhe Wang
38
26
0
04 Mar 2024
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Chao Xu
Yang Liu
Jiazheng Xing
Weida Wang
Mingze Sun
...
Tianxin Huang
Siyuan Li
Zhi-Qi Cheng
Ying Tai
Baigui Sun
CVBM
46
11
0
04 Mar 2024
What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Kuan-Hsun Ho
J. Hung
Berlin Chen
26
1
0
04 Mar 2024
Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet
Satvik Venkatesh
Arthur Benilov
Philip Coleman
Frederic Roskam
26
5
0
27 Feb 2024
Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0
Taein Kang
Soyul Han
Sunmook Choi
Jaejin Seo
Sanghyeok Chung
Seungeun Lee
Seungsang Oh
Il-Youp Kwak
41
8
0
27 Feb 2024
Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
Xiaohui Zhang
Wenjie Fu
Mangui Liang
37
6
0
19 Feb 2024
Listening Between the Lines: Synthetic Speech Detection Disregarding Verbal Content
Davide Salvi
Temesgen Semu Balcha
Paolo Bestagini
Stefano Tubaro
35
6
0
08 Feb 2024
Explainable Predictive Maintenance: A Survey of Current Methods, Challenges and Opportunities
Logan Cummins
Alexander Sommers
Somayeh Bakhtiari Ramezani
Sudip Mittal
Joseph E. Jabour
Maria Seale
Shahram Rahimi
26
21
0
15 Jan 2024
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Yuhang He
Zhuangzhuang Dai
Long Chen
Niki Trigoni
Andrew Markham
17
0
0
26 Dec 2023
Speech Understanding on Tiny Devices with A Learning Cache
A. Benazir
Zhiming Xu
Felix Xiaozhu Lin
15
0
0
30 Nov 2023
TACNET: Temporal Audio Source Counting Network
Amirreza Ahmadnejad
Ahmad Mahmmodian Darviishani
Mohmmad Mehrdad Asadi
Sajjad Saffariyeh
Pedram Yousef
Emad Fatemizadeh
24
2
0
04 Nov 2023
Powerset multi-class cross entropy loss for neural speaker diarization
Alexis Plaquet
H. Bredin
99
91
0
19 Oct 2023
Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
Come Peladeau
Geoffroy Peeters
24
5
0
18 Oct 2023
A Study on Incorporating Whisper for Robust Speech Assessment
Ryandhimas E. Zezario
Yu-Wen Chen
Szu-Wei Fu
Yu Tsao
H. Wang
C. Fuh
27
10
0
22 Sep 2023
The Impact of Silence on Speech Anti-Spoofing
Yuxiang Zhang
Zhuo Li
Jingze Lu
Hua Hua
Wenchao Wang
Pengyuan Zhang
24
19
0
21 Sep 2023
Spoofing attack augmentation: can differently-trained attack models improve generalisation?
W. Ge
Xin Wang
Junichi Yamagishi
Massimiliano Todisco
Nicholas W. D. Evans
AAML
22
8
0
18 Sep 2023
1
2
3
4
5
6
Next