ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.00158
  4. Cited By
Speaker Recognition from Raw Waveform with SincNet

Speaker Recognition from Raw Waveform with SincNet

29 July 2018
Mirco Ravanelli
Yoshua Bengio
ArXivPDFHTML

Papers citing "Speaker Recognition from Raw Waveform with SincNet"

50 / 259 papers shown
Title
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration
Daniel Haider
Felix Perfler
Péter Balázs
Clara Hollomey
Nicki Holighaus
21
0
0
12 May 2025
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
Yuankun Xie
Ruibo Fu
Z. Wang
Xiaopeng Wang
Songjun Cao
Long Ma
Haonan Cheng
Long Ye
23
0
0
09 Apr 2025
A Practical Synthesis of Detecting AI-Generated Textual, Visual, and Audio Content
A Practical Synthesis of Detecting AI-Generated Textual, Visual, and Audio Content
Lele Cao
DeLMO
42
0
0
02 Apr 2025
SincPD: An Explainable Method based on Sinc Filters to Diagnose Parkinson's Disease Severity by Gait Cycle Analysis
SincPD: An Explainable Method based on Sinc Filters to Diagnose Parkinson's Disease Severity by Gait Cycle Analysis
Armin Salimi-Badr
Mahan Veisi
Sadra Berangi
33
0
0
10 Feb 2025
Adaptive Central Frequencies Locally Competitive Algorithm for Speech
Adaptive Central Frequencies Locally Competitive Algorithm for Speech
Soufiyan Bahadi
É. Plourde
Jean Rouat
53
0
0
10 Feb 2025
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond
Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond
Mardhiyah Sanni
Tassallah Abdullahi
Devendra D. Kayande
Emmanuel Ayodele
Naome A. Etori
...
Chibuzor Okocha
L. Ismaila
Folafunmi Omofoye
Boluwatife A. Adewale
Tobi Olatunji
85
1
0
06 Feb 2025
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
Yassine El Kheir
Youness Samih
Suraj Maharjan
Tim Polzehl
Sebastian Möller
67
1
0
05 Feb 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
67
0
0
05 Feb 2025
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
Tian-Hao Zhang
Jiawei Zhang
J. Wang
Xinyuan Qian
Xu-cheng Yin
CVBM
45
0
0
02 Jan 2025
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
Kazi Nazmul Haque
R. Rana
Tasnim Jarin
Bjorn W. Schuller Jr
60
0
0
30 Nov 2024
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
Zirui Zhang
Wei Hao
Aroon Sankoh
William Lin
Emanuel Mendiola-Ortiz
Junfeng Yang
Chengzhi Mao
AAML
26
2
0
31 Oct 2024
Reverb: Open-Source ASR and Diarization from Rev
Reverb: Open-Source ASR and Diarization from Rev
Nishchal Bhandari
Danny Chen
Miguel Ángel del Río Fernández
Natalie Delworth
Jennifer Drexler Fox
...
Ondrej Novotný
Jan Profant
Nan Qin
Martin Ratajczak
Jean-Philippe Robichaud
VLM
31
1
0
04 Oct 2024
Freeze and Learn: Continual Learning with Selective Freezing for Speech
  Deepfake Detection
Freeze and Learn: Continual Learning with Selective Freezing for Speech Deepfake Detection
Davide Salvi
Viola Negroni
Luca Bondi
Paolo Bestagini
Stefano Tubaro
29
1
0
26 Sep 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
46
3
0
23 Sep 2024
oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach
  with Pretrained Models
oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach with Pretrained Models
Muhammad Sudipto Siam Dip
Md Anik Hasan
Sapnil Sarker Bipro
Md Abdur Raiyan
M. A. Motin
29
0
0
16 Sep 2024
Leveraging Self-Supervised Learning for Speaker Diarization
Leveraging Self-Supervised Learning for Speaker Diarization
Jiangyu Han
Federico Landini
Johan Rohdin
Anna Silnova
Mireia Díez
Lukas Burget
33
1
0
14 Sep 2024
Biomimetic Frontend for Differentiable Audio Processing
Biomimetic Frontend for Differentiable Audio Processing
Ruolan Leslie Famularo
D. Zotkin
S. Shamma
R. Duraiswami
AI4TS
31
0
0
13 Sep 2024
Universal Pooling Method of Multi-layer Features from Pretrained Models
  for Speaker Verification
Universal Pooling Method of Multi-layer Features from Pretrained Models for Speaker Verification
Jin Sob Kim
Hyun Joon Park
Wooseok Shin
Sung Won Han
SLR
43
0
0
12 Sep 2024
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL
  Features and Additional Regularization for the ASVspoof 2024 Challenge
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge
Kirill Borodin
Vasiliy Kudryavtsev
Dmitrii Korzh
Alexey Efimenko
Grach Mkrtchian
Mikhail Gorodnichev
Oleg Y. Rogov
41
1
0
30 Aug 2024
Learning Multi-Target TDOA Features for Sound Event Localization and
  Detection
Learning Multi-Target TDOA Features for Sound Event Localization and Detection
Axel Berg
Johanna Engman
Jens Gulin
Karl Åström
Magnus Oskarsson
27
1
0
30 Aug 2024
EmoAttack: Utilizing Emotional Voice Conversion for Speech Backdoor
  Attacks on Deep Speech Classification Models
EmoAttack: Utilizing Emotional Voice Conversion for Speech Backdoor Attacks on Deep Speech Classification Models
Wenhan Yao
Zedong XingXiarun Chen
Jia Liu
yongqiang He
Weiping Wen
AAML
36
0
0
28 Aug 2024
Sample-Independent Federated Learning Backdoor Attack in Speaker Recognition
Sample-Independent Federated Learning Backdoor Attack in Speaker Recognition
Weida Xu
Yang Xu
Sicong Zhang
FedML
AAML
36
0
0
25 Aug 2024
Toward Improving Synthetic Audio Spoofing Detection Robustness via
  Meta-Learning and Disentangled Training With Adversarial Examples
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
Zhenyu Wang
John H. L. Hansen
AAML
30
1
0
23 Aug 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep
  Speaker Representation Learning
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Shuai Wang
Zheng-Shou Chen
Kong Aik Lee
Yan-min Qian
Haizhou Li
26
4
0
21 Jul 2024
Towards Enhanced Classification of Abnormal Lung sound in Multi-breath:
  A Light Weight Multi-label and Multi-head Attention Classification Method
Towards Enhanced Classification of Abnormal Lung sound in Multi-breath: A Light Weight Multi-label and Multi-head Attention Classification Method
Yi-Wei Chua
Yun-Chien Cheng
24
0
0
15 Jul 2024
SincVAE: a New Approach to Improve Anomaly Detection on EEG Data Using
  SincNet and Variational Autoencoder
SincVAE: a New Approach to Improve Anomaly Detection on EEG Data Using SincNet and Variational Autoencoder
A. Pollastro
Francesco Isgrò
R. Prevete
29
2
0
25 Jun 2024
Modulated Differentiable STFT and Balanced Spectrum Metric for Freight Train Wheelset Bearing Cross-machine Transfer Fault Diagnosis under Speed Fluctuations
Modulated Differentiable STFT and Balanced Spectrum Metric for Freight Train Wheelset Bearing Cross-machine Transfer Fault Diagnosis under Speed Fluctuations
Chao He
Hongmei Shi
Ruixin Li
Jianbo Li
Zujun Yu
30
32
0
17 Jun 2024
MR-RawNet: Speaker verification system with multiple temporal
  resolutions for variable duration utterances using raw waveforms
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms
Seung-bin Kim
Chan-yeong Lim
Jungwoo Heo
Ju-ho Kim
Hyun-Seo Shin
Kyo-Won Koo
Ha-Jin Yu
31
0
0
11 Jun 2024
Towards Signal Processing In Large Language Models
Towards Signal Processing In Large Language Models
Prateek Verma
Mert Pilanci
31
3
0
10 Jun 2024
RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake
  Detection
RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection
Yujie Chen
Jiangyan Yi
Jun Xue
Chenglong Wang
Xiaohui Zhang
Shunbo Dong
Siding Zeng
Jianhua Tao
Lv Zhao
Cunhang Fan
Mamba
38
15
0
10 Jun 2024
Non-autoregressive real-time Accent Conversion model with voice cloning
Non-autoregressive real-time Accent Conversion model with voice cloning
Vladimir Nechaev
Sergey Kosyakov
27
1
0
21 May 2024
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel
  Energy Normalisation (PCEN) to Noisy Conditions
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions
Hanyu Meng
V. Sethu
E. Ambikairajah
24
2
0
10 Apr 2024
Exploring the Task-agnostic Trait of Self-supervised Learning in the
  Context of Detecting Mental Disorders
Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders
Rohan kumar Gupta
Rohit Sinha
33
0
0
22 Mar 2024
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection
  with Spiking Neural Networks
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks
Qu Yang
Qianhui Liu
Nan Li
Meng Ge
Zeyang Song
Haizhou Li
32
4
0
09 Mar 2024
A robust audio deepfake detection system via multi-view feature
A robust audio deepfake detection system via multi-view feature
Yujie Yang
Haochen Qin
Hang Zhou
Chengcheng Wang
Tianyu Guo
Kai Han
Yunhe Wang
38
24
0
04 Mar 2024
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces
  from Disentangled Audio
FaceChain-ImagineID: Freely Crafting High-Fidelity Diverse Talking Faces from Disentangled Audio
Chao Xu
Yang Liu
Jiazheng Xing
Weida Wang
Mingze Sun
...
Tianxin Huang
Siyuan Li
Zhi-Qi Cheng
Ying Tai
Baigui Sun
CVBM
43
11
0
04 Mar 2024
What do neural networks listen to? Exploring the crucial bands in Speech
  Enhancement using Sinc-convolution
What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Kuan-Hsun Ho
J. Hung
Berlin Chen
26
1
0
04 Mar 2024
Real-time Low-latency Music Source Separation using Hybrid
  Spectrogram-TasNet
Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet
Satvik Venkatesh
Arthur Benilov
Philip Coleman
Frederic Roskam
24
5
0
27 Feb 2024
Experimental Study: Enhancing Voice Spoofing Detection Models with
  wav2vec 2.0
Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0
Taein Kang
Soyul Han
Sunmook Choi
Jaejin Seo
Sanghyeok Chung
Seungeun Lee
Seungsang Oh
Il-Youp Kwak
41
7
0
27 Feb 2024
Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
Xiaohui Zhang
Wenjie Fu
Mangui Liang
37
6
0
19 Feb 2024
Listening Between the Lines: Synthetic Speech Detection Disregarding
  Verbal Content
Listening Between the Lines: Synthetic Speech Detection Disregarding Verbal Content
Davide Salvi
Temesgen Semu Balcha
Paolo Bestagini
Stefano Tubaro
33
6
0
08 Feb 2024
Explainable Predictive Maintenance: A Survey of Current Methods,
  Challenges and Opportunities
Explainable Predictive Maintenance: A Survey of Current Methods, Challenges and Opportunities
Logan Cummins
Alexander Sommers
Somayeh Bakhtiari Ramezani
Sudip Mittal
Joseph E. Jabour
Maria Seale
Shahram Rahimi
24
21
0
15 Jan 2024
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition
  Neural Network
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Yuhang He
Zhuangzhuang Dai
Long Chen
Niki Trigoni
Andrew Markham
15
0
0
26 Dec 2023
Speech Understanding on Tiny Devices with A Learning Cache
Speech Understanding on Tiny Devices with A Learning Cache
A. Benazir
Zhiming Xu
Felix Xiaozhu Lin
13
0
0
30 Nov 2023
TACNET: Temporal Audio Source Counting Network
TACNET: Temporal Audio Source Counting Network
Amirreza Ahmadnejad
Ahmad Mahmmodian Darviishani
Mohmmad Mehrdad Asadi
Sajjad Saffariyeh
Pedram Yousef
Emad Fatemizadeh
24
2
0
04 Nov 2023
Powerset multi-class cross entropy loss for neural speaker diarization
Powerset multi-class cross entropy loss for neural speaker diarization
Alexis Plaquet
H. Bredin
99
91
0
19 Oct 2023
Blind estimation of audio effects using an auto-encoder approach and
  differentiable digital signal processing
Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
Come Peladeau
Geoffroy Peeters
22
5
0
18 Oct 2023
A Study on Incorporating Whisper for Robust Speech Assessment
A Study on Incorporating Whisper for Robust Speech Assessment
Ryandhimas E. Zezario
Yu-Wen Chen
Szu-Wei Fu
Yu Tsao
H. Wang
C. Fuh
27
10
0
22 Sep 2023
The Impact of Silence on Speech Anti-Spoofing
The Impact of Silence on Speech Anti-Spoofing
Yuxiang Zhang
Zhuo Li
Jingze Lu
Hua Hua
Wenchao Wang
Pengyuan Zhang
24
19
0
21 Sep 2023
Spoofing attack augmentation: can differently-trained attack models
  improve generalisation?
Spoofing attack augmentation: can differently-trained attack models improve generalisation?
W. Ge
Xin Wang
Junichi Yamagishi
Massimiliano Todisco
Nicholas W. D. Evans
AAML
22
7
0
18 Sep 2023
123456
Next