ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1808.00158
  4. Cited By
Speaker Recognition from Raw Waveform with SincNet

Speaker Recognition from Raw Waveform with SincNet

29 July 2018
Mirco Ravanelli
Yoshua Bengio
ArXivPDFHTML

Papers citing "Speaker Recognition from Raw Waveform with SincNet"

50 / 259 papers shown
Title
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker
  Recognition Systems
AS2T: Arbitrary Source-To-Target Adversarial Attack on Speaker Recognition Systems
Guangke Chen
Zhe Zhao
Fu Song
Sen Chen
Lingling Fan
Yang Liu
AAML
17
18
0
07 Jun 2022
Radar Image Reconstruction from Raw ADC Data using Parametric
  Variational Autoencoder with Domain Adaptation
Radar Image Reconstruction from Raw ADC Data using Parametric Variational Autoencoder with Domain Adaptation
Michael Stephan
Thomas Stadelmayer
Avik Santra
Georg Fischer0001
R. Weigel
F. Lurz
8
10
0
30 May 2022
Adversarial attacks and defenses in Speaker Recognition Systems: A
  survey
Adversarial attacks and defenses in Speaker Recognition Systems: A survey
Jiahe Lan
Rui Zhang
Zheng Yan
Jie Wang
Yu Chen
Ronghui Hou
AAML
9
23
0
27 May 2022
Trainable Wavelet Neural Network for Non-Stationary Signals
Trainable Wavelet Neural Network for Non-Stationary Signals
Jason Stock
Chuck Anderson
9
3
0
06 May 2022
Dictionary Attacks on Speaker Verification
Dictionary Attacks on Speaker Verification
Mirko Marras
Pawel Korus
Anubhav Jain
N. Memon
AAML
13
9
0
24 Apr 2022
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility
  Prediction Model for Hearing Aids
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids
Ryandhimas E. Zezario
Fei Chen
C. Fuh
Hsin-Min Wang
Yu Tsao
16
16
0
07 Apr 2022
HiFi-VC: High Quality ASR-Based Voice Conversion
HiFi-VC: High Quality ASR-Based Voice Conversion
A. Kashkin
I. Karpukhin
S. Shishkin
14
5
0
31 Mar 2022
Does Audio Deepfake Detection Generalize?
Does Audio Deepfake Detection Generalize?
Nicolas M. Muller
Pavel Czempin
Franziska Dieckmann
Adam Froghyar
Konstantin Böttinger
25
136
0
30 Mar 2022
Combination of Time-domain, Frequency-domain, and Cepstral-domain
  Acoustic Features for Speech Commands Classification
Combination of Time-domain, Frequency-domain, and Cepstral-domain Acoustic Features for Speech Commands Classification
Yikang Wang
Hiromitsu Nishizaki
17
1
0
30 Mar 2022
Learning neural audio features without supervision
Learning neural audio features without supervision
Sarthak Yadav
Neil Zeghidour
SSL
30
4
0
29 Mar 2022
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cross-Modal Perceptionist: Can Face Geometry be Gleaned from Voices?
Cho-Ying Wu
Chin-Cheng Hsu
Ulrich Neumann
CVBM
4
14
0
18 Mar 2022
Pushing the limits of raw waveform speaker recognition
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
23
87
0
16 Mar 2022
Audio Self-supervised Learning: A Survey
Audio Self-supervised Learning: A Survey
Shuo Liu
Adria Mallol-Ragolta
Emilia Parada-Cabeleiro
Kun Qian
Xingshuo Jing
Alexander Kathan
Bin Hu
Bjoern W. Schuller
SSL
22
106
0
02 Mar 2022
Automatic speaker verification spoofing and deepfake detection using
  wav2vec 2.0 and data augmentation
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
Hemlata Tak
Massimiliano Todisco
Xin Wang
Jee-weon Jung
Junichi Yamagishi
Nicholas W. D. Evans
19
151
0
24 Feb 2022
Partially Fake Audio Detection by Self-attention-based Fake Span
  Discovery
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery
Haibin Wu
Heng-Cheng Kuo
Naijun Zheng
Kuo-Hsuan Hung
Hung-yi Lee
Yu Tsao
Hsin-Min Wang
H. Meng
14
36
0
14 Feb 2022
The xmuspeech system for multi-channel multi-party meeting transcription
  challenge
The xmuspeech system for multi-channel multi-party meeting transcription challenge
Jie Wang
Yuji Liu
Binling Wang
Yiming Zhi
Song Li
Shipeng Xia
Jiayang Zhang
Lin Li
Q. Hong
Feng Tong
11
0
0
11 Feb 2022
Learnable Nonlinear Compression for Robust Speaker Verification
Learnable Nonlinear Compression for Robust Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
17
2
0
10 Feb 2022
CALM: Contrastive Aligned Audio-Language Multirate and Multimodal
  Representations
CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations
Vin Sachidananda
Shao-Yen Tseng
Erik Marchi
S. Kajarekar
P. Georgiou
21
8
0
08 Feb 2022
Learnable Wavelet Packet Transform for Data-Adapted Spectrograms
Learnable Wavelet Packet Transform for Data-Adapted Spectrograms
Gaetan Frusque
Olga Fink
9
13
0
26 Jan 2022
Real-Time Seizure Detection using EEG: A Comprehensive Comparison of
  Recent Approaches under a Realistic Setting
Real-Time Seizure Detection using EEG: A Comprehensive Comparison of Recent Approaches under a Realistic Setting
Kwanhyung Lee
Hyewon Jeong
Seyun Kim
Donghwa Yang
Hoon-Chul Kang
E. Choi
OOD
4
12
0
21 Jan 2022
A Practical Guide to Logical Access Voice Presentation Attack Detection
A Practical Guide to Logical Access Voice Presentation Attack Detection
Xin Wang
Junichi Yamagishi
AAML
11
10
0
10 Jan 2022
Towards Relatable Explainable AI with the Perceptual Process
Towards Relatable Explainable AI with the Perceptual Process
Wencan Zhang
Brian Y. Lim
AAML
XAI
9
61
0
28 Dec 2021
Deep Spoken Keyword Spotting: An Overview
Deep Spoken Keyword Spotting: An Overview
Iván López-Espejo
Z. Tan
John H. L. Hansen
Jesper Jensen
11
99
0
20 Nov 2021
Investigating self-supervised front ends for speech spoofing
  countermeasures
Investigating self-supervised front ends for speech spoofing countermeasures
Xin Wang
Junichi Yamagishi
AAML
17
123
0
15 Nov 2021
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment
  Model with Cross-Domain Features
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
Ryandhimas E. Zezario
Szu-Wei Fu
Fei Chen
C. Fuh
Hsin-Min Wang
Yu Tsao
DiffM
19
75
0
03 Nov 2021
A Comparative Study of Speaker Role Identification in Air Traffic
  Communication Using Deep Learning Approaches
A Comparative Study of Speaker Role Identification in Air Traffic Communication Using Deep Learning Approaches
Dongyue Guo
Jianwei Zhang
Bo Yang
Yi Lin
17
10
0
03 Nov 2021
FANS: Fusing ASR and NLU for on-device SLU
FANS: Fusing ASR and NLU for on-device SLU
Martin H. Radfar
Athanasios Mouchtaris
Siegfried Kunzmann
Ariya Rastrow
17
12
0
31 Oct 2021
Deep Learning For Prominence Detection In Children's Read Speech
Deep Learning For Prominence Detection In Children's Read Speech
Mithilesh Vaidya
Kamini Sabu
Preeti Rao
6
6
0
27 Oct 2021
Optimizing Multi-Taper Features for Deep Speaker Verification
Optimizing Multi-Taper Features for Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
16
1
0
21 Oct 2021
EEGminer: Discovering Interpretable Features of Brain Activity with
  Learnable Filters
EEGminer: Discovering Interpretable Features of Brain Activity with Learnable Filters
Siegfried Ludwig
Stylianos Bakas
D. Adamos
N. Laskaris
Yannis Panagakis
S. Zafeiriou
8
6
0
19 Oct 2021
Multistage linguistic conditioning of convolutional layers for speech
  emotion recognition
Multistage linguistic conditioning of convolutional layers for speech emotion recognition
Andreas Triantafyllopoulos
U. Reichel
Shuo Liu
Simon Huber
F. Eyben
Björn W. Schuller
25
9
0
13 Oct 2021
Large-scale Self-Supervised Speech Representation Learning for Automatic
  Speaker Verification
Large-scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification
Zhengyang Chen
Sanyuan Chen
Yu-Huan Wu
Yao Qian
Chengyi Wang
Shujie Liu
Y. Qian
Michael Zeng
SSL
15
124
0
12 Oct 2021
A study of the robustness of raw waveform based speaker embeddings under
  mismatched conditions
A study of the robustness of raw waveform based speaker embeddings under mismatched conditions
Ge Zhu
Frank Cwitkowitz
Z. Duan
22
2
0
08 Oct 2021
AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph
  Attention Networks
AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks
Jee-weon Jung
Hee-Soo Heo
Hemlata Tak
Hye-jin Shim
Joon Son Chung
Bong-Jin Lee
Ha-Jin Yu
Nicholas W. D. Evans
121
279
0
04 Oct 2021
Optimized Power Normalized Cepstral Coefficients towards Robust Deep
  Speaker Verification
Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification
Xuechen Liu
Md. Sahidullah
Tomi Kinnunen
18
6
0
24 Sep 2021
MS-SincResNet: Joint learning of 1D and 2D kernels using multi-scale
  SincNet and ResNet for music genre classification
MS-SincResNet: Joint learning of 1D and 2D kernels using multi-scale SincNet and ResNet for music genre classification
Pei-Chun Chang
Yonghao Chen
Chang-Hsing Lee
15
21
0
18 Sep 2021
Behavior of Keyword Spotting Networks Under Noisy Conditions
Behavior of Keyword Spotting Networks Under Noisy Conditions
Anwesh Mohanty
Adrian Frischknecht
Christoph Gerum
Oliver Bringmann
6
1
0
15 Sep 2021
Overlap-aware low-latency online speaker diarization based on end-to-end
  local segmentation
Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation
Juan Manuel Coria
H. Bredin
Sahar Ghannay
Sophie Rosset
36
30
0
14 Sep 2021
Complementing Handcrafted Features with Raw Waveform Using a
  Light-weight Auxiliary Model
Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model
Zhongwei Teng
Quchen Fu
Jules White
Maria E. Powell
Douglas C. Schmidt
8
5
0
06 Sep 2021
Learning Sparse Analytic Filters for Piano Transcription
Learning Sparse Analytic Filters for Piano Transcription
Frank Cwitkowitz
M. Heydari
Z. Duan
19
2
0
23 Aug 2021
Using Large Pre-Trained Models with Cross-Modal Attention for
  Multi-Modal Emotion Recognition
Using Large Pre-Trained Models with Cross-Modal Attention for Multi-Modal Emotion Recognition
Krishna D N Freshworks
14
11
0
22 Aug 2021
Curricular SincNet: Towards Robust Deep Speaker Recognition by
  Emphasizing Hard Samples in Latent Space
Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space
Labib Chowdhury
M. Kamal
Najia Hasan
Nabeel Mohammed
11
3
0
21 Aug 2021
On the Exploitability of Audio Machine Learning Pipelines to
  Surreptitious Adversarial Examples
On the Exploitability of Audio Machine Learning Pipelines to Surreptitious Adversarial Examples
Adelin Travers
Lorna Licollari
Guanghan Wang
Varun Chandrasekaran
Adam Dziedzic
David Lie
Nicolas Papernot
AAML
13
3
0
03 Aug 2021
A Multi-Head Relevance Weighting Framework For Learning Raw Waveform
  Audio Representations
A Multi-Head Relevance Weighting Framework For Learning Raw Waveform Audio Representations
Debottam Dutta
Purvi Agrawal
Sriram Ganapathy
6
2
0
30 Jul 2021
End-to-End Spectro-Temporal Graph Attention Networks for Speaker
  Verification Anti-Spoofing and Speech Deepfake Detection
End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection
Hemlata Tak
Jee-weon Jung
J. Patino
Madhu R. Kamble
Massimiliano Todisco
Nicholas W. D. Evans
11
157
0
27 Jul 2021
Use of speaker recognition approaches for learning and evaluating
  embedding representations of musical instrument sounds
Use of speaker recognition approaches for learning and evaluating embedding representations of musical instrument sounds
Xuan Shi
Erica Cooper
Junichi Yamagishi
24
7
0
24 Jul 2021
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Cheng-Hung Hu
Yu-Huai Peng
Junichi Yamagishi
Yu Tsao
Hsin-Min Wang
13
5
0
20 Jul 2021
Human Perception of Audio Deepfakes
Human Perception of Audio Deepfakes
Nicolas M. Muller
Karla Markert
Konstantin Böttinger
11
49
0
20 Jul 2021
PERSA+: A Deep Learning Front-End for Context-Agnostic Audio
  Classification
PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification
Lazaros Vrysis
Iordanis Thoidis
Charalampos A. Dimoulas
G. Papanikolaou
VLM
17
0
0
20 Jul 2021
Interpretable SincNet-based Deep Learning for Emotion Recognition from
  EEG brain activity
Interpretable SincNet-based Deep Learning for Emotion Recognition from EEG brain activity
J. M. M. Torres
Mirco Ravanelli
Sara E. Medina-DeVilliers
M. Lerner
Giuseppe Riccardi
11
21
0
18 Jul 2021
Previous
123456
Next