ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.08596
  4. Cited By
LEAF: A Learnable Frontend for Audio Classification

LEAF: A Learnable Frontend for Audio Classification

International Conference on Learning Representations (ICLR), 2021
21 January 2021
Neil Zeghidour
O. Teboul
Félix de Chaumont Quitry
Marco Tagliasacchi
    VLMAAML
ArXiv (abs)PDFHTML

Papers citing "LEAF: A Learnable Frontend for Audio Classification"

50 / 78 papers shown
AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning
AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning
Kohei Yamamoto
Kosuke Okusa
101
1
0
03 Dec 2025
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
Hanyu Meng
V. Sethu
E. Ambikairajah
Qiquan Zhang
Haizhou Li
159
0
0
21 Oct 2025
Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
Jiani Ding
Qiyang Sun
Alican Akman
Björn Schuller
202
1
0
26 Sep 2025
Thinking While Listening: Simple Test Time Scaling For Audio Classification
Thinking While Listening: Simple Test Time Scaling For Audio Classification
Prateek Verma
Mert Pilanci
LRM
123
0
0
24 Sep 2025
Unified Learnable 2D Convolutional Feature Extraction for ASR
Unified Learnable 2D Convolutional Feature Extraction for ASR
Peter Vieting
Benedikt Hilmes
Ralf Schluter
Hermann Ney
SSL
234
0
0
12 Sep 2025
MAPSS: Manifold-based Assessment of Perceptual Source Separation
MAPSS: Manifold-based Assessment of Perceptual Source Separation
Amir Ivry
Samuele Cornell
Shinji Watanabe
170
0
0
11 Sep 2025
Regularizing Learnable Feature Extraction for Automatic Speech Recognition
Regularizing Learnable Feature Extraction for Automatic Speech Recognition
Peter Vieting
Maximilian Kannen
Benedikt Hilmes
Ralf Schluter
Hermann Ney
AAML
260
1
0
11 Jun 2025
RPRA-ADD: Forgery Trace Enhancement-Driven Audio Deepfake Detection
RPRA-ADD: Forgery Trace Enhancement-Driven Audio Deepfake Detection
Ruibo Fu
Xiaopeng Wang
Zhengqi Wen
Jianhua Tao
Yuankun Xie
...
Chunyu Qiang
Zhengqi Wen
Cunhang Fan
Chenxing Li
Guanjun Li
323
1
0
31 May 2025
Large Language Models Implicitly Learn to See and Hear Just By Reading
Large Language Models Implicitly Learn to See and Hear Just By Reading
Prateek Verma
Mert Pilanci
465
1
0
20 May 2025
ALLM4ADD: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection
ALLM4ADD: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection
Hao Gu
Jiangyan Yi
Chenglong Wang
Jianhua Tao
Zheng Lian
Jiayi He
Yong Ren
Yujie Chen
Zhengqi Wen
484
0
0
16 May 2025
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration
ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML IntegrationInternational Conference on Sampling Theory and Applications (SampTA), 2025
Daniel Haider
Felix Perfler
Péter Balázs
Clara Hollomey
Nicki Holighaus
321
0
0
12 May 2025
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling
Jakob Poncelet
Hugo Van hamme
487
1
0
05 Feb 2025
autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks
Simon Rampp
Andreas Triantafyllopoulos
M. Milling
Björn Schuller
588
2
0
16 Dec 2024
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
Kazi Nazmul Haque
R. Rana
Tasnim Jarin
Bjorn W. Schuller Jr
353
0
0
30 Nov 2024
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
A Comprehensive Survey with Critical Analysis for Deepfake Speech DetectionComputer Science Review (CSR), 2024
Lam Pham
Phat Lam
Dat Tran
Hieu Tang
Tin Nguyen
Alexander Schindler
Canh Vu
Alexander Polonsky
Canh Vu
720
19
0
23 Sep 2024
Biomimetic Frontend for Differentiable Audio Processing
Biomimetic Frontend for Differentiable Audio Processing
Ruolan Leslie Famularo
D. Zotkin
S. Shamma
R. Duraiswami
AI4TS
310
0
0
13 Sep 2024
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music
  Videos
VMAS: Video-to-Music Generation via Semantic Alignment in Web Music VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Yan-Bo Lin
Yu Tian
L. Yang
Gedas Bertasius
Heng Wang
VGen
293
18
0
11 Sep 2024
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL
  Features and Additional Regularization for the ASVspoof 2024 Challenge
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge
Kirill Borodin
Vasiliy Kudryavtsev
Dmitrii Korzh
Alexey Efimenko
Grach Mkrtchian
Mikhail Gorodnichev
Oleg Y. Rogov
351
25
0
30 Aug 2024
Utilizing Speaker Profiles for Impersonation Audio Detection
Utilizing Speaker Profiles for Impersonation Audio DetectionACM Multimedia (MM), 2024
Hao Gu
JiangYan Yi
Chenglong Wang
Yong Ren
Jianhua Tao
Xinrui Yan
Yujie Chen
Xiaohui Zhang
191
2
0
30 Aug 2024
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Maria Sandsten
B. Schuller
474
11
0
22 Jul 2024
Towards Enhanced Classification of Abnormal Lung sound in Multi-breath:
  A Light Weight Multi-label and Multi-head Attention Classification Method
Towards Enhanced Classification of Abnormal Lung sound in Multi-breath: A Light Weight Multi-label and Multi-head Attention Classification Method
Yi-Wei Chua
Yun-Chien Cheng
229
3
0
15 Jul 2024
Towards Attention-based Contrastive Learning for Audio Spoof Detection
Towards Attention-based Contrastive Learning for Audio Spoof Detection
C. Goel
Surya Koppisetti
Ben Colman
Ali Shahriyari
Gaurav Bharaj
402
9
0
03 Jul 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with
  Discrete Speech Representations
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations
Kunal Dhawan
Nithin Rao Koluguri
Ante Jukić
Ryan Langman
Jagadeesh Balam
Boris Ginsburg
290
15
0
03 Jul 2024
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Pre-training Feature Guided Diffusion Model for Speech Enhancement
Yiyuan Yang
Niki Trigoni
Andrew Markham
460
10
0
11 Jun 2024
An automatic analysis of ultrasound vocalisations for the prediction of
  interaction context in captive Egyptian fruit bats
An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats
Andreas Triantafyllopoulos
Alexander Gebhard
M. Milling
Simon Rampp
Björn Schuller
184
0
0
10 Jun 2024
A Survey on Speech Deepfake Detection
A Survey on Speech Deepfake Detection
Menglu Li
Yasaman Ahmadiadli
Xiao-Ping Zhang
502
25
0
22 Apr 2024
Efficient infusion of self-supervised representations in Automatic
  Speech Recognition
Efficient infusion of self-supervised representations in Automatic Speech Recognition
Darshan Prabhu
Sai Ganesh Mirishkar
Pankaj Wasnik
133
0
0
19 Apr 2024
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel
  Energy Normalisation (PCEN) to Noisy Conditions
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy ConditionsInterspeech (Interspeech), 2023
Hanyu Meng
V. Sethu
E. Ambikairajah
282
4
0
10 Apr 2024
A robust audio deepfake detection system via multi-view feature
A robust audio deepfake detection system via multi-view feature
Yujie Yang
Haochen Qin
Hang Zhou
Chengcheng Wang
Tianyu Guo
Kai Han
Yunhe Wang
270
57
0
04 Mar 2024
Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting
  Networks?
Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks?IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Stefano Damiano
Luca Bondi
Shabnam Ghaffarzadegan
Andre Guntoro
Toon van Waterschoot
156
12
0
17 Jan 2024
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio
  Detection
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
Lian Huang
Chi-Man Pun
220
14
0
11 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Brain-Conditional Multimodal Synthesis: A Survey and TaxonomyIEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
579
16
0
31 Dec 2023
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition
  Neural Network
SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Yuhang He
Zhuangzhuang Dai
Long Chen
Niki Trigoni
Andrew Markham
246
3
0
26 Dec 2023
Free-Space Optical Spiking Neural Network
Free-Space Optical Spiking Neural Network
Reyhane Ahmadi
Amirreza Ahmadnejad
S. Koohi
204
4
0
08 Nov 2023
TACNET: Temporal Audio Source Counting Network
TACNET: Temporal Audio Source Counting Network
Amirreza Ahmadnejad
Ahmad Mahmmodian Darviishani
Mohmmad Mehrdad Asadi
Sajjad Saffariyeh
Pedram Yousef
Emad Fatemizadeh
209
4
0
04 Nov 2023
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural NetworksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Zeyang Song
Jibin Wu
Malu Zhang
Mike Zheng Shou
Haizhou Li
340
11
0
18 Sep 2023
SSL-Net: A Synergistic Spectral and Learning-based Network for Efficient
  Bird Sound Classification
SSL-Net: A Synergistic Spectral and Learning-based Network for Efficient Bird Sound ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Yiyuan Yang
Kaichen Zhou
Niki Trigoni
Andrew Markham
236
7
0
15 Sep 2023
Instabilities in Convnets for Raw Audio
Instabilities in Convnets for Raw AudioIEEE Signal Processing Letters (IEEE SPL), 2023
Daniel Haider
Vincent Lostanlen
Martin Ehler
Péter Balázs
374
3
0
11 Sep 2023
Audio Deepfake Detection: A Survey
Audio Deepfake Detection: A Survey
Jiangyan Yi
Chenglong Wang
Jianhua Tao
Xiaohui Zhang
Chu Yuan Zhang
Yan Zhao
457
111
0
29 Aug 2023
Neural Architectures Learning Fourier Transforms, Signal Processing and
  Much More....
Neural Architectures Learning Fourier Transforms, Signal Processing and Much More....
Prateek Verma
141
0
0
20 Aug 2023
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Comparative Analysis of the wav2vec 2.0 Feature Extractor
Peter Vieting
Ralf Schluter
Hermann Ney
284
5
0
08 Aug 2023
Fitting Auditory Filterbanks with Multiresolution Neural Networks
Fitting Auditory Filterbanks with Multiresolution Neural NetworksIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023
Vincent Lostanlen
Daniel Haider
Han Han
Mathieu Lagrange
Péter Balázs
Martin Ehler
259
8
0
25 Jul 2023
Brain2Music: Reconstructing Music from Human Brain Activity
Brain2Music: Reconstructing Music from Human Brain Activity
Timo I. Denk
Yu Takagi
Takuya Matsuyama
A. Agostinelli
Tomoya Nakai
Christian Frank
Shinji Nishimoto
257
18
0
20 Jul 2023
V2Meow: Meowing to the Visual Beat via Video-to-Music Generation
V2Meow: Meowing to the Visual Beat via Video-to-Music GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023
Kun Su
Judith Yue Li
Qingqing Huang
Dima Kuzmin
Joonseok Lee
...
Fei Sha
A. Jansen
Yu Wang
Mauro Verzetti
Timo I. Denk
VGen
248
26
0
11 May 2023
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion
  Recognition
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
Soumya Dutta
Sriram Ganapathy
465
25
0
14 Apr 2023
Speech Intelligibility Classifiers from 550k Disordered Speech Samples
Speech Intelligibility Classifiers from 550k Disordered Speech SamplesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Subhashini Venugopalan
Jimmy Tobin
Samuel J. Yang
Katie Seaver
Richard Cave
P. Jiang
Neil Zeghidour
Rus Heywood
Jordan R. Green
Michael P. Brenner
304
18
0
13 Mar 2023
Onsets and Velocities: Affordable Real-Time Piano Transcription Using
  Convolutional Neural Networks
Onsets and Velocities: Affordable Real-Time Piano Transcription Using Convolutional Neural NetworksEuropean Signal Processing Conference (EUSIPCO), 2023
Andres Fernandez
349
7
0
08 Mar 2023
Synergy between human and machine approaches to sound/scene recognition
  and processing: An overview of ICASSP special session
Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session
Laurie M. Heller
Benjamin Elizalde
Bhiksha Raj
Soham Deshmukh
165
11
0
20 Feb 2023
In Search for a Generalizable Method for Source Free Domain Adaptation
In Search for a Generalizable Method for Source Free Domain AdaptationInternational Conference on Machine Learning (ICML), 2023
Malik Boudiaf
Tom Denton
B. V. Merrienboer
Vincent Dumoulin
Eleni Triantafillou
TTA
316
24
0
13 Feb 2023
MusicLM: Generating Music From Text
MusicLM: Generating Music From Text
A. Agostinelli
Timo I. Denk
Zalan Borsos
Jesse Engel
Mauro Verzetti
...
Adam Roberts
Marco Tagliasacchi
Matthew Sharifi
Neil Zeghidour
Christian Frank
MGen
1.1K
647
0
26 Jan 2023
12
Next
Page 1 of 2