LEAF: A Learnable Frontend for Audio Classification

International Conference on Learning Representations (ICLR), 2021

21 January 2021

Neil Zeghidour

O. Teboul

Félix de Chaumont Quitry

Papers citing "LEAF: A Learnable Frontend for Audio Classification"

50 / 78 papers shown

AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning

Kohei Yamamoto

Kosuke Okusa

101

03 Dec 2025

Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing

159

21 Oct 2025

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation

202

26 Sep 2025

Thinking While Listening: Simple Test Time Scaling For Audio Classification

Prateek Verma

Mert Pilanci

LRM

123

24 Sep 2025

Unified Learnable 2D Convolutional Feature Extraction for ASR

234

12 Sep 2025

MAPSS: Manifold-based Assessment of Perceptual Source Separation

Amir Ivry

Samuele Cornell

Shinji Watanabe

170

11 Sep 2025

Regularizing Learnable Feature Extraction for Automatic Speech Recognition

260

11 Jun 2025

RPRA-ADD: Forgery Trace Enhancement-Driven Audio Deepfake Detection

...

323

31 May 2025

Large Language Models Implicitly Learn to See and Hear Just By Reading

Prateek Verma

Mert Pilanci

465

20 May 2025

ALLM4ADD: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection

484

16 May 2025

ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML IntegrationInternational Conference on Sampling Theory and Applications (SampTA), 2025

321

12 May 2025

Leveraging Broadcast Media Subtitle Transcripts for Automatic Speech Recognition and Subtitling

Jakob Poncelet

Hugo Van hamme

487

05 Feb 2025

autrainer: A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks

Simon Rampp

Andreas Triantafyllopoulos

M. Milling

Björn Schuller

588

16 Dec 2024

Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)

353

30 Nov 2024

A Comprehensive Survey with Critical Analysis for Deepfake Speech DetectionComputer Science Review (CSR), 2024

Lam Pham

Phat Lam

Dat Tran

Hieu Tang

Tin Nguyen

Alexander Schindler

Canh Vu

Alexander Polonsky

Canh Vu

720

23 Sep 2024

Biomimetic Frontend for Differentiable Audio Processing

Ruolan Leslie Famularo

310

13 Sep 2024

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Heng Wang

293

11 Sep 2024

AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge

351

30 Aug 2024

Utilizing Speaker Profiles for Impersonation Audio DetectionACM Multimedia (MM), 2024

JiangYan Yi

Jianhua Tao

Xiaohui Zhang

191

30 Aug 2024

Computer Audition: From Task-Specific Machine Learning to Foundation Models

Andreas Triantafyllopoulos

474

22 Jul 2024

Towards Enhanced Classification of Abnormal Lung sound in Multi-breath: A Light Weight Multi-label and Multi-head Attention Classification Method

Yi-Wei Chua

Yun-Chien Cheng

229

15 Jul 2024

Towards Attention-based Contrastive Learning for Audio Spoof Detection

402

03 Jul 2024

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations

Kunal Dhawan

Nithin Rao Koluguri

Ante Jukić

Ryan Langman

Jagadeesh Balam

Boris Ginsburg

290

03 Jul 2024

Pre-training Feature Guided Diffusion Model for Speech Enhancement

Yiyuan Yang

Niki Trigoni

Andrew Markham

460

11 Jun 2024

An automatic analysis of ultrasound vocalisations for the prediction of interaction context in captive Egyptian fruit bats

Andreas Triantafyllopoulos

Alexander Gebhard

M. Milling

Simon Rampp

Björn Schuller

184

10 Jun 2024

A Survey on Speech Deepfake Detection

Menglu Li

Yasaman Ahmadiadli

Xiao-Ping Zhang

502

22 Apr 2024

Efficient infusion of self-supervised representations in Automatic Speech Recognition

Darshan Prabhu

Sai Ganesh Mirishkar

Pankaj Wasnik

133

19 Apr 2024

What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy ConditionsInterspeech (Interspeech), 2023

Hanyu Meng

V. Sethu

E. Ambikairajah

282

10 Apr 2024

A robust audio deepfake detection system via multi-view feature

270

04 Mar 2024

Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks?IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Stefano Damiano

Luca Bondi

Shabnam Ghaffarzadegan

Andre Guntoro

Toon van Waterschoot

156

17 Jan 2024

Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection

Lian Huang

Chi-Man Pun

220

11 Jan 2024

Brain-Conditional Multimodal Synthesis: A Survey and TaxonomyIEEE Transactions on Artificial Intelligence (IEEE TAI), 2023

Weijian Mai

Jian Zhang

Pengfei Fang

Zhijun Zhang

579

31 Dec 2023

SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network

246

26 Dec 2023

Free-Space Optical Spiking Neural Network

Reyhane Ahmadi

Amirreza Ahmadnejad

S. Koohi

204

08 Nov 2023

TACNET: Temporal Audio Source Counting Network

Amirreza Ahmadnejad

Ahmad Mahmmodian Darviishani

Mohmmad Mehrdad Asadi

Sajjad Saffariyeh

Pedram Yousef

Emad Fatemizadeh

209

04 Nov 2023

Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural NetworksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Haizhou Li

340

18 Sep 2023

SSL-Net: A Synergistic Spectral and Learning-based Network for Efficient Bird Sound ClassificationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

236

15 Sep 2023

Instabilities in Convnets for Raw AudioIEEE Signal Processing Letters (IEEE SPL), 2023

374

11 Sep 2023

Audio Deepfake Detection: A Survey

Jiangyan Yi

457

111

29 Aug 2023

Neural Architectures Learning Fourier Transforms, Signal Processing and Much More....

Prateek Verma

141

20 Aug 2023

Comparative Analysis of the wav2vec 2.0 Feature Extractor

Peter Vieting

Ralf Schluter

Hermann Ney

284

08 Aug 2023

Fitting Auditory Filterbanks with Multiresolution Neural NetworksIEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2023

259

25 Jul 2023

Brain2Music: Reconstructing Music from Human Brain Activity

257

20 Jul 2023

V2Meow: Meowing to the Visual Beat via Video-to-Music GenerationAAAI Conference on Artificial Intelligence (AAAI), 2023

Kun Su

Judith Yue Li

Qingqing Huang

Dima Kuzmin

Joonseok Lee

...

248

11 May 2023

HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition

Soumya Dutta

Sriram Ganapathy

465

14 Apr 2023

Speech Intelligibility Classifiers from 550k Disordered Speech SamplesIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Subhashini Venugopalan

304

13 Mar 2023

Onsets and Velocities: Affordable Real-Time Piano Transcription Using Convolutional Neural NetworksEuropean Signal Processing Conference (EUSIPCO), 2023

Andres Fernandez

349

08 Mar 2023

Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session

Laurie M. Heller

Benjamin Elizalde

Bhiksha Raj

Soham Deshmukh

165

20 Feb 2023

In Search for a Generalizable Method for Source Free Domain AdaptationInternational Conference on Machine Learning (ICML), 2023

316

13 Feb 2023

MusicLM: Generating Music From Text

...

1.1K

647

26 Jan 2023