EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

7 January 2024

Xie Chen

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (224★)

Papers citing "EAT: Self-Supervised Pre-Training with Efficient Audio Transformer"

30 / 30 papers shown

Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning

115

27 Mar 2026

AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning

Kohei Yamamoto

Kosuke Okusa

03 Dec 2025

AMAuT: A Flexible and Efficient Multiview Audio Transformer Framework Trained from Scratch

Weichuang Shao

I. Liao

Tomas Henrique Bode Maul

T. Chandesa

173

22 Oct 2025

When Audio Generators Become Good Listeners: Generative Features for Understanding Tasks

204

29 Sep 2025

Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification

344

29 Sep 2025

WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms

228

27 Sep 2025

FakeSound2: A Benchmark for Explainable and Generalizable Deepfake Sound Detection

247

21 Sep 2025

AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval

Hyun Jun Kim

Hyeong Yong Choi

Changwon Lim

137

20 Sep 2025

SAM: A Mamba-2 State-Space Audio-Language Model

202

19 Sep 2025

Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training

157

16 Sep 2025

Local Density-Based Anomaly Score Normalization for Domain Generalization

335

13 Sep 2025

The AudioMOS Challenge 2025

163

01 Sep 2025

An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained ModelsInternational Conference on Artificial Neural Networks (ICANN), 2025

187

21 Aug 2025

ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signals

225

20 Aug 2025

What Matters for Bioacoustic Encoding

Marius Miron

David Robinson

Milad Alizadeh

Ellen Gilsenan-McMahon

...

195

15 Aug 2025

Foundation Models for Bioacoustics -- a Comparative Review

247

02 Aug 2025

FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation

...

264

22 Jul 2025

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning

...

244

17 Jul 2025

USAD: Universal Speech and Audio Representation via Distillation

430

23 Jun 2025

SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic SoundscapesInternational Conference on Learning Representations (ICLR), 2025

238

13 Jun 2025

AC/DC: LLM-based Audio Comprehension via Dialogue Continuation

340

12 Jun 2025

Can Masked Autoencoders Also Listen to Birds?

698

17 Apr 2025

TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis

342

08 Apr 2025

Token Pruning in Audio Transformers: Optimizing Performance and Decoding Patch Importance

Taehan Lee

Hyukjun Lee

ViT VLM

394

02 Apr 2025

Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders

...

553

21 Feb 2025

DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Ziyang Ma

348

12 Oct 2024

AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection

Wei-Qiang Zhang

Xie Chen

Yanmin Qian

Jia Liu

Pingyi Fan

292

17 Jun 2024

FakeSound: Deepfake General Audio Detection

260

12 Jun 2024

MuPT: A Generative Symbolic Music Pretrained TransformerInternational Conference on Learning Representations (ICLR), 2024

...

Xu Tan

Stephen W. Huang

Lei Ma

Jie Fu

Ge Zhang

345

09 Apr 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models

536

02 Feb 2024