ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03497
  4. Cited By
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer

7 January 2024
Wenxi Chen
Yuzhe Liang
Ziyang Ma
Zhisheng Zheng
Xie Chen
    ViT
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (224★)

Papers citing "EAT: Self-Supervised Pre-Training with Efficient Audio Transformer"

30 / 30 papers shown
Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
Cleaning the Pool: Progressive Filtering of Unlabeled Pools in Deep Active Learning
Denis Huseljic
M. Herde
Lukas Rauch
Paul Hahn
Bernhard Sick
CLL
115
0
0
27 Mar 2026
AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning
AaPE: Aliasing-aware Patch Embedding for Self-Supervised Audio Representation Learning
Kohei Yamamoto
Kosuke Okusa
98
1
0
03 Dec 2025
AMAuT: A Flexible and Efficient Multiview Audio Transformer Framework Trained from Scratch
AMAuT: A Flexible and Efficient Multiview Audio Transformer Framework Trained from Scratch
Weichuang Shao
I. Liao
Tomas Henrique Bode Maul
T. Chandesa
173
1
0
22 Oct 2025
When Audio Generators Become Good Listeners: Generative Features for Understanding Tasks
When Audio Generators Become Good Listeners: Generative Features for Understanding Tasks
Zeyu Xie
Chenxing Li
Xuenan Xu
Mengyue Wu
Wenfu Wang
Ruibo Fu
Meng Yu
Dong Yu
Yuexian Zou
204
0
0
29 Sep 2025
Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification
Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification
Lukas Rauch
René Heinrich
Houtan Ghaffari
Lukas Miklautz
Ilyass Moummad
Bernhard Sick
Christoph Scholz
344
4
0
29 Sep 2025
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
Goksenin Yuksel
Pierre Guetschel
Michael Tangermann
Marcel van Gerven
Kiki van der Heijden
AI4TS
228
2
0
27 Sep 2025
FakeSound2: A Benchmark for Explainable and Generalizable Deepfake Sound Detection
FakeSound2: A Benchmark for Explainable and Generalizable Deepfake Sound Detection
Zeyu Xie
Yaoyun Zhang
Xuenan Xu
Yongkang Yin
Chenxing Li
Mengyue Wu
Yuexian Zou
247
0
0
21 Sep 2025
AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval
AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval
Hyun Jun Kim
Hyeong Yong Choi
Changwon Lim
137
0
0
20 Sep 2025
SAM: A Mamba-2 State-Space Audio-Language Model
SAM: A Mamba-2 State-Space Audio-Language Model
Taehan Lee
Jaehan Jung
Hyukjun Lee
MambaAuLLM
202
0
0
19 Sep 2025
Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training
Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training
Xin Fang
Guirui Zhong
Qing Wang
Fan Chu
Lei Wang
Mengui Qian
Mingqi Cai
Jiangzhao Wu
J. Gao
Jun Du
157
0
0
16 Sep 2025
Local Density-Based Anomaly Score Normalization for Domain Generalization
Local Density-Based Anomaly Score Normalization for Domain Generalization
Kevin Wilkinghoff
Haici Yang
Janek Ebbers
François Germain
Gordon Wichern
Jonathan Le Roux
335
4
0
13 Sep 2025
The AudioMOS Challenge 2025
The AudioMOS Challenge 2025
Wen-Chin Huang
Hui Wang
Cheng Liu
Yi-Chiao Wu
Andros Tjandra
Wei-Ning Hsu
Erica Cooper
Yong Qin
Tomoki Toda
163
8
0
01 Sep 2025
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained Models
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained ModelsInternational Conference on Artificial Neural Networks (ICANN), 2025
Guirui Zhong
Qing Wang
Jun Du
Lei Wang
Mingqi Cai
Xin Fang
187
1
0
21 Aug 2025
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signals
ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signals
Yucong Zhang
Juan Liu
Ming Li
VLM
225
2
0
20 Aug 2025
What Matters for Bioacoustic Encoding
What Matters for Bioacoustic Encoding
Marius Miron
David Robinson
Milad Alizadeh
Ellen Gilsenan-McMahon
Gagan Narula
...
Jane Lawton
Jen-Yu Liu
Aza Raskin
Olivier Pietquin
Matthieu Geist
195
3
0
15 Aug 2025
Foundation Models for Bioacoustics -- a Comparative Review
Foundation Models for Bioacoustics -- a Comparative Review
Raphael Schwinger
Paria Vali Zadeh
Lukas Rauch
Mats Kurz
Tom Hauschild
Sam Lapp
Sven Tomforde
247
3
0
02 Aug 2025
FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation
FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation
Pingyi Fan
Anbai Jiang
Shuwei Zhang
Zhiqiang Lv
Bing Han
...
Wei Zhang
Yanmin Qian
Xie Chen
Cheng Lu
Jia Liu
264
4
0
22 Jul 2025
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning
AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning
Yiming Ren
Zhiqiang Lin
Yu Li
Gao Meng
Weiyun Wang
...
Zicheng Lin
Jifeng Dai
Yujiu Yang
Wenhai Wang
Ruihang Chu
244
3
0
17 Jul 2025
USAD: Universal Speech and Audio Representation via Distillation
USAD: Universal Speech and Audio Representation via Distillation
Heng-Jui Chang
Saurabhchand Bhati
James R. Glass
Alexander H. Liu
430
4
0
23 Jun 2025
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic SoundscapesInternational Conference on Learning Representations (ICLR), 2025
Tony Alex
S. Ahmed
A. Mustafa
Muhammad Awais
Philip J. B. Jackson
238
15
0
13 Jun 2025
AC/DC: LLM-based Audio Comprehension via Dialogue Continuation
AC/DC: LLM-based Audio Comprehension via Dialogue Continuation
Yusuke Fujita
Tomoya Mizumoto
Atsushi Kojima
Lianbo Liu
Yui Sudo
AuLLM
340
1
0
12 Jun 2025
Can Masked Autoencoders Also Listen to Birds?
Can Masked Autoencoders Also Listen to Birds?
Lukas Rauch
Ilyass Moummad
René Heinrich
Alexis Joly
Bernhard Sick
Christoph Scholz
698
14
0
17 Apr 2025
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
342
5
0
08 Apr 2025
Token Pruning in Audio Transformers: Optimizing Performance and Decoding Patch Importance
Token Pruning in Audio Transformers: Optimizing Performance and Decoding Patch Importance
Taehan Lee
Hyukjun Lee
ViTVLM
394
8
0
02 Apr 2025
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
Weiqiao Shan
Yongqian Li
Yuhao Zhang
Yingfeng Luo
Chen Xu
...
Yaojie Lu
Hao Fei
Hao Yang
Tong Xiao
Jingbo Zhu
AuLLM
553
3
0
21 Feb 2025
DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning
DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Xiquan Li
Wenxi Chen
Ziyang Ma
Xuenan Xu
Yuzhe Liang
Zhisheng Zheng
Qiuqiang Kong
Xie Chen
VLM
348
18
0
12 Oct 2024
AnoPatch: Towards Better Consistency in Machine Anomalous Sound
  Detection
AnoPatch: Towards Better Consistency in Machine Anomalous Sound Detection
Anbai Jiang
Bing Han
Zhiqiang Lv
Yufeng Deng
Wei-Qiang Zhang
Xie Chen
Yanmin Qian
Jia Liu
Pingyi Fan
292
23
0
17 Jun 2024
FakeSound: Deepfake General Audio Detection
FakeSound: Deepfake General Audio Detection
Zeyu Xie
Baihan Li
Xuenan Xu
Zheng Liang
Kai Yu
Mengyue Wu
260
12
0
12 Jun 2024
MuPT: A Generative Symbolic Music Pretrained Transformer
MuPT: A Generative Symbolic Music Pretrained TransformerInternational Conference on Learning Representations (ICLR), 2024
Xingwei Qu
Yuelin Bai
Yi Ma
Ziya Zhou
Ka Man Lo
...
Xu Tan
Stephen W. Huang
Lei Ma
Jie Fu
Ge Zhang
345
30
0
09 Apr 2024
BAT: Learning to Reason about Spatial Sounds with Large Language Models
BAT: Learning to Reason about Spatial Sounds with Large Language Models
Zhisheng Zheng
Puyuan Peng
Ziyang Ma
Xie Chen
Eunsol Choi
David Harwath
LRM
536
41
0
02 Feb 2024
1
Page 1 of 1