ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event
Classification

v1v2 (latest)

ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification

IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

23 November 2022

Mark D. Plumbley

ArXiv (abs)PDF HTML

Papers citing "ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification"

7 / 7 papers shown

Title
Unmute the Patch Tokens: Rethinking Probing in Multi-Label Audio Classification Lukas Rauch René Heinrich Houtan Ghaffari Lukas Miklautz Ilyass Moummad Bernhard Sick Christoph Scholz 249 1 0 29 Sep 2025
Temporally Heterogeneous Graph Contrastive Learning for Multimodal Acoustic event Classification Yuanjian Chen Yang Xiao Jinjie Huang 76 0 0 18 Sep 2025
MATPAC++: Enhanced Masked Latent Prediction for Self-Supervised Audio Representation Learning Aurian Quélennec Pierre Chouteau Geoffroy Peeters S. Essid 140 0 0 18 Aug 2025
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic SoundscapesInternational Conference on Learning Representations (ICLR), 2025 Tony Alex S. Ahmed A. Mustafa Muhammad Awais Philip J. B. Jackson 145 7 0 13 Jun 2025
TACOS: Temporally-aligned Audio CaptiOnS for Language-Audio Pretraining Paul Primus Florian Schmid Gerhard Widmer CLIP AI4TS VLM 205 6 0 12 May 2025
Effective Pre-Training of Audio Transformers for Sound Event DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 Florian Schmid T. Morocutti Francesco Foscarin Jan Schluter Paul Primus Gerhard Widmer ViT 170 8 0 14 Sep 2024
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level TasksIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023 Xian Li Nian Shao Xiaofei Li ViT CLIP 283 43 0 07 Jun 2023