Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.00561
Cited By
Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners
1 June 2023
Sarthak Yadav
Sergios Theodoridis
Lars Kai Hansen
Zheng-Hua Tan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners"
9 / 9 papers shown
Title
Structured-Noise Masked Modeling for Video, Audio and Beyond
Aritra Bhowmik
Fida Mohammad Thoker
Carlos Hinojosa
Bernard Ghanem
Cees G. M. Snoek
VGen
59
0
0
20 Mar 2025
How not to Stitch Representations to Measure Similarity: Task Loss Matching versus Direct Matching
András Balogh
Márk Jelasity
85
0
0
15 Dec 2024
BiSSL: A Bilevel Optimization Framework for Enhancing the Alignment Between Self-Supervised Pre-Training and Downstream Fine-Tuning
Gustav Wagner Zakarias
Lars Kai Hansen
Zheng-Hua Tan
32
0
0
03 Oct 2024
Audio xLSTMs: Learning Self-Supervised Audio Representations with xLSTMs
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
45
2
0
29 Aug 2024
Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations
Sarthak Yadav
Zheng-Hua Tan
Mamba
42
10
0
04 Jun 2024
Masked World Models for Visual Control
Younggyo Seo
Danijar Hafner
Hao Liu
Fangchen Liu
Stephen James
Kimin Lee
Pieter Abbeel
OffRL
87
146
0
28 Jun 2022
BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping
Gasser Elbanna
Neil Scheidwasser
M. Kegler
P. Beckmann
Karl El Hajal
Milos Cernak
SSL
33
21
0
24 Jun 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
121
264
0
02 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
1