v1v2v3 (latest)

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

International Conference on Machine Learning (ICML), 2022

7 February 2022

Papers citing "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language"

50 / 609 papers shown

Point2Vec for Self-Supervised Representation Learning on Point Clouds

207

29 Mar 2023

Unmasked Teacher: Towards Training-Efficient Video Foundation ModelsIEEE International Conference on Computer Vision (ICCV), 2023

Yi Wang

Yu Qiao

536

238

28 Mar 2023

On the Stepwise Nature of Self-Supervised LearningInternational Conference on Machine Learning (ICML), 2023

301

27 Mar 2023

Decoupled Multimodal Distilling for Emotion RecognitionComputer Vision and Pattern Recognition (CVPR), 2023

Yong Li

Yuan-Zheng Wang

Zhen Cui

185

167

24 Mar 2023

Transformers in Speech Processing: A Survey

463

21 Mar 2023

GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D UnderstandingIEEE International Conference on Computer Vision (ICCV), 2023

247

20 Mar 2023

Cocktail HuBERT: Generalized Self-Supervised Pre-training for Mixture and Single-Source SpeechIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Maryam Fazel-Zarandi

Wei-Ning Hsu

SSL

142

20 Mar 2023

Right the docs: Characterising voice dataset documentation practices used in machine learningAustralasian Language Technology Association Workshop (ALTA), 2023

Kathy Reid

Elizabeth T. Williams

178

19 Mar 2023

OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav

321

14 Mar 2023

AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+

Guo-Jun Qi

178

14 Mar 2023

CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale AttentionIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Wei Liu

231

13 Mar 2023

Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature MimickingInternational Journal of Computer Vision (IJCV), 2023

180

09 Mar 2023

Improving Few-Shot Learning for Talking Face System with TTS Data AugmentationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Xie Chen

158

09 Mar 2023

Masked Image Modeling with Local Multi-Scale ReconstructionComputer Vision and Pattern Recognition (CVPR), 2023

205

09 Mar 2023

Centroid-centered Modeling for Efficient Vision Transformer Pre-trainingChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023

Bo Du

146

08 Mar 2023

Self-supervised speech representation learning for keyword-spotting with light-weight transformersIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

173

07 Mar 2023

Applying Plain Transformers to Real-World Point Clouds

Lanxiao Li

M. Heizmann

3DPC ViT

370

28 Feb 2023

Generic-to-Specific Distillation of Masked AutoencodersComputer Vision and Pattern Recognition (CVPR), 2023

282

28 Feb 2023

Efficient Masked Autoencoders with Self-ConsistencyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Ming Tang

267

28 Feb 2023

Phone and speaker spatial organization in self-supervised speech representations

237

24 Feb 2023

Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Xie Chen

162

18 Feb 2023

Gaussian-smoothed Imbalance Data Improves Speech Emotion Recognition

170

17 Feb 2023

A Comprehensive Review and a Taxonomy of Edge Machine Learning: Requirements, Paradigms, and TechniquesApplied Informatics (AI), 2023

Wenbin Li

Hakim Hacid

Ebtesam Almazrouei

Merouane Debbah

333

16 Feb 2023

Speech Enhancement with Multi-granularity Vector QuantizationAsia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023

Xiaokang Zhao

Qiu-shi Zhu

Jie Zhang

163

16 Feb 2023

Multi-modal Machine Learning in Engineering Design: A Review and Future DirectionsJournal of Computing and Information Science in Engineering (JCISE), 2023

356

14 Feb 2023

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target RepresentationsAutomatic Speech Recognition & Understanding (ASRU), 2023

397

10 Feb 2023

Representation Deficiency in Masked Language ModelingInternational Conference on Learning Representations (ICLR), 2023

Sinong Wang

Han Fang

Luke Zettlemoyer

229

04 Feb 2023

ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics

354

03 Feb 2023

SimMTM: A Simple Pre-Training Framework for Masked Time-Series ModelingNeural Information Processing Systems (NeurIPS), 2023

Li Zhang

485

143

02 Feb 2023

Image-Based Vehicle Classification by Synergizing Features from Supervised and Self-Supervised Learning Paradigms

S. Ma

Jidong J. Yang

SSL

01 Feb 2023

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

Muhammad Arslan Manzoor

342

01 Feb 2023

Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion ModelsInternational Conference on Machine Learning (ICML), 2023

Rongjie Huang

Dongchao Yang

Zhou Zhao

405

432

30 Jan 2023

Aerial Image Object Detection With Vision Transformer Detector (ViTDet)IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2023

Liya Wang

A. Tien

414

28 Jan 2023

Open Problems in Applied Deep Learning

M. Raissi

AI4CE

234

26 Jan 2023

Self-Supervised Learning from Images with a Joint-Embedding Predictive ArchitectureComputer Vision and Pattern Recognition (CVPR), 2023

Pascal Vincent

471

596

19 Jan 2023

Vision Learners Meet Web Image-Text Pairs

203

17 Jan 2023

RILS: Masked Visual Reconstruction in Language Semantic SpaceComputer Vision and Pattern Recognition (CVPR), 2023

Shusheng Yang

Ying Shan

194

17 Jan 2023

A Survey on Self-supervised Learning: Algorithms, Applications, and Future TrendsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

579

366

13 Jan 2023

All in Tokens: Unifying Output Space of Visual Tasks via Soft TokenIEEE International Conference on Computer Vision (ICCV), 2023

331

05 Jan 2023

Trace Encoding in Process Mining: a survey and benchmarkingEngineering applications of artificial intelligence (Eng. Appl. Artif. Intell.), 2023

250

05 Jan 2023

TinyMIM: An Empirical Study of Distilling MIM Pre-trained ModelsComputer Vision and Pattern Recognition (CVPR), 2023

321

03 Jan 2023

Disjoint Masking with Joint Distillation for Efficient Masked Image ModelingIEEE transactions on multimedia (IEEE TMM), 2022

Chunyu Xie

351

31 Dec 2022

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

274

20 Dec 2022

Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models

Changli Tang

Yujin Wang

Xie Chen

Weiqiang Zhang

125

20 Dec 2022

Randomized Quantization: A Generic Augmentation for Data Agnostic Self-supervised LearningIEEE International Conference on Computer Vision (ICCV), 2022

277

19 Dec 2022

BEATs: Audio Pre-Training with Acoustic TokenizersInternational Conference on Machine Learning (ICML), 2022

400

483

18 Dec 2022

MAViL: Masked Audio-Video LearnersNeural Information Processing Systems (NeurIPS), 2022

Po-Yao (Bernie) Huang

Christoph Feichtenhofer

337

15 Dec 2022

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and LanguageInternational Conference on Machine Learning (ICML), 2022

364

123

14 Dec 2022

Disentangling Prosody Representations with Unsupervised Speech ReconstructionIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

Leyuan Qu

Taiha Li

C. Weber

Theresa Pekarek-Rosin

F. Ren

S. Wermter

242

14 Dec 2022

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked AutoencodersComputer Vision and Pattern Recognition (CVPR), 2022

Yu Qiao

288

184

13 Dec 2022