ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.12667
  4. Cited By
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
v1v2v3 (latest)

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Neural Information Processing Systems (NeurIPS), 2019
28 November 2019
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
    SSL
ArXiv (abs)PDFHTML

Papers citing "Self-Supervised Learning by Cross-Modal Audio-Video Clustering"

30 / 280 papers shown
Title
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio
  and Tags
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and TagsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Xavier Favory
Konstantinos Drossos
Maria Sandsten
Xavier Serra
199
16
0
27 Oct 2020
Self-supervised Co-training for Video Representation Learning
Self-supervised Co-training for Video Representation LearningNeural Information Processing Systems (NeurIPS), 2020
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
459
361
0
19 Oct 2020
Discriminative Sounding Objects Localization via Self-supervised
  Audiovisual Matching
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching
Di Hu
Rui Qian
Minyue Jiang
Xiao Tan
Shilei Wen
Errui Ding
Weiyao Lin
Dejing Dou
173
149
0
12 Oct 2020
Support-set bottlenecks for video-text representation learning
Support-set bottlenecks for video-text representation learning
Mandela Patrick
Po-Yao (Bernie) Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
João Henriques
Andrea Vedaldi
250
260
0
06 Oct 2020
Hard Negative Mixing for Contrastive Learning
Hard Negative Mixing for Contrastive LearningNeural Information Processing Systems (NeurIPS), 2020
Yannis Kalantidis
Mert Bulent Sariyildiz
Noé Pion
Philippe Weinzaepfel
Diane Larlus
SSL
443
710
0
02 Oct 2020
Understanding Self-supervised Learning with Dual Deep Networks
Understanding Self-supervised Learning with Dual Deep Networks
Yuandong Tian
Lantao Yu
Xinlei Chen
Surya Ganguli
SSL
446
86
0
01 Oct 2020
SEMI: Self-supervised Exploration via Multisensory Incongruity
SEMI: Self-supervised Exploration via Multisensory IncongruityIEEE International Conference on Robotics and Automation (ICRA), 2020
Jianren Wang
Ziwen Zhuang
Hang Zhao
SSL
139
1
0
26 Sep 2020
Active Contrastive Learning of Audio-Visual Video Representations
Active Contrastive Learning of Audio-Visual Video Representations
Shuang Ma
Zhaoyang Zeng
Daniel J. McDuff
Yale Song
VLMSSL
160
9
0
31 Aug 2020
Self-supervised Video Representation Learning by Uncovering
  Spatio-temporal Statistics
Self-supervised Video Representation Learning by Uncovering Spatio-temporal StatisticsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Jiangliu Wang
Jianbo Jiao
Linchao Bao
Shengfeng He
Wei Liu
Yunhui Liu
SSLAI4TS
192
58
0
31 Aug 2020
Delving into Inter-Image Invariance for Unsupervised Visual
  Representations
Delving into Inter-Image Invariance for Unsupervised Visual RepresentationsInternational Journal of Computer Vision (IJCV), 2020
Jiahao Xie
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
SSLVLM
179
61
0
26 Aug 2020
Self-supervised Video Representation Learning by Pace Prediction
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang
Jianbo Jiao
Yunhui Liu
SSLAI4TS
203
251
0
13 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised
  Audio-Visual Representation Learning
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
221
117
0
13 Aug 2020
What Should Not Be Contrastive in Contrastive Learning
What Should Not Be Contrastive in Contrastive LearningInternational Conference on Learning Representations (ICLR), 2020
Tete Xiao
Xiaolong Wang
Alexei A. Efros
Trevor Darrell
SSLDRL
277
330
0
13 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Spatiotemporal Contrastive Video Representation LearningComputer Vision and Pattern Recognition (CVPR), 2020
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Jian Shu
Serge J. Belongie
Huayu Chen
SSLAI4TS
373
543
0
09 Aug 2020
Memory-augmented Dense Predictive Coding for Video Representation
  Learning
Memory-augmented Dense Predictive Coding for Video Representation Learning
Tengda Han
Weidi Xie
Andrew Zisserman
SSL
293
254
0
03 Aug 2020
Learning Video Representations from Textual Web Supervision
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Gaowen Liu
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
218
50
0
29 Jul 2020
Leveraging Category Information for Single-Frame Visual Sound Source
  Separation
Leveraging Category Information for Single-Frame Visual Sound Source Separation
Xiangjie Sui
Esa Rahtu
129
9
0
15 Jul 2020
Learning Speech Representations from Raw Audio by Joint Audiovisual
  Self-Supervision
Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
129
16
0
08 Jul 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
369
395
0
29 Jun 2020
Video Representation Learning with Visual Tempo Consistency
Video Representation Learning with Visual Tempo Consistency
Ceyuan Yang
Yinghao Xu
Bo Dai
Bolei Zhou
146
94
0
28 Jun 2020
Labelling unlabelled videos from scratch with multi-modal
  self-supervision
Labelling unlabelled videos from scratch with multi-modal self-supervisionNeural Information Processing Systems (NeurIPS), 2020
Yuki M. Asano
Mandela Patrick
Christian Rupprecht
Andrea Vedaldi
SSL
253
161
0
24 Jun 2020
AVLnet: Learning Audio-Visual Language Representations from
  Instructional Videos
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
202
142
0
16 Jun 2020
Video Understanding as Machine Translation
Bruno Korbar
Fabio Petroni
Rohit Girdhar
Lorenzo Torresani
SSL
195
29
0
12 Jun 2020
Are we done with ImageNet?
Are we done with ImageNet?
Lucas Beyer
Olivier J. Hénaff
Alexander Kolesnikov
Xiaohua Zhai
Aaron van den Oord
VLM
311
454
0
12 Jun 2020
Visually Guided Sound Source Separation using Cascaded Opponent Filter
  Network
Visually Guided Sound Source Separation using Cascaded Opponent Filter Network
Xiangjie Sui
Esa Rahtu
195
25
0
04 Jun 2020
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data
Large Scale Audiovisual Learning of Sounds with Weakly Labeled DataInternational Joint Conference on Artificial Intelligence (IJCAI), 2020
Haytham M. Fayek
Anurag Kumar
176
37
0
29 May 2020
Does Visual Self-Supervision Improve Learning of Speech Representations
  for Emotion Recognition?
Does Visual Self-Supervision Improve Learning of Speech Representations for Emotion Recognition?IEEE Transactions on Affective Computing (IEEE TAC), 2020
Abhinav Shukla
Stavros Petridis
Maja Pantic
SSL
377
33
0
04 May 2020
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Audio-Visual Instance Discrimination with Cross-Modal AgreementComputer Vision and Pattern Recognition (CVPR), 2020
Pedro Morgado
Nuno Vasconcelos
Ishan Misra
SSL
274
293
0
27 Apr 2020
On Compositions of Transformations in Contrastive Self-Supervised
  Learning
On Compositions of Transformations in Contrastive Self-Supervised LearningIEEE International Conference on Computer Vision (ICCV), 2020
Mandela Patrick
Yuki M. Asano
Polina Kuznetsova
Ruth C. Fong
João F. Henriques
Geoffrey Zweig
Andrea Vedaldi
184
53
0
09 Mar 2020
Self-supervised Visual Feature Learning with Deep Neural Networks: A
  Survey
Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey
Longlong Jing
Yingli Tian
SSL
388
1,881
0
16 Feb 2019
Previous
123456