Self-Supervised MultiModal Versatile Networks

Self-Supervised MultiModal Versatile Networks

29 June 2020

Jean-Baptiste Alayrac

Adrià Recasens

Relja Arandjelović

Jason Ramapuram

Sander Dieleman

Andrew Zisserman

Papers citing "Self-Supervised MultiModal Versatile Networks"

16 / 266 papers shown

Title
Robust Audio-Visual Instance Discrimination Pedro Morgado Ishan Misra Nuno Vasconcelos SSL 6 102 0 29 Mar 2021
Reading Isn't Believing: Adversarial Attacks On Multi-Modal Neurons David A. Noever S. M. Noever AAML VLM 18 33 0 18 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning Mandela Patrick Yuki M. Asano Bernie Huang Ishan Misra Florian Metze Joao Henriques Andrea Vedaldi AI4TS 14 32 0 18 Mar 2021
On Semantic Similarity in Video Retrieval Michael Wray Hazel Doughty Dima Damen 8 58 0 18 Mar 2021
Multi-Format Contrastive Learning of Audio Representations Luyu Wang Aaron van den Oord 21 59 0 11 Mar 2021
Perceiver: General Perception with Iterative Attention Andrew Jaegle Felix Gimeno Andrew Brock Andrew Zisserman Oriol Vinyals João Carreira VLM ViT MDE 13 970 0 04 Mar 2021
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge Francisco Rivera Valverde Juana Valeria Hurtado Abhinav Valada 15 72 0 01 Mar 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 12 18,768 0 26 Feb 2021
Learning rich touch representations through cross-modal self-supervision Martina Zambelli Y. Aytar Francesco Visin Yuxiang Zhou R. Hadsell SSL 18 16 0 21 Jan 2021
A Comprehensive Study of Deep Video Action Recognition Yi Zhu Xinyu Li Chunhui Liu Mohammadreza Zolfaghari Yuanjun Xiong Chongruo Wu Zhi-Li Zhang Joseph Tighe R. Manmatha Mu Li VLM AI4TS 19 162 0 11 Dec 2020
Game Plan: What AI can do for Football, and What Football can do for AI K. Tuyls Shayegan Omidshafiei Paul Muller Zhe Wang Jerome T. Connor ... Simon Bouton Nathalie Beauguerlange Jackson Broshear T. Graepel Demis Hassabis 20 62 0 18 Nov 2020
Support-set bottlenecks for video-text representation learning Mandela Patrick Po-Yao (Bernie) Huang Yuki M. Asano Florian Metze Alexander G. Hauptmann João Henriques Andrea Vedaldi 12 227 0 06 Oct 2020
Spatiotemporal Contrastive Video Representation Learning Rui Qian Tianjian Meng Boqing Gong Ming-Hsuan Yang H. Wang Serge J. Belongie Yin Cui SSL AI4TS 14 486 0 09 Aug 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos Andrew Rouditchenko Angie Boggust David F. Harwath Brian Chen D. Joshi ... Rogerio Feris Brian Kingsbury M. Picheny Antonio Torralba James R. Glass SSL 22 141 0 16 Jun 2020
On Compositions of Transformations in Contrastive Self-Supervised Learning Mandela Patrick Yuki M. Asano Polina Kuznetsova Ruth C. Fong João F. Henriques Geoffrey Zweig Andrea Vedaldi 8 49 0 09 Mar 2020
A Multi-View Embedding Space for Modeling Internet Images, Tags, and their Semantics Yunchao Gong Qifa Ke Michael Isard Svetlana Lazebnik 3DV 58 583 0 18 Dec 2012