Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1707.09074
Cited By
Learning from Video and Text via Large-Scale Discriminative Clustering
27 July 2017
Antoine Miech
Jean-Baptiste Alayrac
Piotr Bojanowski
Ivan Laptev
Josef Sivic
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from Video and Text via Large-Scale Discriminative Clustering"
27 / 27 papers shown
Title
Dense and Aligned Captions (DAC) Promote Compositional Reasoning in VL Models
Sivan Doveh
Assaf Arbelle
Sivan Harary
Roei Herzig
Donghyun Kim
...
Rameswar Panda
Raja Giryes
Rogerio Feris
S. Ullman
Leonid Karlinsky
VLM
CoGe
31
52
0
31 May 2023
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Aleksandr Beznosikov
David Dobre
Gauthier Gidel
25
5
0
23 Apr 2023
FETA: Towards Specializing Foundation Models for Expert Task Applications
Amit Alfassy
Assaf Arbelle
Oshri Halimi
Sivan Harary
Roei Herzig
...
Christoph Auer
Kate Saenko
Peter W. J. Staar
Rogerio Feris
Leonid Karlinsky
21
19
0
08 Sep 2022
Creating Multimedia Summaries Using Tweets and Videos
Anietie U Andy
Siyi Liu
Daphne Ippolito
Reno Kriz
Chris Callison-Burch
Derry Wijaya
11
0
0
16 Mar 2022
A multimodal deep learning framework for scalable content based visual media retrieval
Ambareesh Ravi
Amith Nandakumar
11
3
0
18 May 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
16
33
0
18 Mar 2021
Rethinking movie genre classification with fine-grained semantic clustering
Edward Fish
Jon Weinbren
Andrew Gilbert
VLM
34
7
0
04 Dec 2020
Identity-Aware Multi-Sentence Video Description
J. S. Park
Trevor Darrell
Anna Rohrbach
10
17
0
22 Aug 2020
Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval
Xun Yang
Jianfeng Dong
Yixin Cao
Xun Wang
Meng Wang
Tat-Seng Chua
12
137
0
06 Jul 2020
Clustering based Contrastive Learning for Improving Face Representations
Vivek Sharma
Makarand Tapaswi
M. Sarfraz
Rainer Stiefelhagen
CVBM
SSL
11
46
0
05 Apr 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
20
54
0
30 Mar 2020
Global Convergence of Frank Wolfe on One Hidden Layer Networks
Alexandre d’Aspremont
Mert Pilanci
9
4
0
06 Feb 2020
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Antoine Miech
Jean-Baptiste Alayrac
Lucas Smaira
Ivan Laptev
Josef Sivic
Andrew Zisserman
VGen
SSL
17
700
0
13 Dec 2019
Finding Moments in Video Collections Using Natural Language
Victor Escorcia
Mattia Soldan
Josef Sivic
Bernard Ghanem
Bryan C. Russell
20
6
0
30 Jul 2019
Domain-Specific Priors and Meta Learning for Few-Shot First-Person Action Recognition
Huseyin Coskun
Zeeshan Zia
Bugra Tekin
Federica Bogo
Nassir Navab
Federico Tombari
H. Sawhney
12
27
0
22 Jul 2019
Deep Discriminative Clustering Analysis
Jianlong Chang
Yiwen Guo
Lingfeng Wang
Gaofeng Meng
Shiming Xiang
Chunhong Pan
20
49
0
05 May 2019
RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization
Alejandro Pardo
Humam Alwassel
Fabian Caba Heilbron
Ali K. Thabet
Bernard Ghanem
32
52
0
30 Mar 2019
M-VAD Names: a Dataset for Video Captioning with Naming
S. Pini
Marcella Cornia
Federico Bolelli
Lorenzo Baraldi
Rita Cucchiara
9
29
0
04 Mar 2019
Self-Supervised Learning of Face Representations for Video Face Clustering
Vivek Sharma
Makarand Tapaswi
M. Sarfraz
Rainer Stiefelhagen
SSL
CVBM
9
49
0
03 Mar 2019
Foreground Clustering for Joint Segmentation and Localization in Videos and Images
Abhishek Sharma
17
4
0
26 Nov 2018
A Joint Sequence Fusion Model for Video Question Answering and Retrieval
Youngjae Yu
Jongseok Kim
Gunhee Kim
29
339
0
07 Aug 2018
A flexible model for training action localization with varying levels of supervision
Guilhem Chéron
Jean-Baptiste Alayrac
Ivan Laptev
Cordelia Schmid
14
41
0
29 Jun 2018
Learning Multimodal Representations for Unseen Activities
A. Piergiovanni
Michael S. Ryoo
SSL
9
4
0
21 Jun 2018
Learning a Text-Video Embedding from Incomplete and Heterogeneous Data
Antoine Miech
Ivan Laptev
Josef Sivic
11
233
0
07 Apr 2018
Classification from Pairwise Similarity and Unlabeled Data
Han Bao
Gang Niu
Masashi Sugiyama
165
87
0
12 Feb 2018
Multimodal Visual Concept Learning with Weakly Supervised Techniques
Giorgos Bouritsas
Petros Koutras
Athanasia Zlatintsi
Petros Maragos
14
7
0
03 Dec 2017
Weakly-supervised learning of visual relations
Julia Peyre
Ivan Laptev
Cordelia Schmid
Josef Sivic
11
193
0
29 Jul 2017
1