Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.13487
Cited By
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
31 July 2019
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Use What You Have: Video Retrieval Using Representations From Collaborative Experts"
50 / 214 papers shown
Title
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
W. Wang
Lijuan Wang
Zicheng Liu
VLM
34
216
0
24 Nov 2021
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions
Hongwei Xue
Tiankai Hang
Yanhong Zeng
Yuchong Sun
Bei Liu
Huan Yang
Jianlong Fu
B. Guo
AI4TS
VLM
27
189
0
19 Nov 2021
Masking Modalities for Cross-modal Video Retrieval
Valentin Gabeur
Arsha Nagrani
Chen Sun
Alahari Karteek
Cordelia Schmid
11
29
0
01 Nov 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Domain Adaptation in Multi-View Embedding for Cross-Modal Video Retrieval
Jonathan Munro
Michael Wray
Diane Larlus
G. Csurka
Dima Damen
15
6
0
25 Oct 2021
Video and Text Matching with Conditioned Embeddings
Ameen Ali
Idan Schwartz
Tamir Hazan
Lior Wolf
83
13
0
21 Oct 2021
TEAM-Net: Multi-modal Learning for Video Action Recognition with Partial Decoding
Zhengwei Wang
Qi She
A. Smolic
16
9
0
17 Oct 2021
CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
Mohammadreza Zolfaghari
Yi Zhu
Peter V. Gehler
Thomas Brox
125
127
0
30 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
558
0
28 Sep 2021
CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval
Zhijian Hou
Chong-Wah Ngo
W. Chan
11
38
0
21 Sep 2021
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss
Xingyi Cheng
Hezheng Lin
Xiangyu Wu
Fan Yang
Dong Shen
6
147
0
09 Sep 2021
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang
Yonatan Bisk
Jianfeng Gao
19
136
0
23 Aug 2021
Learning to Cut by Watching Movies
Alejandro Pardo
Fabian Caba Heilbron
Juan Carlos León Alcázar
Ali K. Thabet
Bernard Ghanem
VGen
43
20
0
09 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
27
42
0
04 Aug 2021
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
Peng Wu
Xiangteng He
Mingqian Tang
Yiliang Lv
Jing Liu
23
52
0
26 Jul 2021
Transcript to Video: Efficient Clip Sequencing from Texts
Yu Xiong
Fabian Caba Heilbron
Dahua Lin
CLIP
20
10
0
25 Jul 2021
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Han Fang
Pengfei Xiong
Luhui Xu
Yu Chen
CLIP
VLM
13
291
0
21 Jun 2021
Cross-Modal Discrete Representation Learning
Alexander H. Liu
SouYoung Jin
Cheng-I Jeff Lai
Andrew Rouditchenko
A. Oliva
James R. Glass
SSL
22
40
0
10 Jun 2021
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Mohit Bansal
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
24
100
0
08 Jun 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Rameswar Panda
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
28
47
0
11 May 2021
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
19
77
0
05 May 2021
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Brian Chen
Andrew Rouditchenko
Kevin Duarte
Hilde Kuehne
Samuel Thomas
...
Rogerio Feris
David F. Harwath
James R. Glass
M. Picheny
Shih-Fu Chang
SSL
30
89
0
26 Apr 2021
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Yanbei Chen
Yongqin Xian
A. Sophia Koepke
Ying Shan
Zeynep Akata
78
80
0
22 Apr 2021
T2VLAD: Global-Local Sequence Alignment for Text-Video Retrieval
Xiaohan Wang
Linchao Zhu
Yi Yang
151
169
0
20 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
16
82
0
19 Apr 2021
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
309
780
0
18 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
11
124
0
16 Apr 2021
Video-aided Unsupervised Grammar Induction
Songyang Zhang
Linfeng Song
Lifeng Jin
Kun Xu
Dong Yu
Jiebo Luo
20
26
0
09 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
34
1,125
0
01 Apr 2021
Memory Enhanced Embedding Learning for Cross-Modal Video-Text Retrieval
Rui Zhao
Kecheng Zheng
Zhengjun Zha
Hongtao Xie
Jiebo Luo
22
3
0
29 Mar 2021
HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval
Song Liu
Haoqi Fan
Shengsheng Qian
Yiru Chen
Wenkui Ding
Zhongyuan Wang
22
145
0
28 Mar 2021
Self-Supervised Learning in Multi-Task Graphs through Iterative Consensus Shift
Emanuela Haller
Elena Burceanu
Marius Leordeanu
SSL
11
2
0
26 Mar 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
13
128
0
19 Mar 2021
On Semantic Similarity in Video Retrieval
Michael Wray
Hazel Doughty
Dima Damen
21
66
0
18 Mar 2021
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
Po-Yao (Bernie) Huang
Mandela Patrick
Junjie Hu
Graham Neubig
Florian Metze
Alexander G. Hauptmann
MLLM
VLM
19
56
0
16 Mar 2021
Open-book Video Captioning with Retrieve-Copy-Generate Network
Ziqi Zhang
Zhongang Qi
C. Yuan
Ying Shan
Bing Li
Ying Deng
Weiming Hu
23
92
0
09 Mar 2021
A Straightforward Framework For Video Retrieval Using CLIP
Jesús Andrés Portillo-Quintero
J. C. Ortíz-Bayliss
Hugo Terashima-Marín
CLIP
316
116
0
24 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Mohit Bansal
Jingjing Liu
CLIP
32
645
0
11 Feb 2021
Look Before you Speak: Visually Contextualized Utterances
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
19
66
0
10 Dec 2020
Rethinking movie genre classification with fine-grained semantic clustering
Edward Fish
Jon Weinbren
Andrew Gilbert
VLM
34
7
0
04 Dec 2020
Top-1 CORSMAL Challenge 2020 Submission: Filling Mass Estimation Using Multi-modal Observations of Human-robot Handovers
Vladimir E. Iashin
Francesca Palermo
Gokhan Solak
Claudio Coppola
9
10
0
02 Dec 2020
SEA: Sentence Encoder Assembly for Video Retrieval by Textual Queries
Xirong Li
Fangming Zhou
Chaoxi Xu
Jiaqi Ji
Gang Yang
6
51
0
24 Nov 2020
QuerYD: A video dataset with high-quality text and audio narrations
Andreea-Maria Oncescu
João F. Henriques
Yang Liu
Andrew Zisserman
Samuel Albanie
VGen
11
11
0
22 Nov 2020
The complementarity of a diverse range of deep learning features extracted from video content for video recommendation
A. Almeida
J. D. Villiers
A. Freitas
Mergandran Velayudan
11
16
0
21 Nov 2020
Graph Based Temporal Aggregation for Video Retrieval
Arvind Srinivasan
Aprameya Bharadwaj
Aveek Saha
Natarajan Subramanyam
11
0
0
04 Nov 2020
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Simon Ging
Mohammadreza Zolfaghari
Hamed Pirsiavash
Thomas Brox
ViT
CLIP
13
168
0
01 Nov 2020
Universal Weighting Metric Learning for Cross-Modal Matching
Jiwei Wei
Xing Xu
Yang Yang
Yanli Ji
Zheng Wang
Heng Tao Shen
13
87
0
07 Oct 2020
Support-set bottlenecks for video-text representation learning
Mandela Patrick
Po-Yao (Bernie) Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
João Henriques
Andrea Vedaldi
20
242
0
06 Oct 2020
Semi-Supervised Learning for Multi-Task Scene Understanding by Neural Graph Consensus
Marius Leordeanu
Mihai Cristian Pîrvu
Dragos Costea
Alina Marcu
E. Slusanschi
Rahul Sukthankar
SSL
14
9
0
02 Oct 2020
Dual Encoding for Video Retrieval by Text
Jianfeng Dong
Xirong Li
Chaoxi Xu
Xun Yang
Gang Yang
Xun Wang
Meng Wang
8
2
0
10 Sep 2020
Previous
1
2
3
4
5
Next