ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.13487
  4. Cited By
Use What You Have: Video Retrieval Using Representations From
  Collaborative Experts

Use What You Have: Video Retrieval Using Representations From Collaborative Experts

31 July 2019
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
ArXivPDFHTML

Papers citing "Use What You Have: Video Retrieval Using Representations From Collaborative Experts"

50 / 214 papers shown
Title
Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering
Towards Efficient Partially Relevant Video Retrieval with Active Moment Discovering
Peipei Song
L. Zhang
Long Lan
Weidong Chen
D. Guo
Xun Yang
Meng Wang
14
0
0
15 Apr 2025
TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval
TC-MGC: Text-Conditioned Multi-Grained Contrastive Learning for Text-Video Retrieval
Xiaolun Jing
Genke Yang
Jian Chu
26
0
0
07 Apr 2025
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval
Boseung Jeong
Jicheol Park
Sungyeon Kim
Suha Kwak
36
0
0
03 Apr 2025
Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval
Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval
Arun V. Reddy
Alexander Martin
Eugene Yang
Andrew Yates
Kate Sanders
Kenton W. Murray
Reno Kriz
Celso M. De Melo
Benjamin Van Durme
Rama Chellappa
48
1
0
24 Mar 2025
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
Zichen Liu
Kunlun Xu
Bing-Huang Su
Xu Zou
Yuxin Peng
Jiahuan Zhou
VLM
AI4TS
65
1
0
20 Mar 2025
Detection, Retrieval, and Explanation Unified: A Violence Detection System Based on Knowledge Graphs and GAT
Detection, Retrieval, and Explanation Unified: A Violence Detection System Based on Knowledge Graphs and GAT
Wen-Dong Jiang
Chih-Yung Chang
Diptendu Sinha Roy
36
0
0
07 Jan 2025
Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Hierarchical Banzhaf Interaction for General Video-Language Representation Learning
Peng Jin
H. Li
Li Yuan
Shuicheng Yan
Jie Chen
45
1
0
31 Dec 2024
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
TokenBinder: Text-Video Retrieval with One-to-Many Alignment Paradigm
Bingqing Zhang
Zhuo Cao
Heming Du
Xin Yu
Xue Li
Jiajun Liu
Sen Wang
VGen
23
0
0
30 Sep 2024
OneEncoder: A Lightweight Framework for Progressive Alignment of
  Modalities
OneEncoder: A Lightweight Framework for Progressive Alignment of Modalities
Bilal Faye
Hanane Azzag
M. Lebbah
ObjD
28
0
0
17 Sep 2024
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Andreea-Maria Oncescu
João F. Henriques
A. Sophia Koepke
26
2
0
01 Sep 2024
T2VIndexer: A Generative Video Indexer for Efficient Text-Video
  Retrieval
T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval
Yili Li
Jing Yu
Keke Gai
Bang Liu
Gang Xiong
Qi Wu
DiffM
VGen
28
2
0
21 Aug 2024
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product
  Retrieval
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval
Ruixiang Zhao
Jian Jia
Yan Li
Xuehan Bai
Quan Chen
Han Li
Peng Jiang
Xirong Li
28
0
0
06 Aug 2024
SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language
  Retrieval
SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval
Longtao Jiang
Min Wang
Zecheng Li
Yao Fang
Wen-gang Zhou
Houqiang Li
SLR
29
2
0
23 Jul 2024
Fine-Grained Scene Image Classification with Modality-Agnostic Adapter
Fine-Grained Scene Image Classification with Modality-Agnostic Adapter
Yiqun Wang
Zhao Zhou
Xiangcheng Du
Xingjiao Wu
Yingbin Zheng
Cheng Jin
36
0
0
03 Jul 2024
Multi-Scale Temporal Difference Transformer for Video-Text Retrieval
Multi-Scale Temporal Difference Transformer for Video-Text Retrieval
Ni Wang
Dongliang Liao
Xing Xu
20
0
0
23 Jun 2024
An Empirical Study of Excitation and Aggregation Design Adaptions in
  CLIP4Clip for Video-Text Retrieval
An Empirical Study of Excitation and Aggregation Design Adaptions in CLIP4Clip for Video-Text Retrieval
Xiaolun Jing
Genke Yang
Jian Chu
CLIP
29
1
0
25 May 2024
A Tale of Two Languages: Large-Vocabulary Continuous Sign Language
  Recognition from Spoken Language Supervision
A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision
Charles Raude
Prajwal K R
Liliane Momeni
Hannah Bull
Samuel Albanie
Andrew Zisserman
Gül Varol
SLR
36
5
0
16 May 2024
Unified Video-Language Pre-training with Synchronized Audio
Unified Video-Language Pre-training with Synchronized Audio
Shentong Mo
Haofan Wang
Huaxia Li
Xu Tang
30
2
0
12 May 2024
Learning text-to-video retrieval from image captioning
Learning text-to-video retrieval from image captioning
Lucas Ventura
Cordelia Schmid
Gül Varol
3DV
31
3
0
26 Apr 2024
ProTA: Probabilistic Token Aggregation for Text-Video Retrieval
ProTA: Probabilistic Token Aggregation for Text-Video Retrieval
Han Fang
Xianghao Zang
Chao Ban
Zerun Feng
Lanxiang Zhou
Zhongjiang He
Yongxiang Li
Hao Sun
27
1
0
18 Apr 2024
VideoDistill: Language-aware Vision Distillation for Video Question
  Answering
VideoDistill: Language-aware Vision Distillation for Video Question Answering
Bo Zou
Chao Yang
Yu Qiao
Chengbin Quan
Youjian Zhao
VGen
42
1
0
01 Apr 2024
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang
Guohao Sun
Pichao Wang
Dongfang Liu
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Zhiqiang Tao
VGen
55
20
0
26 Mar 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
70
177
0
29 Feb 2024
Event-aware Video Corpus Moment Retrieval
Event-aware Video Corpus Moment Retrieval
Danyang Hou
Liang Pang
Huawei Shen
Xueqi Cheng
20
1
0
21 Feb 2024
Video Editing for Video Retrieval
Video Editing for Video Retrieval
Bin Zhu
Kevin Flanagan
A. Fragomeni
Michael Wray
Dima Damen
CLIP
29
0
0
04 Feb 2024
SNP-S3: Shared Network Pre-training and Significant Semantic
  Strengthening for Various Video-Text Tasks
SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks
Xingning Dong
Qingpei Guo
Tian Gan
Qing Wang
Jianlong Wu
Xiangyuan Ren
Yuan-Chia Cheng
Wei Chu
21
5
0
31 Jan 2024
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling
  Vision-Language Models Through Open-Vocabulary Knowledge
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge
Huy Le
Tung Kieu
Anh Nguyen
Ngan Le
VGen
24
1
0
15 Dec 2023
Speaker-Text Retrieval via Contrastive Learning
Speaker-Text Retrieval via Contrastive Learning
Xuechen Liu
Xin Wang
Erica Cooper
Xiaoxiao Miao
Junichi Yamagishi
VLM
22
0
0
11 Dec 2023
Lost Your Style? Navigating with Semantic-Level Approach for
  Text-to-Outfit Retrieval
Lost Your Style? Navigating with Semantic-Level Approach for Text-to-Outfit Retrieval
Junkyu Jang
Eugene Hwang
Sung-Hyuk Park
11
0
0
03 Nov 2023
An Empirical Study of Frame Selection for Text-to-Video Retrieval
An Empirical Study of Frame Selection for Text-to-Video Retrieval
Mengxia Wu
Min Cao
Yang Bai
Ziyin Zeng
Chen Chen
Liqiang Nie
Min Zhang
20
3
0
01 Nov 2023
Harvest Video Foundation Models via Efficient Post-Pretraining
Harvest Video Foundation Models via Efficient Post-Pretraining
Yizhuo Li
Kunchang Li
Yinan He
Yi Wang
Yali Wang
Limin Wang
Yu Qiao
Ping Luo
CLIP
VLM
VGen
38
2
0
30 Oct 2023
Sound of Story: Multi-modal Storytelling with Audio
Sound of Story: Multi-modal Storytelling with Audio
Jaeyeon Bae
Seokhoon Jeong
Seokun Kang
Namgi Han
Jae-Yon Lee
Hyounghun Kim
Taehwan Kim
26
2
0
30 Oct 2023
InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution
InvGC: Robust Cross-Modal Retrieval by Inverse Graph Convolution
Xiangru Jian
Yimu Wang
25
4
0
20 Oct 2023
Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and
  Gallery Banks
Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks
Yimu Wang
Xiangru Jian
Bo Xue
22
9
0
17 Oct 2023
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient
  Partially Relevant Video Retrieval
GMMFormer: Gaussian-Mixture-Model Based Transformer for Efficient Partially Relevant Video Retrieval
Yuting Wang
Jinpeng Wang
Bin Chen
Ziyun Zeng
Shu-Tao Xia
38
8
0
08 Oct 2023
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
  Retrieval
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval
Hao Li
Marie-Jeanne Lesot
Lianli Gao
Xiaosu Zhu
Christophe Marsala
EDL
14
11
0
29 Sep 2023
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve
  Multimodal Sarcasm Detection
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection
Swapnil Bhosale
Abhra Chaudhuri
Alex Lee Robert Williams
Divyank Tiwari
Anjan Dutta
Xiatian Zhu
Pushpak Bhattacharyya
Diptesh Kanojia
28
2
0
29 Sep 2023
Video-adverb retrieval with compositional adverb-action embeddings
Video-adverb retrieval with compositional adverb-action embeddings
Thomas Hummel
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
17
1
0
26 Sep 2023
Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Ziyang Wang
Yi-Lin Sung
Feng Cheng
Gedas Bertasius
Mohit Bansal
93
44
0
18 Sep 2023
In-Style: Bridging Text and Uncurated Videos with Style Transfer for
  Text-Video Retrieval
In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval
Nina Shvetsova
Anna Kukleva
Bernt Schiele
Hilde Kuehne
DiffM
23
3
0
16 Sep 2023
Contrastive Latent Space Reconstruction Learning for Audio-Text
  Retrieval
Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Kaiyi Luo
Xulong Zhang
Jianzong Wang
Huaxiong Li
Ning Cheng
Jing Xiao
61
2
0
16 Sep 2023
ATM: Action Temporality Modeling for Video Question Answering
ATM: Action Temporality Modeling for Video Question Answering
Junwen Chen
Jie Zhu
Yu Kong
19
1
0
05 Sep 2023
Simple Baselines for Interactive Video Retrieval with Questions and
  Answers
Simple Baselines for Interactive Video Retrieval with Questions and Answers
Kaiqu Liang
Samuel Albanie
22
2
0
21 Aug 2023
JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset
  Student-Teacher Scenario for Video Action Recognition
JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset Student-Teacher Scenario for Video Action Recognition
L. Bicsi
B. Alexe
Radu Tudor Ionescu
Marius Leordeanu
17
2
0
09 Aug 2023
TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval
TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval
Kaibin Tian
Rui Zhao
Hu Hu
Runquan Xie
Fengzong Lian
Zhanhui Kang
Xirong Li
CLIP
27
0
0
02 Aug 2023
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data
Zheng Zhang
Zheng Ning
Chenliang Xu
Yapeng Tian
Toby Jia-Jun Li
59
6
0
27 Jul 2023
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature
  Alignment
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment
Sarah Ibrahimi
Xiaohang Sun
Pichao Wang
Amanmeet Garg
Ashutosh Sanan
Mohamed Omar
44
14
0
24 Jul 2023
Towards Video Anomaly Retrieval from Video Anomaly Detection: New
  Benchmarks and Model
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model
Peng Wu
Jing Liu
Xiangteng He
Yuxin Peng
Peng Wang
Yanning Zhang
35
29
0
24 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention
  and Zoom-in Boundary Detection
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Qi Zhang
S. Zheng
Qin Jin
17
1
0
20 Jul 2023
Fine-grained Text-Video Retrieval with Frozen Image Encoders
Fine-grained Text-Video Retrieval with Frozen Image Encoders
Zuozhuo Dai
Fang Shao
Qingkun Su
Zilong Dong
Siyu Zhu
167
1
0
14 Jul 2023
12345
Next