ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.02101
  4. Cited By
TALL: Temporal Activity Localization via Language Query

TALL: Temporal Activity Localization via Language Query

5 May 2017
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
ArXivPDFHTML

Papers citing "TALL: Temporal Activity Localization via Language Query"

50 / 420 papers shown
Title
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Dezhao Luo
Jiabo Huang
Shaogang Gong
Hailin Jin
Yang Liu
VLM
19
9
0
01 Sep 2023
Distraction-free Embeddings for Robust VQA
Distraction-free Embeddings for Robust VQA
Atharvan Dogra
Deeksha Varshney
A. Kalyan
A. Deshpande
Neeraj Kumar
14
0
0
31 Aug 2023
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and
  Highlight Detection
DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection
Henghao Zhao
Kevin Qinghong Lin
Rui Yan
Zechao Li
VGen
DiffM
31
1
0
29 Aug 2023
Multi-event Video-Text Retrieval
Multi-event Video-Text Retrieval
Gengyuan Zhang
Jisen Ren
Jindong Gu
Volker Tresp
19
13
0
22 Aug 2023
UnLoc: A Unified Framework for Video Localization Tasks
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
24
53
0
21 Aug 2023
Temporal Sentence Grounding in Streaming Videos
Temporal Sentence Grounding in Streaming Videos
Tian Gan
Xiao Wang
Yan Sun
Jianlong Wu
Qingpei Guo
Liqiang Nie
38
2
0
14 Aug 2023
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Knowing Where to Focus: Event-aware Transformer for Video Grounding
Jinhyun Jang
Jungin Park
Jin-Hwa Kim
Hyeongjun Kwon
K. Sohn
16
49
0
14 Aug 2023
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
ViGT: Proposal-free Video Grounding with Learnable Token in Transformer
Kun Li
Dan Guo
Meng Wang
ViT
14
36
0
11 Aug 2023
Encode-Store-Retrieve: Enhancing Memory Augmentation through
  Language-Encoded Egocentric Perception
Encode-Store-Retrieve: Enhancing Memory Augmentation through Language-Encoded Egocentric Perception
Junxiao Shen
John J. Dudley
Per Ola Kristensson
RALM
20
0
0
10 Aug 2023
Counterfactual Cross-modality Reasoning for Weakly Supervised Video
  Moment Localization
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization
Zezhong Lv
Bing-Huang Su
Ji-Rong Wen
16
16
0
10 Aug 2023
Local-Global Information Interaction Debiasing for Dynamic Scene Graph Generation
Xinyu Lyu
Jingwei Liu
Yuyu Guo
Lianli Gao
21
1
0
10 Aug 2023
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with
  Glance Annotation
D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation
Hanjun Li
Xiujun Shu
Su He
Ruizhi Qiao
Wei Wen
Taian Guo
Bei Gan
Xing Sun
12
11
0
08 Aug 2023
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher
  Knowledge Distillation
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation
Renjie Liang
Yiming Yang
Hui Lu
Li Li
17
10
0
07 Aug 2023
UniVTG: Towards Unified Video-Language Temporal Grounding
UniVTG: Towards Unified Video-Language Temporal Grounding
Kevin Qinghong Lin
Pengchuan Zhang
Joya Chen
Shraman Pramanick
Difei Gao
Alex Jinpeng Wang
Rui Yan
Mike Zheng Shou
21
112
0
31 Jul 2023
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Kun Yuan
V. Srivastav
Tong Yu
Joël L. Lavanchy
Pietro Mascagni
Pietro Mascagni
N. Padoy
Nicolas Padoy
22
20
0
27 Jul 2023
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and
  Game Theory
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory
Hongxiang Li
Meng Cao
Xuxin Cheng
Yaowei Li
Zhihong Zhu
Yuexian Zou
24
20
0
26 Jul 2023
Towards Video Anomaly Retrieval from Video Anomaly Detection: New
  Benchmarks and Model
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model
Peng Wu
Jing Liu
Xiangteng He
Yuxin Peng
Peng Wang
Yanning Zhang
35
29
0
24 Jul 2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention
  and Zoom-in Boundary Detection
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Qi Zhang
S. Zheng
Qin Jin
17
1
0
20 Jul 2023
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the
  Backbone
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick
Yale Song
Sayan Nag
Kevin Qinghong Lin
Hardik Shah
Mike Zheng Shou
Ramalingam Chellappa
Pengchuan Zhang
VLM
39
86
0
11 Jul 2023
MomentDiff: Generative Video Moment Retrieval from Random to Real
MomentDiff: Generative Video Moment Retrieval from Random to Real
P. Li
Chen-Wei Xie
Hongtao Xie
Liming Zhao
Lei Zhang
Yun Zheng
Deli Zhao
Yongdong Zhang
DiffM
VGen
34
56
0
06 Jul 2023
Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment
Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment
Yongrae Jo
Seongyun Lee
Aiden Seung Joon Lee
Hyunji Lee
Hanseok Oh
Minjoon Seo
16
2
0
05 Jul 2023
SpotEM: Efficient Video Search for Episodic Memory
SpotEM: Efficient Video Search for Episodic Memory
Santhosh Kumar Ramakrishnan
Ziad Al-Halah
Kristen Grauman
VLM
28
9
0
28 Jun 2023
Dissecting Multimodality in VideoQA Transformer Models by Impairing
  Modality Fusion
Dissecting Multimodality in VideoQA Transformer Models by Impairing Modality Fusion
Isha Rawal
Alexander Matyasko
Shantanu Jaiswal
Basura Fernando
Cheston Tan
21
1
0
15 Jun 2023
A Survey on Video Moment Localization
A Survey on Video Moment Localization
Meng Liu
Liqiang Nie
Yunxiao Wang
Meng Wang
Yong Rui
27
28
0
13 Jun 2023
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment
  Interaction
MS-DETR: Natural Language Video Localization with Sampling Moment-Moment Interaction
J. Wang
Aixin Sun
Hao Zhang
Xiaoli Li
ViT
19
13
0
30 May 2023
Deep Neural Networks in Video Human Action Recognition: A Review
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
51
4
0
25 May 2023
Faster Video Moment Retrieval with Point-Level Supervision
Faster Video Moment Retrieval with Point-Level Supervision
Xun Jiang
Zailei Zhou
Xing Xu
Yang Yang
Guoqing Wang
Heng Tao Shen
29
13
0
23 May 2023
Movie101: A New Movie Understanding Benchmark
Movie101: A New Movie Understanding Benchmark
Zihao Yue
Qi Zhang
Anwen Hu
Liang Zhang
Ziheng Wang
Qin Jin
VGen
27
17
0
20 May 2023
Joint Moment Retrieval and Highlight Detection Via Natural Language
  Queries
Joint Moment Retrieval and Highlight Detection Via Natural Language Queries
Richard Luo
Austin Peng
Heidi Yap
Koby Beard
ViT
16
0
0
08 May 2023
Transform-Equivariant Consistency Learning for Temporal Sentence
  Grounding
Transform-Equivariant Consistency Learning for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Zichuan Xu
Haozhao Wang
Xing Di
Weining Lu
Yu Cheng
44
8
0
06 May 2023
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion
  Synthesis
TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis
Mathis Petrovich
Michael J. Black
Gül Varol
VGen
67
76
0
02 May 2023
MH-DETR: Video Moment and Highlight Detection with Cross-modal
  Transformer
MH-DETR: Video Moment and Highlight Detection with Cross-modal Transformer
Yifang Xu
Yunzhuo Sun
Yang Li
Yilei Shi
Xiaoxia Zhu
S. Du
ViT
40
33
0
29 Apr 2023
Boundary-Denoising for Video Activity Localization
Boundary-Denoising for Video Activity Localization
Mengmeng Xu
Mattia Soldan
Jialin Gao
Shuming Liu
Juan-Manuel Perez-Rua
Bernard Ghanem
19
10
0
06 Apr 2023
Sketch-based Video Object Localization
Sketch-based Video Object Localization
Sangmin Woo
So-Yeong Jeon
Jinyoung Park
Minji Son
Sumin Lee
Changick Kim
11
0
0
02 Apr 2023
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Learning Action Changes by Measuring Verb-Adverb Textual Relationships
Davide Moltisanti
Frank Keller
Hakan Bilen
Laura Sevilla-Lara
26
7
0
27 Mar 2023
Query-Dependent Video Representation for Moment Retrieval and Highlight
  Detection
Query-Dependent Video Representation for Moment Retrieval and Highlight Detection
WonJun Moon
Sangeek Hyun
S. Park
Dongchan Park
Jae-Pil Heo
ViT
41
106
0
24 Mar 2023
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding
  in Long Videos
Scanning Only Once: An End-to-end Framework for Fast Temporal Grounding in Long Videos
Yulin Pan
Xiangteng He
Biao Gong
Yiliang Lv
Yujun Shen
Yuxin Peng
Deli Zhao
37
12
0
15 Mar 2023
You Can Ground Earlier than See: An Effective and Efficient Pipeline for
  Temporal Sentence Grounding in Compressed Videos
You Can Ground Earlier than See: An Effective and Efficient Pipeline for Temporal Sentence Grounding in Compressed Videos
Xiang Fang
Daizong Liu
Pan Zhou
Guoshun Nan
23
37
0
14 Mar 2023
Generation-Guided Multi-Level Unified Network for Video Grounding
Generation-Guided Multi-Level Unified Network for Video Grounding
Xingyi Cheng
Xiangyu Wu
Dong Shen
Hezheng Lin
Fan Yang
19
0
0
14 Mar 2023
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Align and Attend: Multimodal Summarization with Dual Contrastive Losses
Bo He
Jun Wang
Jielin Qiu
Trung Bui
Abhinav Shrivastava
Zhaowen Wang
22
65
0
13 Mar 2023
Towards Diverse Temporal Grounding under Single Positive Labels
Towards Diverse Temporal Grounding under Single Positive Labels
Hao Zhou
Chongyang Zhang
Yanjun Chen
Chuanping Hu
24
1
0
12 Mar 2023
Learning Grounded Vision-Language Representation for Versatile
  Understanding in Untrimmed Videos
Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Teng Wang
Jinrui Zhang
Feng Zheng
Wenhao Jiang
Ran Cheng
Ping Luo
VLM
31
11
0
11 Mar 2023
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Text-Visual Prompting for Efficient 2D Temporal Video Grounding
Yimeng Zhang
Xin Chen
Jinghan Jia
Sijia Liu
Ke Ding
16
25
0
09 Mar 2023
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection
  to Image-Text Pre-Training
Towards Generalisable Video Moment Retrieval: Visual-Dynamic Injection to Image-Text Pre-Training
Dezhao Luo
Jiabo Huang
S. Gong
Hailin Jin
Yang Liu
VGen
21
28
0
28 Feb 2023
Deep Visual Forced Alignment: Learning to Align Transcription with
  Talking Face Video
Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video
Minsu Kim
Chae Won Kim
Y. Ro
CVBM
DiffM
30
3
0
27 Feb 2023
Localizing Moments in Long Video Via Multimodal Guidance
Localizing Moments in Long Video Via Multimodal Guidance
Wayner Barrios
Mattia Soldan
Alberto M. Ceballos-Arroyo
Fabian Caba Heilbron
Bernard Ghanem
22
20
0
26 Feb 2023
Tracking Objects and Activities with Attention for Temporal Sentence
  Grounding
Tracking Objects and Activities with Attention for Temporal Sentence Grounding
Zeyu Xiong
Daizong Liu
Pan Zhou
Jiahao Zhu
13
5
0
21 Feb 2023
Constraint and Union for Partially-Supervised Temporal Sentence
  Grounding
Constraint and Union for Partially-Supervised Temporal Sentence Grounding
Chen Ju
Haicheng Wang
Jinxian Liu
Chaofan Ma
Ya-Qin Zhang
Peisen Zhao
Jianlong Chang
Qi Tian
22
15
0
20 Feb 2023
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal
E. Mavroudi
Xitong Yang
Sainbayar Sukhbaatar
Leonid Sigal
Matt Feiszli
Lorenzo Torresani
Du Tran
12
7
0
16 Feb 2023
Multi-video Moment Ranking with Multimodal Clue
Multi-video Moment Ranking with Multimodal Clue
Danyang Hou
Liang Pang
Yanyan Lan
Huawei Shen
Xueqi Cheng
11
0
0
29 Jan 2023
Previous
123456789
Next