Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.02101
Cited By
TALL: Temporal Activity Localization via Language Query
5 May 2017
J. Gao
Chen Sun
Zhenheng Yang
Ram Nevatia
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TALL: Temporal Activity Localization via Language Query"
50 / 420 papers shown
Title
Target Adaptive Context Aggregation for Video Scene Graph Generation
Yao Teng
Limin Wang
Zhifeng Li
Gangshan Wu
29
62
0
18 Aug 2021
MTVR: Multilingual Moment Retrieval in Videos
Jie Lei
Tamara L. Berg
Mohit Bansal
8
11
0
30 Jul 2021
Adaptive Hierarchical Graph Reasoning with Semantic Coherence for Video-and-Language Inference
Juncheng Li
Siliang Tang
Linchao Zhu
Haochen Shi
Xuanwen Huang
Fei Wu
Yi Yang
Yueting Zhuang
17
28
0
26 Jul 2021
Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation
Jiabo Huang
Yang Liu
S. Gong
Hailin Jin
24
61
0
23 Jul 2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
Jie Lei
Tamara L. Berg
Mohit Bansal
ViT
21
62
0
20 Jul 2021
End-to-end Multi-modal Video Temporal Grounding
Yi-Wen Chen
Yi-Hsuan Tsai
Ming-Hsuan Yang
11
51
0
12 Jul 2021
A Survey on Deep Learning Technique for Video Segmentation
Tianfei Zhou
Fatih Porikli
David J. Crandall
Luc Van Gool
Wenguan Wang
VOS
20
231
0
02 Jul 2021
Weakly Supervised Temporal Adjacent Network for Language Grounding
Yuechen Wang
Jiajun Deng
Wen-gang Zhou
Houqiang Li
24
67
0
30 Jun 2021
Building a Video-and-Language Dataset with Human Actions for Multimodal Logical Inference
Riko Suzuki
Hitomi Yanaka
K. Mineshima
D. Bekki
VGen
MLLM
16
1
0
27 Jun 2021
Listen As You Wish: Audio based Event Detection via Text-to-Audio Grounding in Smart Cities
Haoyu Tang
Yunxiao Wang
Jihua Zhu
Shuai Zhang
Mingzhu Xu
Qinghai Zheng
Yupeng Hu
14
1
0
27 Jun 2021
Video Moment Retrieval with Text Query Considering Many-to-Many Correspondence Using Potentially Relevant Pair
Sho Maeoki
Yusuke Mukuta
Tatsuya Harada
9
4
0
25 Jun 2021
Interventional Video Grounding with Dual Contrastive Learning
Guoshun Nan
Rui Qiao
Yao Xiao
Jun Liu
Sicong Leng
H. Zhang
Wei Lu
16
144
0
21 Jun 2021
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Mohit Bansal
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
24
100
0
08 Jun 2021
Deconfounded Video Moment Retrieval with Causal Intervention
Xun Yang
Fuli Feng
Wei Ji
Meng Wang
Tat-Seng Chua
CML
VGen
29
187
0
03 Jun 2021
Connecting Language and Vision for Natural Language-Based Vehicle Retrieval
Shuai Bai
Zhedong Zheng
Xiaohan Wang
Junyang Lin
Zhu Zhang
Chang Zhou
Yi Yang
Hongxia Yang
11
27
0
31 May 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Joey Tianyi Zhou
Rick Siow Mong Goh
16
40
0
18 May 2021
Video Corpus Moment Retrieval with Contrastive Learning
Hao Zhang
Aixin Sun
Wei Jing
Guoshun Nan
Liangli Zhen
Joey Tianyi Zhou
Rick Siow Mong Goh
33
81
0
13 May 2021
Aligning Subtitles in Sign Language Videos
Hannah Bull
Triantafyllos Afouras
Gül Varol
Samuel Albanie
Liliane Momeni
Andrew Zisserman
SLR
22
30
0
06 May 2021
Sign Segmentation with Changepoint-Modulated Pseudo-Labelling
Katrin Renz
N. Stache
Neil Fox
Gül Varol
Samuel Albanie
37
18
0
28 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
16
82
0
19 Apr 2021
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
20
68
0
02 Apr 2021
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning
Luowei Zhou
Jingjing Liu
Yu Cheng
Zhe Gan
Lei Zhang
15
7
0
01 Apr 2021
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
23
7
0
01 Apr 2021
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu
Xiaoye Qu
Jianfeng Dong
Pan Zhou
Yu Cheng
Wei Wei
Zichuan Xu
Yulai Xie
16
145
0
22 Mar 2021
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
Qi Feng
Yunchao Wei
Mingming Cheng
Yi Yang
19
5
0
18 Mar 2021
Boundary Proposal Network for Two-Stage Natural Language Video Localization
Shaoning Xiao
Long Chen
Songyang Zhang
Wei Ji
Jian Shao
Lu Ye
Jun Xiao
10
160
0
15 Mar 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Mohit Bansal
Jingjing Liu
CLIP
32
646
0
11 Feb 2021
Progressive Localization Networks for Language-based Moment Localization
Qi Zheng
Jianfeng Dong
Xiaoye Qu
Xun Yang
Yabing Wang
Pan Zhou
Baolong Liu
Xun Wang
17
33
0
02 Feb 2021
A Closer Look at Temporal Sentence Grounding in Videos: Dataset and Metric
Yitian Yuan
Xiaohan Lan
Xin Wang
Long Chen
Zhi Wang
Wenwu Zhu
13
51
0
22 Jan 2021
Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language
Songyang Zhang
Houwen Peng
Jianlong Fu
Yijuan Lu
Jiebo Luo
19
51
0
04 Dec 2020
QuerYD: A video dataset with high-quality text and audio narrations
Andreea-Maria Oncescu
João F. Henriques
Yang Liu
Andrew Zisserman
Samuel Albanie
VGen
14
11
0
22 Nov 2020
Boundary-sensitive Pre-training for Temporal Localization in Videos
Mengmeng Xu
Juan-Manuel Perez-Rua
Victor Escorcia
Brais Martínez
Xiatian Zhu
Li Zhang
Bernard Ghanem
Tao Xiang
25
61
0
21 Nov 2020
VLG-Net: Video-Language Graph Matching Network for Video Grounding
Mattia Soldan
Mengmeng Xu
Sisi Qu
Jesper N. Tegnér
Bernard Ghanem
33
69
0
19 Nov 2020
A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus
Bowen Zhang
Hexiang Hu
Joonseok Lee
Mingde Zhao
Sheide Chammas
Vihan Jain
Eugene Ie
Fei Sha
25
30
0
18 Nov 2020
Data-efficient Alignment of Multimodal Sequences by Aligning Gradient Updates and Internal Feature Distributions
Jianan Wang
Boyang Albert Li
Xiangyu Fan
Jing-Hua Lin
Yanwei Fu
23
2
0
15 Nov 2020
Human-centric Spatio-Temporal Video Grounding With Visual Transformers
Zongheng Tang
Yue Liao
Si Liu
Guanbin Li
Xiaojie Jin
Hongxu Jiang
Qian Yu
Dong Xu
19
94
0
10 Nov 2020
Actor and Action Modular Network for Text-based Video Segmentation
Jianhua Yang
Yan Huang
K. Niu
Linjiang Huang
Zhanyu Ma
Liang Wang
11
9
0
02 Nov 2020
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Simon Ging
Mohammadreza Zolfaghari
Hamed Pirsiavash
Thomas Brox
ViT
CLIP
13
168
0
01 Nov 2020
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Mohit Bansal
14
72
0
15 Oct 2020
DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
Basura Fernando
Hongdong Li
Stephen Gould
129
11
0
13 Oct 2020
A Simple Yet Effective Method for Video Temporal Grounding with Cross-Modality Attention
Binjie Zhang
Yu Li
Chun Yuan
D. Xu
Pin Jiang
Ying Shan
6
5
0
23 Sep 2020
Frame-wise Cross-modal Matching for Video Moment Retrieval
Haoyu Tang
Jihua Zhu
Meng Liu
Zan Gao
Zhiyong Cheng
29
61
0
22 Sep 2020
Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos
Jie Wu
Guanbin Li
Xiaoguang Han
Liang Lin
OffRL
AI4TS
11
56
0
18 Sep 2020
Linear Temporal Public Announcement Logic: a new perspective for reasoning about the knowledge of multi-classifiers
Amirhoshang Hoseinpour Dehkordi
Majid Alizadeh
A. Movaghar
6
0
0
08 Sep 2020
Video Moment Retrieval via Natural Language Queries
Xinli Yu
Mohsen Malmir
C. He
Yue Liu
Rex Wu
14
1
0
04 Sep 2020
Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
Mayu Otani
Yuta Nakashima
Esa Rahtu
J. Heikkilä
13
74
0
01 Sep 2020
Sentence Guided Temporal Modulation for Dynamic Video Thumbnail Generation
Mrigank Rochan
Mahesh Kumar Krishna Reddy
Yang Wang
8
7
0
31 Aug 2020
VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval
Minuk Ma
Sunjae Yoon
Junyeong Kim
Youngjoon Lee
Sunghun Kang
Chang-Dong Yoo
15
78
0
24 Aug 2020
Text-based Localization of Moments in a Video Corpus
Sudipta Paul
Niluthpol Chowdhury Mithun
A. Roy-Chowdhury
10
14
0
20 Aug 2020
Generating Adjacency Matrix for Video Relocalization
Yuanen Zhou
Mingfei Wang
Ruolin Wang
Shuwei Huo
17
0
0
19 Aug 2020
Previous
1
2
3
4
5
6
7
8
9
Next