ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.08421
  4. Cited By
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual
  Actions

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

23 May 2017
Chunhui Gu
Chen Sun
David A. Ross
Carl Vondrick
C. Pantofaru
Yeqing Li
Sudheendra Vijayanarasimhan
G. Toderici
Susanna Ricco
Rahul Sukthankar
Cordelia Schmid
Jitendra Malik
    VGen
ArXivPDFHTML

Papers citing "AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions"

40 / 190 papers shown
Title
Weakly Supervised Temporal Action Localization Using Deep Metric
  Learning
Weakly Supervised Temporal Action Localization Using Deep Metric Learning
Ashraful Islam
Richard J. Radke
19
46
0
21 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
22
19
0
17 Jan 2020
Deep Audio-Visual Learning: A Survey
Deep Audio-Visual Learning: A Survey
Hao Zhu
Mandi Luo
Rui Wang
A. Zheng
R. He
31
156
0
14 Jan 2020
Actions as Moving Points
Actions as Moving Points
Yixuan Li
Zixu Wang
Limin Wang
Gangshan Wu
22
104
0
14 Jan 2020
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Action Genome: Actions as Composition of Spatio-temporal Scene Graphs
Jingwei Ji
Ranjay Krishna
Li Fei-Fei
Juan Carlos Niebles
39
335
0
15 Dec 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time
  Spatiotemporal Action Localization
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Okan Kopuklu
Xiangyu Wei
Gerhard Rigoll
25
143
0
15 Nov 2019
A Graph-Based Framework to Bridge Movies and Synopses
A Graph-Based Framework to Bridge Movies and Synopses
Yu Xiong
Chengyi Zhang
Lingfeng Guo
Hang Zhou
Bolei Zhou
Dahua Lin
27
60
0
24 Oct 2019
Generating Human Action Videos by Coupling 3D Game Engines and
  Probabilistic Graphical Models
Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models
César Roberto de Souza
Adrien Gaidon
Yohann Cabon
Naila Murray
A. Peña
31
14
0
12 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal
  Reasoning
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
19
176
0
10 Oct 2019
Weakly-supervised Action Localization with Background Modeling
Weakly-supervised Action Localization with Background Modeling
P. Nguyen
Deva Ramanan
Charless C. Fowlkes
SSL
WSOL
22
157
0
19 Aug 2019
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Laura Sevilla-Lara
Shengxin Cindy Zha
Zhicheng Yan
Vedanuj Goswami
Matt Feiszli
Lorenzo Torresani
32
75
0
19 Jul 2019
Deformable Tube Network for Action Detection in Videos
Deformable Tube Network for Action Detection in Videos
Wei Li
Zehuan Yuan
Dashan Guo
Lei Huang
Xiangzhong Fang
Changhu Wang
ViT
MedIm
20
5
0
03 Jul 2019
Learning Video Representations using Contrastive Bidirectional
  Transformer
Learning Video Representations using Contrastive Bidirectional Transformer
Chen Sun
Fabien Baradel
Kevin Patrick Murphy
Cordelia Schmid
SSL
ViT
19
133
0
13 Jun 2019
TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVQA+: Spatio-Temporal Grounding for Video Question Answering
Jie Lei
Licheng Yu
Tamara L. Berg
Mohit Bansal
31
227
0
25 Apr 2019
Large Scale Holistic Video Understanding
Large Scale Holistic Video Understanding
Ali Diba
Mohsen Fayyaz
Vivek Sharma
Manohar Paluri
Jurgen Gall
Rainer Stiefelhagen
Luc Van Gool
24
35
0
25 Apr 2019
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
STEP: Spatio-Temporal Progressive Learning for Video Action Detection
Xitong Yang
Xiaodong Yang
Ming-Yu Liu
Fanyi Xiao
L. Davis
Jan Kautz
27
138
0
19 Apr 2019
Dance with Flow: Two-in-One Stream Action Detection
Dance with Flow: Two-in-One Stream Action Detection
Jiaojiao Zhao
Cees G. M. Snoek
12
83
0
01 Apr 2019
COIN: A Large-scale Dataset for Comprehensive Instructional Video
  Analysis
COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis
Yansong Tang
Dajun Ding
Yongming Rao
Yu Zheng
Danyang Zhang
Lili Zhao
Jiwen Lu
Jie Zhou
16
302
0
07 Mar 2019
IF-TTN: Information Fused Temporal Transformation Network for Video
  Action Recognition
IF-TTN: Information Fused Temporal Transformation Network for Video Action Recognition
Ke Yang
Peng Qiao
Dongsheng Li
Y. Dou
ViT
25
8
0
26 Feb 2019
Grounded Video Description
Grounded Video Description
Luowei Zhou
Yannis Kalantidis
Xinlei Chen
Jason J. Corso
Marcus Rohrbach
27
190
0
17 Dec 2018
Action Machine: Rethinking Action Recognition in Trimmed Videos
Action Machine: Rethinking Action Recognition in Trimmed Videos
Jiagang Zhu
Wei Zou
Liang Xu
Yiming Hu
Zheng Zhu
Manyu Chang
Junjie Huang
Guan Huang
Dalong Du
27
37
0
14 Dec 2018
Long-Term Feature Banks for Detailed Video Understanding
Long-Term Feature Banks for Detailed Video Understanding
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
41
476
0
12 Dec 2018
A Structured Model For Action Detection
A Structured Model For Action Detection
Yubo Zhang
P. Tokmakov
M. Hebert
Cordelia Schmid
25
101
0
09 Dec 2018
Video Action Transformer Network
Video Action Transformer Network
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
ViT
28
702
0
06 Dec 2018
Multimodal Explanations by Predicting Counterfactuality in Videos
Multimodal Explanations by Predicting Counterfactuality in Videos
Atsushi Kanehira
Kentaro Takemoto
S. Inayoshi
Tatsuya Harada
18
35
0
04 Dec 2018
ARBEE: Towards Automated Recognition of Bodily Expression of Emotion In
  the Wild
ARBEE: Towards Automated Recognition of Bodily Expression of Emotion In the Wild
Yu Luo
Jianbo Ye
Reginald B. Adams
Jia Li
M. Newman
J. Z. Wang
53
86
0
28 Aug 2018
Attentive Sequence to Sequence Translation for Localizing Clips of
  Interest by Natural Language Descriptions
Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions
Ke Ning
Linchao Zhu
Ming Cai
Yi Yang
Di Xie
Fei Wu
13
2
0
27 Aug 2018
Predicting Action Tubes
Predicting Action Tubes
Gurkirt Singh
Suman Saha
Fabio Cuzzolin
ViT
24
22
0
23 Aug 2018
The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary
The ActivityNet Large-Scale Activity Recognition Challenge 2018 Summary
Bernard Ghanem
Juan Carlos Niebles
Cees G. M. Snoek
Fabian Caba Heilbron
Humam Alwassel
Victor Escorcia
Ranjay Krishna
S. Buch
Cuong Duc Dao
42
65
0
11 Aug 2018
Actor-Centric Relation Network
Actor-Centric Relation Network
Chen Sun
Abhinav Shrivastava
Carl Vondrick
Kevin Patrick Murphy
Rahul Sukthankar
Cordelia Schmid
36
220
0
28 Jul 2018
Diagnosing Error in Temporal Action Detectors
Diagnosing Error in Temporal Action Detectors
Humam Alwassel
Fabian Caba Heilbron
Victor Escorcia
Bernard Ghanem
32
106
0
27 Jul 2018
A Better Baseline for AVA
A Better Baseline for AVA
Rohit Girdhar
João Carreira
Carl Doersch
Andrew Zisserman
13
66
0
26 Jul 2018
VideoCapsuleNet: A Simplified Network for Action Detection
VideoCapsuleNet: A Simplified Network for Action Detection
Kevin Duarte
Y. S. Rawat
M. Shah
MedIm
24
165
0
21 May 2018
SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos
SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos
Silvio Giancola
Mohieddine Amine
Tarek Dghaily
Bernard Ghanem
AI4TS
19
193
0
12 Apr 2018
Let's Dance: Learning From Online Dance Videos
Let's Dance: Learning From Online Dance Videos
Daniel Castro
Steven Hickson
Patsorn Sangkloy
Bhavishya Mittal
Sean Dai
James Hays
Irfan Essa
27
24
0
23 Jan 2018
Moments in Time Dataset: one million videos for event understanding
Moments in Time Dataset: one million videos for event understanding
Mathew Monfort
A. Andonian
Bolei Zhou
K. Ramakrishnan
Sarah Adel Bargal
...
L. Brown
Quanfu Fan
Dan Gutfreund
Carl Vondrick
A. Oliva
45
538
0
09 Jan 2018
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
P. Nguyen
Ting Liu
Gautam Prasad
Bohyung Han
WSOL
19
347
0
14 Dec 2017
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
Saining Xie
Chen Sun
Jonathan Huang
Z. Tu
Kevin Patrick Murphy
3DH
11
1,308
0
13 Dec 2017
From Lifestyle Vlogs to Everyday Interactions
From Lifestyle Vlogs to Everyday Interactions
David Fouhey
Weicheng Kuo
Alexei A. Efros
Jitendra Malik
14
123
0
06 Dec 2017
Second-order Temporal Pooling for Action Recognition
Second-order Temporal Pooling for Action Recognition
A. Cherian
Stephen Gould
EgoV
11
29
0
23 Apr 2017
Previous
1234