ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.05038
  4. Cited By
Long-Term Feature Banks for Detailed Video Understanding

Long-Term Feature Banks for Detailed Video Understanding

12 December 2018
Chao-Yuan Wu
Christoph Feichtenhofer
Haoqi Fan
Kaiming He
Philipp Krahenbuhl
Ross B. Girshick
ArXivPDFHTML

Papers citing "Long-Term Feature Banks for Detailed Video Understanding"

50 / 306 papers shown
Title
COPILOT: Human-Environment Collision Prediction and Localization from
  Egocentric Videos
COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos
Boxiao Pan
Bokui Shen
Davis Rempe
Despoina Paschalidou
Kaichun Mo
Yanchao Yang
Leonidas J. Guibas
15
2
0
04 Oct 2022
Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain
  Supervision for Domain-adaptive Action Detection
Exploiting Instance-based Mixed Sampling via Auxiliary Source Domain Supervision for Domain-adaptive Action Detection
Yifan Lu
Gurkirt Singh
Suman Saha
Luc Van Gool
TTA
29
2
0
28 Sep 2022
Visual Object Tracking in First Person Vision
Visual Object Tracking in First Person Vision
Matteo Dunnhofer
Antonino Furnari
G. Farinella
C. Micheloni
27
33
0
27 Sep 2022
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video
  Temporal Grounding
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding
Zhijian Hou
Wanjun Zhong
Lei Ji
Difei Gao
Kun Yan
W. Chan
Chong-Wah Ngo
Zheng Shou
Nan Duan
AI4TS
27
24
0
22 Sep 2022
MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic
  Segmentation
MCIBI++: Soft Mining Contextual Information Beyond Image for Semantic Segmentation
Zhenchao Jin
Dongdong Yu
Zehuan Yuan
Lequan Yu
38
21
0
09 Sep 2022
Spatio-Temporal Action Detection Under Large Motion
Spatio-Temporal Action Detection Under Large Motion
Gurkirt Singh
Vasileios Choutas
Suman Saha
F. I. F. Richard Yu
Luc Van Gool
18
12
0
06 Sep 2022
A comprehensive survey on recent deep learning-based methods applied to
  surgical data
A comprehensive survey on recent deep learning-based methods applied to surgical data
Mansoor Ali
Rafael Martinez Garcia Peña
Gilberto Ochoa-Ruiz
Sharib Ali
12
6
0
03 Sep 2022
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action
  Recognition
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition
Tianjiao Li
Lin Geng Foo
Qiuhong Ke
Hossein Rahmani
Anran Wang
Jinghua Wang
J. Liu
19
21
0
03 Sep 2022
A Circular Window-based Cascade Transformer for Online Action Detection
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
34
6
0
30 Aug 2022
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition
  Analysis for Adversarial Multi-task Video Understanding
Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding
Stephen Su
Sam Kwong
Qingyu Zhao
De-An Huang
Juan Carlos Niebles
Ehsan Adeli
21
0
0
22 Aug 2022
EgoEnv: Human-centric environment representations from egocentric video
EgoEnv: Human-centric environment representations from egocentric video
Tushar Nagarajan
Santhosh Kumar Ramakrishnan
Ruta Desai
James M. Hillis
Kristen Grauman
EgoV
21
19
0
22 Jul 2022
Is an Object-Centric Video Representation Beneficial for Transfer?
Is an Object-Centric Video Representation Beneficial for Transfer?
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
ViT
31
26
0
20 Jul 2022
ViGAT: Bottom-up event recognition and explanation in video using
  factorized graph attention network
ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network
Nikolaos Gkalelis
Dimitrios Daskalakis
Vasileios Mezaris
8
10
0
20 Jul 2022
Learning Sequence Representations by Non-local Recurrent Neural Memory
Learning Sequence Representations by Non-local Recurrent Neural Memory
Wenjie Pei
Xin Feng
Canmiao Fu
Qi Cao
Guangming Lu
Yu-Wing Tai
AI4TS
19
1
0
20 Jul 2022
Learning from Label Relationships in Human Affect
Learning from Label Relationships in Human Affect
Niki Maria Foteinopoulou
Ioannis Patras
CVBM
25
8
0
12 Jul 2022
Beyond Transfer Learning: Co-finetuning for Action Localisation
Beyond Transfer Learning: Co-finetuning for Action Localisation
Anurag Arnab
Xuehan Xiong
A. Gritsenko
Rob Romijnders
Josip Djolonga
Mostafa Dehghani
Chen Sun
Mario Lucic
Cordelia Schmid
25
8
0
08 Jul 2022
Explore Spatio-temporal Aggregation for Insubstantial Object Detection:
  Benchmark Dataset and Baseline
Explore Spatio-temporal Aggregation for Insubstantial Object Detection: Benchmark Dataset and Baseline
Kailai Zhou
Yibo Wang
Tao Lv
Yunqian Li
Linsen Chen
Qiu Shen
Xun Cao
17
10
0
23 Jun 2022
One-stage Action Detection Transformer
One-stage Action Detection Transformer
Lijun Li
Lian Zhuo
Bangyin Zhang
ViT
22
0
0
21 Jun 2022
It's Time for Artistic Correspondence in Music and Video
It's Time for Artistic Correspondence in Music and Video
Dídac Surís
Carl Vondrick
Bryan C. Russell
Justin Salamon
11
37
0
14 Jun 2022
A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal
  Action Detector
A Simple and Efficient Pipeline to Build an End-to-End Spatial-Temporal Action Detector
Lin Sui
Chen-Da Liu-Zhang
Lixin Gu
Feng Han
22
8
0
07 Jun 2022
Revisiting the "Video" in Video-Language Understanding
Revisiting the "Video" in Video-Language Understanding
S. Buch
Cristobal Eyzaguirre
Adrien Gaidon
Jiajun Wu
L. Fei-Fei
Juan Carlos Niebles
27
155
0
03 Jun 2022
A CLIP-Hitchhiker's Guide to Long Video Retrieval
A CLIP-Hitchhiker's Guide to Long Video Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
CLIP
121
62
0
17 May 2022
Retrieval-Enhanced Machine Learning
Retrieval-Enhanced Machine Learning
Hamed Zamani
Fernando Diaz
Mostafa Dehghani
Donald Metzler
Michael Bendersky
11
49
0
02 May 2022
The Wisdom of Crowds: Temporal Progressive Attention for Early Action
  Prediction
The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction
Alexandros Stergiou
Dima Damen
AI4TS
EgoV
EDL
17
7
0
28 Apr 2022
Temporal Relevance Analysis for Video Action Models
Temporal Relevance Analysis for Video Action Models
Quanfu Fan
Donghyun Kim
Chun-Fu Chen
Chen
Stan Sclaroff
Kate Saenko
Sarah Adel Bargal
FAtt
22
0
0
25 Apr 2022
A Multi-Person Video Dataset Annotation Method of Spatio-Temporally
  Actions
A Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions
Fan Yang
16
5
0
21 Apr 2022
THORN: Temporal Human-Object Relation Network for Action Recognition
THORN: Temporal Human-Object Relation Network for Action Recognition
Mohammed Guermal
Rui Dai
F. Brémond
EgoV
14
3
0
20 Apr 2022
LaMemo: Language Modeling with Look-Ahead Memory
LaMemo: Language Modeling with Look-Ahead Memory
Haozhe Ji
Rongsheng Zhang
Zhenyu Yang
Zhipeng Hu
Minlie Huang
KELM
RALM
CLL
11
3
0
15 Apr 2022
SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric
  Action Recognition
SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition
Victor Escorcia
Ricardo Guerrero
Xiatian Zhu
Brais Martínez
EgoV
28
9
0
10 Apr 2022
E^2TAD: An Energy-Efficient Tracking-based Action Detector
E^2TAD: An Energy-Efficient Tracking-based Action Detector
Xin Hu
Zhenyu Wu
Haoyuan Miao
Siqi Fan
Taiyu Long
...
Pengcheng Pi
Yi Wu
Zhou Ren
Zhangyang Wang
G. Hua
19
2
0
09 Apr 2022
Hierarchical Self-supervised Representation Learning for Movie
  Understanding
Hierarchical Self-supervised Representation Learning for Movie Understanding
Fanyi Xiao
Kaustav Kundu
Joseph Tighe
Davide Modolo
SSL
37
24
0
06 Apr 2022
Learning from Untrimmed Videos: Self-Supervised Video Representation
  Learning with Hierarchical Consistency
Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Yi Tian Xu
Xiang Wang
Mingqian Tang
Changxin Gao
Rong Jin
Nong Sang
SSL
AI4TS
23
17
0
06 Apr 2022
TALLFormer: Temporal Action Localization with a Long-memory Transformer
TALLFormer: Temporal Action Localization with a Long-memory Transformer
Feng Cheng
Gedas Bertasius
ViT
24
91
0
04 Apr 2022
Exploiting Temporal Relations on Radar Perception for Autonomous Driving
Exploiting Temporal Relations on Radar Perception for Autonomous Driving
Peizhao Li
Puzuo Wang
K. Berntorp
Hongfu Liu
19
43
0
03 Apr 2022
A-ACT: Action Anticipation through Cycle Transformations
A-ACT: Action Anticipation through Cycle Transformations
Akash Gupta
Jingen Liu
Liefeng Bo
A. Roy-Chowdhury
Tao Mei
20
5
0
02 Apr 2022
MeMOT: Multi-Object Tracking with Memory
MeMOT: Multi-Object Tracking with Memory
Jiarui Cai
Mingze Xu
Wei Li
Yuanjun Xiong
Wei Xia
Z. Tu
Stefano Soatto
VOT
27
148
0
31 Mar 2022
Stochastic Backpropagation: A Memory Efficient Strategy for Training
  Video Models
Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
Feng Cheng
Ming Xu
Yuanjun Xiong
Hao Chen
Xinyu Li
Wei Li
Wei Xia
14
16
0
31 Mar 2022
Global Tracking Transformers
Global Tracking Transformers
Xingyi Zhou
Tianwei Yin
V. Koltun
Philipp Krahenbuhl
VOT
21
133
0
24 Mar 2022
Point3D: tracking actions as moving points with 3D CNNs
Point3D: tracking actions as moving points with 3D CNNs
Shentong Mo
Jingfei Xia
Xiaoqing Ellen Tan
Bhiksha Raj
3DPC
18
5
0
20 Mar 2022
Local-Global Context Aware Transformer for Language-Guided Video
  Segmentation
Local-Global Context Aware Transformer for Language-Guided Video Segmentation
Chen Liang
Wenguan Wang
Tianfei Zhou
Jiaxu Miao
Yawei Luo
Yi Yang
VOS
24
74
0
18 Mar 2022
Gate-Shift-Fuse for Video Action Recognition
Gate-Shift-Fuse for Video Action Recognition
Swathikiran Sudhakaran
Sergio Escalera
O. Lanz
20
22
0
16 Mar 2022
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
HAKE: A Knowledge Engine Foundation for Human Activity Understanding
Yong-Lu Li
Xinpeng Liu
Xiaoqian Wu
Yizhuo Li
Zuoyu Qiu
Liang Xu
Yue Xu
Haoshu Fang
Cewu Lu
24
38
0
14 Feb 2022
OWL (Observe, Watch, Listen): Audiovisual Temporal Context for
  Localizing Actions in Egocentric Videos
OWL (Observe, Watch, Listen): Audiovisual Temporal Context for Localizing Actions in Egocentric Videos
Merey Ramazanova
Victor Escorcia
Fabian Caba Heilbron
Chen Zhao
Bernard Ghanem
20
3
0
10 Feb 2022
A Coding Framework and Benchmark towards Compressed Video Understanding
A Coding Framework and Benchmark towards Compressed Video Understanding
Yuan Tian
Guo Lu
Yichao Yan
Guangtao Zhai
L. Chen
Zhiyong Gao
33
21
0
06 Feb 2022
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient
  Long-Term Video Recognition
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition
Chao-Yuan Wu
Yanghao Li
K. Mangalam
Haoqi Fan
Bo Xiong
Jitendra Malik
Christoph Feichtenhofer
ViT
32
198
0
20 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
20
11
0
17 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
20
103
0
16 Jan 2022
Hand-Object Interaction Reasoning
Hand-Object Interaction Reasoning
Jian Ma
Dima Damen
19
7
0
13 Jan 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
24
211
0
12 Jan 2022
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal
  Action Localization
ACGNet: Action Complement Graph Network for Weakly-supervised Temporal Action Localization
Zichen Yang
Jie Qin
Di Huang
20
56
0
21 Dec 2021
Previous
1234567
Next