ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.00180
  4. Cited By
Object-centric Video Representation for Long-term Action Anticipation

Object-centric Video Representation for Long-term Action Anticipation

31 October 2023
Ce Zhang
Changcheng Fu
Shijie Wang
Nakul Agarwal
Kwonjoon Lee
Chiho Choi
Chen Sun
ArXivPDFHTML

Papers citing "Object-centric Video Representation for Long-term Action Anticipation"

19 / 19 papers shown
Title
Action Anticipation from SoccerNet Football Video Broadcasts
Action Anticipation from SoccerNet Football Video Broadcasts
Mohamad Dalal
Artur Xarles
A. Cioppa
Silvio Giancola
Marc Van Droogenbroeck
Bernard Ghanem
Albert Clapés
Sergio Escalera
T. Moeslund
AI4TS
36
0
0
16 Apr 2025
Enhancing Action Recognition by Leveraging the Hierarchical Structure of
  Actions and Textual Context
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
24
0
0
28 Oct 2024
Human Action Anticipation: A Survey
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris M. Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
41
1
0
17 Oct 2024
TR-LLM: Integrating Trajectory Data for Scene-Aware LLM-Based Human
  Action Prediction
TR-LLM: Integrating Trajectory Data for Scene-Aware LLM-Based Human Action Prediction
Kojiro Takeyama
Yimeng Liu
Misha Sra
29
1
0
05 Oct 2024
Anticipation through Head Pose Estimation: a preliminary study
Anticipation through Head Pose Estimation: a preliminary study
Federico Figari Tomenotti
Nicoletta Noceti
23
0
0
10 Aug 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
Can't make an Omelette without Breaking some Eggs: Plausible Action
  Anticipation using Large Video-Language Models
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal
Nakul Agarwal
Shao-Yuan Lo
Kwonjoon Lee
41
14
0
30 May 2024
Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on
  Egocentric Videos
Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos
Junyi Ma
Jingyi Xu
Xieyuanli Chen
Hesheng Wang
VGen
32
7
0
07 May 2024
Uncertainty-boosted Robust Video Activity Anticipation
Uncertainty-boosted Robust Video Activity Anticipation
Zhaobo Qi
Shuhui Wang
Weigang Zhang
Qingming Huang
36
5
0
29 Apr 2024
DITTO: Demonstration Imitation by Trajectory Transformation
DITTO: Demonstration Imitation by Trajectory Transformation
Nick Heppert
Max Argus
Tim Welschehold
Thomas Brox
Abhinav Valada
67
16
0
22 Mar 2024
VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by
  Visual-Semantic Fusion for Egocentric Action Anticipation
VS-TransGRU: A Novel Transformer-GRU-based Framework Enhanced by Visual-Semantic Fusion for Egocentric Action Anticipation
Congqi Cao
Ze Sun
Qinyi Lv
Lingtong Min
Yanning Zhang
ViT
18
3
0
08 Jul 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
22
14
0
22 Jan 2023
Omnivore: A Single Model for Many Visual Modalities
Omnivore: A Single Model for Many Visual Modalities
Rohit Girdhar
Mannat Singh
Nikhil Ravi
L. V. D. van der Maaten
Armand Joulin
Ishan Misra
223
225
0
20 Jan 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
232
1,024
0
13 Oct 2021
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Nayeon Lee
Weicheng Kuo
Huayu Chen
VLM
ObjD
225
899
0
28 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw
  Video, Audio and Text
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
248
577
0
22 Apr 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
Feature Pyramid Networks for Object Detection
Feature Pyramid Networks for Object Detection
Nayeon Lee
Piotr Dollár
Ross B. Girshick
Kaiming He
Bharath Hariharan
Serge J. Belongie
ObjD
183
21,819
0
09 Dec 2016
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
245
31,257
0
16 Jan 2013
1