ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.17839
  4. Cited By
Learning Procedure-aware Video Representation from Instructional Videos
  and Their Narrations

Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations

Computer Vision and Pattern Recognition (CVPR), 2023
31 March 2023
Yiwu Zhong
Licheng Yu
Yang Bai
Shangwen Li
Xueting Yan
Yin Li
    AI4TS
ArXiv (abs)PDFHTMLGithub (53★)

Papers citing "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"

33 / 33 papers shown
Learning Skill-Attributes for Transferable Assessment in Video
Learning Skill-Attributes for Transferable Assessment in Video
Kumar Ashutosh
Kristen Grauman
229
2
0
17 Nov 2025
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal Modeling
Learning to Recognize Correctly Completed Procedure Steps in Egocentric Assembly Videos through Spatio-Temporal ModelingComputer Vision and Image Understanding (CVIU), 2025
Tim J. Schoonbeek
Shao-Hsuan Hung
Dan Lehman
H. Onvlee
Jacek Kustra
Peter H. N. de With
Fons van der Sommen
177
0
0
14 Oct 2025
LEGO Co-builder: Exploring Fine-Grained Vision-Language Modeling for Multimodal LEGO Assembly Assistants
LEGO Co-builder: Exploring Fine-Grained Vision-Language Modeling for Multimodal LEGO Assembly Assistants
Haochen Huang
Jiahuan Pei
Mohammad Aliannejadi
Xin Sun
Moonisa Ahsan
Chuang Yu
Zhaochun Ren
Pablo César
Junxiao Wang
VLM
269
2
0
07 Jul 2025
Vision Generalist Model: A Survey
Vision Generalist Model: A SurveyInternational Journal of Computer Vision (IJCV), 2025
Ziyi Wang
Yongming Rao
Shuofeng Sun
Xinrun Liu
Yi Wei
...
Zuyan Liu
Yanbo Wang
Hongmin Liu
Jie Zhou
Jiwen Lu
320
0
0
11 Jun 2025
EgoVIS@CVPR: What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
EgoVIS@CVPR: What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
384
0
0
30 May 2025
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
HiERO: understanding the hierarchy of human behavior enhances reasoning on egocentric videos
Simone Alberto Peirone
Francesca Pistilli
Giuseppe Averta
452
2
0
19 May 2025
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Memory-efficient Streaming VideoLLMs for Real-time Procedural Video Understanding
Dibyadip Chatterjee
Edoardo Remelli
Yale Song
Bugra Tekin
Abhay Mittal
...
Shreyas Hampali
Eric Sauser
Shugao Ma
Angela Yao
Fadime Sener
VLM
302
6
0
10 Apr 2025
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
761
2
0
27 Mar 2025
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering
Thanh-Son Nguyen
Hong Yang
Tzeh Yuan Neoh
Hao Zhang
Ee Yeo Keat
Basura Fernando
NAI
534
3
0
19 Mar 2025
Stitch-a-Demo: Video Demonstrations from Multistep Descriptions
Stitch-a-Demo: Video Demonstrations from Multistep Descriptions
Chi Hsuan Wu
Kumar Ashutosh
Kristen Grauman
VGen3DV
364
1
0
18 Mar 2025
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary
VLog: Video-Language Models by Generative Retrieval of Narration VocabularyComputer Vision and Pattern Recognition (CVPR), 2025
Kevin Qinghong Lin
Mike Zheng Shou
VGen
1.1K
5
0
12 Mar 2025
Learning to Generate Long-term Future Narrations Describing Activities of Daily Living
Learning to Generate Long-term Future Narrations Describing Activities of Daily Living
Ramanathan Rajendiran
Debaditya Roy
Basura Fernando
VGen
369
1
0
03 Mar 2025
Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos
Task Graph Maximum Likelihood Estimation for Procedural Activity Understanding in Egocentric Videos
Luigi Seminara
G. Farinella
Antonino Furnari
323
1
0
25 Feb 2025
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos
Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric VideosNeural Information Processing Systems (NeurIPS), 2024
Luigi Seminara
G. Farinella
Antonino Furnari
654
27
0
10 Jan 2025
ACE: Action Concept Enhancement of Video-Language Models in Procedural
  Videos
ACE: Action Concept Enhancement of Video-Language Models in Procedural VideosIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Reza Ghoddoosian
Nakul Agarwal
Isht Dwivedi
Behzad Darisuh
317
0
0
23 Nov 2024
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos
Leonardo Plini
Luca Scofano
Edoardo De Matteis
Guido Maria DÁmely di Melendugno
Alessandro Flaborea
Andrea Sanchietti
G. Farinella
Fabio Galasso
Antonino Furnari
LRMEgoV
452
9
0
04 Nov 2024
Human Action Anticipation: A Survey
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
404
10
0
17 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Enhancing Temporal Modeling of Video LLMs via Time GatingConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
Michael R. Lyu
Liwei Wang
VLM
231
8
0
08 Oct 2024
VEDIT: Latent Prediction Architecture For Procedural Video
  Representation Learning
VEDIT: Latent Prediction Architecture For Procedural Video Representation LearningInternational Conference on Learning Representations (ICLR), 2024
Han Lin
Tushar Nagarajan
Nicolas Ballas
Mido Assran
Mojtaba Komeili
Joey Tianyi Zhou
Koustuv Sinha
AI4TS
362
8
0
04 Oct 2024
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths
  Vision Computation
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision ComputationNeural Information Processing Systems (NeurIPS), 2024
Shiwei Wu
Joya Chen
Kevin Qinghong Lin
Qimeng Wang
Yan Gao
Qianli Xu
Tong Xu
Yao Hu
Enhong Chen
Mike Zheng Shou
VLM
313
46
0
29 Aug 2024
ExpertAF: Expert Actionable Feedback from Video
ExpertAF: Expert Actionable Feedback from VideoComputer Vision and Pattern Recognition (CVPR), 2024
Kumar Ashutosh
Tushar Nagarajan
Georgios Pavlakos
Kris Kitani
Kristen Grauman
VGen
511
13
0
01 Aug 2024
VideoLLM-online: Online Video Large Language Model for Streaming Video
VideoLLM-online: Online Video Large Language Model for Streaming Video
Joya Chen
Zhaoyang Lv
Shiwei Wu
Kevin Qinghong Lin
Chenan Song
Difei Gao
Jia-Wei Liu
Ziteng Gao
Dongxing Mao
Mike Zheng Shou
MLLMMoMe
348
155
0
17 Jun 2024
Learning Object States from Actions via Large Language Models
Learning Object States from Actions via Large Language Models
Masatoshi Tateno
Takuma Yagi
Ryosuke Furuta
Yoichi Sato
166
2
0
02 May 2024
PREGO: online mistake detection in PRocedural EGOcentric videos
PREGO: online mistake detection in PRocedural EGOcentric videosComputer Vision and Pattern Recognition (CVPR), 2024
Alessandro Flaborea
Guido Maria DÁmely di Melendugno
Leonardo Plini
Luca Scofano
Edoardo De Matteis
Antonino Furnari
G. Farinella
Yuta Kyuragi
EgoV
346
35
0
02 Apr 2024
Beyond Embeddings: The Promise of Visual Table in Visual Reasoning
Beyond Embeddings: The Promise of Visual Table in Visual Reasoning
Yiwu Zhong
Zi-Yuan Hu
Michael R. Lyu
Liwei Wang
292
7
0
27 Mar 2024
Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of
  Instructional Videos
Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos
Kumaranage Ravindu Yasas Nagasinghe
Honglu Zhou
Malitha Gunawardhana
Martin Renqiang Min
Daniel Harari
Muhammad Haris Khan
339
17
0
05 Mar 2024
OSCaR: Object State Captioning and State Change Representation
OSCaR: Object State Captioning and State Change Representation
Nguyen Nguyen
Jing Bi
Ali Vosoughi
Yapeng Tian
Pooyan Fazli
Chenliang Xu
682
16
0
27 Feb 2024
CI w/o TN: Context Injection without Task Name for Procedure Planning
CI w/o TN: Context Injection without Task Name for Procedure Planning
Xinjie Li
302
0
0
23 Feb 2024
Detours for Navigating Instructional Videos
Detours for Navigating Instructional VideosComputer Vision and Pattern Recognition (CVPR), 2024
Kumar Ashutosh
Zihui Xue
Tushar Nagarajan
Kristen Grauman
645
8
0
03 Jan 2024
GenHowTo: Learning to Generate Actions and State Transformations from
  Instructional Videos
GenHowTo: Learning to Generate Actions and State Transformations from Instructional VideosComputer Vision and Pattern Recognition (CVPR), 2023
Tomávs Souvcek
Dima Damen
Michael Wray
Ivan Laptev
Josef Sivic
VGen
303
45
0
12 Dec 2023
Masked Diffusion with Task-awareness for Procedure Planning in
  Instructional Videos
Masked Diffusion with Task-awareness for Procedure Planning in Instructional Videos
Fen Fang
Yun Liu
Ali Koksal
Qianli Xu
Joo-Hwee Lim
VGenDiffM
322
6
0
14 Sep 2023
Video-Mined Task Graphs for Keystep Recognition in Instructional Videos
Video-Mined Task Graphs for Keystep Recognition in Instructional VideosNeural Information Processing Systems (NeurIPS), 2023
Kumar Ashutosh
Santhosh Kumar Ramakrishnan
Triantafyllos Afouras
Kristen Grauman
370
44
0
17 Jul 2023
Learning to Ground Instructional Articles in Videos through Narrations
Learning to Ground Instructional Articles in Videos through NarrationsIEEE International Conference on Computer Vision (ICCV), 2023
E. Mavroudi
Triantafyllos Afouras
Lorenzo Torresani
DiffM
303
27
0
06 Jun 2023
1
Page 1 of 1