ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00672
29
2

ExpertAF: Expert Actionable Feedback from Video

1 August 2024
Kumar Ashutosh
Tushar Nagarajan
Georgios Pavlakos
Kris M. Kitani
Kristen Grauman
    VGen
ArXivPDFHTML
Abstract

Feedback is essential for learning a new skill or improving one's current skill-level. However, current methods for skill-assessment from video only provide scores or compare demonstrations, leaving the burden of knowing what to do differently on the user. We introduce a novel method to generate actionable feedback (AF) from video of a person doing a physical activity, such as basketball or soccer. Our method takes a video demonstration and its accompanying 3D body pose and generates (1) free-form expert commentary describing what the person is doing well and what they could improve, and (2) a visual expert demonstration that incorporates the required corrections. We show how to leverage Ego-Exo4D's [29] videos of skilled activity and expert commentary together with a strong language model to create a weakly-supervised training dataset for this task, and we devise a multimodal video-language model to infer coaching feedback. Our method is able to reason across multi-modal input combinations to output full spectrum, actionable coaching-expert commentary, expert video retrieval, and expert pose generation-outperforming strong vision-language models on both established metrics and human preference studies.

View on arXiv
@article{ashutosh2025_2408.00672,
  title={ ExpertAF: Expert Actionable Feedback from Video },
  author={ Kumar Ashutosh and Tushar Nagarajan and Georgios Pavlakos and Kris Kitani and Kristen Grauman },
  journal={arXiv preprint arXiv:2408.00672},
  year={ 2025 }
}
Comments on this paper