ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.15252
35
41

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

21 June 2024
Xuan He
Dongfu Jiang
Ge Zhang
Max W.F. Ku
Achint Soni
Sherman Siu
Haonan Chen
Abhranil Chandra
Ziyan Jiang
Aaran Arulraj
Kai Wang
Quy Duc Do
Yuansheng Ni
Bohan Lyu
Yaswanth Narsupalli
Rongqi Fan
Zhiheng Lyu
Yuchen Lin
Wenhu Chen
    EGVM
    VGen
    ALM
ArXivPDFHTML
Abstract

The recent years have witnessed great advances in video generation. However, the development of automatic video metrics is lagging significantly behind. None of the existing metric is able to provide reliable scores over generated videos. The main barrier is the lack of large-scale human-annotated dataset. In this paper, we release VideoFeedback, the first large-scale dataset containing human-provided multi-aspect score over 37.6K synthesized videos from 11 existing video generative models. We train VideoScore (initialized from Mantis) based on VideoFeedback to enable automatic video quality assessment. Experiments show that the Spearman correlation between VideoScore and humans can reach 77.1 on VideoFeedback-test, beating the prior best metrics by about 50 points. Further result on other held-out EvalCrafter, GenAI-Bench, and VBench show that VideoScore has consistently much higher correlation with human judges than other metrics. Due to these results, we believe VideoScore can serve as a great proxy for human raters to (1) rate different video models to track progress (2) simulate fine-grained human feedback in Reinforcement Learning with Human Feedback (RLHF) to improve current video generation models.

View on arXiv
Comments on this paper