VMAF And Variants: Towards A Unified VQA
Video quality assessment (VQA) is now a fast-growing subject, maturing in the full reference (FR) case, yet challenging in the exploding no reference (NR) case. We investigate variants of the popular VMAF video quality assessment algorithm for the FR case, using both support vector regression and feedforward neural networks. We extend it to the NR case, using some different features but similar learning, to develop a partially unified framework for VQA. When fully trained, FR algorithms such as VMAF perform well on test datasets, with 90%+ match in PCC and SRCC; but for predicting performance in the wild, we train/test from scratch for each database. With an 80/20 train/test split, we still achieve 90%+ performance on average in both PCC and SRCC, with 8-9% gains over VMAF. Moreover, we even get decent performance (~75%) if we ignore the reference, treating FR as NR, partly justifying our attempts at unification. In the true NR case, we reduce complexity vs. leading recent algorithms VIDEVAL, RAPIQUE, yet achieve a stunning 90% in SRCC (~12% gain), while roughly matching in PCC (78% vs. 79.6%). At lower complexities, we can still achieve 87% in SRCC, 70% in PCC. In short, we find encouraging improvements in trainability in both FR and NR, while also constraining computational complexity against leading methods
View on arXiv