0
0

Leveraging Static Relationships for Intra-Type and Inter-Type Message Passing in Video Question Answering

Lili Liang
Guanglu Sun
Abstract

Video Question Answering (VideoQA) is an important research direction in the field of artificial intelligence, enabling machines to understand video content and perform reasoning and answering based on natural language questions. Although methods based on static relationship reasoning have made certain progress, there are still deficiencies in the accuracy of static relationship recognition and representation, and they have not fully utilized the static relationship information in videos for in-depth reasoning and analysis. Therefore, this paper proposes a reasoning method for intra-type and inter-type message passing based on static relationships. This method constructs a dual graph for intra-type message passing reasoning and builds a heterogeneous graph based on static relationships for inter-type message passing reasoning. The intra-type message passing reasoning model captures the neighborhood information of targets and relationships related to the question in the dual graph, updating the dual graph to obtain intra-type clues for answering the question. The inter-type message passing reasoning model captures the neighborhood information of targets and relationships from different categories related to the question in the heterogeneous graph, updating the heterogeneous graph to obtain inter-type clues for answering the question. Finally, the answers are inferred by combining the intra-type and inter-type clues based on static relationships. Experimental results on the ANetQA and Next-QA datasets demonstrate the effectiveness of this method.

View on arXiv
@article{liang2025_2504.02417,
  title={ Leveraging Static Relationships for Intra-Type and Inter-Type Message Passing in Video Question Answering },
  author={ Lili Liang and Guanglu Sun },
  journal={arXiv preprint arXiv:2504.02417},
  year={ 2025 }
}
Comments on this paper