ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.12058
  4. Cited By
Discovering Spatio-Temporal Rationales for Video Question Answering

Discovering Spatio-Temporal Rationales for Video Question Answering

22 July 2023
Yicong Li
Junbin Xiao
Chun Feng
Xiang Wang
Tat-Seng Chua
ArXivPDFHTML

Papers citing "Discovering Spatio-Temporal Rationales for Video Question Answering"

14 / 14 papers shown
Title
Video Flow as Time Series: Discovering Temporal Consistency and Variability for VideoQA
Video Flow as Time Series: Discovering Temporal Consistency and Variability for VideoQA
Zijie Song
Zhenzhen Hu
Yixiao Ma
Jia Li
Richang Hong
16
0
0
08 Apr 2025
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
Zhiyuan Liu
Yanchen Luo
Han Huang
Enzhi Zhang
Sihang Li
Junfeng Fang
Yaorui Shi
X. Wang
Kenji Kawaguchi
Tat-Seng Chua
100
3
0
18 Feb 2025
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Miran Heo
Min-Hung Chen
De-An Huang
Sifei Liu
Subhashree Radhakrishnan
Seon Joo Kim
Yu-Chun Wang
Ryo Hachiuma
ObjD
VLM
116
2
0
14 Jan 2025
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei
Shengqiong Wu
Wei Ji
H. Zhang
M. Zhang
M. Lee
W. Hsu
LRM
VGen
44
55
0
08 Jan 2025
When SAM2 Meets Video Shadow and Mirror Detection
When SAM2 Meets Video Shadow and Mirror Detection
Leiping Jie
VLM
27
1
0
26 Dec 2024
Scene-Text Grounding for Text-Based Video Question Answering
Scene-Text Grounding for Text-Based Video Question Answering
Sheng Zhou
Junbin Xiao
Xun Yang
Peipei Song
Dan Guo
Angela Yao
Meng Wang
Tat-Seng Chua
52
1
0
22 Sep 2024
High-Order Evolving Graphs for Enhanced Representation of Traffic
  Dynamics
High-Order Evolving Graphs for Enhanced Representation of Traffic Dynamics
Aditya Humnabadkar
Arindam Sikdar
Benjamin Cave
Huaizhong Zhang
P. Bakaki
Ardhendu Behera
14
0
0
17 Sep 2024
Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal
  Models for Video Question Answering
Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering
Haibo Wang
Chenghang Lai
Yixuan Sun
Weifeng Ge
13
5
0
19 Jan 2024
Can I Trust Your Answer? Visually Grounded Video Question Answering
Can I Trust Your Answer? Visually Grounded Video Question Answering
Junbin Xiao
Angela Yao
Yicong Li
Tat-Seng Chua
25
46
0
04 Sep 2023
Video Graph Transformer for Video Question Answering
Video Graph Transformer for Video Question Answering
Junbin Xiao
Pan Zhou
Tat-Seng Chua
Shuicheng Yan
ViT
134
73
0
12 Jul 2022
Discovering Invariant Rationales for Graph Neural Networks
Discovering Invariant Rationales for Graph Neural Networks
Yingmin Wu
Xiang Wang
An Zhang
Xiangnan He
Tat-Seng Chua
OOD
AI4CE
89
222
0
30 Jan 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
554
0
28 Sep 2021
Bridge to Answer: Structure-aware Graph Interaction Network for Video
  Question Answering
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
Jungin Park
Jiyoung Lee
K. Sohn
123
99
0
29 Apr 2021
Beyond VQA: Generating Multi-word Answer and Rationale to Visual
  Questions
Beyond VQA: Generating Multi-word Answer and Rationale to Visual Questions
Radhika Dua
Sai Srinivas Kancheti
V. Balasubramanian
LRM
30
22
0
24 Oct 2020
1