v1v2 (latest)

STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training

AAAI Conference on Artificial Intelligence (AAAI), 2023

20 February 2023

ArXiv (abs)PDF HTML Github

Papers citing "STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training"

4 / 4 papers shown

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

...

579

2,632

09 Nov 2023

VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools

Lei Hou

328

16 Oct 2023

Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook

Ming Jin

...

489

172

16 Oct 2023

On the Relationship between Self-Attention and Convolutional LayersInternational Conference on Learning Representations (ICLR), 2019

Jean-Baptiste Cordonnier

Andreas Loukas

Martin Jaggi

812

625

08 Nov 2019