A Survey on Video Temporal Grounding with Multimodal Large Language ModelIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025 |
MotionPro: A Precise Motion Controller for Image-to-Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025 |
Object-Shot Enhanced Grounding Network for Egocentric VideoComputer Vision and Pattern Recognition (CVPR), 2025 |
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in
Hour-Long VideosComputer Vision and Pattern Recognition (CVPR), 2024 |
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningInternational Conference on Learning Representations (ICLR), 2024 |