
Title |
|---|
![]() Scaling Vision Pre-Training to 4K ResolutionComputer Vision and Pattern Recognition (CVPR), 2025 |
![]() VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024 |
![]() VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web TasksInternational Conference on Learning Representations (ICLR), 2024 |