Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2407.17453
Cited By

$VILA^2$: VILA Augmented VILA

$VILA^2$ : VILA Augmented VILA

24 July 2024

Pavlo Molchanov

Song Han

ArXiv (abs)PDF HTML HuggingFace (42 upvotes)

Papers citing "$VILA^2$: VILA Augmented VILA"

5 / 5 papers shown

Scaling Vision Pre-Training to 4K Resolution

Scaling Vision Pre-Training to 4K ResolutionComputer Vision and Pattern Recognition (CVPR), 2025

...

Pavlo Molchanov

905

12

0

25 Mar 2025

Diving into Self-Evolving Training for Multimodal Reasoning

Diving into Self-Evolving Training for Multimodal Reasoning

387

28

0

23 Dec 2024

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language ModelsComputer Vision and Pattern Recognition (CVPR), 2024

Yu-Chiang Frank Wang

393

6

0

02 Dec 2024

VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks

VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web TasksInternational Conference on Learning Representations (ICLR), 2024

Rogerio Bonatti

404

25

0

24 Oct 2024

Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large
Language Models

Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models

Alexander Visheratin

Daiqing Li

419

89

0

16 Sep 2024