ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.06189
  4. Cited By
Video-STaR: Self-Training Enables Video Instruction Tuning with Any
  Supervision

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

8 July 2024
Orr Zohar
Xiaohan Wang
Yonatan Bitton
Idan Szpektor
Serena Yeung-Levy
    VLM
    LRM
ArXivPDFHTML

Papers citing "Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision"

8 / 8 papers shown
Title
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni
Pooyan Fazli
VLM
90
2
0
01 Dec 2024
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
Shiyi Zhang
Wen-Dao Dai
Sujia Wang
Xiangwei Shen
Jiwen Lu
Jie Zhou
Yansong Tang
42
25
0
07 Apr 2024
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned
  Language Models
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti
Suraj Nair
Ashwin Balakrishna
Percy Liang
Thomas Kollar
Dorsa Sadigh
MLLM
VLM
57
95
0
12 Feb 2024
Beyond Human Data: Scaling Self-Training for Problem-Solving with
  Language Models
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Avi Singh
John D. Co-Reyes
Rishabh Agarwal
Ankesh Anand
Piyush Patil
...
Yamini Bansal
Ethan Dyer
Behnam Neyshabur
Jascha Narain Sohl-Dickstein
Noah Fiedel
ALM
LRM
ReLM
SyDa
147
143
0
11 Dec 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before
  Projection
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
185
576
0
16 Nov 2023
GPT-4V(ision) Unsuitable for Clinical Care and Education: A
  Clinician-Evaluated Assessment
GPT-4V(ision) Unsuitable for Clinical Care and Education: A Clinician-Evaluated Assessment
Senthujan Senkaiahliyan
Augustin Toma
Jun Ma
An-Wen Chan
Andrew Ha
Kevin R. An
Hrishikesh Suresh
Barry Rubin
Bo Wang
LM&MA
ELM
38
11
0
14 Nov 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
1