ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.02310
  4. Cited By
VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

4 December 2023
Yizhou Wang
Ruiyi Zhang
Haoliang Wang
Uttaran Bhattacharya
Yun Fu
Gang Wu
    MLLM
ArXivPDFHTML

Papers citing "VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding"

15 / 15 papers shown
Title
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction
Yiqiao Jin
Stefano Petrangeli
Yu Shen
Gang Wu
LLMAG
LM&Ro
117
0
0
26 Mar 2025
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering
Anupam Pandey
Deepjyoti Bodo
Arpan Phukan
Asif Ekbal
33
0
0
13 Jan 2025
From Seconds to Hours: Reviewing MultiModal Large Language Models on
  Comprehensive Long Video Understanding
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
Heqing Zou
Tianze Luo
Guiyang Xie
Victor
Zhang
...
Guangcong Wang
Juanyang Chen
Zhuochen Wang
Hansheng Zhang
Huaijian Zhang
VLM
34
6
0
27 Sep 2024
Video-STaR: Self-Training Enables Video Instruction Tuning with Any
  Supervision
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision
Orr Zohar
Xiaohan Wang
Yonatan Bitton
Idan Szpektor
Serena Yeung-Levy
VLM
LRM
48
8
0
08 Jul 2024
Hallucination Mitigation Prompts Long-term Video Understanding
Hallucination Mitigation Prompts Long-term Video Understanding
Yiwei Sun
Zhihang Liu
Chuanbin Liu
Bowei Pu
Zhihan Zhang
Hongtao Xie
VLM
MLLM
33
2
0
17 Jun 2024
Dense Connector for MLLMs
Dense Connector for MLLMs
Huanjin Yao
Wenhao Wu
Taojiannan Yang
Yuxin Song
Mengxi Zhang
Haocheng Feng
Yifan Sun
Zhiheng Li
Wanli Ouyang
Jingdong Wang
MLLM
VLM
32
16
0
22 May 2024
FreeVA: Offline MLLM as Training-Free Video Assistant
FreeVA: Offline MLLM as Training-Free Video Assistant
Wenhao Wu
VLM
OffRL
32
19
0
13 May 2024
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
50
81
0
29 Dec 2023
Valley: Video Assistant with Large Language model Enhanced abilitY
Valley: Video Assistant with Large Language model Enhanced abilitY
Ruipu Luo
Ziwang Zhao
Min Yang
Junwei Dong
Da Li
Pengcheng Lu
Tao Wang
Linmei Hu
Ming-Hui Qiu
MLLM
40
188
0
12 Jun 2023
VideoLLM: Modeling Video Sequence with Large Language Models
VideoLLM: Modeling Video Sequence with Large Language Models
Guo Chen
Yin-Dong Zheng
Jiahao Wang
Jilan Xu
Yifei Huang
...
Yi Wang
Yali Wang
Yu Qiao
Tong Lu
Limin Wang
MLLM
92
76
0
22 May 2023
Explainable Anomaly Detection in Images and Videos: A Survey
Explainable Anomaly Detection in Images and Videos: A Survey
Yizhou Wang
Dongliang Guo
Sheng R. Li
Octavia Camps
Yun Fu
16
5
0
13 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
265
4,223
0
30 Jan 2023
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
4,048
0
24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
557
0
28 Sep 2021
1