ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.03384
  4. Cited By
LongVLM: Efficient Long Video Understanding via Large Language Models

LongVLM: Efficient Long Video Understanding via Large Language Models

4 April 2024
Yuetian Weng
Mingfei Han
Haoyu He
Xiaojun Chang
Bohan Zhuang
    VLM
ArXivPDFHTML

Papers citing "LongVLM: Efficient Long Video Understanding via Large Language Models"

10 / 10 papers shown
Title
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Xiangrui Liu
Yan Shu
Zheng Liu
Ao Li
Yang Tian
Bo Zhao
VGen
VLM
86
0
0
24 Mar 2025
LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
LVAgent: Long Video Understanding by Multi-Round Dynamical Collaboration of MLLM Agents
Boyu Chen
Zhengrong Yue
Siran Chen
Z. Wang
Yang Liu
Peng Li
Y. Wang
VLM
61
0
0
13 Mar 2025
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding
Shehreen Azad
Vibhav Vineet
Y. S. Rawat
VLM
49
1
0
11 Mar 2025
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling
Yi Wang
Xinhao Li
Ziang Yan
Yinan He
Jiashuo Yu
...
Kai Chen
Wenhai Wang
Yu Qiao
Yali Wang
Limin Wang
64
19
0
21 Jan 2025
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
Xiao Wang
Qingyi Si
Jianlong Wu
Shiyu Zhu
Li Cao
Liqiang Nie
VLM
62
6
0
29 Dec 2024
Video LLMs for Temporal Reasoning in Long Videos
Video LLMs for Temporal Reasoning in Long Videos
Fawad Javed Fateh
Umer Ahmed
Hamza Khan
M. Zia
Quoc-Huy Tran
VLM
75
0
0
04 Dec 2024
SEAL: Semantic Attention Learning for Long Video Representation
SEAL: Semantic Attention Learning for Long Video Representation
Lan Wang
Yujia Chen
Wen-Sheng Chu
Vishnu Naresh Boddeti
Du Tran
VLM
67
0
0
02 Dec 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
212
682
0
13 Oct 2021
1