ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.14743
  4. Cited By
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding

VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding

21 March 2024
Ahmad A Mahmood
Ashmal Vayani
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
    LRM
ArXivPDFHTML

Papers citing "VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding"

10 / 10 papers shown
Title
SB-Bench: Stereotype Bias Benchmark for Large Multimodal Models
SB-Bench: Stereotype Bias Benchmark for Large Multimodal Models
Vishal Narnaware
Ashmal Vayani
Rohit Gupta
Swetha Sirnam
Mubarak Shah
104
3
0
12 Feb 2025
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
Xueqing Wu
Yuheng Ding
Bingxuan Li
Pan Lu
Da Yin
Kai-Wei Chang
Nanyun Peng
LRM
88
3
0
03 Dec 2024
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Ashmal Vayani
Dinura Dissanayake
Hasindri Watawana
Noor Ahsan
Nevasini Sasikumar
...
Monojit Choudhury
Ivan Laptev
Mubarak Shah
Salman Khan
Fahad A Khan
118
8
0
25 Nov 2024
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Omkar Thawakar
Ashmal Vayani
Salman Khan
Hisham Cholakal
Rao M. Anwer
M. Felsberg
Timothy Baldwin
Eric P. Xing
Fahad Shahbaz Khan
38
10
0
26 Feb 2024
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
36
76
0
29 Dec 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
1,899
0
30 Jan 2023
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
S. Hoi
SyDa
ALM
116
232
0
05 Jul 2022
Truthful AI: Developing and governing AI that does not lie
Truthful AI: Developing and governing AI that does not lie
Owain Evans
Owen Cotton-Barratt
Lukas Finnveden
Adam Bales
Avital Balwit
Peter Wills
Luca Righetti
William Saunders
HILM
207
107
0
13 Oct 2021
MURAL: Multimodal, Multitask Retrieval Across Languages
MURAL: Multimodal, Multitask Retrieval Across Languages
Aashi Jain
Mandy Guo
Krishna Srinivasan
Ting-Li Chen
Sneha Kudugunta
Chao Jia
Yinfei Yang
Jason Baldridge
VLM
104
50
0
10 Sep 2021
Deep Learning-Based Human Pose Estimation: A Survey
Deep Learning-Based Human Pose Estimation: A Survey
Ce Zheng
Wenhan Wu
C. L. P. Chen
Taojiannan Yang
Sijie Zhu
Ju Shen
N. Kehtarnavaz
M. Shah
3DH
90
383
0
24 Dec 2020
1