ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.07270
  4. Cited By
Open-ended VQA benchmarking of Vision-Language models by exploiting
  Classification datasets and their semantic hierarchy
v1v2 (latest)

Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy

International Conference on Learning Representations (ICLR), 2024
11 February 2024
Simon Ging
M. A. Bravo
Thomas Brox
    VLM
ArXiv (abs)PDFHTML

Papers citing "Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy"

7 / 7 papers shown
Title
Reasoning-Enhanced Domain-Adaptive Pretraining of Multimodal Large Language Models for Short Video Content Governance
Reasoning-Enhanced Domain-Adaptive Pretraining of Multimodal Large Language Models for Short Video Content Governance
Zixuan Wang
Yu Sun
Hongwei Wang
Baoyu Jing
Xiang Shen
Xin Dong
Zhuolin Hao
Hongyu Xiong
Yang Song
LRMVLM
200
1
0
25 Sep 2025
Object Detection with Multimodal Large Vision-Language Models: An In-depth Review
Object Detection with Multimodal Large Vision-Language Models: An In-depth ReviewInformation Fusion (Inf. Fusion), 2025
Ranjan Sapkota
Manoj Karkee
ObjDVLM
271
13
0
25 Aug 2025
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC VideosAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tingyu Song
Tongyan Hu
Guo Gan
Yilun Zhao
250
0
0
29 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
1.1K
27
0
05 May 2025
Directional Gradient Projection for Robust Fine-Tuning of Foundation Models
Directional Gradient Projection for Robust Fine-Tuning of Foundation ModelsInternational Conference on Learning Representations (ICLR), 2025
Chengyue Huang
Junjiao Tian
Brisa Maneechotesuwan
Shivang Chopra
Z. Kira
467
7
0
21 Feb 2025
Patent Figure Classification using Large Vision-language Models
Patent Figure Classification using Large Vision-language ModelsEuropean Conference on Information Retrieval (ECIR), 2025
Sushil Awale
Eric Müller-Budack
Ralph Ewerth
191
1
0
22 Jan 2025
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
VILA-M3: Enhancing Vision-Language Models with Medical Expert KnowledgeComputer Vision and Pattern Recognition (CVPR), 2024
Vishwesh Nath
Wenqi Li
Dong Yang
Andriy Myronenko
Mingxin Zheng
...
Holger Roth
Daguang Xu
Baris Turkbey
Holger Roth
Daguang Xu
VLM
511
24
0
19 Nov 2024
1