Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.04306
Cited By
Effectiveness Assessment of Recent Large Vision-Language Models
7 March 2024
Yao Jiang
Xinyu Yan
Ge-Peng Ji
Keren Fu
Meijun Sun
Huan Xiong
Deng-Ping Fan
Fahad Shahbaz Khan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Effectiveness Assessment of Recent Large Vision-Language Models"
17 / 17 papers shown
Title
TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs
Zijian Zhang
Xuhui Zheng
X. Wu
Chong Peng
Xuezhi Cao
30
0
0
10 Apr 2025
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models
Jian Liang
Wenke Huang
Guancheng Wan
Qu Yang
Mang Ye
MoMe
CLL
AI4CE
57
1
0
21 Mar 2025
A Review on Geometry and Surface Inspection in 3D Concrete Printing
K. Mawas
M. Maboudi
M. Gerke
57
0
0
10 Mar 2025
An Expert Ensemble for Detecting Anomalous Scenes, Interactions, and Behaviors in Autonomous Driving
Tianchen Ji
Neeloy Chakraborty
Andre Schreiber
Katherine Rose Driggs-Campbell
37
1
0
23 Feb 2025
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
62
0
0
02 Dec 2024
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
Satvik Dixit
Laurie M. Heller
Chris Donahue
VLM
60
5
0
18 Nov 2024
Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based Recognition
Zongyou Yu
Qiang Qu
Xiaoming Chen
Chen Wang
MLLM
24
1
0
15 Sep 2024
VLM4Bio: A Benchmark Dataset to Evaluate Pretrained Vision-Language Models for Trait Discovery from Biological Images
M. Maruf
Arka Daw
Kazi Sajeed Mehrab
Harish Babu Manogaran
Abhilash Neog
...
Wei-Lun Chao
Charles V. Stewart
T. Berger-Wolf
Wasila Dahdul
Anuj Karpatne
CoGe
24
0
0
28 Aug 2024
La-SoftMoE CLIP for Unified Physical-Digital Face Attack Detection
Hang Zou
Chenxi Du
Hui Zhang
Yuan Zhang
A. Liu
Jun Wan
Zhen Lei
AAML
26
4
0
23 Aug 2024
Fine-Tuned Large Language Model for Visualization System: A Study on Self-Regulated Learning in Education
Lin Gao
Jing Lu
Zekai Shao
Ziyue Lin
Shengbin Yue
Chio-in Ieong
Yi Sun
Rory James Zauner
Zhongyu Wei
Siming Chen
27
0
0
30 Jul 2024
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation
Shuting He
Henghui Ding
VOS
24
23
0
04 Apr 2024
VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification
Lanfeng Zhong
Xin Liao
Shaoting Zhang
Xiaofan Zhang
Guotai Wang
VLM
16
4
0
23 Mar 2024
The All-Seeing Project V2: Towards General Relation Comprehension of the Open World
Weiyun Wang
Yiming Ren
Hao Luo
Tiantong Li
Chenxiang Yan
...
Qingyun Li
Lewei Lu
Xizhou Zhu
Yu Qiao
Jifeng Dai
MLLM
36
46
0
29 Feb 2024
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
152
280
0
14 Oct 2023
Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval
Jae Myung Kim
A. Sophia Koepke
Cordelia Schmid
Zeynep Akata
68
25
0
06 Apr 2023
Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Zhenhailong Wang
Manling Li
Ruochen Xu
Luowei Zhou
Jie Lei
...
Chenguang Zhu
Derek Hoiem
Shih-Fu Chang
Mohit Bansal
Heng Ji
MLLM
VLM
162
134
0
22 May 2022
RGB-D Salient Object Detection via 3D Convolutional Neural Networks
Qian Chen
Ze Liu
Y. Zhang
Keren Fu
Qijun Zhao
H. Du
3DPC
24
148
0
25 Jan 2021
1