Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.17534
Cited By
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models
26 April 2024
Yuhang Huang
Zihan Wu
Chongyang Gao
Jiawei Peng
Xu Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models"
4 / 4 papers shown
Title
Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images
Zalan Fabian
Zhongqi Miao
Chunyuan Li
Yuanhan Zhang
Ziwei Liu
...
Laura Siabatto
Andrés Link
Pablo Arbelaez
Rahul Dodhia
J. L. Ferres
38
10
0
02 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
29
10
0
01 Nov 2023
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Mohit Bansal
CLIP
123
76
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,125
0
28 Jan 2022
1