ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.17534
  4. Cited By
Exploring the Distinctiveness and Fidelity of the Descriptions Generated
  by Large Vision-Language Models

Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models

26 April 2024
Yuhang Huang
Zihan Wu
Chongyang Gao
Jiawei Peng
Xu Yang
ArXivPDFHTML

Papers citing "Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models"

4 / 4 papers shown
Title
Multimodal Foundation Models for Zero-shot Animal Species Recognition in
  Camera Trap Images
Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images
Zalan Fabian
Zhongqi Miao
Chunyuan Li
Yuanhan Zhang
Ziwei Liu
...
Laura Siabatto
Andrés Link
Pablo Arbelaez
Rahul Dodhia
J. L. Ferres
38
10
0
02 Nov 2023
De-Diffusion Makes Text a Strong Cross-Modal Interface
De-Diffusion Makes Text a Strong Cross-Modal Interface
Chen Wei
Chenxi Liu
Siyuan Qiao
Zhishuai Zhang
Alan Yuille
Jiahui Yu
VLM
DiffM
29
10
0
01 Nov 2023
Fine-grained Image Captioning with CLIP Reward
Fine-grained Image Captioning with CLIP Reward
Jaemin Cho
Seunghyun Yoon
Ajinkya Kale
Franck Dernoncourt
Trung Bui
Mohit Bansal
CLIP
123
76
0
26 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
390
4,125
0
28 Jan 2022
1