Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.00228
Cited By
Using Visual Cropping to Enhance Fine-Detail Question Answering of BLIP-Family Models
31 May 2023
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Using Visual Cropping to Enhance Fine-Detail Question Answering of BLIP-Family Models"
4 / 4 papers shown
Title
FIRE: Food Image to REcipe generation
P. Chhikara
Dhiraj Chaurasia
Yifan Jiang
Omkar Masur
Filip Ilievski
21
21
0
28 Aug 2023
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
380
4,010
0
28 Jan 2022
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
51
244
0
14 Jul 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
922
0
24 Sep 2019
1