ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.01961
  4. Cited By
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

5 September 2023
Taehoon Kim
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Mark A Marsden
Alessandra Sala
Seung Wook Kim
Bohyung Han
Kyoung Mu Lee
Honglak Lee
Kyounghoon Bae
Xiangyu Wu
Yi Gao
Hailiang Zhang
Yang Yang
Weili Guo
Jianfeng Lu
Youngtaek Oh
Jae-Won Cho
Dong-Jin Kim
In So Kweon
Junmo Kim
Woohyun Kang
Won Young Jhoo
Byungseok Roh
Jonghwan Mun
Solgil Oh
Kenan E. Ak
G. Lee
Yan Xu
Mingwei Shen
Kyomin Hwang
Wonsik Shin
Kamin Lee
Wonhark Park
Dongkwan Lee
Nojun Kwak
Yujin Wang
Yimu Wang
Tiancheng Gu
Xingchang Lv
Mingmao Sun
    VLM
ArXivPDFHTML

Papers citing "NICE: CVPR 2023 Challenge on Zero-shot Image Captioning"

5 / 5 papers shown
Title
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Group-based Distinctive Image Captioning with Memory Difference Encoding and Attention
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
38
0
0
03 Apr 2025
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving
  Vision-Linguistic Compositionality
Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality
Youngtaek Oh
Jae-Won Cho
Dong-Jin Kim
In So Kweon
Junmo Kim
VLM
CoGe
CLIP
27
4
0
07 Oct 2024
Video-Text Retrieval by Supervised Sparse Multi-Grained Learning
Video-Text Retrieval by Supervised Sparse Multi-Grained Learning
Yimu Wang
Peng Shi
8
5
0
19 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
265
4,229
0
30 Jan 2023
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,081
0
17 Feb 2021
1