ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.16525
  4. Cited By
Transferable Decoding with Visual Entities for Zero-Shot Image
  Captioning

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

31 July 2023
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
    VLM
ArXivPDFHTML

Papers citing "Transferable Decoding with Visual Entities for Zero-Shot Image Captioning"

12 / 12 papers shown
Title
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
43
0
0
03 Jan 2025
Controllable Contextualized Image Captioning: Directing the Visual
  Narrative through User-Defined Highlights
Controllable Contextualized Image Captioning: Directing the Visual Narrative through User-Defined Highlights
Shunqi Mao
Chaoyi Zhang
Hang Su
Hwanjun Song
Igor Shalyminov
Weidong Cai
26
1
0
16 Jul 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
18
13
0
06 Mar 2024
Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models
Baoshuo Kan
Teng Wang
Wenpeng Lu
Xiantong Zhen
Weili Guan
Feng Zheng
VPVLM
VLM
13
25
0
22 Aug 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only
  Training
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
40
81
0
06 Mar 2023
Text-Only Training for Image Captioning using Noise-Injected CLIP
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
41
69
0
01 Nov 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
X. Wang
ViT
VLM
175
494
0
22 Feb 2022
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
220
897
0
28 Apr 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,077
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
189
432
0
27 Mar 2018
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image
  Captioning
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
83
1,440
0
06 Dec 2016
1