ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.01646
  4. Cited By
Boosting Image Captioning with Attributes

Boosting Image Captioning with Attributes

5 November 2016
Ting Yao
Yingwei Pan
Yehao Li
Zhaofan Qiu
Tao Mei
    VLM
ArXivPDFHTML

Papers citing "Boosting Image Captioning with Attributes"

50 / 222 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
27
0
0
23 Apr 2025
ChatBEV: A Visual Language Model that Understands BEV Maps
ChatBEV: A Visual Language Model that Understands BEV Maps
Qingyao Xu
S. Chen
Guang Chen
Yanfeng Wang
Y. Zhang
46
0
0
18 Mar 2025
A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
Qing Zhou
Tao Yang
Junyu Gao
W. Ni
Junzheng Wu
Qi Wang
48
0
0
06 Mar 2025
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
Chunbai Zhang
Chao Wang
Yang Zhou
Yan Peng
LRM
ReLM
60
0
0
02 Feb 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
45
0
0
03 Jan 2025
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
27
0
0
09 Aug 2024
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for
  Image-Text Matching
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
Jie Wang
Joemon M. Jose
34
0
0
05 Jun 2024
Enhancing Multimodal Large Language Models with Multi-instance Visual
  Prompt Generator for Visual Representation Enrichment
Enhancing Multimodal Large Language Models with Multi-instance Visual Prompt Generator for Visual Representation Enrichment
Wenliang Zhong
Wenyi Wu
Qi Li
Rob Barton
Boxin Du
Shioulin Sam
Karim Bouyarmane
Ismail B. Tutar
Junzhou Huang
25
3
0
05 Jun 2024
Image Captioning via Dynamic Path Customization
Image Captioning via Dynamic Path Customization
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
Xiaopeng Hong
Yongjian Wu
Rongrong Ji
32
0
0
01 Jun 2024
To See is to Believe: Prompting GPT-4V for Better Visual Instruction
  Tuning
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Junke Wang
Lingchen Meng
Zejia Weng
Bo He
Zuxuan Wu
Yu-Gang Jiang
MLLM
VLM
27
93
0
13 Nov 2023
Targeted Image Data Augmentation Increases Basic Skills Captioning
  Robustness
Targeted Image Data Augmentation Increases Basic Skills Captioning Robustness
Valentin Barriere
Felipe del Rio
Andres Carvallo De Ferari
Carlos Aspillaga
Eugenio Herrera-Berg
Cristian Buc Calderon
DiffM
22
0
0
27 Sep 2023
Informative Scene Graph Generation via Debiasing
Informative Scene Graph Generation via Debiasing
Lianli Gao
Xinyu Lyu
Yuyu Guo
Yuxuan Hu
Yuanyou Li
Lu Xu
Hengtao Shen
Jingkuan Song
18
5
0
10 Aug 2023
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention
  and Text Attributes
AIC-AB NET: A Neural Network for Image Captioning with Spatial Attention and Text Attributes
Guoyun Tu
Ying Liu
Vladimir Vlassov
125
1
0
14 Jul 2023
Learning Knowledge-Rich Sequential Model for Planar Homography
  Estimation in Aerial Video
Learning Knowledge-Rich Sequential Model for Planar Homography Estimation in Aerial Video
Pu Li
Xiaobai Liu
6
1
0
05 Apr 2023
One Small Step for Generative AI, One Giant Leap for AGI: A Complete
  Survey on ChatGPT in AIGC Era
One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era
Chaoning Zhang
Chenshuang Zhang
Chenghao Li
Yu Qiao
Sheng Zheng
...
Sung-Ho Bae
Lik-Hang Lee
Pan Hui
In So Kweon
Choong Seon Hong
LM&MA
AI4MH
LRM
ELM
31
130
0
04 Apr 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
13
55
0
21 Mar 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image Captioning
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Yiyi Zhou
R. Ji
ViT
19
71
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
29
4
0
08 Feb 2023
Summarize the Past to Predict the Future: Natural Language Descriptions
  of Context Boost Multimodal Object Interaction Anticipation
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
Razvan-George Pasca
Alexey Gavryushin
Muhammad Hamza
Yen-Ling Kuo
Kaichun Mo
Luc Van Gool
Otmar Hilliges
Xi Wang
22
14
0
22 Jan 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
21
10
0
06 Jan 2023
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image Captioning
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
22
62
0
06 Dec 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng-Wei Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
23
17
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
24
1
0
20 Nov 2022
OSIC: A New One-Stage Image Captioner Coined
OSIC: A New One-Stage Image Captioner Coined
Bo Wang
Zhao Zhang
Ming Zhao
Xiaojie Jin
Mingliang Xu
Meng Wang
VLM
23
3
0
04 Nov 2022
DiMBERT: Learning Vision-Language Grounded Representations with
  Disentangled Multimodal-Attention
DiMBERT: Learning Vision-Language Grounded Representations with Disentangled Multimodal-Attention
Fenglin Liu
Xian Wu
Shen Ge
Xuancheng Ren
Wei Fan
Xu Sun
Yuexian Zou
VLM
73
12
0
28 Oct 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
24
46
0
19 Oct 2022
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval
Xuri Ge
Fuhai Chen
Songpei Xu
Fuxiang Tao
J. Jose
25
26
0
17 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
37
10
0
04 Oct 2022
M^4I: Multi-modal Models Membership Inference
M^4I: Multi-modal Models Membership Inference
Pingyi Hu
Zihan Wang
Ruoxi Sun
Hu Wang
Minhui Xue
37
26
0
15 Sep 2022
Distinctive Image Captioning via CLIP Guided Group Optimization
Distinctive Image Captioning via CLIP Guided Group Optimization
Youyuan Zhang
Jiuniu Wang
Hao Wu
Wenjia Xu
VLM
29
8
0
08 Aug 2022
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Boosting Video-Text Retrieval with Explicit High-Level Semantics
Haoran Wang
Di Xu
Dongliang He
Fu Li
Zhong Ji
Jungong Han
Errui Ding
24
11
0
08 Aug 2022
Efficient Modeling of Future Context for Image Captioning
Efficient Modeling of Future Context for Image Captioning
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
29
14
0
22 Jul 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
25
106
0
20 Jul 2022
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised
  Referring Expression Grounding
Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding
Xuejing Liu
Liang Li
Shuhui Wang
Zhengjun Zha
Dechao Meng
Qi Tian
Qingming Huang
16
42
0
18 Jul 2022
Learning To Generate Scene Graph from Head to Tail
Learning To Generate Scene Graph from Head to Tail
Chao Zheng
Xinyu Lyu
Yuyu Guo
Pengpeng Zeng
Jingkuan Song
Lianli Gao
17
10
0
23 Jun 2022
Bypass Network for Semantics Driven Image Paragraph Captioning
Bypass Network for Semantics Driven Image Paragraph Captioning
Qinjie Zheng
Chaoyue Wang
Dadong Wang
17
1
0
21 Jun 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
13
87
0
14 Jun 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Xiao Wang
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
C. L. P. Chen
VLM
21
31
0
26 May 2022
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity
  Resolution
Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity Resolution
Georgios Tziafas
S. Kasaei
16
2
0
24 May 2022
Guiding Attention using Partial-Order Relationships for Image Captioning
Guiding Attention using Partial-Order Relationships for Image Captioning
Murad Popattia
Muhammad Rafi
Rizwan Qureshi
Shah Nawaz
19
4
0
15 Apr 2022
Image Captioning In the Transformer Age
Image Captioning In the Transformer Age
Yangliu Xu
Li Li
Haiyang Xu
Songfang Huang
Fei Huang
Jianfei Cai
ViT
14
5
0
15 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
30
16
0
08 Apr 2022
Exploiting long-term temporal dynamics for video captioning
Exploiting long-term temporal dynamics for video captioning
Yuyu Guo
Jingqiu Zhang
Lianli Gao
17
18
0
22 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
89
0
31 Jan 2022
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular
  Vision-Language Pre-training
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training
Yehao Li
Jiahao Fan
Yingwei Pan
Ting Yao
Weiyao Lin
Tao Mei
MLLM
ObjD
25
19
0
11 Jan 2022
Compact Bidirectional Transformer for Image Captioning
Compact Bidirectional Transformer for Image Captioning
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Meng Wang
VLM
14
16
0
06 Jan 2022
Translational Concept Embedding for Generalized Compositional Zero-shot
  Learning
Translational Concept Embedding for Generalized Compositional Zero-shot Learning
He Huang
Wei Tang
Jiawei Zhang
Philip S. Yu
CoGe
23
2
0
20 Dec 2021
Calorie Aware Automatic Meal Kit Generation from an Image
Calorie Aware Automatic Meal Kit Generation from an Image
Ahmad Babaeian Jelodar
Yu Sun
20
2
0
18 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
24
45
0
29 Nov 2021
Scaling Up Vision-Language Pre-training for Image Captioning
Scaling Up Vision-Language Pre-training for Image Captioning
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Zhengyuan Yang
Zicheng Liu
Yumao Lu
Lijuan Wang
MLLM
VLM
28
246
0
24 Nov 2021
12345
Next