ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02378
  4. Cited By
Auto-Encoding Scene Graphs for Image Captioning
v1v2v3 (latest)

Auto-Encoding Scene Graphs for Image Captioning

6 December 2018
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
ArXiv (abs)PDFHTML

Papers citing "Auto-Encoding Scene Graphs for Image Captioning"

50 / 310 papers shown
Title
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph
  Generation via Visual-Concept Alignment and Retention
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen
Jinlin Wu
Zhen Lei
Zhaoxiang Zhang
Changwen Chen
261
27
0
18 Nov 2023
Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and
  Message Passing Neural Network
Semantic Scene Graph Generation Based on an Edge Dual Scene Graph and Message Passing Neural Network
H. Kim
Sangwon Kim
Jong Taek Lee
B. Ko
219
3
0
02 Nov 2023
LLM4SGG: Large Language Models for Weakly Supervised Scene Graph
  Generation
LLM4SGG: Large Language Models for Weakly Supervised Scene Graph GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Kibum Kim
Kanghoon Yoon
Jaeyeong Jeon
Yeonjun In
Jinyoung Moon
Donghyun Kim
Chanyoung Park
406
29
0
16 Oct 2023
Logical Bias Learning for Object Relation Prediction
Logical Bias Learning for Object Relation Prediction
Xinyu Zhou
Zihan Ji
Anna Zhu
CVBM
170
0
0
01 Oct 2023
Predicate Classification Using Optimal Transport Loss in Scene Graph
  Generation
Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita
Satoshi Oyama
Itsuki Noda
OT
142
0
0
19 Sep 2023
RepSGG: Novel Representations of Entities and Relationships for Scene
  Graph Generation
RepSGG: Novel Representations of Entities and Relationships for Scene Graph GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hengyue Liu
B. Bhanu
314
5
0
06 Sep 2023
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning
With a Little Help from your own Past: Prototypical Memory Networks for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2023
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
148
30
0
23 Aug 2023
The Expressive Power of Graph Neural Networks: A Survey
The Expressive Power of Graph Neural Networks: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Bingxue Zhang
Changjun Fan
Shixuan Liu
Kuihua Huang
Xiang Zhao
Jin-Yu Huang
Zhong Liu
404
41
0
16 Aug 2023
A Comprehensive Analysis of Real-World Image Captioning and Scene
  Identification
A Comprehensive Analysis of Real-World Image Captioning and Scene Identification
Sai Suprabhanu Nallapaneni
Subrahmanyam Konakanchi
162
2
0
05 Aug 2023
Improving Scene Graph Generation with Superpixel-Based Interaction
  Learning
Improving Scene Graph Generation with Superpixel-Based Interaction LearningACM Multimedia (ACM MM), 2023
Jingyi Wang
Can Zhang
Jinfa Huang
Bo Ren
Zhidong Deng
147
7
0
04 Aug 2023
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge
  using Vision-Language Pre-Training Model
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training ModelACM Multimedia (ACM MM), 2023
Ka Leong Cheng
Wenpo Song
Zheng Ma
Wenhao Zhu
Zi-Yue Zhu
Jianbing Zhang
CLIPVLM
138
18
0
02 Aug 2023
Embedded Heterogeneous Attention Transformer for Cross-lingual Image
  Captioning
Embedded Heterogeneous Attention Transformer for Cross-lingual Image CaptioningIEEE transactions on multimedia (IEEE TMM), 2023
Zijie Song
Zhenzhen Hu
Yuanen Zhou
Ye Zhao
Richang Hong
Meng Wang
147
18
0
19 Jul 2023
Unbiased Scene Graph Generation via Two-stage Causal Modeling
Unbiased Scene Graph Generation via Two-stage Causal ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Shuzhou Sun
Shuaifeng Zhi
Qing Liao
J. Heikkilä
Tianpeng Liu
CML
236
50
0
11 Jul 2023
Multimodal Prompt Learning for Product Title Generation with Extremely
  Limited Labels
Multimodal Prompt Learning for Product Title Generation with Extremely Limited LabelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
184
5
0
05 Jul 2023
Improving Reference-based Distinctive Image Captioning with Contrastive
  Rewards
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
156
10
0
25 Jun 2023
Single-Stage Visual Relationship Learning using Conditional Queries
Single-Stage Visual Relationship Learning using Conditional QueriesNeural Information Processing Systems (NeurIPS), 2023
Alakh Desai
Tz-Ying Wu
Subarna Tripathi
Nuno Vasconcelos
190
9
0
09 Jun 2023
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning
Jianghui Wang
Yuxuan Wang
Dongyan Zhao
Zilong Zheng
290
3
0
04 Jun 2023
MemeGraphs: Linking Memes to Knowledge Graphs
MemeGraphs: Linking Memes to Knowledge GraphsIEEE International Conference on Document Analysis and Recognition (ICDAR), 2023
Vasiliki Kougia
Simon Fetzel
Thomas Kirchmair
Erion cCano
Sina Moayed Baharlou
Sahand Sharifzadeh
Benjamin Roth
245
12
0
28 May 2023
Weakly-Supervised Learning of Visual Relations in Multimodal Pretraining
Weakly-Supervised Learning of Visual Relations in Multimodal PretrainingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Emanuele Bugliarello
Aida Nematzadeh
Lisa Anne Hendricks
SSL
239
6
0
23 May 2023
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for
  Improved Vision-Language Compositionality
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language CompositionalityConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Harman Singh
Pengchuan Zhang
Qifan Wang
Mengjiao MJ Wang
Wenhan Xiong
Jingfei Du
Yu Chen
CoGeVLM
301
33
0
23 May 2023
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual
  Cross-modal Structure-pivoted Alignment
Cross2StrA: Unpaired Cross-lingual Image Captioning with Cross-lingual Cross-modal Structure-pivoted AlignmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shengqiong Wu
Hao Fei
Wei Ji
Tat-Seng Chua
155
33
0
20 May 2023
Scene Graph as Pivoting: Inference-time Image-free Unsupervised
  Multimodal Machine Translation with Visual Scene Hallucination
Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene HallucinationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Hao Fei
Qianfeng Liu
Meishan Zhang
Hao Fei
Tat-Seng Chua
LRM
271
61
0
20 May 2023
A request for clarity over the End of Sequence token in the
  Self-Critical Sequence Training
A request for clarity over the End of Sequence token in the Self-Critical Sequence TrainingInternational Conference on Image Analysis and Processing (ICIAP), 2023
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
241
7
0
20 May 2023
Information Screening whilst Exploiting! Multimodal Relation Extraction
  with Feature Denoising and Multimodal Topic Modeling
Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Shengqiong Wu
Hao Fei
Yixin Cao
Lidong Bing
Tat-Seng Chua
192
51
0
19 May 2023
Cross-Modality Time-Variant Relation Learning for Generating Dynamic
  Scene Graphs
Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene GraphsIEEE International Conference on Robotics and Automation (ICRA), 2023
Jingyi Wang
Jinfa Huang
Can Zhang
Zhidong Deng
305
10
0
15 May 2023
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
  Structured Representations
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured RepresentationsAAAI Conference on Artificial Intelligence (AAAI), 2023
Yufen Huang
Jiji Tang
Zhuo Chen
Rongsheng Zhang
Xinfeng Zhang
...
Zeng Zhao
Zhou Zhao
Tangjie Lv
Zhipeng Hu
Wen Zhang
VLM
259
47
0
06 May 2023
Transforming Visual Scene Graphs to Image Captions
Transforming Visual Scene Graphs to Image CaptionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
235
24
0
03 May 2023
Multimodal Graph Transformer for Multimodal Question Answering
Multimodal Graph Transformer for Multimodal Question AnsweringConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Xuehai He
Xin Eric Wang
274
10
0
30 Apr 2023
Textual Explanations for Automated Commentary Driving
Textual Explanations for Automated Commentary Driving
Marc Alexander Kühn
Daniel Omeiza
Lars Kunze
170
9
0
12 Apr 2023
SPAN: Learning Similarity between Scene Graphs and Images with
  Transformers
SPAN: Learning Similarity between Scene Graphs and Images with Transformers
Yuren Cong
Wentong Liao
Bodo Rosenhahn
M. Yang
230
6
0
02 Apr 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning EvaluationComputer Vision and Pattern Recognition (CVPR), 2023
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
280
84
0
21 Mar 2023
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
  GPT-5 All You Need?
A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?
Chaoning Zhang
Chenshuang Zhang
Sheng Zheng
Yu Qiao
Chenghao Li
...
Lik-Hang Lee
Yang Yang
Heng Tao Shen
In So Kweon
Choong Seon Hong
271
193
0
21 Mar 2023
Location-Free Scene Graph Generation
Location-Free Scene Graph Generation
Ege Özsoy
Felix Holm
Tobias Czempiel
Tobias Czempiel
Benjamin Busam
Nassir Navab
Benjamin Busam
239
4
0
20 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
132
1
0
17 Mar 2023
Knowledge-augmented Few-shot Visual Relation Detection
Knowledge-augmented Few-shot Visual Relation Detection
Tianyu Yu
Yongqian Li
Jiaoyan Chen
Hai-Tao Zheng
Haitao Zheng
...
Qingbin Liu
Wenqiang Liu
Dongxiao Huang
Bei Wu
Yexin Wang
152
6
0
09 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Graph Neural Networks in Vision-Language Image Understanding: A SurveyThe Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
252
30
0
07 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based PolishingComputer Vision and Pattern Recognition (CVPR), 2023
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDLDiffM
162
56
0
04 Mar 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image CaptioningPattern Recognition (Pattern Recogn.), 2023
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Weihao Ye
Rongrong Ji
ViT
207
101
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
150
4
0
08 Feb 2023
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for
  Scene Graph Generation
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation
Yuxiang Zhang
Zhenbo Liu
Shuai Wang
ReLMLRM
173
1
0
19 Dec 2022
SceneGATE: Scene-Graph based co-Attention networks for TExt visual
  question answering
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
372
5
0
16 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2022
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
197
111
0
06 Dec 2022
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Multi-Task Edge Prediction in Temporally-Dynamic Video GraphsBritish Machine Vision Conference (BMVC), 2022
Osman Ulger
Julian Wiederer
Mohsen Ghafoorian
Vasileios Belagiannis
Pascal Mettes
154
0
0
06 Dec 2022
Unbiased Heterogeneous Scene Graph Generation with Relation-aware
  Message Passing Neural Network
Unbiased Heterogeneous Scene Graph Generation with Relation-aware Message Passing Neural NetworkAAAI Conference on Artificial Intelligence (AAAI), 2022
Kanghoon Yoon
Kibum Kim
Jinyoung Moon
Chanyoung Park
342
39
0
01 Dec 2022
Improving Commonsense in Vision-Language Models via Knowledge Graph
  Riddles
Improving Commonsense in Vision-Language Models via Knowledge Graph RiddlesComputer Vision and Pattern Recognition (CVPR), 2022
Shuquan Ye
Yujia Xie
Dongdong Chen
Yichong Xu
Lu Yuan
Chenguang Zhu
Jing Liao
VLM
115
18
0
29 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
CLID: Controlled-Length Image Descriptions with Limited DataIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Elad Hirsch
A. Tal
VLM3DV
142
5
0
27 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffMVLM
215
30
0
21 Nov 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
123
1
0
20 Nov 2022
A survey on knowledge-enhanced multimodal learning
A survey on knowledge-enhanced multimodal learningArtificial Intelligence Review (Artif Intell Rev), 2022
Maria Lymperaiou
Giorgos Stamou
417
21
0
19 Nov 2022
Probabilistic Debiasing of Scene Graphs
Probabilistic Debiasing of Scene GraphsComputer Vision and Pattern Recognition (CVPR), 2022
Bashirul Azam Biswas
Qian Ji
198
15
0
11 Nov 2022
Previous
1234567
Next