ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02378
  4. Cited By
Auto-Encoding Scene Graphs for Image Captioning

Auto-Encoding Scene Graphs for Image Captioning

6 December 2018
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
ArXivPDFHTML

Papers citing "Auto-Encoding Scene Graphs for Image Captioning"

50 / 96 papers shown
Title
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
27
0
0
23 Apr 2025
A Causal Adjustment Module for Debiasing Scene Graph Generation
A Causal Adjustment Module for Debiasing Scene Graph Generation
Li Liu
Shuzhou Sun
Shuaifeng Zhi
Fan Shi
Zhen Liu
J. Heikkilä
Yongxiang Liu
CML
52
2
0
22 Mar 2025
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic
Disentangling Fine-Tuning from Pre-Training in Visual Captioning with Hybrid Markov Logic
Monika Shah
Somdeb Sarkhel
Deepak Venugopal
MLLM
BDL
VLM
83
0
0
18 Mar 2025
Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning
Jianjie Luo
Jingwen Chen
Yehao Li
Yingwei Pan
Jianlin Feng
Hongyang Chao
Ting Yao
DiffM
VLM
45
0
0
03 Jan 2025
Situational Scene Graph for Structured Human-centric Situation Understanding
Situational Scene Graph for Structured Human-centric Situation Understanding
Chinthani Sugandhika
Chen Li
Deepu Rajan
Basura Fernando
111
1
0
30 Oct 2024
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
BCTR: Bidirectional Conditioning Transformer for Scene Graph Generation
Peng Hao
Xiaobing Wang
Yingying Jiang
Hanchao Jia
Xiaoshuai Hao
Shaowei Cui
Junhang Wei
Xiaoshuai Hao
47
3
0
26 Jul 2024
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu
Hao Fei
Xiangtai Li
Jiayi Ji
Hanwang Zhang
Tat-Seng Chua
Shuicheng Yan
MLLM
59
31
0
07 Jun 2024
MeaCap: Memory-Augmented Zero-shot Image Captioning
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng
Yan Xie
Hao Zhang
Chiyu Chen
Zhengjue Wang
Boli Chen
VLM
20
14
0
06 Mar 2024
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Action Scene Graphs for Long-Form Understanding of Egocentric Videos
Ivan Rodin
Antonino Furnari
Kyle Min
Subarna Tripathi
G. Farinella
EgoV
19
12
0
06 Dec 2023
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph
  Generation via Visual-Concept Alignment and Retention
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention
Zuyao Chen
Jinlin Wu
Zhen Lei
Zhaoxiang Zhang
Changwen Chen
23
11
0
18 Nov 2023
Predicate Classification Using Optimal Transport Loss in Scene Graph
  Generation
Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita
Satoshi Oyama
Itsuki Noda
OT
22
0
0
19 Sep 2023
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
51
19
0
23 Aug 2023
The Expressive Power of Graph Neural Networks: A Survey
The Expressive Power of Graph Neural Networks: A Survey
Bingxue Zhang
Changjun Fan
Shixuan Liu
Kuihua Huang
Xiang Zhao
Jin-Yu Huang
Zhong Liu
40
19
0
16 Aug 2023
Improving Scene Graph Generation with Superpixel-Based Interaction
  Learning
Improving Scene Graph Generation with Superpixel-Based Interaction Learning
Jingyi Wang
Can Zhang
Jinfa Huang
Bo Ren
Zhidong Deng
21
7
0
04 Aug 2023
Multimodal Prompt Learning for Product Title Generation with Extremely
  Limited Labels
Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang-ju Yang
Fenglin Liu
Zheng Li
Qingyu Yin
Chenyu You
Bing Yin
Yuexian Zou
VLM
26
5
0
05 Jul 2023
A request for clarity over the End of Sequence token in the
  Self-Critical Sequence Training
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
24
6
0
20 May 2023
Information Screening whilst Exploiting! Multimodal Relation Extraction
  with Feature Denoising and Multimodal Topic Modeling
Information Screening whilst Exploiting! Multimodal Relation Extraction with Feature Denoising and Multimodal Topic Modeling
Shengqiong Wu
Hao Fei
Yixin Cao
Lidong Bing
Tat-Seng Chua
32
31
0
19 May 2023
Textual Explanations for Automated Commentary Driving
Textual Explanations for Automated Commentary Driving
Marc Alexander Kühn
Daniel Omeiza
Lars Kunze
22
6
0
12 Apr 2023
SPAN: Learning Similarity between Scene Graphs and Images with
  Transformers
SPAN: Learning Similarity between Scene Graphs and Images with Transformers
Yuren Cong
Wentong Liao
Bodo Rosenhahn
M. Yang
20
6
0
02 Apr 2023
Location-Free Scene Graph Generation
Location-Free Scene Graph Generation
Ege Ozsoy
Felix Holm
Tobias Czempiel
Tobias Czempiel
Benjamin Busam
Nassir Navab
Benjamin Busam
37
4
0
20 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
11
1
0
17 Mar 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
21
4
0
08 Feb 2023
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for
  Scene Graph Generation
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation
Yuxiang Zhang
Zhenbo Liu
Shuai Wang
ReLM
LRM
19
1
0
19 Dec 2022
SceneGATE: Scene-Graph based co-Attention networks for TExt visual
  question answering
SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Feiqi Cao
Siwen Luo
F. Núñez
Zean Wen
Josiah Poon
Caren Han
GNN
16
4
0
16 Dec 2022
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs
Osman Ulger
Julian Wiederer
Mohsen Ghafoorian
Vasileios Belagiannis
Pascal Mettes
35
0
0
06 Dec 2022
How to Describe Images in a More Funny Way? Towards a Modular Approach
  to Cross-Modal Sarcasm Generation
How to Describe Images in a More Funny Way? Towards a Modular Approach to Cross-Modal Sarcasm Generation
Jie Ruan
Yue Wu
Xiaojun Wan
Yuesheng Zhu
19
1
0
20 Nov 2022
Probabilistic Debiasing of Scene Graphs
Probabilistic Debiasing of Scene Graphs
Bashirul Azam Biswas
Qian Ji
22
11
0
11 Nov 2022
Prophet Attention: Predicting Attention with Future Attention for Image
  Captioning
Prophet Attention: Predicting Attention with Future Attention for Image Captioning
Fenglin Liu
Xuancheng Ren
Xian Wu
Wei Fan
Yuexian Zou
Xu Sun
19
46
0
19 Oct 2022
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for
  Text-to-Image Generation
Swinv2-Imagen: Hierarchical Vision Transformer Diffusion Models for Text-to-Image Generation
Rui Li
Weihua Li
Yi Yang
Hanyu Wei
Jianhua Jiang
Quan-wei Bai
DiffM
19
11
0
18 Oct 2022
Learning to Collocate Visual-Linguistic Neural Modules for Image
  Captioning
Learning to Collocate Visual-Linguistic Neural Modules for Image Captioning
Xu Yang
Hanwang Zhang
Chongyang Gao
Jianfei Cai
MLLM
25
10
0
04 Oct 2022
Unbiased Scene Graph Generation using Predicate Similarities
Unbiased Scene Graph Generation using Predicate Similarities
Misaki Ohashi
Yusuke Matsui
25
1
0
03 Oct 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based
  Finetuning
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
18
50
0
17 Aug 2022
Context-aware Mixture-of-Experts for Unbiased Scene Graph Generation
Context-aware Mixture-of-Experts for Unbiased Scene Graph Generation
Liguang Zhou
Yuhongze Zhou
Tin Lun Lam
Yangsheng Xu
EDL
MoE
21
2
0
15 Aug 2022
Rethinking the Evaluation of Unbiased Scene Graph Generation
Rethinking the Evaluation of Unbiased Scene Graph Generation
Xingchen Li
Long Chen
Jian Shao
Shaoning Xiao
Songyang Zhang
Jun Xiao
27
12
0
03 Aug 2022
Integrating Object-aware and Interaction-aware Knowledge for Weakly
  Supervised Scene Graph Generation
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation
Xingchen Li
Long Chen
Wenbo Ma
Yi Yang
Jun Xiao
11
26
0
03 Aug 2022
GRIT: Faster and Better Image captioning Transformer Using Dual Visual
  Features
GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features
Van-Quang Nguyen
Masanori Suganuma
Takayuki Okatani
ViT
25
106
0
20 Jul 2022
Exploring the sequence length bottleneck in the Transformer for Image
  Captioning
Exploring the sequence length bottleneck in the Transformer for Image Captioning
Jiapeng Hu
Roberto Cavicchioli
Alessandro Capotondi
ViT
33
3
0
07 Jul 2022
Comprehending and Ordering Semantics for Image Captioning
Comprehending and Ordering Semantics for Image Captioning
Yehao Li
Yingwei Pan
Ting Yao
Tao Mei
13
87
0
14 Jun 2022
Visual Transformer for Object Detection
Visual Transformer for Object Detection
M. Yang
ViT
17
6
0
01 Jun 2022
Controllable Image Captioning
Luka Maxwell
28
0
0
28 Apr 2022
Attention Mechanism based Cognition-level Scene Understanding
Attention Mechanism based Cognition-level Scene Understanding
Xuejiao Tang
Tai Le Quy
LRM
23
0
0
17 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
On Distinctive Image Captioning via Comparing and Reweighting
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
30
16
0
08 Apr 2022
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Two-stream Hierarchical Similarity Reasoning for Image-text Matching
Ran Chen
Hanli Wang
Lei Wang
Sam Kwong
11
9
0
10 Mar 2022
Knowledge-enriched Attention Network with Group-wise Semantic for Visual
  Storytelling
Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling
Tengpeng Li
Hanli Wang
Bin He
Changan Chen
DiffM
19
9
0
10 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
8
88
0
31 Jan 2022
RelTR: Relation Transformer for Scene Graph Generation
RelTR: Relation Transformer for Scene Graph Generation
Yuren Cong
M. Yang
Bodo Rosenhahn
ViT
87
132
0
27 Jan 2022
SA-VQA: Structured Alignment of Visual and Semantic Representations for
  Visual Question Answering
SA-VQA: Structured Alignment of Visual and Semantic Representations for Visual Question Answering
Peixi Xiong
Quanzeng You
Pei Yu
Zicheng Liu
Ying Wu
10
5
0
25 Jan 2022
A Cognitive Explainer for Fetal ultrasound images classifier Based on
  Medical Concepts
A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts
Ying-Shuai Wanga
Yunxia Liua
Licong Dongc
Xuzhou Wua
Huabin Zhangb
Qiongyu Yed
Desheng Sunc
Xiaobo Zhoue
Kehong Yuan
11
0
0
19 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Representing Videos as Discriminative Sub-graphs for Action Recognition
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
19
25
0
11 Jan 2022
12
Next