ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.02378
  4. Cited By
Auto-Encoding Scene Graphs for Image Captioning
v1v2v3 (latest)

Auto-Encoding Scene Graphs for Image Captioning

6 December 2018
Xu Yang
Kaihua Tang
Hanwang Zhang
Jianfei Cai
ArXiv (abs)PDFHTML

Papers citing "Auto-Encoding Scene Graphs for Image Captioning"

50 / 310 papers shown
Title
A Cognitive Explainer for Fetal ultrasound images classifier Based on
  Medical Concepts
A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts
Ying-Shuai Wanga
Yunxia Liua
Licong Dongc
Xuzhou Wua
Huabin Zhangb
Qiongyu Yed
Desheng Sunc
Xiaobo Zhoue
Kehong Yuan
236
0
0
19 Jan 2022
BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR
BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIRNeurocomputing (Neurocomputing), 2022
Ushasi Chaudhuri
Ruchika Chavan
Biplab Banerjee
Anjan Dutta
Zeynep Akata
159
22
0
17 Jan 2022
Representing Videos as Discriminative Sub-graphs for Action Recognition
Representing Videos as Discriminative Sub-graphs for Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2021
Dong Li
Zhaofan Qiu
Yingwei Pan
Ting Yao
Houqiang Li
Tao Mei
199
31
0
11 Jan 2022
Incremental Object Grounding Using Scene Graphs
Incremental Object Grounding Using Scene Graphs
J. Yi
Yoonwoo Kim
Sonia Chernova
LM&Ro
257
9
0
06 Jan 2022
Image Captioning via Compact Bidirectional Architecture
Image Captioning via Compact Bidirectional Architecture
Zijie Song
Yuanen Zhou
Zhenzhen Hu
Daqing Liu
Huixia Ben
Richang Hong
Meng Wang
VLM
164
18
0
06 Jan 2022
Scene Graph Generation: A Comprehensive Survey
Scene Graph Generation: A Comprehensive SurveyNeurocomputing (Neurocomputing), 2022
Guangming Zhu
Liang Zhang
Youliang Jiang
Yixuan Dang
Haoran Hou
...
Mingtao Feng
Xia Zhao
Qiguang Miao
Syed Afaq Ali Shah
Bennamoun
3DV
395
127
0
03 Jan 2022
SGTR: End-to-end Scene Graph Generation with Transformer
SGTR: End-to-end Scene Graph Generation with TransformerComputer Vision and Pattern Recognition (CVPR), 2021
Rongjie Li
Songyang Zhang
Xuming He
ViT
446
139
0
24 Dec 2021
A Survey of Natural Language Generation
A Survey of Natural Language GenerationACM Computing Surveys (CSUR), 2021
Chenhe Dong
Hai-Tao Zheng
Haifan Gong
Mengzhao Chen
Junxin Li
Ying Shen
Min Yang
3DV
308
62
0
22 Dec 2021
Exploiting Long-Term Dependencies for Generating Dynamic Scene Graphs
Exploiting Long-Term Dependencies for Generating Dynamic Scene GraphsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2021
Shengyu Feng
Subarna Tripathi
Hesham Mostafa
Marcel Nassar
Somdeb Majumdar
264
33
0
18 Dec 2021
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense
  Reasoning
SGEITL: Scene Graph Enhanced Image-Text Learning for Visual Commonsense Reasoning
Zhecan Wang
Haoxuan You
Liunian Harold Li
Alireza Zareian
Suji Park
Yiqing Liang
Kai-Wei Chang
Shih-Fu Chang
ReLMLRM
168
38
0
16 Dec 2021
Neural Belief Propagation for Scene Graph Generation
Neural Belief Propagation for Scene Graph Generation
Daqi Liu
M. Bober
J. Kittler
GNN
156
11
0
10 Dec 2021
Injecting Semantic Concepts into End-to-End Image Captioning
Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lin Liang
Zhe Gan
Lijuan Wang
Yezhou Yang
Zicheng Liu
ViTVLM
220
120
0
09 Dec 2021
Classification-Then-Grounding: Reformulating Video Scene Graphs as
  Temporal Bipartite Graphs
Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs
Kaifeng Gao
Long Chen
Yulei Niu
Jian Shao
Jun Xiao
199
36
0
08 Dec 2021
UNITER-Based Situated Coreference Resolution with Rich Multimodal Input
UNITER-Based Situated Coreference Resolution with Rich Multimodal Input
Yichen Huang
Yuchen Wang
Yik-Cheung Tam
147
9
0
07 Dec 2021
Consensus Graph Representation Learning for Better Grounded Image
  Captioning
Consensus Graph Representation Learning for Better Grounded Image Captioning
Wenqiao Zhang
Haochen Shi
Siliang Tang
Jun Xiao
Qiang Yu
Yueting Zhuang
214
60
0
02 Dec 2021
Relational Graph Learning for Grounded Video Description Generation
Relational Graph Learning for Grounded Video Description Generation
Wenqiao Zhang
Xinze Wang
Siliang Tang
Haizhou Shi
Haochen Shi
Jun Xiao
Yueting Zhuang
Wenjie Wang
175
34
0
02 Dec 2021
Neural Attention for Image Captioning: Review of Outstanding Methods
Neural Attention for Image Captioning: Review of Outstanding Methods
Zanyar Zohourianshahzadi
Jugal Kalita
VLM
168
57
0
29 Nov 2021
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic
  Arithmetic
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticComputer Vision and Pattern Recognition (CVPR), 2021
Yoad Tewel
Yoav Shalev
Idan Schwartz
Lior Wolf
VLM
295
232
0
29 Nov 2021
Generating More Pertinent Captions by Leveraging Semantics and Style on
  Multi-Source Datasets
Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
Marcella Cornia
Lorenzo Baraldi
G. Fiameni
Rita Cucchiara
281
14
0
24 Nov 2021
Scaling Up Vision-Language Pre-training for Image Captioning
Scaling Up Vision-Language Pre-training for Image Captioning
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Zhengyuan Yang
Zicheng Liu
Yumao Lu
Lijuan Wang
MLLMVLM
357
294
0
24 Nov 2021
L-Verse: Bidirectional Generation Between Image and Text
L-Verse: Bidirectional Generation Between Image and TextComputer Vision and Pattern Recognition (CVPR), 2021
Taehoon Kim
Gwangmo Song
Sihaeng Lee
Sangyun Kim
Yewon Seo
Soonyoung Lee
S. Kim
Honglak Lee
Kyunghoon Bae
972
28
0
22 Nov 2021
Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
Fenglin Liu
Chenyu You
Xian Wu
Shen Ge
Sheng Wang
Xu Sun
MedIm
276
120
0
08 Nov 2021
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences
  for Image-Text Retrieval
Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Jianqing Fan
167
7
0
05 Nov 2021
Unifying Multimodal Transformer for Bi-directional Image and Text
  Generation
Unifying Multimodal Transformer for Bi-directional Image and Text Generation
Yupan Huang
Hongwei Xue
Bei Liu
Yutong Lu
197
63
0
19 Oct 2021
MeronymNet: A Hierarchical Approach for Unified and Controllable
  Multi-Category Object Generation
MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation
Rishabh Baghel
Abhishek Trivedi
Tejas Ravichandran
Ravi Kiran Sarvadevabhatla
DiffM
148
1
0
17 Oct 2021
Self-Annotated Training for Controllable Image Captioning
Self-Annotated Training for Controllable Image Captioning
Zhangzi Zhu
Tianlei Wang
Hong Qu
175
2
0
16 Oct 2021
Topic Scene Graph Generation by Attention Distillation from Caption
Topic Scene Graph Generation by Attention Distillation from CaptionIEEE International Conference on Computer Vision (ICCV), 2021
Wenbin Wang
R. Wang
X. Chen
DiffM
208
16
0
12 Oct 2021
Geometry Attention Transformer with Position-aware LSTMs for Image
  Captioning
Geometry Attention Transformer with Position-aware LSTMs for Image Captioning
Chi-Yin Wang
Yulin Shen
Luping Ji
ViT
233
65
0
01 Oct 2021
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Geometry-Entangled Visual Semantic Transformer for Image Captioning
Ling Cheng
Wei Wei
Feida Zhu
Yong Liu
Chunyan Miao
ViT
154
3
0
29 Sep 2021
Scene Graph Generation for Better Image Captioning?
Scene Graph Generation for Better Image Captioning?
Maximilian Mozes
Martin Schmitt
Vladimir Golkov
Hinrich Schütze
Zorah Lähner
GNN
176
5
0
23 Sep 2021
Cross Modification Attention Based Deliberation Model for Image
  Captioning
Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian
Yanan Zhang
Haichang Li
Rui Wang
Xiaohui Hu
102
8
0
17 Sep 2021
Label-Attention Transformer with Geometrically Coherent Objects for
  Image Captioning
Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning
Shikha Dubey
Farrukh Olimov
M. Rafique
Joonmo Kim
M. Jeon
ViT
174
46
0
16 Sep 2021
Constructing Phrase-level Semantic Labels to Form Multi-Grained
  Supervision for Image-Text Retrieval
Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval
Zhihao Fan
Zhongyu Wei
Zejun Li
Siyuan Wang
Haijun Shan
Xuanjing Huang
Jianqing Fan
CLIP
87
12
0
12 Sep 2021
BGT-Net: Bidirectional GRU Transformer Network for Scene Graph
  Generation
BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation
Naina Dhingra
Florian Ritter
A. Kunz
189
43
0
11 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
228
85
0
06 Sep 2021
Auto-Parsing Network for Image Captioning and Visual Question Answering
Auto-Parsing Network for Image Captioning and Visual Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2021
Xu Yang
Chongyang Gao
Hanwang Zhang
Jianfei Cai
217
43
0
24 Aug 2021
Learning of Visual Relations: The Devil is in the Tails
Learning of Visual Relations: The Devil is in the TailsIEEE International Conference on Computer Vision (ICCV), 2021
Alakh Desai
Tz-Ying Wu
Subarna Tripathi
Nuno Vasconcelos
210
97
0
22 Aug 2021
Semantic Compositional Learning for Low-shot Scene Graph Generation
Semantic Compositional Learning for Low-shot Scene Graph Generation
Tao He
Lianli Gao
Jingkuan Song
Jianfei Cai
Yuan-Fang Li
CoGe
208
10
0
19 Aug 2021
Cross-Modal Graph with Meta Concepts for Video Captioning
Cross-Modal Graph with Meta Concepts for Video CaptioningIEEE Transactions on Image Processing (TIP), 2021
Hao Wang
Guosheng Lin
Guosheng Lin
Chunyan Miao
275
9
0
14 Aug 2021
Interpretable Visual Understanding with Cognitive Attention Network
Interpretable Visual Understanding with Cognitive Attention NetworkInternational Conference on Artificial Neural Networks (ICANN), 2021
Xuejiao Tang
Wenbin Zhang
Yi Yu
Kea Turner
Hanyu Wang
Mengyu Wang
Eirini Ntoutsi
262
19
0
06 Aug 2021
Dual Graph Convolutional Networks with Transformer and Curriculum
  Learning for Image Captioning
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image CaptioningACM Multimedia (ACM MM), 2021
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
249
71
0
05 Aug 2021
Distributed Attention for Grounded Image Captioning
Distributed Attention for Grounded Image Captioning
Nenglun Chen
Xingjia Pan
Runnan Chen
Lei Yang
Zhiwen Lin
Yuqiang Ren
Haolei Yuan
Xiaowei Guo
Feiyue Huang
Wenping Wang
303
22
0
02 Aug 2021
ReFormer: The Relational Transformer for Image Captioning
ReFormer: The Relational Transformer for Image CaptioningACM Multimedia (ACM MM), 2021
Xuewen Yang
Yingru Liu
Xin Wang
ViT
199
62
0
29 Jul 2021
Image Scene Graph Generation (SGG) Benchmark
Image Scene Graph Generation (SGG) Benchmark
Xiao Han
Jianwei Yang
Houdong Hu
Lei Zhang
Jianfeng Gao
Pengchuan Zhang
124
40
0
27 Jul 2021
Spatial-Temporal Transformer for Dynamic Scene Graph Generation
Spatial-Temporal Transformer for Dynamic Scene Graph GenerationIEEE International Conference on Computer Vision (ICCV), 2021
Yuren Cong
Wentong Liao
H. Ackermann
Bodo Rosenhahn
M. Yang
ViT
246
149
0
26 Jul 2021
Boosting Entity-aware Image Captioning with Multi-modal Knowledge Graph
Boosting Entity-aware Image Captioning with Multi-modal Knowledge GraphIEEE transactions on multimedia (IEEE Trans. Multimedia), 2021
Wentian Zhao
Yao Hu
Heda Wang
Xinxiao Wu
Jiebo Luo
173
65
0
26 Jul 2021
What and When to Look?: Temporal Span Proposal Network for Video
  Relation Detection
What and When to Look?: Temporal Span Proposal Network for Video Relation Detection
Sangmin Woo
Junhyug Noh
Kangil Kim
298
2
0
15 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image CaptioningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DVVLMMLLM
379
343
0
14 Jul 2021
Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge
  Integration
Zero-Shot Scene Graph Relation Prediction through Commonsense Knowledge Integration
Xuan Kan
Hejie Cui
Carl Yang
210
47
0
11 Jul 2021
Controlled Caption Generation for Images Through Adversarial Attacks
Controlled Caption Generation for Images Through Adversarial Attacks
Nayyer Aafaq
Naveed Akhtar
Wei Liu
M. Shah
Lin Wang
AAML
123
12
0
07 Jul 2021
Previous
1234567
Next