Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1704.06485
Cited By
v1
v2 (latest)
Attend to You: Personalized Image Captioning with Context Sequence Memory Networks
21 April 2017
C. C. Park
Byeongchang Kim
Gunhee Kim
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Attend to You: Personalized Image Captioning with Context Sequence Memory Networks"
50 / 67 papers shown
Title
Personalized Generation In Large Model Era: A Survey
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yiyan Xu
Jinghao Zhang
Alireza Salemi
Xinting Hu
Wenjie Wang
Fuli Feng
Hamed Zamani
Xiangnan He
Tat-Seng Chua
3DV
543
27
0
04 Mar 2025
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
IEEE International Conference on Computer Vision (ICCV), 2023
R. Yasarla
H. Cai
Jisoo Jeong
Y. Shi
Risheek Garrepalli
Fatih Porikli
MDE
573
27
0
17 Jan 2025
Personalized Representation from Personalized Generation
Shobhita Sundaram
Julia Chae
Yonglong Tian
Sara Beery
Phillip Isola
310
4
0
20 Dec 2024
Personalized Visual Instruction Tuning
International Conference on Learning Representations (ICLR), 2024
Renjie Pi
Jianshu Zhang
Tianyang Han
Jipeng Zhang
Boyao Wang
Tong Zhang
MLLM
208
13
0
09 Oct 2024
Context-Aware Image Descriptions for Web Accessibility
International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS), 2024
Ananya Gubbi Mohanbabu
Amy Pavel
VLM
178
21
0
04 Sep 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
416
0
0
09 Aug 2024
Influencer: Empowering Everyday Users in Creating Promotional Posts via AI-infused Exploration and Customization
Xuye Liu
Annie Sun
Pengcheng An
Tengfei Ma
Jian Zhao
121
1
0
20 Jul 2024
A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE)
Daniel Sonntag
Michael Barz
Thiago S. Gouvêa
VLM
256
6
0
27 Jun 2024
MyVLM: Personalizing VLMs for User-Specific Queries
Yuval Alaluf
Elad Richardson
Sergey Tulyakov
Kfir Aberman
Daniel Cohen-Or
MLLM
VLM
298
41
0
21 Mar 2024
"It's Kind of Context Dependent": Understanding Blind and Low Vision People's Video Accessibility Preferences Across Viewing Scenarios
Lucy Jiang
Crescentia Jung
Mahika Phutane
Abigale Stangl
Shiri Azenkot
250
29
0
16 Mar 2024
Social Media Ready Caption Generation for Brands
Himanshu Maheshwari
Koustava Goswami
Apoorv Saxena
Balaji Vasan Srinivasan
155
1
0
03 Jan 2024
Impressions: Understanding Visual Semiotics and Aesthetic Impact
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Julia Kruk
Caleb Ziems
Diyi Yang
135
3
0
27 Oct 2023
PGA: Personalizing Grasping Agents with Single Human-Robot Interaction
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Junghyun Kim
Gi-Cheon Kang
Suhyung Choi
Seoyun Yang
Minjoon Jung
Byoung-Tak Zhang
204
0
0
19 Oct 2023
A Comparative Study of Pre-trained CNNs and GRU-Based Attention for Image Caption Generation
International Conference Robotics and Computer Vision (ICRCV), 2023
Rashid Khan
Bingding Huang
Haseeb Hassan
Asim Zaman
Z. Ye
156
3
0
11 Oct 2023
Multi-grained Temporal Prototype Learning for Few-shot Video Object Segmentation
IEEE International Conference on Computer Vision (ICCV), 2023
Nian Liu
Kepan Nan
Wangbo Zhao
Yuanwei Liu
Xiwen Yao
Salman Khan
Hisham Cholakkal
Rao Muhammad Anwer
Junwei Han
Fahad Shahbaz Khan
VOS
241
11
0
20 Sep 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Computer Vision and Pattern Recognition (CVPR), 2023
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
194
57
0
04 Mar 2023
HumanDiffusion: a Coarse-to-Fine Alignment Diffusion Framework for Controllable Text-Driven Person Image Generation
Kai Zhang
Muyi Sun
Jianxin Sun
Binghao Zhao
Kunbo Zhang
Zhenan Sun
Tieniu Tan
DiffM
146
14
0
11 Nov 2022
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Gaurav Verma
Vishwa Vinay
Ryan A. Rossi
Srijan Kumar
VLM
AAML
192
9
0
04 Nov 2022
Unsupervised domain adaptation semantic segmentation of high-resolution remote sensing imagery with invariant domain-level prototype memory
IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2022
Jingru Zhu
Ya Guo
Geng Sun
Libo Yang
M. Deng
Jie Chen
224
74
0
16 Aug 2022
Motion-aware Memory Network for Fast Video Salient Object Detection
IEEE Transactions on Image Processing (IEEE TIP), 2022
Xingke Zhao
Haoran Liang
Peipei Li
Guodao Sun
Dongdong Zhao
Ronghua Liang
Xiaofei He
163
26
0
01 Aug 2022
Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2022
Chia-Wen Kuo
Z. Kira
256
80
0
09 May 2022
SceneTrilogy: On Human Scene-Sketch and its Complementarity with Photo and Text
Computer Vision and Pattern Recognition (CVPR), 2022
Pinaki Nath Chowdhury
A. Bhunia
Aneeshan Sain
Subhadeep Koley
Tao Xiang
Yi-Zhe Song
397
37
0
25 Apr 2022
On Distinctive Image Captioning via Comparing and Reweighting
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
177
23
0
08 Apr 2022
"This is my unicorn, Fluffy": Personalizing frozen vision-language representations
European Conference on Computer Vision (ECCV), 2022
Niv Cohen
Rinon Gal
E. Meirom
Gal Chechik
Yuval Atzmon
VLM
MLLM
339
102
0
04 Apr 2022
A Deep Neural Framework for Image Caption Generation Using GRU-Based Attention Mechanism
Rashid Khan
Shujah Islam
Khadija Kanwal
Mansoor Iqbal
Md. Imran Hossain
Z. Ye
3DV
94
20
0
03 Mar 2022
Cross Modal Retrieval with Querybank Normalisation
Computer Vision and Pattern Recognition (CVPR), 2021
Simion-Vlad Bogolin
Ioana Croitoru
Hailin Jin
Yang Liu
Samuel Albanie
262
115
0
23 Dec 2021
Consensus Graph Representation Learning for Better Grounded Image Captioning
Wenqiao Zhang
Haochen Shi
Siliang Tang
Jun Xiao
Qiang Yu
Yueting Zhuang
238
60
0
02 Dec 2021
Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Sahar Abdelnabi
Rakibul Hasan
Mario Fritz
381
100
0
30 Nov 2021
Universal Face Restoration With Memorized Modulation
Jia Li
Huaibo Huang
Xiaofei Jia
Ran He
CVBM
163
2
0
03 Oct 2021
Memory-based Semantic Segmentation for Off-road Unstructured Natural Environments
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Youngsaeng Jin
D. Han
Hanseok Ko
124
14
0
12 Aug 2021
Personalized Image Semantic Segmentation
IEEE International Conference on Computer Vision (ICCV), 2021
Yu Zhang
Chang-Bin Zhang
Peng-Tao Jiang
Mingg-Ming Cheng
Feng Mao
276
9
0
24 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
435
343
0
14 Jul 2021
Evaluation of Automated Image Descriptions for Visually Impaired Students
International Conference on Artificial Intelligence in Education (AIED), 2021
Anett Hoppe
D. Morris
Ralph Ewerth
189
5
0
29 Jun 2021
Human-like Controllable Image Captioning with Verb-specific Semantic Roles
Computer Vision and Pattern Recognition (CVPR), 2021
Long Chen
Zhihong Jiang
Jun Xiao
Wei Liu
245
81
0
22 Mar 2021
#PraCegoVer: A Large Dataset for Image Captioning in Portuguese
International Conference on Data Technologies and Applications (DATA), 2021
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
198
12
0
21 Mar 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2021
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
428
273
0
20 Feb 2021
CNN with large memory layers
R. Karimov
Yury Malkov
Karim Iskakov
Victor Lempitsky
212
0
0
27 Jan 2021
Understanding Guided Image Captioning Performance across Domains
Conference on Computational Natural Language Learning (CoNLL), 2020
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
361
28
0
04 Dec 2020
Structural and Functional Decomposition for Personality Image Captioning in a Communication Game
Findings (Findings), 2020
Minh-Thu Nguyen
Duy Phung
Minh Hoai
Thien Huu Nguyen
189
5
0
17 Nov 2020
Boost Image Captioning with Knowledge Reasoning
Machine-mediated learning (ML), 2020
Feicheng Huang
Zhiwen Wang
Haiyang Wei
Canlong Zhang
Huifang Ma
107
27
0
02 Nov 2020
Learning to Summarize Long Texts with Memory Compression and Transfer
Jaehong Park
Jonathan Pilault
C. Pal
99
0
0
21 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Neurocomputing (Neurocomputing), 2020
Wei Chen
Weiping Wang
Tianpeng Liu
M. Lew
VLM
329
36
0
16 Oct 2020
Enriching Video Captions With Contextual Text
International Conference on Pattern Recognition (ICPR), 2020
Philipp Rimle
Pelin Dogan
Markus Gross
149
3
0
29 Jul 2020
Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets
European Conference on Computer Vision (ECCV), 2020
Jiuniu Wang
Wenjia Xu
Qingzhong Wang
Antoni B. Chan
147
46
0
14 Jul 2020
Clue: Cross-modal Coherence Modeling for Caption Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Malihe Alikhani
Piyush Sharma
Shengjie Li
Radu Soricut
Matthew Stone
185
59
0
02 May 2020
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Computer Vision and Pattern Recognition (CVPR), 2020
Dongxu Li
Xin Yu
Chenchen Xu
L. Petersson
Hongdong Li
SLR
321
121
0
08 Mar 2020
Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose
Ran Yi
Zipeng Ye
Juyong Zhang
Hujun Bao
Yong Liu
CVBM
230
147
0
24 Feb 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
222
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
Multimedia tools and applications (MTA), 2020
C. Sur
151
17
0
15 Feb 2020
aiTPR: Attribute Interaction-Tensor Product Representation for Image Caption
Neural Processing Letters (NPL), 2020
C. Sur
115
10
0
27 Jan 2020
1
2
Next