Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.12383
Cited By
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
23 August 2023
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning"
21 / 21 papers shown
Title
HoloDx: Knowledge- and Data-Driven Multimodal Diagnosis of Alzheimer's Disease
Qiuhui Chen
Jintao Wang
Gang Wang
Yi Hong
39
0
0
27 Apr 2025
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
17
0
0
23 Apr 2025
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fulvio Sanguigni
Davide Morelli
Marcella Cornia
Rita Cucchiara
DiffM
23
0
0
18 Apr 2025
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
20
0
0
09 Oct 2024
See or Guess: Counterfactually Regularized Image Captioning
Qian Cao
Xu Chen
Ruihua Song
Xiting Wang
Xinting Huang
Yuchen Ren
CML
29
0
0
29 Aug 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
21
1
0
26 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
24
0
0
09 Aug 2024
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
20
1
0
29 Jul 2024
TTM-RE: Memory-Augmented Document-Level Relation Extraction
Chufan Gao
Xuan Wang
Jimeng Sun
14
0
0
09 Jun 2024
Towards Retrieval-Augmented Architectures for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
VLM
17
1
0
21 May 2024
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Lorenzo Agnolucci
Alberto Baldrati
Marco Bertini
A. Bimbo
30
9
0
05 May 2024
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Davide Caffagni
Federico Cocchi
Nicholas Moratelli
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELM
16
34
0
23 Apr 2024
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing
Zi Yang
Nan Hua
RALM
18
4
0
10 Jan 2024
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
19
22
0
02 Apr 2023
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
51
244
0
14 Jul 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
174
342
0
13 Jul 2021
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition
Jun Chen
Aniket Agarwal
Sherif Abdelkarim
Deyao Zhu
Mohamed Elhoseiny
ViT
60
15
0
24 Apr 2021
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
106
164
0
19 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
922
0
24 Sep 2019
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
186
412
0
27 Mar 2018
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
78
443
0
06 Dec 2016
1