ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.12383
  4. Cited By
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning

With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning

23 August 2023
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
    VLM
ArXivPDFHTML

Papers citing "With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning"

21 / 21 papers shown
Title
HoloDx: Knowledge- and Data-Driven Multimodal Diagnosis of Alzheimer's Disease
HoloDx: Knowledge- and Data-Driven Multimodal Diagnosis of Alzheimer's Disease
Qiuhui Chen
Jintao Wang
Gang Wang
Yi Hong
39
0
0
27 Apr 2025
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Tri-FusionNet: Enhancing Image Description Generation with Transformer-based Fusion Network and Dual Attention Mechanism
Lakshita Agarwal
Bindu Verma
ViT
17
0
0
23 Apr 2025
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fashion-RAG: Multimodal Fashion Image Editing via Retrieval-Augmented Generation
Fulvio Sanguigni
Davide Morelli
Marcella Cornia
Rita Cucchiara
DiffM
23
0
0
18 Apr 2025
Positive-Augmented Contrastive Learning for Vision-and-Language
  Evaluation and Training
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
20
0
0
09 Oct 2024
See or Guess: Counterfactually Regularized Image Captioning
See or Guess: Counterfactually Regularized Image Captioning
Qian Cao
Xu Chen
Ruihua Song
Xiting Wang
Xinting Huang
Yuchen Ren
CML
29
0
0
29 Aug 2024
Revisiting Image Captioning Training Paradigm via Direct CLIP-based
  Optimization
Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization
Nicholas Moratelli
Davide Caffagni
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
CLIP
21
1
0
26 Aug 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
24
0
0
09 Aug 2024
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger
  Visual Cues
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
20
1
0
29 Jul 2024
TTM-RE: Memory-Augmented Document-Level Relation Extraction
TTM-RE: Memory-Augmented Document-Level Relation Extraction
Chufan Gao
Xuan Wang
Jimeng Sun
14
0
0
09 Jun 2024
Towards Retrieval-Augmented Architectures for Image Captioning
Towards Retrieval-Augmented Architectures for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Alessandro Nicolosi
Rita Cucchiara
VLM
17
1
0
21 May 2024
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image
  Retrieval
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Lorenzo Agnolucci
Alberto Baldrati
Marco Bertini
A. Bimbo
30
9
0
05 May 2024
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal
  LLMs
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Davide Caffagni
Federico Cocchi
Nicholas Moratelli
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
KELM
16
34
0
23 Apr 2024
Attendre: Wait To Attend By Retrieval With Evicted Queries in
  Memory-Based Transformers for Long Context Processing
Attendre: Wait To Attend By Retrieval With Evicted Queries in Memory-Based Transformers for Long Context Processing
Zi Yang
Nan Hua
RALM
18
4
0
10 Jan 2024
Parents and Children: Distinguishing Multimodal DeepFakes from Natural
  Images
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images
Roberto Amoroso
Davide Morelli
Marcella Cornia
Lorenzo Baraldi
A. Bimbo
Rita Cucchiara
DiffM
19
22
0
02 Apr 2023
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
51
244
0
14 Jul 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
174
342
0
13 Jul 2021
RelTransformer: A Transformer-Based Long-Tail Visual Relationship
  Recognition
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition
Jun Chen
Aniket Agarwal
Sherif Abdelkarim
Deyao Zhu
Mohamed Elhoseiny
ViT
60
15
0
24 Apr 2021
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
106
164
0
19 Mar 2020
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
922
0
24 Sep 2019
Neural Baby Talk
Neural Baby Talk
Jiasen Lu
Jianwei Yang
Dhruv Batra
Devi Parikh
VLM
186
412
0
27 Mar 2018
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image
  Captioning
Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
Jiasen Lu
Caiming Xiong
Devi Parikh
R. Socher
78
443
0
06 Dec 2016
1