ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.13563
  4. Cited By
Noise-aware Learning from Web-crawled Image-Text Data for Image
  Captioning
v1v2 (latest)

Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

IEEE International Conference on Computer Vision (ICCV), 2022
27 December 2022
Woohyun Kang
Jonghwan Mun
Sungjun Lee
Byungseok Roh
    VLM
ArXiv (abs)PDFHTMLGithub (46★)

Papers citing "Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning"

16 / 16 papers shown
Title
Testing chatbots on the creation of encoders for audio conditioned image generation
Testing chatbots on the creation of encoders for audio conditioned image generation
Jorge E. León
Miguel Carrasco
56
0
0
09 Sep 2025
Effectively obtaining acoustic, visual and textual data from videos
Effectively obtaining acoustic, visual and textual data from videos
Jorge E. León
Miguel Carrasco
VGen
63
1
0
06 Sep 2025
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Tommaso Galliena
Tommaso Apicella
Stefano Rosa
Pietro Morerio
Alessio Del Bue
Lorenzo Natale
211
0
0
11 Apr 2025
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding
Wenxuan Zhu
Bing Li
Cheng Zheng
Jinjie Mai
Jun-Cheng Chen
...
Abdullah Hamdi
Sara Rojas Martinez
Chia-Wen Lin
Mohamed Elhoseiny
Bernard Ghanem
VLM
177
1
0
22 Mar 2025
Do we really have to filter out random noise in pre-training data for language models?
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
329
7
0
10 Feb 2025
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
DiffDoctor: Diagnosing Image Diffusion Models Before Treating
Yiyang Wang
Xi Chen
Xiaogang Xu
S. Ji
Yongxu Liu
Yujun Shen
Hengshuang Zhao
DiffM
242
0
0
21 Jan 2025
Development of Image Collection Method Using YOLO and Siamese Network
Development of Image Collection Method Using YOLO and Siamese Network
Chan Young Shin
Ah Hyun Lee
Jun Young Lee
Ji Min Lee
Soo Jin Park
47
0
0
16 Oct 2024
CtrlSynth: Controllable Image Text Synthesis for Data-Efficient
  Multimodal Learning
CtrlSynth: Controllable Image Text Synthesis for Data-Efficient Multimodal Learning
Qingqing Cao
Mahyar Najibi
Sachin Mehta
CLIPDiffM
145
1
0
15 Oct 2024
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
Uri Berger
Gabriel Stanovsky
Omri Abend
Lea Frermann
176
1
0
09 Aug 2024
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer
  from Text to Image via CLIP Inversion
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion
Philipp Allgeuer
Kyra Ahrens
Stefan Wermter
CLIPVLM
191
5
0
15 Jul 2024
LEMoN: Label Error Detection using Multimodal Neighbors
LEMoN: Label Error Detection using Multimodal Neighbors
Haoran Zhang
Aparna Balagopalan
Nassim Oufattole
Hyewon Jeong
Yan Wu
Jiacheng Zhu
Elisa Kreiss
245
2
0
10 Jul 2024
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Nicolas Dufour
Victor Besnier
Vicky Kalogeiton
David Picard
DiffM
285
7
0
30 May 2024
The Solution for the CVPR2024 NICE Image Captioning Challenge
The Solution for the CVPR2024 NICE Image Captioning Challenge
Longfei Huang
Shupeng Zhong
Xiangyu Wu
Ruoxuan Li
115
0
0
19 Apr 2024
Temporal-Spatial Object Relations Modeling for Vision-and-Language
  Navigation
Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation
Bowen Huang
Yanwei Zheng
Chuanlin Lan
Xinpeng Zhao
Yifei Zou
Dongxiao Yu
199
0
0
23 Mar 2024
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
Hasan Hammoud
Hani Itani
Fabio Pizzati
Juil Sock
Adel Bibi
Guohao Li
CLIPVLM
320
51
0
02 Feb 2024
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning
Taehoon Kim
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Mark A Marsden
...
Yujin Wang
Yimu Wang
Tiancheng Gu
Xingchang Lv
Mingmao Sun
VLM
174
7
0
05 Sep 2023
1