ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.07088
  4. Cited By
Vision Learners Meet Web Image-Text Pairs

Vision Learners Meet Web Image-Text Pairs

17 January 2023
Bingchen Zhao
Quan Cui
Hao Wu
Osamu Yoshie
Cheng Yang
Oisin Mac Aodha
    VLM
ArXivPDFHTML

Papers citing "Vision Learners Meet Web Image-Text Pairs"

12 / 12 papers shown
Title
Benchmarking Multi-Image Understanding in Vision and Language Models:
  Perception, Knowledge, Reasoning, and Multi-Hop Reasoning
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning
Bingchen Zhao
Yongshuo Zong
Letian Zhang
Timothy Hospedales
VLM
25
15
0
18 Jun 2024
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM
  Finetuning
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning
Bingchen Zhao
Haoqin Tu
Chen Wei
Jieru Mei
Cihang Xie
6
31
0
18 Dec 2023
Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness
  and Ethics
Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics
Haoqin Tu
Bingchen Zhao
Chen Wei
Cihang Xie
MLLM
17
13
0
13 Sep 2023
Unsupervised Camouflaged Object Segmentation as Domain Adaptation
Unsupervised Camouflaged Object Segmentation as Domain Adaptation
Yi Zhang
Chengyi Wu
18
3
0
08 Aug 2023
RankMe: Assessing the downstream performance of pretrained
  self-supervised representations by their rank
RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank
Q. Garrido
Randall Balestriero
Laurent Najman
Yann LeCun
SSL
43
71
0
05 Oct 2022
Self-Supervised Visual Representation Learning with Semantic Grouping
Self-Supervised Visual Representation Learning with Semantic Grouping
Xin Wen
Bingchen Zhao
Anlin Zheng
X. Zhang
Xiaojuan Qi
SSL
101
71
0
30 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
385
4,010
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Improving Contrastive Learning by Visualizing Feature Transformation
Improving Contrastive Learning by Visualizing Feature Transformation
Rui Zhu
Bingchen Zhao
Jingen Liu
Zhenglong Sun
C. L. P. Chen
SSL
96
77
0
06 Aug 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,359
0
09 Mar 2020
1