ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.06767
  4. Cited By
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training
  Benchmark

Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark

14 February 2022
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
Xiaodan Liang
Lewei Yao
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
    VLM
ArXivPDFHTML

Papers citing "Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark"

14 / 64 papers shown
Title
InternLM-XComposer: A Vision-Language Large Model for Advanced
  Text-image Comprehension and Composition
InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition
Pan Zhang
Xiaoyi Wang
Bin Wang
Yuhang Cao
Chao Xu
...
Conghui He
Xingcheng Zhang
Yu Qiao
Da Lin
Jiaqi Wang
MLLM
61
222
0
26 Sep 2023
Bridge Diffusion Model: bridge non-English language-native text-to-image
  diffusion model with English communities
Bridge Diffusion Model: bridge non-English language-native text-to-image diffusion model with English communities
Shanyuan Liu
Dawei Leng
Yuhui Yin
DiffM
13
7
0
02 Sep 2023
A Survey on Multimodal Large Language Models
A Survey on Multimodal Large Language Models
Shukang Yin
Chaoyou Fu
Sirui Zhao
Ke Li
Xing Sun
Tong Bill Xu
Enhong Chen
MLLM
LRM
36
552
0
23 Jun 2023
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text
  Documents
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Hugo Laurenccon
Lucile Saulnier
Léo Tronchon
Stas Bekman
Amanpreet Singh
...
Siddharth Karamcheti
Alexander M. Rush
Douwe Kiela
Matthieu Cord
Victor Sanh
25
227
0
21 Jun 2023
Optimal Linear Subspace Search: Learning to Construct Fast and
  High-Quality Schedulers for Diffusion Models
Optimal Linear Subspace Search: Learning to Construct Fast and High-Quality Schedulers for Diffusion Models
Zhongjie Duan
Chengyu Wang
Cen Chen
Jun Huang
Weining Qian
DiffM
19
12
0
24 May 2023
X-LLM: Bootstrapping Advanced Large Language Models by Treating
  Multi-Modalities as Foreign Languages
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Feilong Chen
Minglun Han
Haozhi Zhao
Qingyang Zhang
Jing Shi
Shuang Xu
Bo Xu
MLLM
33
115
0
07 May 2023
Edit Everything: A Text-Guided Generative System for Images Editing
Edit Everything: A Text-Guided Generative System for Images Editing
Defeng Xie
Ruichen Wang
Jiancang Ma
Chen Chen
H. Lu
D. Yang
Fobo Shi
Xiaodong Lin
DiffM
80
31
0
27 Apr 2023
Natural Language-Assisted Sign Language Recognition
Natural Language-Assisted Sign Language Recognition
Ronglai Zuo
Fangyun Wei
Brian Mak
SLR
18
37
0
21 Mar 2023
BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP
  for Generic Natural Visual Stimulus Decoding
BrainCLIP: Bridging Brain and Visual-Linguistic Representation Via CLIP for Generic Natural Visual Stimulus Decoding
Yulong Liu
Yongqiang Ma
Wei Zhou
Guibo Zhu
Nanning Zheng
VLM
25
34
0
25 Feb 2023
ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text
  Pre-training
ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training
Bin Shan
Weichong Yin
Yu Sun
Hao Tian
Hua-Hong Wu
Haifeng Wang
VLM
22
19
0
30 Sep 2022
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual
  Machine Learning
WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning
Krishna Srinivasan
K. Raman
Jiecao Chen
Michael Bendersky
Marc Najork
VLM
197
308
0
02 Mar 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
273
1,081
0
17 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,740
0
26 Sep 2016
Previous
12