Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.18565
Cited By
PaLI-X: On Scaling up a Multilingual Vision and Language Model
29 May 2023
Xi Chen
Josip Djolonga
Piotr Padlewski
Basil Mustafa
Soravit Changpinyo
Jialin Wu
Carlos Riquelme Ruiz
Sebastian Goodman
Xiao Wang
Yi Tay
Siamak Shakeri
Mostafa Dehghani
Daniel M. Salz
Mario Lucic
Michael Tschannen
Arsha Nagrani
Hexiang Hu
Mandar Joshi
Bo Pang
Ceslee Montgomery
Paulina Pietrzyk
Marvin Ritter
A. Piergiovanni
Matthias Minderer
Filip Pavetić
Austin Waters
Gang Li
Ibrahim M. Alabdulmohsin
Lucas Beyer
J. Amelot
Kenton Lee
Andreas Steiner
Yang Li
Daniel Keysers
Anurag Arnab
Yuanzhong Xu
Keran Rong
Alexander Kolesnikov
Mojtaba Seyedhosseini
A. Angelova
Xiaohua Zhai
N. Houlsby
Radu Soricut
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PaLI-X: On Scaling up a Multilingual Vision and Language Model"
11 / 161 papers shown
Title
Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Hexiang Hu
Yi Luan
Yang Chen
Urvashi Khandelwal
Mandar Joshi
Kenton Lee
Kristina Toutanova
Ming-Wei Chang
VLM
43
54
0
22 Feb 2023
Underspecification in Scene Description-to-Depiction Tasks
Ben Hutchinson
Jason Baldridge
Vinodkumar Prabhakaran
DiffM
53
32
0
11 Oct 2022
Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding
Kenton Lee
Mandar Joshi
Iulia Turc
Hexiang Hu
Fangyu Liu
Julian Martin Eisenschlos
Urvashi Khandelwal
Peter Shaw
Ming-Wei Chang
Kristina Toutanova
CLIP
VLM
148
259
0
07 Oct 2022
PreSTU: Pre-Training for Scene-Text Understanding
Jihyung Kil
Soravit Changpinyo
Xi Chen
Hexiang Hu
Sebastian Goodman
Wei-Lun Chao
Radu Soricut
VLM
123
29
0
12 Sep 2022
Pre-training image-language transformers for open-vocabulary tasks
A. Piergiovanni
Weicheng Kuo
A. Angelova
VLM
ViT
23
8
0
09 Sep 2022
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
Ashish V. Thapliyal
Jordi Pont-Tuset
Xi Chen
Radu Soricut
VGen
67
71
0
25 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Meta-learning via Language Model In-context Tuning
Yanda Chen
Ruiqi Zhong
Sheng Zha
George Karypis
He He
210
155
0
15 Oct 2021
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
233
341
0
22 Sep 2021
Screen2Words: Automatic Mobile UI Summarization with Multimodal Learning
Bryan Wang
Gang Li
Xin Zhou
Zhourong Chen
Tovi Grossman
Yang Li
159
152
0
07 Aug 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
220
698
0
28 Apr 2021
Previous
1
2
3
4