ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.02499
  4. Cited By
Images Speak in Images: A Generalist Painter for In-Context Visual
  Learning

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

5 December 2022
Xinlong Wang
Wen Wang
Yue Cao
Chunhua Shen
Tiejun Huang
    VLM
    MLLM
ArXivPDFHTML

Papers citing "Images Speak in Images: A Generalist Painter for In-Context Visual Learning"

50 / 197 papers shown
Title
Selective Hourglass Mapping for Universal Image Restoration Based on
  Diffusion Model
Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model
Dian Zheng
Xiao-Ming Wu
Shuzhou Yang
Jian Zhang
Jianfang Hu
Wei-Shi Zheng
DiffM
33
27
0
17 Mar 2024
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal
  Learning with Missing Modalities and Data Scarcity
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity
Zhuo Zhi
Ziquan Liu
M. Elbadawi
Adam Daneshmend
Mine Orlu
Abdul Basit
Andreas Demosthenous
Miguel R. D. Rodrigues
22
2
0
14 Mar 2024
Explore In-Context Segmentation via Latent Diffusion Models
Explore In-Context Segmentation via Latent Diffusion Models
Chaoyang Wang
Xiangtai Li
Henghui Ding
Lu Qi
Jiangning Zhang
Yunhai Tong
Chen Change Loy
Shuicheng Yan
DiffM
63
6
0
14 Mar 2024
Toward Generalist Anomaly Detection via In-context Residual Learning
  with Few-shot Sample Prompts
Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts
Jiawen Zhu
Guansong Pang
VLM
53
34
0
11 Mar 2024
In-context Prompt Learning for Test-time Vision Recognition with Frozen
  Vision-language Model
In-context Prompt Learning for Test-time Vision Recognition with Frozen Vision-language Model
Junhui Yin
Xinyu Zhang
Lin Wu
Xianghua Xie
Xiaojie Wang
VPVLM
VLM
MLLM
25
2
0
10 Mar 2024
Part-aware Personalized Segment Anything Model for Patient-Specific
  Segmentation
Part-aware Personalized Segment Anything Model for Patient-Specific Segmentation
Chenhui Zhao
Liyue Shen
VLM
19
3
0
08 Mar 2024
InstructGIE: Towards Generalizable Image Editing
InstructGIE: Towards Generalizable Image Editing
Zichong Meng
Changdi Yang
Jun Liu
Hao Tang
Pu Zhao
Yanzhi Wang
DiffM
39
6
0
08 Mar 2024
SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in
  Videos by Prompt Denoising
SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising
Tao Zhou
Wenhan Luo
Qi Ye
Zhiguo Shi
Jiming Chen
VLM
47
3
0
07 Mar 2024
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
Jun-Yan He
Yifan Wang
Lijun Wang
Huchuan Lu
Jun-Yan He
Jinpeng Lan
Bin Luo
Xuansong Xie
MLLM
VLM
24
19
0
05 Mar 2024
Grounding Language Models for Visual Entity Recognition
Grounding Language Models for Visual Entity Recognition
Zilin Xiao
Ming Gong
Paola Cascante-Bonilla
Xingyao Zhang
Jie Wu
Vicente Ordonez
VLM
33
8
0
28 Feb 2024
VRP-SAM: SAM with Visual Reference Prompt
VRP-SAM: SAM with Visual Reference Prompt
Yanpeng Sun
Jiahui Chen
Shan Zhang
Xinyu Zhang
Qiang Chen
Gang Zhang
Errui Ding
Jingdong Wang
Zechao Li
36
31
0
27 Feb 2024
Video as the New Language for Real-World Decision Making
Video as the New Language for Real-World Decision Making
Sherry Yang
Jacob Walker
Jack Parker-Holder
Yilun Du
Jake Bruce
Andre Barreto
Pieter Abbeel
Dale Schuurmans
VGen
21
45
0
27 Feb 2024
A Simple Framework Uniting Visual In-context Learning with Masked Image
  Modeling to Improve Ultrasound Segmentation
A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation
Yuyue Zhou
B. Felfeliyan
Shrimanti Ghosh
Jessica Knight
Fatima Alves-Pereira
Christopher Keen
Jessica Küpper
A. Hareendranathan
Jacob L. Jaremko
29
0
0
22 Feb 2024
UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning
  Evaluation in Diffusion Models
UnlearnCanvas: Stylized Image Dataset for Enhanced Machine Unlearning Evaluation in Diffusion Models
Yihua Zhang
Chongyu Fan
Yimeng Zhang
Yuguang Yao
Jinghan Jia
...
Gaoyuan Zhang
Gaowen Liu
Ramana Rao Kompella
Xiaoming Liu
Sijia Liu
DiffM
32
4
0
19 Feb 2024
Data-efficient Large Vision Models through Sequential Autoregression
Data-efficient Large Vision Models through Sequential Autoregression
Jianyuan Guo
Zhiwei Hao
Chengcheng Wang
Yehui Tang
Han Wu
Han Hu
Kai Han
Chang Xu
VLM
13
10
0
07 Feb 2024
Generalizable Entity Grounding via Assistance of Large Language Model
Generalizable Entity Grounding via Assistance of Large Language Model
Lu Qi
Yi-Wen Chen
Lehan Yang
Tiancheng Shen
Xiangtai Li
Weidong Guo
Yu-Syuan Xu
Ming-Hsuan Yang
VLM
57
9
0
04 Feb 2024
Can MLLMs Perform Text-to-Image In-Context Learning?
Can MLLMs Perform Text-to-Image In-Context Learning?
Yuchen Zeng
Wonjun Kang
Yicong Chen
Hyung Il Koo
Kangwook Lee
MLLM
23
9
0
02 Feb 2024
Tyche: Stochastic In-Context Learning for Medical Image Segmentation
Tyche: Stochastic In-Context Learning for Medical Image Segmentation
Marianne Rakic
Hallee E. Wong
Jose Javier Gonzalez Ortiz
Beth Cimini
John Guttag
Adrian V. Dalca
20
10
0
24 Jan 2024
OMG-Seg: Is One Model Good Enough For All Segmentation?
OMG-Seg: Is One Model Good Enough For All Segmentation?
Xiangtai Li
Haobo Yuan
Wei Li
Henghui Ding
Size Wu
Wenwei Zhang
Yining Li
Kai Chen
Chen Change Loy
VLM
MLLM
ViT
69
48
0
18 Jan 2024
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask
  Inpainting
A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting
Wouter Van Gansbeke
Bert De Brabandere
DiffM
22
11
0
18 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes
  Interactively
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
18
51
0
05 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
29
13
0
31 Dec 2023
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
Jiannan Wu
Yi-Xin Jiang
Bin Yan
Huchuan Lu
Zehuan Yuan
Ping Luo
VOS
24
17
0
25 Dec 2023
Generative Multimodal Models are In-Context Learners
Generative Multimodal Models are In-Context Learners
Quan-Sen Sun
Yufeng Cui
Xiaosong Zhang
Fan Zhang
Qiying Yu
...
Yueze Wang
Yongming Rao
Jingjing Liu
Tiejun Huang
Xinlong Wang
MLLM
LRM
45
244
0
20 Dec 2023
In-Context Reinforcement Learning for Variable Action Spaces
In-Context Reinforcement Learning for Variable Action Spaces
Viacheslav Sinii
Alexander Nikulin
Vladislav Kurenkov
Ilya Zisman
Sergey Kolesnikov
13
14
0
20 Dec 2023
3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V
3DAxiesPrompts: Unleashing the 3D Spatial Task Capabilities of GPT-4V
Dingning Liu
Xiaomeng Dong
Renrui Zhang
Xu Luo
Peng Gao
Xiaoshui Huang
Yongshun Gong
Zhihui Wang
24
10
0
15 Dec 2023
Tokenize Anything via Prompting
Tokenize Anything via Prompting
Ting Pan
Lulu Tang
Xinlong Wang
Shiguang Shan
VLM
18
22
0
14 Dec 2023
ScribblePrompt: Fast and Flexible Interactive Segmentation for Any
  Biomedical Image
ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image
Hallee E. Wong
Marianne Rakic
John Guttag
Adrian V. Dalca
16
17
0
12 Dec 2023
4M: Massively Multimodal Masked Modeling
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
34
62
0
11 Dec 2023
Flexible visual prompts for in-context learning in computer vision
Flexible visual prompts for in-context learning in computer vision
Thomas Foster
Ioana Croitoru
Robert Dorfman
Christoffer Edlund
Thomas Varsavsky
Jon Almazán
VLM
VOS
22
0
0
11 Dec 2023
UPOCR: Towards Unified Pixel-Level OCR Interface
UPOCR: Towards Unified Pixel-Level OCR Interface
Dezhi Peng
Zhenhua Yang
Jiaxin Zhang
Chongyu Liu
Yongxin Shi
Kai Ding
Fengjun Guo
Lianwen Jin
21
10
0
05 Dec 2023
Towards More Unified In-context Visual Understanding
Towards More Unified In-context Visual Understanding
Dianmo Sheng
Dongdong Chen
Zhentao Tan
Qiankun Liu
Qi Chu
Jianmin Bao
Tao Gong
Bin Liu
Shengwei Xu
Nenghai Yu
MLLM
VLM
22
10
0
05 Dec 2023
UniGS: Unified Representation for Image Generation and Segmentation
UniGS: Unified Representation for Image Generation and Segmentation
Lu Qi
Lehan Yang
Weidong Guo
Yu-Syuan Xu
Bo Du
Varun Jampani
Ming-Hsuan Yang
22
17
0
04 Dec 2023
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks
Jiarui Xu
Yossi Gandelsman
Amir Bar
Jianwei Yang
Jianfeng Gao
Trevor Darrell
Xiaolong Wang
VLM
16
3
0
04 Dec 2023
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large
  Language Models
Retrieval-augmented Multi-modal Chain-of-Thoughts Reasoning for Large Language Models
Bingshuai Liu
Chenyang Lyu
Zijun Min
Zhanyu Wang
Jinsong Su
Longyue Wang
LRM
10
7
0
04 Dec 2023
Universal Segmentation at Arbitrary Granularity with Language
  Instruction
Universal Segmentation at Arbitrary Granularity with Language Instruction
Yong Liu
Cairong Zhang
Yitong Wang
Jiahao Wang
Yujiu Yang
Yansong Tang
VLM
VOS
47
15
0
04 Dec 2023
Improving In-Context Learning in Diffusion Models with Visual
  Context-Modulated Prompts
Improving In-Context Learning in Diffusion Models with Visual Context-Modulated Prompts
Tianqi Chen
Yongfei Liu
Zhendong Wang
Jianbo Yuan
Quanzeng You
Hongxia Yang
Mingyuan Zhou
VLM
24
5
0
03 Dec 2023
Sequential Modeling Enables Scalable Learning for Large Vision Models
Sequential Modeling Enables Scalable Learning for Large Vision Models
Yutong Bai
Xinyang Geng
K. Mangalam
Amir Bar
Alan Yuille
Trevor Darrell
Jitendra Malik
Alexei A. Efros
MLLM
VLM
22
151
0
01 Dec 2023
InstructSeq: Unifying Vision Tasks with Instruction-conditioned
  Multi-modal Sequence Generation
InstructSeq: Unifying Vision Tasks with Instruction-conditioned Multi-modal Sequence Generation
Rongyao Fang
Shilin Yan
Zhaoyang Huang
Jingqiu Zhou
Hao Tian
Jifeng Dai
Hongsheng Li
MLLM
22
8
0
30 Nov 2023
Exploiting Diffusion Prior for Generalizable Dense Prediction
Exploiting Diffusion Prior for Generalizable Dense Prediction
Hsin-Ying Lee
Hung-Yu Tseng
Hsin-Ying Lee
Ming-Hsuan Yang
DiffM
MDE
19
16
0
30 Nov 2023
LLaFS: When Large Language Models Meet Few-Shot Segmentation
LLaFS: When Large Language Models Meet Few-Shot Segmentation
Lanyun Zhu
Tianrun Chen
Deyi Ji
Jieping Ye
Jun Liu
VLM
26
37
0
28 Nov 2023
SEGIC: Unleashing the Emergent Correspondence for In-Context
  Segmentation
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation
Lingchen Meng
Shiyi Lan
Hengduo Li
Jose M. Alvarez
Zuxuan Wu
Yu-Gang Jiang
VLM
ISeg
MLLM
23
6
0
24 Nov 2023
Visual In-Context Prompting
Visual In-Context Prompting
Feng Li
Qing Jiang
Hao Zhang
Tianhe Ren
Shilong Liu
...
Hongyang Li
Chun-yue Li
Jianwei Yang
Lei Zhang
Jianfeng Gao
VLM
LRM
MLLM
24
30
0
22 Nov 2023
An Embodied Generalist Agent in 3D World
An Embodied Generalist Agent in 3D World
Jiangyong Huang
Silong Yong
Xiaojian Ma
Xiongkun Linghu
Puhao Li
Yan Wang
Qing Li
Song-Chun Zhu
Baoxiong Jia
Siyuan Huang
LM&Ro
23
131
0
18 Nov 2023
Imagine the Unseen World: A Benchmark for Systematic Generalization in
  Visual World Models
Imagine the Unseen World: A Benchmark for Systematic Generalization in Visual World Models
Yeongbin Kim
Gautam Singh
Junyeong Park
Çağlar Gülçehre
Sungjin Ahn
OCL
VLM
29
1
0
15 Nov 2023
EviPrompt: A Training-Free Evidential Prompt Generation Method for
  Segment Anything Model in Medical Images
EviPrompt: A Training-Free Evidential Prompt Generation Method for Segment Anything Model in Medical Images
Yinsong Xu
Jiaqi Tang
Aidong Men
Qingchao Chen
VLM
MedIm
16
5
0
10 Nov 2023
PolyMaX: General Dense Prediction with Mask Transformer
PolyMaX: General Dense Prediction with Mask Transformer
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
...
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
21
14
0
09 Nov 2023
Instruct Me More! Random Prompting for Visual In-Context Learning
Instruct Me More! Random Prompting for Visual In-Context Learning
Jiahao Zhang
Bowen Wang
Liangzhi Li
Yuta Nakashima
Hajime Nagahara
VLM
20
15
0
07 Nov 2023
Foundational Models in Medical Imaging: A Comprehensive Survey and
  Future Vision
Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision
Bobby Azad
Reza Azad
Sania Eskandari
Afshin Bozorgpour
A. Kazerouni
I. Rekik
Dorit Merhof
VLM
MedIm
93
57
0
28 Oct 2023
Vision Language Models in Autonomous Driving: A Survey and Outlook
Vision Language Models in Autonomous Driving: A Survey and Outlook
Xingcheng Zhou
Mingyu Liu
Ekim Yurtsever
B. L. Žagar
Walter Zimmer
Hu Cao
Alois C. Knoll
VLM
20
33
0
22 Oct 2023
Previous
1234
Next