ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09403
  4. Cited By
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal
  Language Models

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

13 June 2024
Yushi Hu
Weijia Shi
Xingyu Fu
Dan Roth
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Ranjay Krishna
    LRM
ArXivPDFHTML

Papers citing "Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models"

14 / 14 papers shown
Title
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
Shaokun Zhang
Yi Dong
Jieyu Zhang
Jan Kautz
Bryan Catanzaro
Andrew Tao
Qingyun Wu
Zhiding Yu
Guilin Liu
LLMAG
OffRL
KELM
LRM
83
0
0
25 Apr 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
69
7
0
16 Mar 2025
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Ruihan Jin
Feihu Che
Zengqi Wen
J. Tao
LRM
44
7
0
04 Feb 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Yiyao Yu
Xinzhe Ni
Zicheng Lin
Jin Zeng
Yujiu Yang
LRM
45
12
0
08 Jan 2025
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
Xueqing Wu
Yuheng Ding
Bingxuan Li
Pan Lu
Da Yin
Kai-Wei Chang
Nanyun Peng
LRM
90
3
0
03 Dec 2024
m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Zixian Ma
Weikai Huang
Jieyu Zhang
Tanmay Gupta
Ranjay Krishna
47
18
0
17 Mar 2024
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Chris Liu
Renrui Zhang
Longtian Qiu
Siyuan Huang
Weifeng Lin
...
Hao Shao
Pan Lu
Hongsheng Li
Yu Qiao
Peng Gao
MLLM
116
106
0
08 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
124
349
0
01 Feb 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
130
681
0
19 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
126
895
0
21 Dec 2023
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
Penghao Wu
Saining Xie
LRM
42
105
0
21 Dec 2023
Analyzing Modular Approaches for Visual Question Decomposition
Analyzing Modular Approaches for Visual Question Decomposition
Apoorv Khandelwal
Ellie Pavlick
Chen Sun
32
4
0
10 Nov 2023
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
208
2,413
0
06 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
313
8,261
0
28 Jan 2022
1