ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.19666
  4. Cited By
Take A Step Back: Rethinking the Two Stages in Visual Reasoning

Take A Step Back: Rethinking the Two Stages in Visual Reasoning

29 July 2024
Mingyu Zhang
Jiting Cai
Mingyu Liu
Yue Xu
Cewu Lu
Yong-Lu Li
    LRM
ArXivPDFHTML

Papers citing "Take A Step Back: Rethinking the Two Stages in Visual Reasoning"

7 / 7 papers shown
Title
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
Jaehyun Jeon
Janghan Yoon
Minsoo Kim
Sumin Shim
Yejin Choi
Hanbin Kim
Youngjae Yu
AAML
33
0
0
08 May 2025
Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human
  Activity Reasoning
Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning
Xiaoqian Wu
Yong-Lu Li
Jianhua Sun
Cewu Lu
45
16
0
29 Nov 2023
MiniGPT-v2: large language model as a unified interface for
  vision-language multi-task learning
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning
Jun Chen
Deyao Zhu
Xiaoqian Shen
Xiang Li
Zechun Liu
Pengchuan Zhang
Raghuraman Krishnamoorthi
Vikas Chandra
Yunyang Xiong
Mohamed Elhoseiny
MLLM
154
280
0
14 Oct 2023
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language
  Models
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models
Manli Shu
Weili Nie
De-An Huang
Zhiding Yu
Tom Goldstein
Anima Anandkumar
Chaowei Xiao
VLM
VPVLM
169
278
0
15 Sep 2022
Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel
  Space
Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space
Steeven Janny
Fabien Baradel
Natalia Neverova
M. Nadri
Greg Mori
Christian Wolf
CML
33
15
0
01 Feb 2022
PIP: Physical Interaction Prediction via Mental Simulation with Span
  Selection
PIP: Physical Interaction Prediction via Mental Simulation with Span Selection
Jiafei Duan
Samson Yu
Soujanya Poria
B. Wen
Cheston Tan
29
7
0
10 Sep 2021
Convolutional LSTM Network: A Machine Learning Approach for
  Precipitation Nowcasting
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Xingjian Shi
Zhourong Chen
Hao Wang
Dit-Yan Yeung
W. Wong
W. Woo
198
7,816
0
13 Jun 2015
1