ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.05298
  4. Cited By
Improving Vision-and-Language Reasoning via Spatial Relations Modeling

Improving Vision-and-Language Reasoning via Spatial Relations Modeling

9 November 2023
Cheng Yang
Rui Xu
Ye Guo
Peixiang Huang
Yiru Chen
Wenkui Ding
Zhongyuan Wang
Hong Zhou
    LRM
ArXivPDFHTML

Papers citing "Improving Vision-and-Language Reasoning via Spatial Relations Modeling"

7 / 7 papers shown
Title
Intelligence of Things: A Spatial Context-Aware Control System for Smart Devices
Intelligence of Things: A Spatial Context-Aware Control System for Smart Devices
Sukanth Kalivarathan
Muhmmad Abrar Raja Mohamed
Aswathy Ravikumar
S Harini
16
0
0
16 Apr 2025
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
VIKSER: Visual Knowledge-Driven Self-Reinforcing Reasoning Framework
Chunbai Zhang
Chao Wang
Yang Zhou
Yan Peng
LRM
ReLM
58
0
0
02 Feb 2025
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous
  Spatial Reasoning?
Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning?
Bowen Zhao
Leo Parker Dirac
Paulina Varshavskaya
VLM
LRM
16
0
0
25 Sep 2024
I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models
  Through 3D Reconstruction
I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction
Zaiqiao Meng
Hao Zhou
Yifang Chen
28
4
0
19 Jul 2024
SCAAT: Improving Neural Network Interpretability via Saliency
  Constrained Adaptive Adversarial Training
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
23
2
0
09 Nov 2023
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,689
0
11 Feb 2021
Unified Vision-Language Pre-Training for Image Captioning and VQA
Unified Vision-Language Pre-Training for Image Captioning and VQA
Luowei Zhou
Hamid Palangi
Lei Zhang
Houdong Hu
Jason J. Corso
Jianfeng Gao
MLLM
VLM
250
927
0
24 Sep 2019
1