ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.07568
  4. Cited By
Reasoning about Actions over Visual and Linguistic Modalities: A Survey

Reasoning about Actions over Visual and Linguistic Modalities: A Survey

15 July 2022
Shailaja Keyur Sampat
Maitreya Patel
Subhasish Das
Yezhou Yang
Chitta Baral
    ReLM
    LM&Ro
    LRM
ArXivPDFHTML

Papers citing "Reasoning about Actions over Visual and Linguistic Modalities: A Survey"

12 / 12 papers shown
Title
Perception in Reflection
Perception in Reflection
Yana Wei
Liang Zhao
Kangheng Lin
En Yu
Yuang Peng
...
Jianjian Sun
Haoran Wei
Zheng Ge
Xiangyu Zhang
Vishal M. Patel
31
0
0
09 Apr 2025
Hybrid Reasoning Based on Large Language Models for Autonomous Car
  Driving
Hybrid Reasoning Based on Large Language Models for Autonomous Car Driving
Mehdi Azarafza
M. Nayyeri
Charles Steinmetz
Steffen Staab
A. Rettberg
LRM
36
9
0
21 Feb 2024
Implicit Affordance Acquisition via Causal Action-Effect Modeling in the
  Video Domain
Implicit Affordance Acquisition via Causal Action-Effect Modeling in the Video Domain
Hsiu-yu Yang
Carina Silberer
19
1
0
18 Dec 2023
Large Language Models are Visual Reasoning Coordinators
Large Language Models are Visual Reasoning Coordinators
Liangyu Chen
Bo Li
Sheng Shen
Jingkang Yang
Chunyuan Li
Kurt Keutzer
Trevor Darrell
Ziwei Liu
VLM
LRM
34
47
0
23 Oct 2023
A Survey on Image-text Multimodal Models
A Survey on Image-text Multimodal Models
Ruifeng Guo
Jingxuan Wei
Linzhuang Sun
Khai Le-Duc
Guiyong Chang
Dawei Liu
Sibo Zhang
Zhengbing Yao
Mingjun Xu
Liping Bu
VLM
21
5
0
23 Sep 2023
A Survey on Interpretable Cross-modal Reasoning
A Survey on Interpretable Cross-modal Reasoning
Dizhan Xue
Shengsheng Qian
Zuyi Zhou
Changsheng Xu
LRM
29
4
0
05 Sep 2023
MuG: A Multimodal Classification Benchmark on Game Data with Tabular,
  Textual, and Visual Fields
MuG: A Multimodal Classification Benchmark on Game Data with Tabular, Textual, and Visual Fields
Jiaying Lu
Yongchen Qian
Shifan Zhao
Yuanzhe Xi
Carl Yang
VLM
19
3
0
06 Feb 2023
Learning Action-Effect Dynamics for Hypothetical Vision-Language
  Reasoning Task
Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task
Shailaja Keyur Sampat
Pratyay Banerjee
Yezhou Yang
Chitta Baral
19
2
0
07 Dec 2022
Learning Action-Effect Dynamics from Pairs of Scene-graphs
Learning Action-Effect Dynamics from Pairs of Scene-graphs
Shailaja Keyur Sampat
Pratyay Banerjee
Yezhou Yang
Chitta Baral
GNN
18
0
0
07 Dec 2022
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties
  via Video Question Answering
CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering
Maitreya Patel
Tejas Gokhale
Chitta Baral
Yezhou Yang
44
9
0
07 Nov 2022
Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense
  Language Understanding
Tiered Reasoning for Intuitive Physics: Toward Verifiable Commonsense Language Understanding
Shane Storks
Qiaozi Gao
Yichi Zhang
J. Chai
ReLM
LRM
39
22
0
10 Sep 2021
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
133
0
12 Jul 2021
1