v1v2 (latest)

Compositional Attention Networks for Machine Reasoning

8 March 2018

Drew A. Hudson

Christopher D. Manning

Papers citing "Compositional Attention Networks for Machine Reasoning"

50 / 330 papers shown

SRNN: Spatiotemporal Relational Neural Network for Intuitive Physics Understanding

Fei Yang

145

10 Nov 2025

Memorizing Long-tail Data Can Help Generalization Through Composition

384

18 Oct 2025

Taming a Retrieval Framework to Read Images in Humanlike Manner for Augmenting Generation of MLLMs

128

12 Oct 2025

The Artificial Intelligence Cognitive Examination: A Survey on the Evolution of Multimodal Evaluation from Recognition to Reasoning

Mayank Ravishankara

Varindra V. Persad Maharaj

ELM

202

05 Oct 2025

Can Constructions "SCAN" Compositionality ?

Ganesh Katrapati

Manish Shrivastava

108

24 Sep 2025

Reasoning in Computer Vision: Taxonomy, Models, Tasks, and Methodologies

Ayushman Sarkar

Mohd Yamani Idna Idris

Zhenyu Yu

LRM

168

14 Aug 2025

Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance

...

468

12 Aug 2025

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

164

04 Aug 2025

Think before You Simulate: Symbolic Reasoning to Orchestrate Neural Computation for Counterfactual Question AnsweringIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

282

12 Jun 2025

DeepTraverse: A Depth-First Search Inspired Network for Algorithmic Visual Understanding

Bin Guo

John H.L. Hansen

232

11 Jun 2025

Inherently Faithful Attention Maps for Vision Transformers

541

10 Jun 2025

A Good CREPE needs more than just Sugar: Investigating Biases in Compositional Vision-Language Benchmarks

199

09 Jun 2025

Multi-Sourced Compositional Generalization in Visual Question AnsweringInternational Joint Conference on Artificial Intelligence (IJCAI), 2025

217

29 May 2025

Pay Attention to What and Where? Interpretable Feature Extractor in Vision-based Deep Reinforcement Learning

Tien Pham

Angelo Cangelosi

214

14 Apr 2025

Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language ModelsComputer Vision and Pattern Recognition (CVPR), 2025

289

21 Mar 2025

Neuro Symbolic Knowledge Reasoning for Procedural Video Question Answering

429

19 Mar 2025

Visual Graph Question Answering with ASP and LLMs for Language ParsingInternational Conference on Logic Programming (ICLP), 2025

351

13 Feb 2025

The Quest for Visual Understanding: A Journey Through the Evolution of Visual Question Answering

429

13 Jan 2025

Consistency of Compositional Generalization across Multiple LevelsAAAI Conference on Artificial Intelligence (AAAI), 2024

251

18 Dec 2024

Editable-DeepSC: Reliable Cross-Modal Semantic Communications for Facial Editing

724

24 Nov 2024

Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning ScenariosNeural Information Processing Systems (NeurIPS), 2024

363

20 Nov 2024

A Comprehensive Survey on Visual Question Answering Datasets and Algorithms

289

17 Nov 2024

Understanding the Limits of Vision Language Models Through the Lens of the Binding ProblemNeural Information Processing Systems (NeurIPS), 2024

...

Taylor W. Webb

411

31 Oct 2024

Compositional Physical Reasoning of Objects and Events from VideosIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

368

02 Aug 2024

SADL: An Effective In-Context Learning Method for Compositional Visual QA

Truyen Tran

280

02 Jul 2024

On the Role of Visual Grounding in VQA

Daniel Reich

Tanja Schultz

221

26 Jun 2024

How Could AI Support Design Education? A Study Across Fields Fuels Situating Analytics

115

26 Apr 2024

Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts

Övgü Özdemir

Erdem Akagündüz

307

12 Apr 2024

REFACTOR: Learning to Extract Theorems from Proofs

183

26 Feb 2024

ContPhy: Continuum Physical Concept Learning and Reasoning from Videos

Joshua B. Tenenbaum

Chuang Gan

LRM

209

09 Feb 2024

Generalizing Visual Question Answering from Synthetic to Human-Written Questions via a Chain of QA with a Large Language ModelEuropean Conference on Artificial Intelligence (ECAI), 2024

340

12 Jan 2024

Towards Goal-Oriented Agents for Evolving Problems Observed via ConversationSGAI Conferences (SGAI), 2024

181

11 Jan 2024

STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question AnsweringAAAI Conference on Artificial Intelligence (AAAI), 2024

Yueqian Wang

Yuxuan Wang

Kai Chen

Dongyan Zhao

213

08 Jan 2024

Detection-based Intermediate Supervision for Visual Question Answering

178

26 Dec 2023

Interactive Visual Task Learning for Robots

Weiwei Gu

Anant Sah

N. Gopalan

233

20 Dec 2023

EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering

Yanfei Zhong

182

19 Dec 2023

Benchmarks for Physical Reasoning AI

355

17 Dec 2023

GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models

Haicheng Liao

Chengzhong Xu

245

06 Dec 2023

Attribute Diversity Determines the Systematicity Gap in VQAConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ian Berlot-Attwell

Kumar Krishna Agrawal

A. M. Carrell

Yash Sharma

Naomi Saphra

260

15 Nov 2023

GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs

Chuang Gan

310

08 Nov 2023

ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life VideosConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Nischal Reddy Chandra

Marjorie Freedman

R. Weischedel

Nanyun Peng

283

02 Nov 2023

3D-Aware Visual Question Answering about Parts, Poses and OcclusionsNeural Information Processing Systems (NeurIPS), 2023

318

27 Oct 2023

What's Left? Concept Grounding with Logic-Enhanced Foundation ModelsNeural Information Processing Systems (NeurIPS), 2023

Joy Hsu

Jiayuan Mao

Joshua B. Tenenbaum

Jiajun Wu

VLM ReLM LRM

384

24 Oct 2023

Harnessing Dataset Cartography for Improved Compositional Generalization in TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Osman Batur .Ince

Tanin Zeraati

Semih Yagcioglu

Yadollah Yaghoobzadeh

Erkut Erdem

Aykut Erdem

165

18 Oct 2023

LLark: A Multimodal Instruction-Following Language Model for MusicInternational Conference on Machine Learning (ICML), 2023

322

11 Oct 2023

InstructDET: Diversifying Referring Object Detection with Generalized InstructionsInternational Conference on Learning Representations (ICLR), 2023

...

441

08 Oct 2023

Sentence Attention Blocks for Answer GroundingIEEE International Conference on Computer Vision (ICCV), 2023

Seyedalireza Khoshsirat

Chandra Kambhamettu

207

20 Sep 2023

D3: Data Diversity Design for Systematic Generalization in Visual Question Answering

175

15 Sep 2023

Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference UnderstandingEuropean Conference on Computer Vision (ECCV), 2023

Cheng Shi

Sibei Yang

LRM

166

03 Sep 2023

Learning the meanings of function words from grounded language using a visual question answering modelCognitive Sciences (CogSci), 2023

Eva Portelance

Michael C. Frank

Dan Jurafsky

NAI

274

16 Aug 2023