ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.03316
  4. Cited By
IQA: Visual Question Answering in Interactive Environments

IQA: Visual Question Answering in Interactive Environments

9 December 2017
Daniel Gordon
Aniruddha Kembhavi
Mohammad Rastegari
Joseph Redmon
D. Fox
Ali Farhadi
    LM&Ro
ArXivPDFHTML

Papers citing "IQA: Visual Question Answering in Interactive Environments"

48 / 48 papers shown
Title
Visual Environment-Interactive Planning for Embodied Complex-Question Answering
Visual Environment-Interactive Planning for Embodied Complex-Question Answering
Ning Lan
Baoshan Ou
Xuemei Xie
G. Shi
LM&Ro
60
1
0
01 Apr 2025
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering
Kaixuan Jiang
Y. Liu
Weixing Chen
Jingzhou Luo
Ziliang Chen
Ling Pan
G. Li
Liang Lin
51
2
0
14 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
56
0
0
11 Mar 2025
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Rasoul Shafipour
David Harrison
Maxwell Horton
Jeffrey Marker
Houman Bedayat
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
Saman Naderiparizi
MQ
43
3
0
14 Oct 2024
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+
  Interaction Trajectories
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories
Yifan Song
Weimin Xiong
Xiutian Zhao
Dawei Zhu
Wenhao Wu
Ke Wang
Cheng Li
Wei Peng
Sujian Li
LLMAG
24
9
0
10 Oct 2024
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic
  Environments
ReALFRED: An Embodied Instruction Following Benchmark in Photo-Realistic Environments
Taewoong Kim
Cheolhong Min
Byeonghwi Kim
Jinyeon Kim
Wonje Jeung
Jonghyun Choi
LM&Ro
34
4
0
26 Jul 2024
A Survey on Vision-Language-Action Models for Embodied AI
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
Which way is `right'?: Uncovering limitations of Vision-and-Language
  Navigation model
Which way is `right'?: Uncovering limitations of Vision-and-Language Navigation model
Meera Hahn
Amit Raj
James M. Rehg
30
3
0
30 Nov 2023
Towards AGI in Computer Vision: Lessons Learned from GPT and Large
  Language Models
Towards AGI in Computer Vision: Lessons Learned from GPT and Large Language Models
Lingxi Xie
Longhui Wei
Xiaopeng Zhang
Kaifeng Bi
Xiaotao Gu
Jianlong Chang
Qi Tian
29
7
0
14 Jun 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and
  Mapping through Instruction Following
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
49
21
0
07 Apr 2023
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making
  using Language Guided World Modelling
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Kolby Nottingham
Prithviraj Ammanabrolu
Alane Suhr
Yejin Choi
Hannaneh Hajishirzi
Sameer Singh
Roy Fox
LLMAG
LM&Ro
17
76
0
28 Jan 2023
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved
  Visio-Linguistic Models in 3D Scenes
ScanEnts3D: Exploiting Phrase-to-3D-Object Correspondences for Improved Visio-Linguistic Models in 3D Scenes
Ahmed Abdelreheem
Kyle Olszewski
Hsin-Ying Lee
Peter Wonka
Panos Achlioptas
3DPC
20
28
0
12 Dec 2022
Navigating to Objects in the Real World
Navigating to Objects in the Real World
Théophile Gervet
Soumith Chintala
Dhruv Batra
Jitendra Malik
Devendra Singh Chaplot
27
122
0
02 Dec 2022
A General Purpose Supervisory Signal for Embodied Agents
A General Purpose Supervisory Signal for Embodied Agents
Kunal Pratap Singh
Jordi Salvador
Luca Weihs
Aniruddha Kembhavi
SSL
19
3
0
01 Dec 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
25
233
0
14 Jun 2022
Episodic Memory Question Answering
Episodic Memory Question Answering
Samyak Datta
Sameer Dharur
Vincent Cartillier
Ruta Desai
Mukul Khanna
Dhruv Batra
Devi Parikh
EgoV
11
31
0
03 May 2022
Habitat-Web: Learning Embodied Object-Search Strategies from Human
  Demonstrations at Scale
Habitat-Web: Learning Embodied Object-Search Strategies from Human Demonstrations at Scale
Ram Ramrakhya
Eric Undersander
Dhruv Batra
Abhishek Das
LM&Ro
13
109
0
07 Apr 2022
Continuous Scene Representations for Embodied AI
Continuous Scene Representations for Embodied AI
S. Gadre
Kiana Ehsani
Shuran Song
Roozbeh Mottaghi
23
46
0
31 Mar 2022
AssistQ: Affordance-centric Question-driven Task Completion for
  Egocentric Assistant
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
B. Wong
Joya Chen
You Wu
Stan Weixian Lei
Dongxing Mao
Difei Gao
Mike Zheng Shou
EgoV
27
27
0
08 Mar 2022
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following
Xiaofeng Gao
Qiaozi Gao
Ran Gong
Kaixiang Lin
Govind Thattai
Gaurav Sukhatme
LM&Ro
78
70
0
27 Feb 2022
Image-based Navigation in Real-World Environments via Multiple Mid-level
  Representations: Fusion Models, Benchmark and Efficient Evaluation
Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models, Benchmark and Efficient Evaluation
Marco Rosano
Antonino Furnari
Luigi Gulino
C. Santoro
G. Farinella
EgoV
27
5
0
02 Feb 2022
Shaping embodied agent behavior with activity-context priors from
  egocentric video
Shaping embodied agent behavior with activity-context priors from egocentric video
Tushar Nagarajan
Kristen Grauman
EgoV
LM&Ro
38
13
0
14 Oct 2021
Are you doing what I say? On modalities alignment in ALFRED
Are you doing what I say? On modalities alignment in ALFRED
Ting-Rui Chiang
Yi-Ting Yeh
Ta-Chung Chi
Yau-Shian Wang
17
1
0
12 Oct 2021
Pano-AVQA: Grounded Audio-Visual Question Answering on 360$^\circ$
  Videos
Pano-AVQA: Grounded Audio-Visual Question Answering on 360∘^\circ∘ Videos
Heeseung Yun
Youngjae Yu
Wonsuk Yang
Kangil Lee
Gunhee Kim
10
78
0
11 Oct 2021
Procedures as Programs: Hierarchical Control of Situated Agents through
  Natural Language
Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language
Shuyan Zhou
Pengcheng Yin
Graham Neubig
LM&Ro
9
1
0
16 Sep 2021
Knowledge-based Embodied Question Answering
Knowledge-based Embodied Question Answering
Sinan Tan
Mengmeng Ge
Di Guo
Huaping Liu
F. Sun
15
20
0
16 Sep 2021
A Persistent Spatial Semantic Representation for High-level Natural
  Language Instruction Execution
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution
Valts Blukis
Chris Paxton
D. Fox
Animesh Garg
Yoav Artzi
LM&Ro
212
133
0
12 Jul 2021
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement
  Learning Agents
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
Grgur Kovač
Rémy Portelas
Katja Hofmann
Pierre-Yves Oudeyer
ALM
16
6
0
02 Jul 2021
Learning to Map for Active Semantic Goal Navigation
Learning to Map for Active Semantic Goal Navigation
G. Georgakis
Bernadette Bucher
Karl Schmeckpeper
Siddharth Singh
Kostas Daniilidis
25
73
0
29 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
39
45
0
26 Jun 2021
A Survey on Human-aware Robot Navigation
A Survey on Human-aware Robot Navigation
Ronja Möller
Antonino Furnari
S. Battiato
Aki Härmä
G. Farinella
31
87
0
22 Jun 2021
RobustNav: Towards Benchmarking Robustness in Embodied Navigation
RobustNav: Towards Benchmarking Robustness in Embodied Navigation
Prithvijit Chattopadhyay
Judy Hoffman
Roozbeh Mottaghi
Aniruddha Kembhavi
13
55
0
08 Jun 2021
Hierarchical Task Learning from Language Instructions with Unified
  Transformers and Self-Monitoring
Hierarchical Task Learning from Language Instructions with Unified Transformers and Self-Monitoring
Yichi Zhang
J. Chai
20
78
0
07 Jun 2021
How to Train PointGoal Navigation Agents on a (Sample and Compute)
  Budget
How to Train PointGoal Navigation Agents on a (Sample and Compute) Budget
Erik Wijmans
Irfan Essa
Dhruv Batra
3DPC
17
10
0
11 Dec 2020
Generative Language-Grounded Policy in Vision-and-Language Navigation
  with Bayes' Rule
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule
Shuhei Kurita
Kyunghyun Cho
LM&Ro
9
23
0
16 Sep 2020
Semantic Curiosity for Active Visual Learning
Semantic Curiosity for Active Visual Learning
Devendra Singh Chaplot
Helen Jiang
Saurabh Gupta
Abhinav Gupta
ObjD
16
72
0
16 Jun 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
19
350
0
21 Apr 2020
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous
  Environments
Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments
Jacob Krantz
Erik Wijmans
Arjun Majumdar
Dhruv Batra
Stefan Lee
19
263
0
06 Apr 2020
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning
  Baselines
TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines
Jingxiang Lin
Unnat Jain
A. Schwing
LRM
ReLM
26
9
0
31 Oct 2019
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
Jonathan Gray
Kavya Srinet
Yacine Jernite
Haonan Yu
Zhuoyuan Chen
Demi Guo
Siddharth Goyal
C. L. Zitnick
Arthur Szlam
14
38
0
19 Jul 2019
Vision-and-Dialog Navigation
Vision-and-Dialog Navigation
Jesse Thomason
Michael Murray
Maya Cakmak
Luke Zettlemoyer
LM&Ro
32
322
0
10 Jul 2019
Embodied Visual Recognition
Embodied Visual Recognition
Jianwei Yang
Zhile Ren
Mingze Xu
Xinlei Chen
David J. Crandall
Devi Parikh
Dhruv Batra
27
26
0
09 Apr 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks,
  Resources, and Approaches
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
13
128
0
02 Apr 2019
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual
  Street Environments
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
Howard Chen
Alane Suhr
Dipendra Kumar Misra
Noah Snavely
Yoav Artzi
12
379
0
29 Nov 2018
Mapping Instructions to Actions in 3D Environments with Visual Goal
  Prediction
Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Dipendra Kumar Misra
Andrew Bennett
Valts Blukis
Eyvind Niklasson
Max Shatkhin
Yoav Artzi
LM&Ro
16
186
0
04 Sep 2018
CAD2RL: Real Single-Image Flight without a Single Real Image
CAD2RL: Real Single-Image Flight without a Single Real Image
Fereshteh Sadeghi
Sergey Levine
SSL
216
809
0
13 Nov 2016
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based
  Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Ramprasaath R. Selvaraju
Michael Cogswell
Abhishek Das
Ramakrishna Vedantam
Devi Parikh
Dhruv Batra
FAtt
18
19,446
0
07 Oct 2016
Multimodal Compact Bilinear Pooling for Visual Question Answering and
  Visual Grounding
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
Akira Fukui
Dong Huk Park
Daylen Yang
Anna Rohrbach
Trevor Darrell
Marcus Rohrbach
144
1,464
0
06 Jun 2016
1