Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00598
Cited By
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
1 April 2022
Andy Zeng
Maria Attarian
Brian Ichter
K. Choromanski
Adrian S. Wong
Stefan Welker
F. Tombari
Aveek Purohit
Michael S. Ryoo
Vikas Sindhwani
Johnny Lee
Vincent Vanhoucke
Peter R. Florence
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language"
50 / 443 papers shown
Title
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
35
6
0
05 Jun 2024
AD-H: Autonomous Driving with Hierarchical Agents
Zaibin Zhang
Shiyu Tang
Yuanhang Zhang
Talas Fu
Yifan Wang
Yang Liu
Dong Wang
Jing Shao
Lijun Wang
H. Lu
52
3
0
05 Jun 2024
Position: Foundation Agents as the Paradigm Shift for Decision Making
Xiaoqian Liu
Xingzhou Lou
Jianbin Jiao
Junge Zhang
OffRL
LLMAG
45
6
0
27 May 2024
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Yang Zhang
Shixin Yang
Chenjia Bai
Fei Wu
Xiu Li
Zhen Wang
Xuelong Li
LLMAG
36
25
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
82
42
0
23 May 2024
A Survey of Robotic Language Grounding: Tradeoffs between Symbols and Embeddings
Vanya Cohen
J. Liu
Raymond J. Mooney
Stefanie Tellex
David Watkins
LM&Ro
43
12
0
21 May 2024
LLM+Reasoning+Planning for Supporting Incomplete User Queries in Presence of APIs
Sudhir Agarwal
A. Sreepathy
David H. Alonso
Prarit Lamba
LRM
60
1
0
21 May 2024
Meta-Control: Automatic Model-based Control Synthesis for Heterogeneous Robot Skills
Tianhao Wei
Liqian Ma
Rui Chen
Weiye Zhao
Changliu Liu
45
3
0
18 May 2024
SIGMA: An Open-Source Interactive System for Mixed-Reality Task Assistance Research
D. Bohus
Sean Andrist
Nick Saw
Ann Paradiso
Ishani Chakraborty
Mahdi Rad
38
9
0
16 May 2024
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation
Manh Luong
Khai Nguyen
Nhat Ho
Reza Haf
D.Q. Phung
Lizhen Qu
30
12
0
16 May 2024
A Prompt-driven Task Planning Method for Multi-drones based on Large Language Model
Yaohua Liu
27
0
0
14 May 2024
Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions
Xinglin Chen
Yishuai Cai
Yunxin Mao
Minglong Li
Wenjing Yang
Weixia Xu
Ji Wang
54
6
0
13 May 2024
OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs
Jiahao Nick Li
Yan Xu
Tovi Grossman
Stephanie Santosa
Michelle Li
36
13
0
06 May 2024
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
66
13
0
06 May 2024
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Yunhao Ge
Fangyin Wei
Siddharth Gururani
Nayeon Lee
Xuan Li
Huayu Chen
CoGe
DiffM
35
14
0
30 Apr 2024
SciDaSynth: Interactive Structured Knowledge Extraction and Synthesis from Scientific Literature with Large Language Model
Xingbo Wang
S. Huey
Rui Sheng
Saurabh Mehta
Fei Wang
44
4
0
21 Apr 2024
Action Contextualization: Adaptive Task Planning and Action Tuning using Large Language Models
Sthithpragya Gupta
Kunpeng Yao
Loic Niederhauser
A. Billard
31
1
0
19 Apr 2024
Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions
Leena Mathur
Paul Pu Liang
Louis-Philippe Morency
LLMAG
38
7
0
17 Apr 2024
Private Attribute Inference from Images with Vision-Language Models
Batuhan Tömekçe
Mark Vero
Robin Staab
Martin Vechev
VLM
PILM
68
7
0
16 Apr 2024
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
Juhong Min
Shyamal Buch
Arsha Nagrani
Minsu Cho
Cordelia Schmid
LRM
44
20
0
09 Apr 2024
Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Yutao Ouyang
Jinhan Li
Yunfei Li
Zhongyu Li
Chao Yu
K. Sreenath
Yi Wu
54
15
0
08 Apr 2024
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Zaid Khan
B. Vijaykumar
S. Schulter
Yun Fu
Manmohan Chandraker
LRM
ReLM
34
6
0
06 Apr 2024
Can only LLMs do Reasoning?: Potential of Small Language Models in Task Planning
Gawon Choi
Hyemin Ahn
LM&Ro
LRM
34
1
0
05 Apr 2024
Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity
Jacob Varley
Sumeet Singh
Deepali Jain
Krzysztof Choromanski
Andy Zeng
Somnath Basu Roy Chowdhury
Kumar Avinava Dubey
Vikas Sindhwani
LM&Ro
34
14
0
04 Apr 2024
VLRM: Vision-Language Models act as Reward Models for Image Captioning
Maksim Dzabraev
Alexander Kunitsyn
Andrei Ivaniuta
VLM
MLLM
31
3
0
02 Apr 2024
IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation
Jiacui Huang
Hongtao Zhang
Mingbo Zhao
Zhou Wu
LM&Ro
39
5
0
28 Mar 2024
Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Zhuowan Li
Bhavan A. Jasani
Peng Tang
Shabnam Ghadar
LRM
39
8
0
25 Mar 2024
SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models
Mengqi Zhou
Jun Hou
Chuanchen Luo
Yuxi Wang
Zhaoxiang Zhang
Junran Peng
60
0
0
23 Mar 2024
Natural Language as Policies: Reasoning for Coordinate-Level Embodied Control with LLMs
Yusuke Mikami
Andrew Melnik
Jun Miura
Ville Hautamaki
LM&Ro
LRM
66
4
0
20 Mar 2024
Grounding Spatial Relations in Text-Only Language Models
Gorka Azkune
Ander Salaberria
Eneko Agirre
42
0
0
20 Mar 2024
Improved Baselines for Data-efficient Perceptual Augmentation of LLMs
Théophane Vallaeys
Mustafa Shukor
Matthieu Cord
Jakob Verbeek
56
12
0
20 Mar 2024
What AIs are not Learning (and Why)
M. Stefik
44
0
0
19 Mar 2024
SpatialPIN: Enhancing Spatial Reasoning Capabilities of Vision-Language Models through Prompting and Interacting 3D Priors
Chenyang Ma
Kai Lu
Ta-Ying Cheng
Niki Trigoni
Andrew Markham
LRM
40
7
0
18 Mar 2024
Learning Useful Representations of Recurrent Neural Network Weight Matrices
Vincent Herrmann
Francesco Faccio
Jürgen Schmidhuber
23
7
0
18 Mar 2024
Context-aware LLM-based Safe Control Against Latent Risks
Quang Khanh Luu
Xiyu Deng
Anh Van Ho
Yorie Nakahira
54
4
0
18 Mar 2024
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
Runyu Ma
Jelle Luijkx
Zlatan Ajanović
Jens Kober
LM&Ro
LRM
40
7
0
14 Mar 2024
Scaling Instructable Agents Across Many Simulated Worlds
Sima Team
Maria Abi Raad
Arun Ahuja
Catarina Barros
F. Besse
...
Daan Wierstra
Duncan Williams
Nathaniel Wong
Sarah York
Nick Young
LM&Ro
115
38
0
13 Mar 2024
Human I/O: Towards a Unified Approach to Detecting Situational Impairments
Xingyu Bruce Liu
Jiahao Nick Li
David Kim
Xiang Ánthony' Chen
Andrea Colaço
34
13
0
06 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
67
12
0
05 Mar 2024
SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Yulei Niu
Wenliang Guo
Long Chen
Xudong Lin
Shih-Fu Chang
52
9
0
03 Mar 2024
Learning with Language-Guided State Abstractions
Andi Peng
Ilia Sucholutsky
Belinda Z. Li
T. Sumers
Thomas L. Griffiths
Jacob Andreas
Julie A. Shah
LM&Ro
49
13
0
28 Feb 2024
PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models
Dingkun Guo
Yuqi Xiang
Shuqi Zhao
Xinghao Zhu
Masayoshi Tomizuka
Mingyu Ding
Wei Zhan
32
10
0
26 Feb 2024
Language Agents as Optimizable Graphs
Mingchen Zhuge
Wenyi Wang
Louis Kirsch
Francesco Faccio
Dmitrii Khizbullin
Jürgen Schmidhuber
LLMAG
29
19
0
26 Feb 2024
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
Yao Mu
Junting Chen
Qinglong Zhang
Shoufa Chen
Qiaojun Yu
...
Wenhai Wang
Jifeng Dai
Yu Qiao
Mingyu Ding
Ping Luo
42
21
0
25 Feb 2024
Selective "Selective Prediction": Reducing Unnecessary Abstention in Vision-Language Reasoning
Tejas Srinivasan
Jack Hessel
Tanmay Gupta
Bill Yuchen Lin
Yejin Choi
Jesse Thomason
Khyathi Raghavi Chandu
24
7
0
23 Feb 2024
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation
Junting Chen
Yao Mu
Qiaojun Yu
Tianming Wei
Silang Wu
...
Wenqi Shao
Yu Qiao
Huazhe Xu
Mingyu Ding
Ping Luo
LM&Ro
34
11
0
22 Feb 2024
Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models
Jinyi Liu
Yifu Yuan
Jianye Hao
Fei Ni
Lingzhi Fu
Yibin Chen
Yan Zheng
LM&Ro
118
4
0
22 Feb 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
43
29
0
20 Feb 2024
Modularized Networks for Few-shot Hateful Meme Detection
Rui Cao
Roy Ka-Wei Lee
Jing Jiang
35
4
0
19 Feb 2024
Learning to Learn Faster from Human Feedback with Language Model Predictive Control
Jacky Liang
Fei Xia
Wenhao Yu
Andy Zeng
Montse Gonzalez Arenas
...
N. Heess
Kanishka Rao
Nik Stewart
Jie Tan
Carolina Parada
LM&Ro
61
34
0
18 Feb 2024
Previous
1
2
3
4
5
6
7
8
9
Next