Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.08129
Cited By
AVIS: Autonomous Visual Information Seeking with Large Language Model Agent
13 June 2023
Ziniu Hu
Ahmet Iscen
Chen Sun
Kai-Wei Chang
Yizhou Sun
David A. Ross
Cordelia Schmid
Alireza Fathi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AVIS: Autonomous Visual Information Seeking with Large Language Model Agent"
14 / 14 papers shown
Title
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Imad Eddine Toubal
Aditya Avinash
N. Alldrin
Jan Dlabal
Wenlei Zhou
...
Chun-Ta Lu
Howard Zhou
Ranjay Krishna
Ariel Fuxman
Tom Duerig
VLM
75
7
0
05 Mar 2024
Cross-modal Retrieval for Knowledge-based Visual Question Answering
Paul Lerner
Olivier Ferret
C. Guinaudeau
28
7
0
11 Jan 2024
V*: Guided Visual Search as a Core Mechanism in Multimodal LLMs
Penghao Wu
Saining Xie
LRM
49
122
0
21 Dec 2023
Vamos: Versatile Action Models for Video Understanding
Shijie Wang
Qi Zhao
Minh Quan Do
Nakul Agarwal
Kwonjoon Lee
Chen Sun
27
19
0
22 Nov 2023
How to Bridge the Gap between Modalities: Survey on Multimodal Large Language Model
Shezheng Song
Xiaopeng Li
Shasha Li
Shan Zhao
Jie Yu
Jun Ma
Xiaoguang Mao
Weimin Zhang
66
4
0
10 Nov 2023
Towards Robust Multi-Modal Reasoning via Model Selection
Xiangyan Liu
Rongxue Li
Wei Ji
Tao Lin
LLMAG
LRM
22
3
0
12 Oct 2023
AvalonBench: Evaluating LLMs Playing the Game of Avalon
Jonathan Light
Min Cai
Sheng Shen
Ziniu Hu
LLMAG
15
0
0
08 Oct 2023
HallE-Control: Controlling Object Hallucination in Large Multimodal Models
Bohan Zhai
Shijia Yang
Chenfeng Xu
Sheng Shen
Kurt Keutzer
Chunyuan Li
Manling Li
MLLM
18
12
0
03 Oct 2023
Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities
Hexiang Hu
Yi Luan
Yang Chen
Urvashi Khandelwal
Mandar Joshi
Kenton Lee
Kristina Toutanova
Ming-Wei Chang
VLM
43
55
0
22 Feb 2023
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
233
2,470
0
06 Oct 2022
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Felix Chern
Blake A. Hechtman
Andy Davis
Ruiqi Guo
David Majnemer
Surinder Kumar
94
22
0
28 Jun 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA
Zhengyuan Yang
Zhe Gan
Jianfeng Wang
Xiaowei Hu
Yumao Lu
Zicheng Liu
Lijuan Wang
169
402
0
10 Sep 2021
1