Large language models () have emerged as a powerful method for discovery. Instead of utilizing numerical data, LLMs utilize associated variable to predict variable relationships. Simultaneously, LLMs demonstrate impressive abilities to act as black-box optimizers when given an objective and sequence of trials. We study LLMs at the intersection of these two capabilities by applying LLMs to the task of : given a ground truth graph capturing variable relationships and a budget of edge experiments over rounds, minimize the distance between the predicted graph and at the end of the -th round. To solve this task we propose , a LLM-based pipeline incorporating two key components: 1) an LLM uncertainty-driven method for edge experiment selection 2) a local graph update strategy utilizing binary feedback from experiments to improve predictions for unselected neighboring edges. Experiments on eight different real-world graphs show our approach often outperforms all baselines including a state-of-the-art numerical method for interactive graph discovery. Further, we conduct a rigorous series of ablations dissecting the impact of each pipeline component. Finally, to assess the impact of memorization, we apply our interactive graph discovery strategy to a complex, new (as of July 2024) causal graph on protein transcription factors, finding strong performance in a setting where memorization is impossible. Overall, our results show IGDA to be a powerful method for graph discovery complementary to existing numerically driven approaches.
View on arXiv@article{havrilla2025_2502.17189, title={ IGDA: Interactive Graph Discovery through Large Language Model Agents }, author={ Alex Havrilla and David Alvarez-Melis and Nicolo Fusi }, journal={arXiv preprint arXiv:2502.17189}, year={ 2025 } }