Rooted in the explosion of deep learning over the past decade, this thesis spans from AlphaGo to ChatGPT to empirically examine the fundamental concepts needed to realize the vision of an artificial scientist: a machine with the capacity to autonomously generate original research and contribute to the expansion of human knowledge. The investigation begins with Olivaw, an AlphaGo Zero-like agent that discovers Othello knowledge from scratch but is unable to communicate it. This realization leads to the development of the Explanatory Learning (EL) framework, a formalization of the problem faced by a scientist when trying to explain a new phenomenon to their peers. The effective EL prescriptions allow us to crack Zendo, a popular board game simulating the scientific endeavor. This success comes with a fundamental insight: an artificial scientist must develop its own interpretation of the language used to explain its findings, and not rely on a rigid existing interpreter. Questioning the very process of learning an interpreter, we turn our attention to the inner functioning of modern multimodal models. This culminates in a simple idea to build CLIP-like models where interpretation and perception are explicitly disentangled: a cost-effective approach that couples two unimodal models using little multimodal data and no further training. Finally, we discuss what ChatGPT and its siblings are still missing to become artificial scientists, and introduce the Big-Bench Symbol Interpretation Task, a benchmark about interpreting Zendo-like explanations that sees LLMs going no further than random chance while being instead fully solved by humans.
View on arXiv@article{norelli2025_2411.11672, title={ Artificial Scientific Discovery }, author={ Antonio Norelli }, journal={arXiv preprint arXiv:2411.11672}, year={ 2025 } }