Program Semantic Inequivalence Game with Large Language Models

2 May 2025

Antonio Valerio Miceli-Barone

Vaishak Belle

Ali Payani

LRM

ArXiv (abs)PDF HTML Github (1★)

Main:8 Pages

4 Figures

Bibliography:4 Pages

4 Tables

Appendix:8 Pages

Abstract

Large Language Models (LLMs) can achieve strong performance on everyday coding tasks, but they can fail on complex tasks that require non-trivial reasoning about program semantics. Finding training examples to teach LLMs to solve these tasks can be challenging.

View on arXiv

Comments on this paper