Can Large Language Models Understand Intermediate Representations?

7 February 2025

Abstract

Intermediate Representations (IRs) are essential in compiler design and program analysis, yet their comprehension by Large Language Models (LLMs) remains underexplored. This paper presents a pioneering empirical study to investigate the capabilities of LLMs, including GPT-4, GPT-3, Gemma 2, LLaMA 3.1, and Code Llama, in understanding IRs. We analyze their performance across four tasks: Control Flow Graph (CFG) reconstruction, decompilation, code summarization, and execution reasoning. Our results indicate that while LLMs demonstrate competence in parsing IR syntax and recognizing high-level structures, they struggle with control flow reasoning, execution semantics, and loop handling. Specifically, they often misinterpret branching instructions, omit critical IR operations, and rely on heuristic-based reasoning, leading to errors in CFG reconstruction, IR decompilation, and execution reasoning. The study underscores the necessity for IR-specific enhancements in LLMs, recommending fine-tuning on structured IR datasets and integration of explicit control flow models to augment their comprehension and handling of IR-related tasks.

View on arXiv

@article{jiang2025_2502.06854,
  title={ Can Large Language Models Understand Intermediate Representations? },
  author={ Hailong Jiang and Jianfeng Zhu and Yao Wan and Bo Fang and Hongyu Zhang and Ruoming Jin and Qiang Guan },
  journal={arXiv preprint arXiv:2502.06854},
  year={ 2025 }
}

Comments on this paper