80
v1v2 (latest)

Benchmarking graph construction by large language models for coherence-driven inference

Main:10 Pages
16 Figures
6 Tables
Appendix:22 Pages
Abstract

We devise an algorithm to generate propositions that objectively instantiate graphs supporting coherence-driven inference. We also benchmark the ability of large language models (LLMs) to reconstruct coherence graphs from (a simple transformation of) propositions expressed in natural language, with promising results from a single prompt to reasoning-optimized LLMs. For example, o1/3/4-mini achieve perfect reconstruction half of the time on sparse graphs. Coherence-driven inference on consistency evaluations by LLMs may advance machine cognition capabilities.

View on arXiv
Comments on this paper