Multiagent Transition Systems with Safety and Liveness Faults: A Compositional Foundation for Fault-Resilient Distributed Computing

We present a novel mathematical framework for the specification and analysis of distributed computing systems and their implementations, with the following components: (1) Transition systems that allow the specification and analysis of computations with safety and liveness faults and their fault resilience. (2) Notions of safe, live and complete implementations among transition systems and their composition, with which the correctness (safety and liveness) and completeness of a protocol stack as a whole follows from each protocol implementing correctly and completely the protocol above it in the stack. (2) Applying the notion of monotonicity, pertinent to histories of distributed computing systems, to ease the specification and proof of correctness of implementations among distributed computing systems. (4) Multiagent transition systems, further characterized as centralized/distributed and synchronous/asynchronous; safety and liveness fault-resilience of implementations among them and their composition. (5) An algebraic and operational characterization of a protocol (family of distributed multiagent transition systems) being grassroots, which means that the protocol may be deployed independently at multiple locations and over time, and that such deployments can subsequently interoperate once interconnected; sufficient conditions for a protocol to be grassroots; and the notion of a grassroots implementation among protocols. The framework is being employed in the specification of a grassroots ordering consensus protocol stack, with sovereign cryptocurrencies, an NFT trade network, and an efficient Byzantine atomic broadcast protocol as the first applications.
View on arXiv