On the Resilience of Fast Failover Routing Against Dynamic Link Failures

Modern communication networks feature local fast failover mechanisms in the data plane, to swiftly respond to link failures with pre-installed rerouting rules. This paper explores resilient routing meant to tolerate simultaneous link failures, ensuring packet delivery, provided that the source and destination remain connected. While past theoretical works studied failover routing under static link failures, i.e., links which permanently and simultaneously fail, real-world networks often face link flapping--dynamic down states caused by, e.g., numerous short-lived software-related faults. Thus, in this initial work, we re-investigate the resilience of failover routing against link flapping, by categorizing link failures into static, semi-dynamic (removing the assumption that links fail simultaneously), and dynamic (removing the assumption that links fail permanently) types, shedding light on the capabilities and limitations of failover routing under these scenarios. We show that -edge-connected graphs exhibit -resilient routing against dynamic failures for . We further show that this result extends to arbitrary if it is possible to rewrite bits in the packet header. Rewriting bits suffices to cope with semi-dynamic failures. However, on general graphs, tolerating dynamic failures becomes impossible without bit-rewriting. Even by rewriting bits, resilient routing cannot resolve dynamic failures, demonstrating the limitation of local fast rerouting.
View on arXiv