Abstract
Content delivery networks and edge peering facilities
have unique operating constraints which require novel
approaches to load balancing. Contrary to traditional,
centralized datacenter networks, physical space is heavily
constrained. This limitation drives both the need for
greater efficiency, maximizing the ability to absorb denial
of service attacks and flash crowds at the edge, and
seamless failover, minimizing the impact of maintenance
on service availability.
This paper introduces Faild, a distributed load balancer
which runs on commodity hardware and achieves
graceful failover without relying on network state, providing
a cost-effective and scalable alternative to existing
proposals. Faild allows any individual component
of the edge network to be removed from service without
breaking existing connections, a property which has
proved instrumental in sustaining the growth of a large
global edge network over the past four years. As a consequence
of this operational experience, we further document
unexpected protocol interactions stemming from
misconfigured devices in the wild which have significant
ramifications for transport protocol design.