Modern cloud applications (e.g., publish-subscribe systems, streaming telemetry, database and state-machine replication, and more) frequently exhibit one-to-many communication patterns and, at the same time, require sub-millisecond latencies and high throughput. IP multicast can achieve these requirements but has control- and data-plane scalability limitations that make it challenging to offer it as a service for hundreds of thousands of tenants, typical of cloud environments. Tenants, therefore, must rely on unicast-based approaches, e.g., application-layer or overlay-based, to support multicast in their applications, imposing overhead on throughput and end-host CPU utilization, with higher and unpredictable latencies.
In this talk, we will present Elmo, a system that overcomes the data- and control-plane scalability limitations that pose a barrier to multicast deployment in public clouds. Our key insight is that emerging programmable switches and the unique characteristics of data-center topologies (namely, symmetry and the limited number of switches on any path), enable the use of efficient source-routed multicast in these cloud environments. In our approach, software switches (e.g., PISCES: a P4-enabled OVS) encode the multicast forwarding policy inside packets which are processed by hardware switches (e.g., Barefoot Tofino) at line rate. Doing so alleviates the pressure on switching hardware resources (e.g., group tables) and control-plane overhead during churn. Besides presenting the overall architecture of Elmo, we will discuss the software switch optimizations that enable Elmo's design.
In a three-tier data-center topology with 27,000 hosts, our evaluation shows that Elmo can support a million multicast groups using a 325-byte packet header; with a 20 Gbps network interface card (NIC), software switches encapsulate these headers while meeting the line-rate throughput of the link; hardware switches require as few as 1,100 multicast group-table entries on average; and the traffic overhead remains within 5% of ideal multicast.