Kin-Yip Liu – Director, Customer Solutions Architecture – Cavium Networks
When describing packet processing applications, data plane processing is typically referred to as fast path, and control plane, management, and exception processing is typically referred to as slow path. Literally, the fast path needs to be highly optimized by removing unnecessary overhead, utilizing multiple cores, and applying HW acceleration. Since fast path processing typically does not require many OS services, an effective way to minimize overhead is to run the fast path cores without an OS. This approach is often called bare-metal. Although taking the bare-metal approach almost seems to be a SW architecture option, the benefits of taking the bare-metal approach cannot be realized without proper HW support. Most of the multicore processors on the market today do not offer the required HW support. In other words, most multicore processors today cannot support bare-metal fast path optimally.
What kind of HW support can enable optimal bare metal fast path processing?
At the minimum, fast path processing tasks need to be scheduled for execution on multiple cores. Memory management support is fundamentally required. Synchronization among processing on multiple cores needs to be supported in optimal manner, ideally without using SW locks. HW acceleration engines need to be accessible through well defined API. Ideally, the HW support for scheduling and synchronization understands packet flows and ordering/dependencies, as well as QoS levels for the processing tasks.
Cavium Network’s OCTEON processor families support all of the above features in HW for three generations already and in every single processor SKU. As such, OCTEON processors are built to excel in bare-metal fast path processing. These OCTEON HW features are complemented by SW library to form a fast path run time environment called Simple Executive.
What about other multicore processors which do not support these relevant HW features?
In these cases, some do not support bare-metal approach. Some claim to support bare-metal approach through proprietary kernel which is optimized by removing irrelevant features of a full blown kernel. However, without comprehensive HW support, such bare-metal approaches cannot be highly optimized. This is because synchronization and scheduling processing among multiple cores fundamentally takes significant overhead. Such overhead cannot be minimized without proper HW support.