By Yann Rapaport, 6WIND Customer Support and Service Manager
This post is the first of a series of seven that will discuss High Availability (HA) capabilities for packet processing software. The first five posts will describe architecture concepts and the two last ones will illustrate these concepts using a real-world example.
Let’s start with the analysis of the system requirements.
A multicore-based system provides huge processing capabilities. A single failure in this kind of equipment can affect a very large number of users with long service interruptions and it is absolutely unacceptable. Multicore packet processing needs to implement specific mechanisms to provide HA-ready solutions.
A High Availability architecture is based on some inactive elements that are not in operational use. The goal of the system is to replace a failing active element by an inactive one and to restore the expected level of service within the shortest period of time. Several strategies can be implemented according to the requirements for service interruption.
Once a failure has been detected, an inactive element is configured to replace the failing one. This means that the whole configuration has to be restored and complete information has to be provided from the system to the new element to restart the service. If we take routing as an example, it means that the configuration of the routing protocols has to be performed on the new element and the routing protocols have to complete the route learning process to provide the service. This could take a long time (several minutes), which is not compatible with high availability requirements.
To avoid such long interruption of service, a more sophisticated architecture can be implemented based on a “1+1” architecture. As described in the following figure, a pair of elements (one active, one inactive) is used. A process is implemented that maintains a coherent view of the system in both elements. This process synchronizes the required information between both elements. In case of a failure of the active element, the inactive one has all the information ready to provide the expected level of service within a very short period time. If we take synchronization between routing protocols as an example, the inactive element receives all the routing table updates from the active element, ensuring that it has exactly the same level of information. It should be noted that each Control Plane protocol (ARP/NDP, IPsec, NAT, firewall…) manages its own specific information. So, a dedicated synchronization mechanism is required for each of them.
Beside synchronization mechanisms, the packet processing architecture also has to provide monitoring services and graceful restart capabilities.
Monitoring services periodically check the health of the packet processing components to detect possible issues, in order to prevent complete shutdown or to anticipate switching from the active to the inactive element. These services alert the HA framework that supervises the whole system. The HA framework makes the decision to re-launch a software component or to partially / totally reboot the system.
Graceful restart provides the capability for restarting packet processing software components without interrupting the traffic. Each key software component must be able to implement those features. If the software component does not implement internal states, it only has to be stopped and restarted. More complex protocols require specific mechanisms; some of them like the OSPF routing protocol have been standardized.
Finally, it is very valuable to use the system’s redundancy to enhance the availability of the equipment. A “1+1 architecture” provides redundant interfaces so the equipment can be connected to the network architecture to provide several physical paths. If we refer to the above figure, it means that the inactive Fast Path is now active.
Each part of the architecture will be detailed in further posts.
More information about 6WINDGate architecture can be found here.
6WINDGate High Availability Architecture Overview is available here.
You can check 6WINDGate FAQ here.



5 Comments
Hello Yann
In the 1+1 architecture, if your inactive CP get informations from the active CP, I think that it will be difficult to know if some informations were lost due to the hardware failure of the active CP. Perhaps a better way will be to have the same inputs on both CPs and to process the datas on both CPs with synchronisation of the processing and the outputs. In case of failure, the inactive CP has just to finish the processing of the last inputs and to outputs the results to be coherent with the next state of the broken active CP.
Bruno
It can work for some protocols which are made of stateless communications with the peers. But many protocols require some duplex exchanges. So, duplication of data would still create its other set of issues.
In order to handle failures of an active CP, the HA mechanisms have to be used along with the GR. Then, when there is a planned or unplanned outage, so when we have to do a switch of activity, the protocols have to execute their GR to recover any lost of data that the peers assume to be already provided.
The GRs are part of the protocols (see OSPF GR for instance).
- Vincent
I would also like to add that the data mcast to both active and standby has some reliability consequences. The biggest failure is usually the software failure triggered by some external events, such as an incoming data. If the same data is sent to both active and standby components, the probability of simultaneous failure of both active and standby is much higher. This is the reason why one of very critical rules for highly available system is to make sure that the data processing on active and standby is different. From that point of view, it is better to process data on active only, create required states (if any) and share these states with the standby. The only important addition in this mechanism is that you have to update the standby state BEFORE you update the active state to ensure that there is no state loss in the case of failure at any point of time.
Hi
You have mentioned 1+1 redundancy model as a whole. But we have inherent feature set of graceful restart in some of routing/switching (ospf). How both of them go hand in hand, in terms of manageability and aligning all of them under the same HA – umbrella
Moreover, checkpointing and maintaining consistent data for mutliple VRFs may amount to huge configuration (chunk of persistant storage) for stateful/stateless modules, wherein, these overhead procedures may eat up the advantage we are trying to achieve by doing HA.
I would love to see some data facts, or empirical data on reference platform for failovers leading to seamless transition.
Hi Saumya,
The redundancy model works hand in hand with some protocol extensions too like (BGP GR, OSPF GR): when the inactive blade becomes active, its OSPF (for instance) runs an OSPF GR in order to refresh and to get the current OSPF LSAs. In the meantime, the FIB is not flapping thanks to the route synchronization of the active/inactive model of the Control Planes.
In case of VRFs, when there are 100s or 1000s of VR, please read:
http://www.multicorepacketprocessing.com/implementing-virtual-routing-on-multicore-cpus-part-22/
you can distribute the activity of VRs on multiple CPs. It means that somewhere 1 CP will be active for a specific VR X and inactive for the VR Y while another CP will be inactive for VR X and active for the VR Y. It improves the distribution of the workload.
In order to see and play with data, I suggest that you “play” with the 6WINDGate running in a cluster of Qemu which are used to simulate a set of ATCA boards; or even better if you have a cluster of multicore CPUs, you can run and measure real numbers.