written by Asterfuison
Whether it’s a conventional data center or a modern cloud-based one, a shared necessity exists: high availability.
Concurrently, a mutual challenge persists: the potential for a single point of failure within the network. Physical networking aimed to establish highly reliable connections between devices by employing LAG (Link Aggregation) technology. This ensures high availability of links between servers and switches or between switches themselves. However, LAG technology is not without its flaws and cannot entirely eliminate equipment failure risks.
Consequently, there’s a need for technology that expands the original singular connections between devices to multiple connections, while still guaranteeing multi-device redundancy and multi-link redundancy. Moreover, this technology must ensure robust intercommunication between devices. Enter MC-LAG (Multi-Chassis Link Aggregation), the focal point of this article. As an extension of LAG, it is crucial to comprehend the workings of LAG prior to understanding MC-LAG’s principles.
Link Aggregation group, often referred to as Port Channel, is an innovative technology that combines multiple physical connections into a single logical port at the data link layer. This process effectively increases bandwidth and enhances link redundancy. When utilized for connecting servers and switches, the physical topology can be observed in the illustration below.
In order to facilitate dynamic aggregation of physical links, Link Aggregation requires the implementation of LACP (Link Aggregation Control Protocol) between two devices for seamless negotiation.
LACP((Link Aggregation Control Protocol) ) is an integral part of the IEEE 802.3ad standard and functions as a method to manage combining several physical ports into one cohesive logical channel. This protocol enables a network device to automatically form bundles of links by sending LACP packets to its corresponding device.
LACP provides multi-link redundancy through multi-link load balancing executed by hashing information such as quintuples of packets. As a result, all link bandwidths can be efficiently utilized. Once the aggregated link is established, LACP diligently maintains the link state and auto-adjusts or dissolves the aggregated link should any changes in aggregation conditions occur.
LAG offers redundancy between one-to-one devices by allowing for an uninterrupted flow of data even if a physical link in the aggregated group disconnects. However, it doesn’t address data path interruptions caused by switch downtime. In response, MC-LAG was developed to solve this issue.
MC-LAG extends the original link aggregation technology from one-to-one devices to one-to-many devices, as depicted in the illustration below.
With node redundancy in place, traffic is appropriately load-balanced between two switching devices using a hash algorithm. Additionally, MC-LAG has a built-in anti-loop mechanism, eliminating the need for complex STP protocols or Layer 3 routing and forwarding configurations. This simplifies network configuration complexities.
MC-LAG relies on the LACP protocol’s working mechanism to achieve cross-device link redundancy. When negotiating with two switches, it needs to present itself as a single device in the LAG scenario. To accomplish this illusion, the system IDs of LACP packets within the cross-device redundant links must match; that is, SwitchA’s system ID during negotiation with the server must be identical to SwitchB’s system ID. This concept forms the foundation for MC-LAG’s ability to implement cross-device link aggregation.
As depicted in the diagram below, ServerA and ServerB are linked to the access switch via MC-LAG. In normal circumstances, traffic transmitted from ServerA to ServerB follows the forwarding path indicated by the green line, which defaults to Switch A.
If the connection between ServerB and SwitchA fails, either due to the physical port linking ServerB to SwitchA or the physical port linking SwitchA to ServerB, the forwarding path of the traffic from ServerA to ServerB changes, represented by the brown line.
If SwitchA encounters a device failure, SwitchB takes over the forwarding of traffic from ServerA to ServerB, following the route demonstrated by the red line.
If a PeerLink fault occurs, the traffic sent from ServerA to ServerB remains forwarded via the green path.
Networking Solution 2: Establishing MC-LAG at the Access Layer:The same MC-LAG technology is applicable to the application scenario where dual network cards of the server require active-active access. The server is dual-active connected to the two NICs to share the MAC. Dual NICs implement a flow-based load sharing strategy. Therefore, configure the port connected to the server as a member port of MC-LAG through MC-LAG, and the MAC and ARP entries of the two ports will be synchronized in real time.
As a cross-device link aggregation technology, M-LAG not only has the advantages of increasing bandwidth, improving link reliability, and load sharing, but also has the following advantages:
Overall, M-LAG is a powerful technology that offers a range of benefits for network optimization, including increased bandwidth, improved link reliability, load sharing, faster failover, and greater network stability. By providing a more flexible and scalable option for link aggregation, M-LAG can help organizations to build more efficient and reliable networks.
Asterfusion Networks is the leading provider of open networking infrastructure solutions. We provide an open, disaggregated, and highly programmable network fabric for next generation data centers and campus with white-box switching.