Preface
This Compute Backend Fabric Configuration Guide provides a detailed introduction to the standardized networking solution, configuration guidance, and maintenance manual for small-scale AI computing backend fabric. The solution implements a single-tier Clos network using Asterfusion data center switches, based on Rail-only architecture.
Target Audience
Intended for solution planners, designers, and on-site implementation engineers who are familiar with:
- Â Asterfusion data center switches
- Â RoCE, PFC, ECN, and related technologies
1 Overview
The Rail-only architecture is the ideal design for small-scale AI backend fabric.
As shown in the figure above, The Rail-only architecture adopts a single-tier network design, physically partitioning the entire cluster network into 8 independent rails. Communication between GPUs of different nodes is intra-rail, achieving single-hop connectivity.
Compared to the traditional Clos architecture, the Rail-only architecture eliminates the Spine layer. By reducing network tiers, it saves on the number of switches and optical modules, thereby reducing hardware costs. It is a low-cost, high-performance network architecture specifically tailored for large AI model training in small-scale compute clusters.
2 Typical Configuration Example
2.1 Network Topology
This example illustrates an AI cluster consisting of 32 compute nodes (128 GPUs total, 4 per server), with 4 CX732Q-N switches deployed as Leaf nodes. The key design principles are summarized as follows:
- Each GPU connects to a dedicated NIC; NICs follow the “NIC N to Leaf N” rule. Independent subnets per Rail.
- Single-tier Clos architecture.
- Easy RoCE enabled on Leaf switches.
The Gateway VLAN IP address planning is as follows:
Table 1: Gateway VLAN IP Address Planning
2.2 Configuration Overview
Table 2: Configuration Overview
| |
| Configure NIC-side interface breakout (Optional) |
| Configure Gateway VLAN and IP address |
| |
2.3 Configuring Leaf Switches
2.3.1 Configure NIC-side Interface Breakout (Optional)
When connecting 400G NICs to CX864E-N switches, split each 800G port into two 400G interfaces.
Table 3: Interface Breakout Configuration
After completing the configuration, verify the interface status using the `show interface summary` command.
2.3.2 Configure Gateway VLAN and IP Address
Table 4: Configuring VLAN and Interface IP Addresses
Verify VLAN configuration using the `show vlan summary` command.
2.3.3 Enable Easy RoCE
The CX-N series switches support queues 0-7 (8 queues in total). Queue 3 and queue 4 are lossless (supporting up to two lossless queues), while others are lossy.
The default template uses system-default DSCP mapping. PFC and ECN are enabled for queue 3 and queue 4, and Strict Priority (SP) scheduling is set for queues 6 and 7.
When creating a template, you can specify three parameters:
- cable-length: Specifies the cable length, affecting PFC and ECN parameter calculations. Options: `5m`/`40m`/`100m`/`300m`. If the exact length is unavailable, choose the closest value (e.g., choose `5m` for a 10m cable).
- incast-level: Specifies the traffic Incast model, affecting PFC parameters calculation. Options: `low` (e.g. 1:1) / `medium` (e.g. 3:1) / `high` (e.g. 10:1). `Low` is typically used for GPU backend fabric.
- traffic-model: Specifies the business type: throughput-sensitive, latency-sensitive, or balanced. This affects ECN parameters calculations. Options: `throughput`/`latency`/`balance`. `balance` and `throughput` are typically used for GPU backend fabric.
If the provided lossless RoCE configuration does not fully suit your scenario, refer to RoCE Parameter Adjustment/Optimization for fine-tuning.
Table 5: Enabling Easy RoCE
Verify RoCE configuration using the `show qos roce` command.
3 Maintenance
3.1 RoCE Parameter Adjustment/Optimization
When default configurations are insufficient, use the following commands to optimize performance.
3.1.1 Modify DSCP Mapping
Table 6: Modify DSCP Mapping
Note: The COS value represents the Queue ID the packet is mapped to.
3.1.2 Modify Queue Scheduling Policy
If the interface has been bound to a lossless RoCE policy, unbind it before modifying.
Table 7: Modify Queue Scheduling Policy
3.1.3 Adjust PFC and ECN Thresholds
ECN thresholds are adjusted via ‘min_th’, ‘max_th’, and ‘probability’:
- ‘min_th’ sets the lower absolute value for ECN marking (Bytes).
- ‘max_th’ sets the upper absolute value for ECN marking (Bytes).
- ‘probability’ sets the maximum marking probability [1-100].
PFC thresholds are adjusted via the dynamic threshold coefficient ‘dynamic_th’:
PFC threshold =  2dynamic_th× remaining available buffer. Other parameters can remain unchanged during modification.
Recommended values for CX864E-N:
- PFC dynamic_th: 1, 2, 3
- WRED min (Bytes): 1,000,000, 2,000,000, 3,000,000
- WRED max (Bytes): 8,000,000, 10,000,000, 12,000,000
- WRED probability (%): 10, 30, 50, 70, 90
Recommended values for other models:
- PFC dynamic_th: 1, 2, 3
- WRED min (Bytes): 1,000,000, 2,000,000, 3,000,000
- WRED max (Bytes): 4,000,000 , 5,000,000, 6,000,000
- WRED probability (%): 10, 30, 50, 70, 90
Note: Try ECN adjustment first, then PFC. Follow the principle: WRED Min < WRED Max < PFC xON < PFC xOFF. This ensures ECN triggers rate adjustment early during congestion to avoid unnecessary PFC, while still allowing PFC to trigger promptly when necessary to prevent packet loss.
The specific command lines for adjusting the PFC and ECN thresholds are as follows:
Table 8: Adjust PFC and ECN Thresholds
3.2 Common O&M Commands
3.2.1 Interface Status Maintenance
Table 9: Interface Status Information
3.2.2 Common Table Entry Maintenance
Table 10: Common Table Entries
3.2.3 RoCE Statistics Maintenance
Table 11: RoCE Statistics
4. Appendix: Configuration Files (Sample)
4.1 Leaf1
!
hostname Leaf1
!
interface loopback 0
 ip address 10.1.0.111/32
!
interface vlan 101
 ip address 10.10.1.1/26
exit
!
interface range ethernet 0/0-0/248
 switchport access vlan 101
!
qos roce lossless cable-length 5m incast-level low traffic-model throughput
qos service-policy roce_lossless_5m_low_throughput
4.2 Leaf2
!
hostname Leaf2
!
interface loopback 0
 ip address 10.1.0.112/32
!
interface vlan 102
 ip address 10.10.1.65/26
exit
!
interface range ethernet 0/0-0/248
switchport access vlan 102
!
qos roce lossless cable-length 5m incast-level low traffic-model throughput
qos service-policy roce_lossless_5m_low_throughput
4.3 Leaf3
!
hostname Leaf3
!
interface loopback 0
 ip address 10.1.0.113/32
!
interface vlan 103
 ip address 10.10.1.129/26
exit
!
interface range ethernet 0/0-0/248
switchport access vlan 103
!
qos roce lossless cable-length 5m incast-level low traffic-model throughput
qos service-policy roce_lossless_5m_low_throughput
!
4.4 Leaf4
!
hostname Leaf4
!
interface loopback 0
 ip address 10.1.0.114/32
!
interface vlan 104
ip address 10.10.1.193/26
exit
!
interface range ethernet 0/0-0/248
switchport access vlan 104
!
qos roce lossless cable-length 5m incast-level low traffic-model throughput
qos service-policy roce_lossless_5m_low_throughput
!