To meet the UEC’s target with one-way latency of 2–10 μs and average network utilization of up to 85%, Asterfusion has introduced a suite of advanced technologies that, in a 256K-GPU cluster built on a 2-tier Clos network, reduce leaf-to-leaf latency to under 2 μs and boost average network utilization to over 96.8%.
256K GPU
Ultra Large Scale
Under 2μs
Leaf-to-leaf Latency
Over 96.8%
Average Network Utilization
RoCEv2 – PFC ECN
Enables lossless networking for AI.
Uses PFC for per-class lossless transport.
Uses DCQCN + ECN for per-flow bandwidth control.
Uses ECMP to improve bandwidth utilization.
INT-Driven Adaptive Routing (IAR)
Enables Adaptive Load Balancing (ALB) across multiple network paths.
Driven by real-time link utilization measured via In-Band Network Telemetry (INT).
Supports flow-based, flowlet-based, and packet spray forwarding with WCMP.
Improves network utilization up to 97%.
Flowlet
Splits a flow into multiple flowlets based on inter-packet idle time.
Forwards individual flowlets of a flow over different paths.
Provides better load balancing than flow-based ECMP.
Ensures in-order delivery at the destination NIC, no reordering required.
Packet Spray
Forwards individual packets of a flow over different paths.
Achieves near-perfect load balancing.
Aligned with UEC (Ultra Ethernet Consortium) standard.
Requires large buffers at destination NIC for packet reordering.
WCMP
Assigns different weights to different paths.
Forwards more traffic along higher-weighted paths.
Achieves near-optimal network utilization across multiple paths.
Support large-scale load balancing, 256ways x 4K groups.
Multicast Acceleration
Duplicates RDMA messages at the switch rather than on GPUs.
Significantly reduces the workload on GPUs.
Dramatically reduces network traffic.
PTP
Synchronizes clocks across all GPUs via PTP over Ethernet.
PTP Grandmaster mode and SyncE supported on the switch.
Ensures 20 ns clock accuracy via switch-based hardware timestamping.
Accelerates distributed GPU computing and reduces Job Completion Time (JCT).