Skip to main content

AI Networking
Advanced Technologies

Powered by RoCEv2, INT-Driven Adaptive Routing, Flowlet, <br> Packet Spray, WCMP, Multicast Acceleration, and PTP for the AI Era.
Speed Meets Scale — Without Limits

AI Networking Advanced Technologies

Speed Meets Scale — Without Limits
  
  
  
  

To meet the UEC’s target with one-way latency of 2–10 μs and average network utilization of up to 85%, Asterfusion has introduced a suite of advanced technologies that, in a 256K-GPU cluster built on a 2-tier Clos network, reduce leaf-to-leaf latency to under 2 μs and boost average network utilization to over 96.8%.

256K GPU

Ultra Large Scale

Under 2μs

Leaf-to-leaf Latency

Over 96.8%

Average Network Utilization

AI Networking Advanced Technologies -RoCEv2

RoCEv2 – PFC ECN

  • Enables lossless networking for AI.
  • Uses PFC for per-class lossless transport.
  • Uses DCQCN + ECN for per-flow bandwidth control.
  • Uses ECMP to improve bandwidth utilization.

INT-Driven Adaptive Routing  (IAR)

  • Enables Adaptive Load Balancing (ALB) across multiple network paths.
  • Driven by real-time link utilization measured via In-Band Network Telemetry (INT).
  • Supports flow-based, flowlet-based, and packet spray forwarding with WCMP.
  • Improves network utilization up to 97%.
AI Networking Advanced Technologies - INT-Driven Adaptive Routing  (IAR)
AI Networking Advanced Technologies Flowlet

Flowlet

  • Splits a flow into multiple flowlets based on inter-packet idle time.
  • Forwards individual flowlets of a flow over different paths.
  • Provides better load balancing than flow-based ECMP.
  • Ensures in-order delivery at the destination NIC, no reordering required.

Packet Spray

  • Forwards individual packets of a flow over different paths.
  • Achieves near-perfect load balancing.
  • Aligned with UEC (Ultra Ethernet Consortium) standard.
  • Requires large buffers at destination NIC for packet reordering.
AI Networking Advanced Technologies Packet Spray
AI Networking Advanced Technologies WCMP

WCMP

  • Assigns different weights to different paths.
  • Forwards more traffic along higher-weighted paths.
  • Achieves near-optimal network utilization across multiple paths.
  • Support large-scale load balancing, 256ways x 4K groups.

Multicast Acceleration

  • Duplicates RDMA messages at the switch rather than on GPUs.
  • Significantly reduces the workload on GPUs.
  • Dramatically reduces network traffic.
AI Networking Advanced Technologies Multicast Acceleration
AI Networking Advanced Technologies PTP

PTP

  • Synchronizes clocks across all GPUs via PTP over Ethernet.
  • PTP Grandmaster mode and SyncE supported on the switch.
  • Ensures 20 ns clock accuracy via switch-based hardware timestamping.
  • Accelerates distributed GPU computing and reduces Job Completion Time (JCT).