Skip to main content

560ns Ultra-Low Latency: Asterfusion’s 800G AI Switch Architecture In-Depth Analysis

written by Asterfuison

April 16, 2025

In the era of explosive growth in AI training and inference, high-performance computing (HPC), and cloud computing, one thing is clear — network latency isn’t just a number anymore; it’s a game-changer. In cutting-edge scenarios like AIGC and ultra-low latency storage networks, even microseconds can be the difference between peak performance and a major bottleneck. That’s why Asterfusion is thrilled to introduce its flagship powerhouse — the CX864E-N 64x800G Ultra Ethernet Switch. Engineered for the most demanding environments, this switch boasts an industry-leading 560ns port forwarding latency, unlocking the blazing-fast performance required to fuel the next generation of AI-driven networks.

But this isn’t just a spec sheet marvel — the CX864E-N has already proven itself in the real world. It’s been in mass production since last year and is now deployed at scale in data centers run by top-tier internet giants and cloud service providers. For teams building high-bandwidth, ultra-low-latency, and energy-efficient networks, it’s quickly become the platform of choice.

Our 800G design made waves at MWC 2025 in Barcelona, drawing crowds of customers and industry veterans alike. So instead of keeping things under wraps, we’re doing the opposite — pulling back the curtain and showcasing what we’ve built. Because at Asterfusion, we believe in openness, collaboration, and driving the future of open networking together.

And here’s the best part: the CX864E-N is ready to ship in as little as two weeks. With top-tier performance, fast delivery, and unbeatable value, it’s a global benchmark for cost-effectiveness in its class. So now, let’s lift the curtain and take a deep dive into this groundbreaking switch — exploring its internal structure and engineering brilliance, and discovering how it redefines the 800G era of data center networking with architectural excellence.

01 Overview of Asterfusion’s 64-Port 800G AI Switch – CX864E-N External Hardware

Front Panel

On the front panel, you’ll find a compact 2U chassis housing an impressive array of 64x 800G OSFP ports. These ports support a range of speeds — 25G/50G/100G/200G/400G — enabling seamless migration from existing 100GE/200GE/400GE networks to next-generation 800GE infrastructure, thereby safeguarding your previous investments.

Asterfusion's 64-Port 800G AI Switch
Asterfusion CX864E-N front panel
Asterfusion's 64-Port 800G AI Switch

Each 800G port on the CX864E-N is equipped with a hot-swappable OSFP optical transceiver. Notably, Asterfusion designs, manufactures, and sells a wide range of 800G OSFP modules, with SKUs including 2VR4, 2SR4, 2DR4, 2FR4, and 2LR4, supporting transmission distances from 50 meters to 10 kilometers. This comprehensive portfolio meets the diverse connectivity needs within modern data centers. Fully tested and validated, these modules offer 100% compatibility with Asterfusion switches, ensuring a plug-and-play, worry-free experience for customers.

Asterfusion CX864E-N switch with 800G OSPF module
CX864E-N is equipped with OSFP optical transceiver

As shown in the image, this module is the OT-800G-OSFP-2VR4, based on the OSFP form factor and designed for transmission up to 50 meters. With a maximum power consumption of under 13.5W, it’s ideal for 8×100GbE Ethernet links and high-performance applications such as intra-data center and AI network interconnects. It delivers a powerful combination of performance and energy efficiency.

CX864E-N is equipped with OSFP optical transceiver
OT-800G-OSFP-2VR4

One of the key innovations of this in-house developed OSFP module is its integrated heatsink design — a core advantage. It ensures exceptional thermal performance under high-speed, high-load conditions. Combined with the module’s low power profile, it reduces the thermal burden on the switch system, enhancing both stability and long-term reliability of the entire device.

Want to dive deeper into the selection and technical specs of our 800G optical modules? Check out our detailed article: 👉 Explore the OT Series 800G Optical Modules.

On the management side, the CX864E-N is equipped with an RJ45 MGMT port, a USB 2.0 port, and an RJ45 Console port,giving you full flexibility for out-of-band management.

800G switch 's management ports
CX864E-N’s management ports

You’ll also notice six LED indicators here. On either side of the RJ45 port are:

  • Left (LINK/ACTIVE LED): Indicates network link and data activity for the MGMT port
  • Right (SYS LED): Displays overall system status

Alongside them is a vertical column of status LEDs, from top to bottom:

  • BMC – Baseboard Management Controller status
  • P – Power module status
  • F – Fan module status
  • L – Locator LED, used to identify the device during maintenance
800G switch LEDs
CX864E-N’s LEDs

In terms of airflow design, the front panel features three rows of small ventilation holes, arranged both horizontally and vertically. These air inlets allow cool external air to enter the chassis, working in tandem with the internal fan system to enhance overall thermal efficiency.

A special feature worth highlighting: the CX864E-N includes two additional 10G SFP+ ports, dedicated to enhancing in-band network telemetry (INT) and other advanced management functionalities. Why is this important? On a high-performance 800G switch, each port handles substantial data traffic, and any anomaly can have amplified consequences. These extra ports provide a granular, real-time traffic monitoring mechanism, essential for ensuring optimal network health and visibility. (Of course, customers are also free to repurpose the two 10G SFP+ ports to suit their specific network requirements.)

Curious about how INT works? Learn more here: 👉 What is INT (In-Band Network Telemetry)?

Rear Panel

Moving to the rear, the CX864E-N is equipped with 3+1 fan modules and 1+1 power supply units, providing robust cooling and stable power delivery. All components are hot-swappable, allowing for easy maintenance and replacement without any system downtime.

Asterfusion CX864E-N rear panel

The power supplies are mounted on the left, each rated at 3200W and compliant with 80Plus Titanium efficiency standards — giving you maximum power with minimal waste.

1+1 hot-swappable PSU modules
1+1 hot-swappable PSU modules
800G switch hot swappable Fan modules
3+1 hot-swappable fan modules

You might be wondering: Why only four fans, when many 800G switches on the market come with six or even eight? That’s not a compromise — it’s a result of smart internal design, which we’ll uncover in the next section.

02 Inside the 64-Port 800G AI Switch: A Deep Dive into Hardware Design

Now, let’s lift the lid on this high-performance switch and take a closer look at the engineering brilliance inside. From the moment the top cover is removed, you’re greeted by a meticulously crafted internal layout — compact, clean, and remarkably elegant. We’ll guide you from left to right as we explore the switch’s architecture and what makes it a standout in the 800G era.

800G internal design
Asterfusion CX864E-N Internal design

Advanced Cooling

The first thing that catches the eye is the large-scale heatsink module spanning across the mainboard — a custom-engineered 3D vapor chamber air-cooling solution. While some vendors opt for water cooling in high-power scenarios, Asterfusion takes a different approach: a highly efficient air-cooled design that performs just as well, even under full-load power conditions reaching up to 2180W.

This choice wasn’t just about performance — it’s about energy efficiency and operational simplicity. Our thermal solution provides stable cooling at a lower system power footprint compared to many industry peers. Even under heavy workloads, the system requires only ~60% fan speed to maintain optimal temperatures — effectively keeping noise levels low and minimizing environmental disruption. This reflects a thoughtful balance between cooling efficiency and acoustic performance.

large-scale heatsink module
large-scale heatsink module

We even removed the heatsink to show you its impressive scale compared to the switch itself.

But what lies beneath this massive heatsink?

The Heart of the System: Marvell Teralynx 10

Right below sits the powerhouse: the Marvell Teralynx 10 ASIC, strategically placed behind the OSFP interface array.

Marvell Teralynx 10 ASIC

This is the brain of the switch — a 5nm single-chip programmable switch boasting an eye-watering 51.2 Tbps of total bandwidth and ultra-low port-to-port latency of just ~560 ns, placing it firmly at the top of its class.

What makes the Teralynx 10 truly special isn’t just raw performance — it’s the deterministic, nanosecond-level latency that makes it ideal for AI training, inference, and large-scale parallel computing. In these applications, lower latency translates to faster synchronization, higher throughput, and reduced energy waste, all of which boost cluster-level efficiency.

Engineered for AI, HPC, and Beyond

To meet the demands of cutting-edge networking, Teralynx 10 delivers a wide array of architectural innovations:

  • 200+ MB of On-Chip Buffering: Critical for high-performance RoCE (RDMA over Converged Ethernet), this large buffer helps prevent congestion and queuing delays. Compared to external HBM solutions used by competitors, on-chip memory delivers lower power consumption, reduced latency, and better cost efficiency.
  • Built-in In-Band Telemetry (INT): Enables real-time tracking of packet paths, latency, and packet loss — providing deep visibility for congestion control& analytics, crucial in dynamic AI workloads.
  • Flowlet-Based Load Balancing: Flowlet scheduling optimizes traffic distribution at microburst levels, maintaining throughput while reducing buffer strain. With just ~200MB buffer, the chip achieves outstanding flow control even in complex, high-speed networks.
  • High-Radix Architecture (512×100GbE): Simplifies large-scale fabric designs. Move from complex three-tier topologies to streamlined two-tier architectures — cutting cables, costs, and complexity.
  • Industry-Leading Energy Efficiency :In real-world AI training deployments, TL10 can save over 1MW of power compared to competing solutions. It delivers superior performance per watt and rack density, giving you a massive TCO advantage.

In summary, Marvell’s Teralynx 10 is far more than a high-throughput switch chip — it’s a purpose-built engine for the next generation of AI, HPC, and data center fabric infrastructure, where programmability, bandwidth, and ultra-low latency are non-negotiable.

And yes — we’ll be sharing real-world latency test results on TL10-based hardware at the end of this article. Stay tuned.

Power Componets & PTP Module

Let’s continue our deep dive into the internal design of the CX864E-N:Just above the Teralynx 10 ASIC, you’ll notice a set of angled power components.This non-parallel layout isn’t just visually distinct; it’s been carefully optimized to maximize power integrity, ensuring clean, stable power delivery to the core ASIC and minimizing electrical noise during high-speed data transmission.

 Angled power components
Angled power components

Above the power modules sits our Precision Time Protocol (PTP) module, supporting up to 10ns accuracy for both PTP and SyncE synchronization. Designed as a pluggable, optional component, this module allows for flexible deployment based on customer needs. To make its structure and placement clear, we’ve included comparison visuals showing the switch with and without the PTP module installed.

COMe Module

This compact rectangular component is our COMe module, built on the x86 architecture and powered by an Intel Xeon processor. It delivers powerful computing capabilities and supports advanced networking features such as INT-based routing. Running on our self-developed SONiC-based AsterNOS operating system, it provides a stable and efficient control plane core for the switch, ensuring flexible orchestration and reliable operation even in complex network environments.

COMe module
COMe module

BMC Module

To the right of the COMe module is the BMC (Baseboard Management Controller) module. Like the PTP modules, it’s also pluggable, offering modularity and upgradeability for out-of-band management, allowing customers to scale performance and functionality as needed.

BMC module
BMC (Baseboard Management Controller) module

NVMe and SSD slots

On the left of the COMe module, we’ve integrated two full-length 2280 NVMe slots and one M.2 SATA slot (compatible with 2280 and 2242 sizes). This setup provides flexible local storage expansion for customers. Even more exciting — the two NVMe slots can optionally house up to two Hailo-10 AI acceleration engines, enabling real-time, low-latency, energy-efficient edge AI inference capabilities.

NVMe and M.2 SATA slots
NVMe and M.2 SATA slots

Efficient Fan System

On the cooling front, the switch is equipped with four hot-swappable fan modules located at the rear, forming a clean and efficient air-cooling system. Even under full-load operation at 2180W, this thermal design delivers stable and reliable performance — with no need to cram in extra fans to keep temperatures in check. The result? Lower power consumption, simplified system complexity, and better overall efficiency.

Remember: fewer components mean higher reliability and lower operational costs — exactly what modern data centers demand.

Four hot-swappable fan modules
Four hot-swappable fan modules

Engineering Excellence

One final engineering highlight: inside the entire switch chassis, only a single internal cable is used. All other board-to-board connections rely on high-performance board-to-board connectors. This design drastically reduces signal integrity issues and simplifies maintenance. Compared to traditional designs using multiple internal cables, our approach ensures higher reliability and better long-term performance stability.

A single internal cable
A single internal cable

And finally, let’s talk about what holds it all together — the PCB. Built using world-class manufacturing processes and industry-proven, high-performance materials, our PCB is engineered to meet the most demanding standards of 112G high-speed SerDes. With advanced techniques like VIPPO, blind vias, and back drilling, it ensures exceptional signal integrity, minimal loss, and reduced crosstalk — all critical for next-gen data transmission.

But it’s not just about performance — the internal structure is a true work of engineering art: clean, streamlined,elegant and powerful, with every layer designed for both function and form. It reflects not only our deep technical expertise but also our relentless commitment to industrial design and rock-solid reliability.

03.Asterfusion 800G Ultra Ethernet Switch Software Overview

The Asterfusion CX864E-N switch is powered by AsterNOS, our enterprise-grade SONiC distribution. We are committed to building the industry’s leading commercial SONiC platform, enabling customers to create high-performance, intelligent network systems. As an early member of the Ultra Ethernet Consortium (UEC), Asterfusion leverages Ultra Ethernet technologies to push network utilization to 90% and beyond, accelerating AI network deployments and the evolution of data centers.

SONIC for AI Data Center

Accelerating AI Networks, Unlocking Supercomputing Potential

The Asterfusion CX864E-N delivers over 90% network utilization for AI training and inference workloads. This is achieved through advanced technologies such as flowlet-based load balancing, INT-based intelligent routing, and WCMP. These innovations significantly enhance AI workload efficiency while reducing both the capital and operational expenditures of data center infrastructure.

👉 Learn more: The AI Data Center Revolution: Can Ultra Ethernet Unlock 90%+ Network Utilization?

Enterprise-Class Features, One Step Ahead

  • Enterprise Ready SONiC with Turnkey H/W Solution : AsterNOS builds on the community SONiC with enhanced enterprise-grade features like EVPN multi-homing, RoCEv2, and Ansible automation—tailored for complex deployments. Fully compatible with our own open network hardwares, it delivers a seamless, turnkey solution.
  • Accelerated Release Cadence & Responsive Support: Unlike the community’s biannual release schedule, AsterNOS follows a quarterly release cycle with frequent patch updates, ensuring faster response and resolution to customer needs and issues.
  • Expert Support & Tailored Services : With over 120 dedicated SONiC software engineers and nearly a decade of relentless innovation, we’ve refined our commercial SONiC distribution to deliver unmatched flexibility and expertise. From custom feature development and precision issue resolution to full-spectrum pre-sales consulting and responsive post-sales support—we’re here to power your success at every stage of the journey.
  • Dual-Style CLI for Enhanced Usability: In addition to a Linux-style Bash CLI, AsterNOS provides a Cisco-style command-line interface based on Klish. This dual-mode approach simplifies onboarding for network engineers and reduces the learning curve.

Appendix: Asterfusion 800G Low Latency Data Center Switch (TL10) Test Data

CX864E-N serpentine throughput test
CX864E-N’s Serpentine Throughput Test
800G Port to port forwarding delay
CX864E-N’s Port to Port Forwarding Delay

Below is a comparison of inference metrics when running Deepseek models with IB switches and Asterfusion RoCE switches. (The blue one is the IB switch and the yellow one is the AF ROCE switch)

Top left (lower is better); Top right (higher is better); Bottom left (lower is better); Bottom right (higher is better)

Conclusion

As AI, HPC, and cloud infrastructure evolve at breakneck speed, performance bottlenecks are no longer dictated by compute power—but by every microsecond of latency and every watt of power consumed across the network.

The Asterfusion CX864E-N was born to tackle this generational challenge head-on. Engineered for next-generation AI workloads and ultra-low latency networking, it delivers 560ns forwarding latency, 64×800G OSFP full-port density, a single-chip architecture powered by Marvell Teralynx 10, 200+ MB of on-chip buffer, a cable-free midplane interconnect, a custom PTP timing and AI module… Every component and circuit trace reflects precision engineering tailored for the demands of modern AI workloads.

At the software core, the CX864E-N runs AsterNOS—our enterprise-grade AI network operating system built on SONiC. It delivers a tightly integrated software-hardware architecture across both the control and data planes:

  • Supports advanced features like RoCEv2, EVPN Multi-homing, optimized for large-scale AI network orchestration and observability;
  • Designed for AI-specific traffic patterns with intelligent flow control mechanisms such as In-band Network Telemetry (INT) and WCMP, enabling flowlet-level link utilization;
  • Quarterly release cycles, rapid patching, and robust technical support ensure long-term stability and continuous innovation for fast-moving AI infrastructures.

This isn’t just a “spec bump” to a traditional switch architecture—it’s a full-scale reconstruction of the data center core.

The CX864E-N has already been validated in production environments by leading hyperscalers and cloud service providers. This is not a lab prototype or a concept device—it’s a battle-tested, high-density core switch, purpose-built for real-world 800G AI clusters.

Not a prototype. Not a concept machine. This is a battle-tested powerhouse built for real-world 800G AI networking.

And it’s rewriting the core metrics of next-generation AI data center networks.

Latest Posts