Skip to main content

What You Should know about P4 Programming Language& P4 Programmable Switch

written by Asterfuison

January 26, 2022

What is P4 programming language?

P4 is a domain-specific programming language used to describe how a programmable forwarding hardware processes packets, which can be an ASIC, a FPGA or a NIC and so on. The full name Pis Programming Protocol-independent Packet Processors.

P4 originally designed for programmable switches (especially for the ASIC), now it has expanded to many scenarios. The term “target” is used to refer to the hardware. 

A network device usually includes a control plane and a data forwarding plane. P4 is designed to be used to programming the target’s data plane.

The following picture shows the difference between a traditional switch and a P4 programmable switch:

picture shows the difference between a traditional switch and a P4 programmable switch:

In a traditional switch, ASIC determines what functions its data plane can support, the control plane is responsible for processing packets (such as routing protocol packets), processing asynchronous events (such as port up/down), etc. Its purpose is to control the forwarding behavior by correctly setting various table of the ASIC. Therefore, the functions supported by the AISC determines what functions the switch can support. However, the P4 programmable switch is different, the functions of the data plane are not fixed but are defined by the program.

  • P4 is protocol independent: it does not even support the most common protocols like IP, TCP, VxLAN or MPLS. Instead, the programmers describe the header format and field names of the needed protocol in the program, which is then interpreted and processed by the compiled program and the target device.
  • Data Plane Programming: P4 programmability makes users to develop new and customized functions, removes unnecessary functions and tables to reduce complexity, meanwhile offers a better visibility, including diagnostics, telemetry, OAM, etc,. Modularity let users to combine packet forwarding behavior from the library, which can be compiled to many devices since the forwarding behavior is specified once. Instead of relying on ASICS, protocols are transmitted to software because of code-specific functionality that provides precise control of packets.  
The P4 programmable ASIC enables the network to serve applications

History of P4 programming language

The idea of P4 was originally born in 2013, proposed by Professor Nick Mckeown of Stanford University, and the first formal specification of the language was released in 2014, called P4_14. The first P4 workshop held in June 2015 at  Stanford University. After that, an updated specification P4_16 was released in 2016.

Professor Nick Mckeown not only has a good academic reputation, but also a pioneer in the SDN industry. He has led and participated in many SDN open-source projects: OpenFlow protocol, the first SDN controller NOX, etc. He has founded several successful SDN companies: Nicira (acquired by VMware), Barefoot Network (acquired by Intel), etc.

P4 programmable switches’ application scenarios

Firstly, is to replace traditional network elements. There needs to mention Facebook SilkRoad Layer4 load balancing implementation. By using the feature of its flexible schedule on-chip resources, it realized load balancing of up to 10 million stateful flow tables on Barefoot Tofino chip and the throughput can reach Tbps .The performance is far exceeding the Layer 4 load balancing equipment on the market.

Facebook SilkRoad Layer4 load balancing implementation.

The second is dedicated clusters. Network switching is responsible for forwarding packets which need cooperate with the server cluster to form a complete system. Thus, Programmable switching can be used to participate in multi-node distributed collaboration and coordination.​For example, pairwise unicast can be turned into multicast accelerated coordination. In addition, part of server’s logic can be offloaded to a programmable switch chip.​The Integration architecture of programmable switches and server clusters can optimize dedicated distributed clusters greatly.

Third, A fabric cluster of small switches formed from a CLOS group fabric, equivalent to a large switch. In a data center scenario, all it needs to implement is a cloud-native Fabric cluster. At present, there are two main methods implement fabric control for data centers. One is that the fabric only implements simple underlay routing, and the complex logic is undertaken by the host or SmartNIC, represented by SONiC. The other is cloud network functions such as tenant isolation, load balancing, flow control, INT, etc., are all sink to the fabric, represented by Stratum on ONF.

Fourth, Inband Network Telemetry (INT) which was the best-known feature when it came out. It is mainly solving the four pain points of Intranet traffic tracking:

Inband Network Telemetry (INT) with P4
  • First, which path does the network packet goes
  • Second, the reason why chooses this path, which protocol refer to
  • Third, how long does it stay at each hop node
  • Fourth, any other flows are sharing this physical link

INT takes full advantage P4’s programmable feature by adding INT tags to each hop, when the last hop finished, these tags are uploading into the backend systems to analyze the information needed in the previous four questions.  As a result, online packet-level visualization can be realized, which is very important for network diagnosis and monitoring, operation and maintenance, and also makes the network data plane increasingly transparent and detectable.

To summarize:

  • NF replaces or even optimizes the implementation of traditional network elements, such as load balancing, security, distributed denials of service (DDoS) attacks, firewall, cloud gateway, TAP network packet broker, etc.
  • Cluster accelerates specific distributed clusters through barefoot tofino programmable switches, such as NetCache accelerates distributed key-Value store,NetChain accelerates distributed coordination, SwitchML accelerates machine learning, etc.
  • Fabric is to build a data center switch through the CLOS architecture. The goal is to use network slicing and intranet load balancing to achieve cloud-native Fabric clusters.
  • Telemetry is mainly for data plane’s online diagnosis and visualization, making the ultra-high-speed data plane still observable.

What’s the relationship between P4 programmable switches and Intel Tofino1/2?

We have already mentioned the various application scenarios of P4 programmable switches, and we want to realize the functions of these scenarios, which need strong hardware support. The Intel Tofino1 and Intel Tofino2 chips are specially designed to support the P4 language.
If you think of P4 as a network device “manual”, then the Tofino chip is to understand and execute the instructions written in the P4 language tools, these instructions tell the chip how to deal with network traffic, such as how to forward packets, how to filter packets and so on. This allows you to customize the network device to your needs without having to replace the hardware.
Hardware and software synergy: Intel Tofino1 and Intel Tofino2 provide hardware platforms that support P4 programmability. This means that network operators can write P4 programs to define packet processing behavior on Tofino-based switches.
Flexibility: The combination of P4 and Tofino chips enables a high degree of flexibility and customization, allowing operators to implement new protocols, modify existing protocols and innovate in ways not possible with traditional fixed-function switches.
Performance: Both Intel Tofino1 and Intel Tofino2 provide the high-performance packet processing that is critical for modern data centers and networks requiring low latency and high throughput.

What is the difference between Intel Tofino1 and Intel Tofino2?

Intel Tofino1 and Intel Tofino2 are generations of programmable Ethernet switch ASICs developed by Barefoot Networks, which was acquired by Intel. Both are part of the Tofino family, designed for high-performance, flexible and customizable network applications. Here are the key differences between Intel Tofino1 and Intel Tofino2 :

Intel Tofino1Intel Tofino2
Throughput and Port DensityOffers up to 6.5 Tbps of throughput.Provides up to 12.8 Tbps of throughput, nearly doubling the capacity of Tofino1.
Port ConfigurationsSupports up to 256 x 25 Gbps ports or 65 x 100 Gbps ports.Supports up to 512 x 25 Gbps ports or 128 x 100 Gbps ports, significantly increasing port density.
Process NodeBuilt on a 16nm process technology.Utilizes a more advanced 7nm process technology, leading to better power efficiency and higher performance.
Programmability and FlexibilityBoth Tofino1 and Tofino2 are fully programmable using the P4 programming language, allowing for extensive customization of packet processing functions. However, Tofino2 includes enhancements in its architecture that enable more complex and efficient P4 programs, accommodating a broader range of use cases and optimizations.
Comparing to Tofino1 ,Tofino2 has some advanced capabilities:introduces new features and enhancements such as improved telemetry, advanced load balancing, and enhanced support for new protocols, building upon the capabilities of Tofino1.What’s more,Tofino2 has increased on-chip memory and buffering capabilities compared to Tofino1, which allows for better handling of high-throughput and latency-sensitive applications.

There are concerns about the future of P4 programmable switches following Intel’s announcement that it is discontinuing development of the Tofino family of chips. Asterfusion’s take on this issue is set out at the end of the article, so read on if interested.

Application Scenarios

Tofino1: Suitable for high-performance networking applications where programmability and customization are key, such as data centers and enterprise networks.
Tofino2: Extends the use cases of Tofino1 to include more demanding environments requiring higher throughput, greater scalability, and more complex processing tasks.
In summary, Tofino2 builds upon the foundation established by Tofino1 by offering higher performance, increased port density, improved power efficiency, and enhanced programmability. The transition from a 16nm to a 7nm process node also contributes to the significant advancements in Tofino2, making it more suitable for modern, high-demand networking environments.

Asterfusion P4 Programmable Switches based on Barefoot Tofino

Asterfusion offers 3.3Tbps-12.8 Tbps programmable network switches based on Intel tofino which are well-suited for Leaf/Spine fabric as well as smart gateway of data centers, enterprises, and cloud service providers’ network deployments.

Highlight 1: When P4 meets DPU, Intel Tofino encounters Marvell Octeon

Asterfusion X-T series is unique P4 switch which designed to combine high performance L2~L4 switching programmability and extensible stateful processing power from DPU for the first time in network history.

Intel Tofino switching silicon plus 800G data path connection with Marvell OcteonTX 9/10 DPU, Asterfusion combines P4 based data path on tofino switch and DPDK based traffic processing on ARM64 DPU to provide large stateful table for load balance, NAT and NVME over fabric applications.

X-T programmable switches adopt a computing-network integration architecture, it has the general programmability of CPU while retaining the extreme performance of Intel Tofino ASIC. The combination of the T-bit level fast path for high-performance wire-speed forwarding and the slow path of in-depth data processing, achieves in-depth service processing and application offloading, enhances data center overall computing power and efficiency.

Highlight 2: Ongoing Expert Support

Asterfusion began research on open-source networks since 2014 when has accumulated a wealth of expertise and R & D experience, enable provides expert-level support service to solve the various problems that encountered in the process of open networking development.

Highlight 3: Barefoot Tofino based P4 switch support various innovative application scenarios.

Tofino programmable switches support various network applications’ development. Based on the integrated computing &network architecture and expert support, the Bare Tofino P4 series programmable hardware platform can cope with the challenges of various innovative application scenarios.

Traffic scheduling gateway, precise port rate limit and traffic scheduling for the main and backup link, high performance delivers extreme ROI.

Traffic scheduling gateway, precise port rate limit and traffic scheduling for the main and backup link, high performance delivers extreme ROI.

NFV gateways, state-based load balancing/state-based network address translation, reducing the burden of data centers.

NFV gateways, state-based load balancing/state-based network address translation, reducing the burden of data centers.

Large /small flow separation, it can both meet the high bandwidth of large flow and high concurrency of small flow. 

P4 switch can support Large /small flow separation, it can both meet the high bandwidth of large flow and high concurrency of small flow. 

Distributed INT-driven intelligence network optimization, providing local real- time network telemetry information, improving the overall user experience of the application system.

P4 switch can support Distributed INT-driven intelligence network optimization, providing local real- time network telemetry information, improving the overall user experience of the application system.

Asterfusion Tofino based P4 Programmable switches’ software choices:

  • BSP patches for SONiC community version
  • DPDK, VPP and virtIO on octeon TX DPU

After Intel stopped the subsequent development of Tofino switching chip, P4 programmable still has a future?

At the beginning of 2023, Intel announced that it would stop the subsequent development of Tofino switching chip, does it mean that Tofino switches can not continue to operate, and there is no prospect for the development of P4 programmable?

No. On the one hand, P4 brings certain innovative value to network programmability, especially in the border gateway scenario. P4 switches have already had a lot of industrial landing practice.

On the other hand, Intel’s cessation of subsequent development of the Tofino series is not equal to a complete shutdown of production, and the released products are still in normal sales and maintenance.

We know that the more to the core backbone location, the higher the need for throughput, the lower the requirements for business flexibility.

In the border and gateway scenarios, the need to achieve a more complex network interconnection, P4 brings the business flexibility is more critical, Tofino on the performance of a number of T is more than enough, and there is no need to use the TF2 even as large as the bandwidth.

P4 programmable technology still has a wide range of application prospects, mainly reflected in the following aspects:

  1. Diversified hardware support: In addition to Intel Tofino, there are other vendors who provide hardware to support P4 programming, such as Barefoot Networks, Xilinx, etc., which means that P4 programming can continue to develop on different hardware platforms.
  2. Flexible Network Functions: The P4 programming language allows network engineers to customize the packet processing logic according to specific needs, thus enabling more flexible and efficient network functions.
  3. Community and Ecosystem: The P4 programming language has an active developer community and ecosystem that continues to drive technological advancement and application innovation.
  4. Emerging Application Scenarios: With the development of emerging technologies such as 5G, IoT, and edge computing, the demand for flexible and programmable networks will further increase, and the P4 programming language has a broad application prospect in these areas.

Asterfusion Barefoot Tofino based P4 programmable switches help customers to solve pain points of where needs to program their networking data plane, especially for teams who has the in-house expertise to program networking chips.

Asterfusion offers high educational discount on academia and research program for P4 programming switch research and experiments.

Welcome to consult bd@cloudswit.ch;

Latest Posts