Table of Contents
What is DPDK?
DPDK, or Data Plane Development Kit, is a powerful tool designed to enhance the performance of network applications by providing a streamlined framework for data plane operations. Essentially, it allows for the direct mapping of packets from network cards to user space, bypassing the kernel to achieve faster processing speeds. This capability is crucial for companies aiming to build efficient networking applications, such as virtualized services. In this context, VPP (Vector Packet Processing) operates in user space and often uses DPDK as a method for packet input, acting as an application that leverages DPDK’s capabilities for packet processing.
Why Do We Need DPDK?
DPDK (Data Plane Development Kit) is essential for accelerating packet processing on multi-core CPUs, enabling the creation of faster and more efficient networking applications. By offering a robust programming framework, DPDK empowers developers to handle high-speed data traffic effectively, making it a critical tool in high-performance environments like telecommunications and data centers.
Advantages of DPDK
DPDK is renowned for its robust features that cater to high-performance networking needs:
- User-Space Packet Processing: By executing operations in user space, DPDK reduces latency and boosts throughput, avoiding the overhead of kernel space
- Poll Mode Drivers (PMDs): These drivers continuously poll network interfaces for incoming packets, enhancing performance by eliminating the need for interrupts.
- Advanced Memory Management: DPDK includes mechanisms for efficient memory management, such as large memory pages and memory pools, optimizing buffer allocation, and deallocation.
- Multicore Scalability: Designed to exploit multicore architectures, DPDK allows parallel processing across multiple cores, significantly increasing performance.
- Support for Multiple NICs: It supports various vendor network interface cards, ensuring adaptability across different environments.
- Flexible Packet Processing Framework: DPDK provides libraries for both stateless and stateful packet processing, enabling the development of custom networking applications
- High Throughput and Low Latency: Focused on performance optimization, DPDK achieves high throughput rates while minimizing latency, essential for telecom and cloud services.
- Rich Ecosystem: With a vibrant community and numerous contributed libraries, DPDK’s functionality is continually enhanced, making it easier to use in diverse scenarios.
- Open Source: As an open-source project, DPDK allows developers to modify and improve the code, ensuring ongoing updates and feature enhancements.
These features make DPDK ideal for applications requiring fast and efficient packet processing, such as routers, firewalls, and other network appliances.
What is Vector Packet Processing (VPP)?
Vector Packet Processing (VPP) is a cutting-edge component of the Fast Data — Input/Output (FD.io) project, under the Linux Foundation’s umbrella. Its mission is to deliver a high-speed, user-space network stack that efficiently operates on common architectures like x86, Arm, and Power.
VPP excels by processing the largest available vector of packets from the network I/O layer and routing them through a sophisticated Packet Processing graph. By integrating with the Data Plane Development Kit (DPDK), VPP gains direct access to hardware network card resources, significantly boosting its processing capabilities.
Why We Need VPP?
Traditional packet processing methods handle packets one at a time, which can be inefficient under high network I/O conditions. VPP, however, leverages vector processing to offer several advantages:
- Batch Processing of Multiple Packets: VPP groups packets into a “vector” and processes them simultaneously at each node. This reduces the overhead of resource preparation and context switching, dramatically increasing per-packet processing speed.
- Utilizing SIMD Parallel Processing: Modern CPUs support Single Instruction Multiple Data (SIMD), allowing a single instruction to simultaneously operate on multiple data points. This parallelism significantly enhances processing speed compared to the scalar model. For example, Marvell OCTEON 10 ARM Neoverse N2 processors with Scalable Vector Extensions Version 2 (SVE2) can handle flexible vector lengths, processing up to 64 IPv4 addresses in a single instruction at the 2048-bit configuration.
- Optimizing Cache Usage: Vector processing takes advantage of modern CPUs’ large L1/L2 caches to load multiple packets simultaneously, minimizing the need for frequent memory access. For instance, the Marvell OCTEON 10 ARM Neoverse N2 processor’s 64KB L1 cache can store approximately 42 full 1500-byte packets or 3276 IPv4 headers at once, reducing cache-memory swaps and enhancing packet processing efficiency.
In summary, vector packet processing harnesses batch processing, SIMD parallelism, and optimized cache utilization to significantly enhance processing speed, making it exceptionally well-suited for high-performance networking environments
Optimizing Network Performance: Traditional Linux Stack vs. VPP
In the realm of high-performance networking, the traditional Linux network stack and Vector Packet Processing (VPP) represent two distinct approaches to packet processing. Understanding their differences is crucial for optimizing network performance.
The traditional Linux network stack operates within kernel space, prioritizing generality and software flexibility. However, it faces several bottlenecks in high-performance environments:
- Kernel-to-User Space Switching Overhead: Network data processing often requires frequent switches between user space and kernel space, introducing significant latency, especially under high traffic conditions.
- Layered Processing Overhead: The stack processes data layer-by-layer according to the OSI model. Each layer involves protocol parsing and data copying, which can lead to inefficiencies.
- Soft Interrupts and Single-Thread Limitations: The Linux kernel’s reliance on soft interrupts and single-threaded processing limits its ability to fully utilize multi-core CPUs. Even with Receive Side Scaling (RSS) distributing data flows across multiple cores, scheduling and synchronization overheads prevent full parallel processing.
Advantages of VPP’s User-Space Network Stack
VPP’s host stack, a user-space implementation of various transport, session, and application layer protocols, addresses these issues through several innovative methods:
- Reducing Context Switching: By operating in user space, VPP eliminates the need for kernel-to-user space switching. It leverages the Data Plane Development Kit (DPDK) to directly access network cards, bypassing the kernel network stack.
- Integrated Protocol Processing: VPP combines protocol layers such as TCP, IP, and Session, minimizing redundant data transfers. This integration allows protocol processing to occur in the same memory area, reducing the need for repeated data copying.
- User-Space Multi-Threaded Processing: Unlike the kernel’s single-threaded model, VPP fully utilizes modern multi-core CPUs by processing multiple data flows in parallel through a user-space thread pool. This approach reduces thread scheduling overhead and allows for more flexible task allocation, resulting in nearly linear improvements in throughput.
How does the DPDK and VPP Work together?
DPDK bypasses the Linux kernel, performing packet processing in user space to maximize networking performance. It achieves this through the use of a Poll-Mode Driver (PMD) running in user space, which continually checks incoming packet queues to see if new data has arrived, achieving both high throughput and low latency. PMD works in the data-link layer (Layer 2).
VPP focuses on networking protocols from Layer 2 to Layer 7 and uses DPDK as its network driver. This integration combines the L2 performance of DPDK with VPP’s agility across L3 to L7:
- Direct Hardware Access: VPP uses DPDK to directly access network hardware, avoiding kernel-to-user space context switching and eliminating most kernel-related overhead.
- Direct Memory Access: VPP leverages DPDK’s memory mapping to reduce memory copies and context switches, enabling direct access to network buffers without kernel involvement.
Through this integration, VPP achieves a complete user-space network stack, significantly enhancing network processing performance. This synergy allows applications to take full advantage of DPDK’s ability to minimize the overhead associated with packet processing, resulting in faster and more efficient operations.
Asterfusion: Powering Open Network Platforms with DPDK and VPP
Asterfusion leverages the power of DPDK and VPP to deliver exceptional performance on its Marvell OCTEON Tx-based open networking platforms. By providing a standard Linux OS integrated with the DPDK VPP software package, Asterfusion ensures that their solutions meet the demanding requirements of modern network infrastructures.
Since 2017, Asterfusion has been at the forefront of open network solutions, harnessing the capabilities of Marvell OCTEON products to cater to customers’ unique needs. With extensive industry experience, Asterfusion delivers tailored solutions designed to meet your specific requirements. Trust Asterfusion to enhance your network performance—contact us today to learn more!
Reference:
How Vector Packet Processing (VPP)Empower Asterfusion Marvell Octeon based Solution
https://bgp.potaroo.net/as2.0/bgp-active.html
https://developer.arm.com/documentation/102099/0003/Scalable-Vector-Extensions-support
https://dataplane-stack.docs.arm.com/en/nw-ds-2023.05.31/user_guide/tcp_term.html
https://www.dpdk.org/about/ https://nextgeninfra.io/wp-content/uploads/2020/07/AvidThink-Linux-Foundation-Myth-busting-DPDK-in-2020-Research-Brief-REV-B.pdf