1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) OPNFV, Intel Corporation and others.
5 .. OPNFV SAMPLEVNF Documentation design file.
7 ===================================
8 SampleVNF Highlevel Design
9 ===================================
13 This project provides a placeholder for various sample VNF (Virtual Network Function)
14 development which includes example reference architecture and optimization methods
15 related to VNF/Network service for high performance VNFs. This project provides benefits
16 to other OPNFV projects like Functest, Models, yardstick etc to perform real life
17 use-case based testing and NFVi characterization for the same.
18 The sample VNFs are Open Source approximations* of Telco grade VNF’s using optimized
19 VNF + NFVi Infrastructure libraries, with Performance Characterization of Sample† Traffic Flows.
20 • * Not a commercial product. Encourage the community to contribute and close the feature gaps.
21 • † No Vendor/Proprietary Workloads
25 The DPDK IP Pipeline Framework provides set of libraries to build a pipeline
26 application. In this document, CG-NAT application will be explained with its
29 This document assumes the reader possess the knowledge of DPDK concepts and IP
30 Pipeline Framework. For more details, read DPDK Getting Started Guide, DPDK
31 Programmers Guide, DPDK Sample Applications Guide.
35 These application provides a standalone DPDK based high performance different
36 Virtual Network Function implementation.
40 Common Code - L2L3 stack
41 -------------------------
45 L2L3 stack comprises of a set of libraries which are commonly used by all
46 other VNF's. The different components of this stack is shown in the picture
49 .. image:: l2l3-components.png
51 It comprises of following components.
55 (iii) ARP/ND & L2 adjacency Library
56 (iv) L3 stack components
61 Interface manager is a set of API's which acts as a wrapper for the physical
62 interfaces initialization & population. This set of api's assists in configuring
63 an ethernet device, setting up TX & RX queues & starting of the devices. It
64 provides various types of interfaces like L2 interface, IPv4/IPv6 interface.
65 It helps in Configuration (set/get) operations and status updates like (UP/DOWN)
66 from admin or operations. It provides Callback functions registration for other
67 components who wants to listen to interface status. It Maintains table of all
68 the interfaces present. It provides API for getting interface statistics.
70 It Provides wrapper APIs on top of DPDKs LAG(link Aggregation) APIs, This
71 includes creating/deleting BOND interfaces, knowing the properties like Bond mode,
72 xmit policy, link up delay, link monitor frequency.
77 It provides basic lock & unlock functions which should be used for synchronization
80 ARP/ND & L2 adjacency Library
81 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
83 The ARP/ND state machine is given in the following diagram.
85 .. image:: state-machine.png
87 This library provides api's for handling ARP/ICMPv4 & ND/ICMPV6 packets
88 handling. It provides api's for creating/deleting & populating an entry.
89 It handles ARP request/response messages, Handles ICMP v4 echo request &
90 echo response messages. It handles ND solicitation/advertisement messages
91 for IPV6 packets. It Provide API for L2 Adjacency creation/deletion and
92 retrieval based on nexthop & port_id. It handles Gratuitous ARP.
96 Basic commands for ARP/ND table
99 p 1 arpadd 0 <ip> <mac address> (for adding arp entry)
100 p 1 arpdel 0 <ip> (for deleting an arp entry)
101 p 1 arpreq 0 <ip> (for sending an arp request)
107 This library provides API for taking decision of whether pkt belongs to local
108 system or to forwarding.It Provides API for IPv4/IPv6 local packet out send
109 function. It Provides API for packet forwarding - LPM lookup function.
111 Common Code - Gateway routing
112 -----------------------------
117 Gateway common code is created to support routing functionality for both
118 network and direct attached interfaces. It is supported for both IPv4 and
121 The routeadd command is enhanced to support both net and host interfaces.
122 The net is used to define the gateway and host is used for direct
125 The routing tables are allocated per port basis limited for MAX_PORTS. The
126 number of route entries are supported upto 32 per interface. These sizes
127 can be changed at compile time based on the requirement. Memory is
128 allocated only for the nb_ports which is configured as per the VNF application
133 The next hop IP and Port numbers are retrieved from the routing table based on
134 the destinantion IP addreess. The destination IP address anded with mask is
135 looked in the routing table for the match. The port/interface number which
136 also stored as a part of the table entry is also retrieved.
138 The routing table will be added with entries when the routeadd CLI command is
139 executed through script or run time. There can be multiple routing entries per
142 The routeadd will report error if the match entry already exists or also if any
143 of parameters provide in the commands are not valied. Example the if port
144 number is bigger than the supported number ports/interface per application
148 Reference routeadd command
149 ^^^^^^^^^^^^^^^^^^^^^^^^^^
151 Following are typical reference commands and syntax for adding routes using the CLI.
155 ;routeadd <net/host> <port #> <ipv4 nhip address in decimal> <Mask/NotApplicable>
156 routeadd net 0 202.16.100.20 0xffff0000
157 routeadd net 1 172.16.40.20 0xffff0000
158 routeadd host 0 202.16.100.20
159 routeadd host 1 172.16.40.20
161 ;routeadd <net/host> <port #> <ipv6 nhip address in hex> <Depth/NotApplicable>
162 routeadd net 0 fec0::6a05:caff:fe30:21b0 64
163 routeadd net 1 2012::6a05:caff:fe30:2081 64
164 routeadd host 0 fec0::6a05:caff:fe30:21b0
165 routeadd host 1 2012::6a05:caff:fe30:2081
173 Following are the design requierments of the vFW.
175 - The firewall will examine packets and verify that they are appropriate for the
176 current state of the connection. Inappropriate packets will be discarded, and
177 counter(s) incremented.
178 - Support both IPv4 and IPv6 traffic type for TCP/UDP/ICMP.
179 - All packet inspection features like firewall, synproxy, connection tracker
180 in this component may be turned off or on through CLI commands
181 - The Static filtering is done thorugh ACL using DPDK libraries. The rules
182 can be added/modified through CLI commands.
183 - Multiple instance of the vFW Pipeline running on multipe cores should be
184 supported for scaling the performance scaling.
185 - Should follow the DPDK IP pipeline framework
186 - Should use the DPDK libraries and functionalities for better performance
187 - The memory should be allocated in Hugepages using DPDK RTE calls for better
193 The Firewall performs basic filtering for malformed packets and dynamic packet
194 filtering incoming packets using the connection tracker library.
195 The connection data will be stored using a DPDK hash table. There will be one
196 entry in the hash table for each connection. The hash key will be based on
197 source address/port,destination address/port, and protocol of a packet. The
198 hash key will be processed to allow a single entry to be used, regardless of
199 which direction the packet is flowing (thus changing source and destination).
200 The ACL is implemented as libray stattically linked to vFW, which is used for
201 used for rule based packet filtering.
203 TCP connections and UDP pseudo connections will be tracked separately even if
204 theaddresses and ports are identical. Including the protocol in the hash key
207 The Input FIFO contains all the incoming packets for vFW filtering. The vFW
208 Filter has no dependency on which component has written to the Input FIFO.
209 Packets will be dequeued from the FIFO in bulk for processing by the vFW.
210 Packets will be enqueued to the output FIFO.
212 The software or hardware loadbalancing can be used for traffic distribution
213 across multiple worker threads. The hardware loadbalancing require ethernet
214 flow director support from hardware (eg. Fortville x710 NIC card).
215 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
220 In vFW, each component is constructed using packet framework pipelines.
221 It includes Rx and Tx Driver, Master pipeline, load balancer pipeline and
222 vfw worker pipeline components. A Pipeline framework is a collection of input
223 ports, table(s),output ports and actions (functions).
225 Receive and Transmit Driver
226 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
227 Packets will be received in bulk and provided to LoadBalancer(LB) thread.
228 Transimit takes packets from worker threads in a dedicated ring and sent to
233 The Master component is part of all the IP Pipeline applications. This component
234 does not process any packets and should configure with Core 0, to allow
235 other cores for processing of the traffic. This component is responsible for
236 1. Initializing each component of the Pipeline application in different threads
237 2. Providing CLI shell for the user control/debug
238 3. Propagating the commands from user to the corresponding components
242 This pipeline processes the APRICMP packets.
246 The TXTX and RXRX pipelines are pass through pipelines to forward both ingress
247 and egress traffic to Loadbalancer. This is required when the Software
248 Loadbalancer is used.
250 Load Balancer Pipeline
251 ^^^^^^^^^^^^^^^^^^^^^^
252 The vFW support both hardware and software balancing for load balancing of
253 traffic across multiple VNF threads. The Hardware load balancing require support
254 from hardware like Flow Director for steering of packets to application through
257 The Software Load balancer is also supported if hardware load balancing can't be
258 used for any reason. The TXRX along with LOADB pipeline provides support for
259 software load balancing by distributing the flows to Multiple vFW worker
261 Loadbalancer (HW or SW) distributes traffic based on the 5 tuple (src addr, src
262 port, dest addr, dest port and protocol) applying an XOR logic distributing to
263 active worker threads, thereby maintaining an affinity of flows to worker
268 The vFW performs the basic packet filtering and will drop the invalid and
269 malformed packets.The Dynamic packet filtering done using the connection tracker
270 library. The packets are processed in bulk and Hash table is used to maintain
271 the connection details.
272 Every TCP/UDP packets are passed through connection tracker library for valid
273 connection. The ACL library integrated to firewall provide rule based filtering.
281 This application implements vCGNAPT. The idea of vCGNAPT is to extend the life of
282 the service providers IPv4 network infrastructure and mitigate IPv4 address
283 exhaustion by using address and port translation in large scale. It processes the
284 traffic in both the directions.
286 It also supports the connectivity between the IPv6 access network to IPv4 data network
287 using the IPv6 to IPv4 address translation and vice versa.
291 This application provides a standalone DPDK based high performance vCGNAPT
292 Virtual Network Function implementation.
296 The vCGNAPT VNF currently supports the following functionality:
301 • ARP (request, response, gratuitous)
302 • ICMP (terminal echo, echo response, passthrough)
303 • UDP, TCP and ICMP protocol passthrough
304 • Multithread support
305 • Multiple physical port support
306 • Limiting max ports per client
307 • Limiting max clients per public IP address
308 • Live Session tracking to NAT flow
314 The Upstream path defines the traffic from Private to Public and the downstream
315 path defines the traffic from Public to Private. The vCGNAPT has same set of
316 components to process Upstream and Downstream traffic.
318 In vCGNAPT application, each component is constructed as IP Pipeline framework.
319 It includes Master pipeline component, load balancer pipeline component and vCGNAPT
322 A Pipeline framework is collection of input ports, table(s), output ports and
323 actions (functions). In vCGNAPT pipeline, main sub components are the Inport function
324 handler, Table and Table function handler. vCGNAPT rules will be configured in the
325 table which translates egress and ingress traffic according to physical port
326 information from which side packet is arrived. The actions can be forwarding to the
327 output port (either egress or ingress) or to drop the packet.
331 The idea of vCGNAPT is to extend the life of the service providers IPv4 network infrastructure
332 and mitigate IPv4 address exhaustion by using address and port translation in large scale.
333 It processes the traffic in both the directions. ::
336 | Private consumer | CPE ----
337 | IPv4 traffic +-----+ |
338 +------------------+ |
339 | +-------------------+ +------------------+
341 |-> - Private IPv4 - vCGNAPT - Public -
342 |-> - access network - NAT44 - IPv4 traffic -
344 | +-------------------+ +------------------+
345 +------------------+ |
347 | Private consumer - CPE ----
348 | IPv4 traffic +-----+
350 Figure: vCGNAPT deployment in Service provider network
353 Components of vCGNAPT
354 ---------------------
355 In vCGNAPT, each component is constructed as a packet framework. It includes Master pipeline
356 component, driver, load balancer pipeline component and vCGNAPT worker pipeline component. A
357 pipeline framework is a collection of input ports, table(s), output ports and actions
360 Receive and transmit driver
361 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
362 Packets will be received in bulk and provided to load balancer thread. The transmit takes
363 packets from worker thread in a dedicated ring and sent to the hardware queue.
367 This component does not process any packets and should configure with Core 0,
368 to save cores for other components which processes traffic. The component
370 1. Initializing each component of the Pipeline application in different threads
371 2. Providing CLI shell for the user
372 3. Propagating the commands from user to the corresponding components.
373 4. ARP and ICMP are handled here.
375 Load Balancer pipeline
376 ^^^^^^^^^^^^^^^^^^^^^^^
377 Load balancer is part of the Multi-Threaded CGMAPT release which distributes
378 the flows to Multiple ACL worker threads.
380 Distributes traffic based on the 2 or 5 tuple (source address, source port,
381 destination address, destination port and protocol) applying an XOR logic
382 distributing the load to active worker threads, thereby maintaining an
383 affinity of flows to worker threads.
385 Tuple can be modified/configured using configuration file
389 The vCGNAPT component performs translation of private IP & port to public IP &
390 port at egress side and public IP & port to private IP & port at Ingress side
391 based on the NAT rules added to the pipeline Hash table. The NAT rules are
392 added to the Hash table via user commands. The packets that have a matching
393 egress key or ingress key in the NAT table will be processed to change IP &
394 port and will be forwarded to the output port. The packets that do not have a
395 match will be taken a default action. The default action may result in drop of
400 The vCGNAPT component performs translation of private IP & port to public IP & port
401 at egress side and public IP & port to private IP & port at Ingress side based on the
402 NAT rules added to the pipeline Hash table. Dynamic nature of vCGNAPT refers to the
403 addition of NAT entries in the Hash table dynamically when new packet arrives. The NAT
404 rules will be added to the Hash table automatically when there is no matching entry in
405 the table and the packet is circulated through software queue. The packets that have a
406 matching egress key or ingress key in the NAT table will be processed to change IP &
407 port and will be forwarded to the output port defined in the entry.
409 Dynamic vCGNAPT acts as static one too, we can do NAT entries statically. Static NAT
410 entries port range must not conflict to dynamic NAT port range.
412 vCGNAPT Static Topology:
413 --------------------------
414 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 1) IXIA
416 Egress --> The packets sent out from ixia(port 0) will be CGNAPTed to ixia(port 1).
417 Igress --> The packets sent out from ixia(port 1) will be CGNAPTed to ixia(port 0).
419 vCGNAPT Dynamic Topology (UDP_REPLAY):
420 --------------------------------------
421 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 0)UDP_REPLAY
423 Egress --> The packets sent out from ixia will be CGNAPTed to L3FWD/L4REPLAY.
424 Ingress --> The L4REPLAY upon reception of packets (Private to Public Network),
425 will immediately replay back the traffic to IXIA interface. (Pub -->Priv).
429 1. After the installation of ISB on L4Replay server
431 2. ./UDP_Replay -c core_mask -n no_of_channels(let it be as 2) -- -p PORT_MASK --config="(port,queue,lcore)"
432 eg: ./UDP_Replay -c 0xf -n 4 -- -p 0x3 --config="(0,0,1)"
440 This application implements Access Control List (ACL). ACL is typically used for rule
441 based policy enforcement. It restricts access to a destination IP address/port based
442 on various header fields, such as source IP address/port, destination IP address/port
443 and protocol. It is built on top of DPDK and uses the packet framework infrastructure.
447 This application provides a standalone DPDK based high performance ACL Virtual Network
448 Function implementation.
452 The ACL Filter performs bulk filtering of incoming packets based on rules in current ruleset,
453 discarding any packets not permitted by the rules. The mechanisms needed for building the
454 rule database and performing lookups are provided by the DPDK API.
455 http://dpdk.org/doc/api/rte__acl_8h.html
457 The Input FIFO contains all the incoming packets for ACL filtering. Packets will be dequeued
458 from the FIFO in bulk for processing by the ACL. Packets will be enqueued to the output FIFO.
459 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
461 The DPDK ACL example: http://dpdk.org/doc/guides/sample_app_ug/l3_forward_access_ctrl.html
462 #figure-ipv4-acl-rule contains a suitable syntax and parser for ACL rules.
466 In ACL, each component is constructed as a packet framework. It includes Master pipeline
467 component, driver, load balancer pipeline component and ACL worker pipeline component. A
468 pipeline framework is a collection of input ports, table(s), output ports and actions
471 Receive and transmit driver
472 ---------------------------
473 Packets will be received in bulk and provided to load balancer thread. The transmit takes
474 packets from worker thread in a dedicated ring and it is sent to the hardware queue.
478 This component does not process any packets and should configure with Core 0,
479 to save cores for other components which processes traffic. The component
481 1. Initializing each component of the Pipeline application in different threads
482 2. Providing CLI shell for the user
483 3. Propagating the commands from user to the corresponding components.
484 4. ARP and ICMP are handled here.
488 Load balancer is part of the Multi-Threaded ACL release which distributes
489 the flows to Multiple ACL worker threads.
491 Distributes traffic based on the 5 tuple (source address, source port, destination
492 address, destination port and protocol) applying an XOR logic distributing the
493 load to active worker threads, thereby maintaining an affinity of flows to
498 Visit the following link for DPDK ACL library implementation.
499 http://dpdk.org/doc/api/rte__acl_8h.html
500 http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
502 Provides shadow copy for runtime rule configuration support
504 Implements policy based packet forwarding
511 An Edge Router typically sits between two networks such as the provider core
512 network and the provider access network. In the below diagram, Customer Edge
513 (CE) Router sits in the provider access network and MPLS cloud network
514 represents the provide core network.
515 The edge router processes the traffic in both the directions. The functionality
516 of the Edge Router varies while processing each direction of traffic. The
517 packets to the core network will be filtered, classified and metered with QoS
518 parameters. The packets to the access network will be shaped according to the
520 The idea of Edge Router application is to provide the benchmark for the
521 functionality of Provider Edge routers in each direction.
523 The DPDK IP Pipeline Framework provides set of libraries to build a pipeline
524 application. The Provider Edge Router functionality delivered as a virtual
525 network function (VNF) is integrated with DPDK, optimized for Intel hardware
527 This document assumes the reader possess the knowledge of DPDK concepts and
528 IP Pipeline Framework. For more details, read DPDK Getting Started Guide, DPDK
529 Programmers Guide, DPDK Sample Applications Guide.
533 This application provides a standalone DPDK based high performance Provide
534 Edge Router Network Function implementation.
538 The Edge Router application processes the traffic between Customer and the core
540 The Upstream path defines the traffic from Customer to Core and the downstream
541 path defines the traffic from Core to Customer. The Edge Router has different
542 set of components to process Upstream and Downstream traffic.
544 In Edge Router application, each component is constructed as building blocks in
545 IP Pipeline framework. As in Pipeline framework, each component has its own
546 input ports, table and output ports. The rules of the component will be
547 configured in the table which decides the path of the traffic and any action to
548 be performed on the traffic. The actions can be forwarding to the output port,
549 forwarding to the next table or drop. For more details, please refer Section 24
550 of DPDK Programmers Guide (3).
552 The Core-to-Customer traffic is mentioned as downstream. For downstream
553 processing, Edge Router has the following functionalities in Downstream
555 ---> Packet Rx --> Routing --> Traffic Manager --> Packet Tx -->
558 To identify the route based on the destination IP.
559 To provide QinQ label based on the destination IP.
561 Updates the MAC address based on the route entry.
562 Appends the QinQ label based on the route entry.
564 To perform QoS traffic management (5-level hierarchical scheduling) based on
565 the predefined set of Service Level Agreements (SLAs)
566 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
567 Traffic Manager Profile which contains the SLA parameters are provided as
568 part of the application.
570 The Customer-to-Core traffic is mentioned as upstream. For upstream processing,
571 Edge Router has the following functionalities in Upstream.
573 ---> Packet Rx --> ACL filters --> Flow control --> Metering Policing &
574 Marking --> Routing --> Queueing & Packet Tx -->
577 To filter the unwanted packets based on the defined ACL rules.
578 Source IP, Destination IP, Protocol, Source Port and Destination Port are
579 used to derive the ACL rules.
581 To classify the packet based on the QinQ label
582 To assign a specific flow id based on the classification.
584 Two stages of QoS traffic metering and policing is applied.
585 1st stage is performed per flow ID using trTCM algorithm
586 2nd stage is performed per flow ID traffic class using trTCM algorithm
587 Packets will be either dropped or marked Green, Yellow, Red based on the
590 To identify the route based on the destination IP
591 To provide MPLS label to the packets based on destination IP.
593 Updates the MAC address based on the route entry.
594 Appends the MPLS label based on the route entry.
595 Update the packet color in MPLS EXP field in each MPLS header.
599 The vPE has downstream and upstream pipelines controlled by Master component.
600 Edge router processes two different types of traffic through pipelines
601 I. Downstream (Core-to-Customer)
602 1. Receives TCP traffic from core
603 2. Routes the packet based on the routing rules
604 3. Performs traffic scheduling based on the traffic profile
605 a. Qos scheduling is performed using token bucket algorithm
606 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
607 4. Appends QinQ label in each outgoing packet.
608 II. Upstream (Customer-to-Core)
609 1. Receives QinQ labelled TCP packets from Customer
610 2. Removes the QinQ label
611 3. Classifies the flow using QinQ label and apply Qos metering
612 a. 1st stage Qos metering is performed with flow ID using trTCM algorithm
613 b. 2nd stage Qos metering is performed with flow ID and traffic class using
615 c. traffic class maps to DSCP field in the packet.
616 4. Routes the packet based on the routing rules
617 5. Appends two MPLS labels in each outgoing packet.
621 The Master component is part of all the IP Pipeline applications. This
622 component does not process any packets and should configure with Core0,
623 to save cores for other components which processes traffic. The component
625 1. Initializing each component of the Pipeline application in different threads
626 2. Providing CLI shell for the user
627 3. Propagating the commands from user to the corresponding components.
629 Upstream and Downstream Pipelines
630 ----------------------------------
631 The downstream will have Firewall, Pass-through, Metering and Routing pipelines.
632 The upstream will have Pass-through and Routing pipelines.
634 To run the VNF, execute the following:
635 isb_root/VNFs/vPE$ ./build/ip_pipeline -p 0x3 \
636 -f config/auto_combo_1_instances_1_queues_2_ports_v2.cfg \
637 -s config/auto_combo_1_instances_1_queues_2_ports_v2.txt
639 Prox - Packet pROcessing eXecution engine
640 ==========================================
644 Packet pROcessing eXecution Engine (PROX) which is a DPDK application.
645 PROX can do operations on packets in a highly configurable manner.
646 The PROX application is also displaying performance statistics that can
647 be used for performance investigations.
648 Intel® DPPD - PROX is an application built on top of DPDK which allows creating
649 software architectures, such as the one depicted below, through small and readable
652 .. image:: images/prox-qo-img01.png
654 The figure shows that each core is executing a set of tasks. Currently,
655 a task can be any one of the following:
658 3. Basic Forwarding (no touch)
659 4. L2 Forwarding (change MAC)
661 6. Load balance based on packet fields
662 7. Symmetric load balancing
663 8. QinQ encap/decap IPv4/IPv6
671 One of the example configurations that is distributed with the source code is a
672 Proof of Concept (PoC) implementation of a Broadband Network Gateway (BNG) with Quality of Service (QoS).
673 The software architecture for this PoC is presented below.
675 .. image:: images/prox-qo-img02.png
677 The display shows per task statistics through an ncurses interface.
678 Statistics include: estimated idleness; per second statistics for packets received,
679 transmitted or dropped; per core cache occupancy; cycles per packet.
680 These statistics can help pinpoint bottlenecks in the system.
681 This information can then be used to optimize the configuration.
682 Other features include debugging support, scripting,
683 Open vSwitch support... A screenshot of the display is provided below.
685 .. image:: images/prox-screen-01.png