1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) OPNFV, Intel Corporation and others.
5 .. OPNFV SAMPLEVNF Documentation design file.
7 ==========================
8 SampleVNF Highlevel Design
9 ==========================
11 The high level design of the VNF and common code is explained here.
14 Common Code - L2L3 stack
15 ========================
20 L2L3 stack comprises of a set of libraries which are commonly used by all
23 .. image:: images/l2l3-components.png
26 It comprises of following components.
30 * ARP/ND & L2 adjacency Library
36 Interface manager is a set of API's which acts as a wrapper for the physical
37 interfaces initialization & population. This set of api's assists in configuring
38 an ethernet device, setting up TX & RX queues & starting of the devices. It
39 provides various types of interfaces like L2 interface, IPv4/IPv6 interface.
40 It helps in Configuration (set/get) operations and status updates like (UP/DOWN)
41 from admin or operations. It provides Callback functions registration for other
42 components who wants to listen to interface status. It Maintains table of all
43 the interfaces present. It provides API for getting interface statistics.
45 It Provides wrapper APIs on top of DPDKs LAG(link Aggregation) APIs, This
46 includes creating/deleting BOND interfaces, knowing the properties like Bond
47 mode, xmit policy, link up delay, link monitor frequency.
53 It provides basic lock & unlock functions which should be used for synchronization
56 ARP/ND & L2 adjacency Library
57 -----------------------------
59 The ARP/ND state machine is given in the following diagram.
61 .. image:: images/state-machine.png
64 This library provides api's for handling ARP/ICMPv4 & ND/ICMPV6 packets
65 handling. It provides api's for creating/deleting & populating an entry.
66 It handles ARP request/response messages, Handles ICMP v4 echo request &
67 echo response messages. It handles ND solicitation/advertisement messages
68 for IPV6 packets. It Provide API for L2 Adjacency creation/deletion and
69 retrieval based on nexthop & port_id. It handles Gratuitous ARP.
73 Basic commands for ARP/ND table
76 p 1 arpadd 0 <ip> <mac address> (for adding arp entry)
77 p 1 arpdel 0 <ip> (for deleting an arp entry)
78 p 1 arpreq 0 <ip> (for sending an arp request)
84 This library provides API for taking decision of whether pkt belongs to local
85 system or to forwarding.It Provides API for IPv4/IPv6 local packet out send
86 function. It Provides API for packet forwarding - LPM lookup function.
89 Common Code - Gateway routing
90 =============================
95 Gateway common code is created to support routing functionality for both
96 network and direct attached interfaces. It is supported for both IPv4 and
99 The routeadd command is enhanced to support both net and host interfaces.
100 The net is used to define the gateway and host is used for direct
103 The routing tables are allocated per port basis limited for MAX_PORTS. The
104 number of route entries are supported upto 32 per interface. These sizes
105 can be changed at compile time based on the requirement. Memory is
106 allocated only for the nb_ports which is configured as per the VNF application
112 The next hop IP and Port numbers are retrieved from the routing table based on
113 the destinantion IP addreess. The destination IP address anded with mask is
114 looked in the routing table for the match. The port/interface number which
115 also stored as a part of the table entry is also retrieved.
117 The routing table will be added with entries when the routeadd CLI command is
118 executed through script or run time. There can be multiple routing entries per
121 The routeadd will report error if the match entry already exists or also if any
122 of parameters provide in the commands are not valied. Example the if port
123 number is bigger than the supported number ports/interface per application
126 Reference routeadd command
127 --------------------------
129 Following are typical reference commands and syntax for adding routes using the CLI.
133 ;routeadd <net/host> <port #> <ipv4 nhip address in decimal> <Mask/NotApplicable>
134 routeadd net 0 202.16.100.20 0xffff0000
135 routeadd net 1 172.16.40.20 0xffff0000
136 routeadd host 0 202.16.100.20
137 routeadd host 1 172.16.40.20
139 ;routeadd <net/host> <port #> <ipv6 nhip address in hex> <Depth/NotApplicable>
140 routeadd net 0 fec0::6a05:caff:fe30:21b0 64
141 routeadd net 1 2012::6a05:caff:fe30:2081 64
142 routeadd host 0 fec0::6a05:caff:fe30:21b0
143 routeadd host 1 2012::6a05:caff:fe30:2081
151 Following are the design requierments of the vFW.
153 - The firewall will examine packets and verify that they are appropriate for the
154 current state of the connection. Inappropriate packets will be discarded, and
155 counter(s) incremented.
156 - Support both IPv4 and IPv6 traffic type for TCP/UDP/ICMP.
157 - All packet inspection features like firewall, synproxy, connection tracker
158 in this component may be turned off or on through CLI commands
159 - The Static filtering is done thorugh ACL using DPDK libraries. The rules
160 can be added/modified through CLI commands.
161 - Multiple instance of the vFW Pipeline running on multipe cores should be
162 supported for scaling the performance scaling.
163 - Should follow the DPDK IP pipeline framework
164 - Should use the DPDK libraries and functionalities for better performance
165 - The memory should be allocated in Hugepages using DPDK RTE calls for better
171 The Firewall performs basic filtering for malformed packets and dynamic packet
172 filtering incoming packets using the connection tracker library.
173 The connection data will be stored using a DPDK hash table. There will be one
174 entry in the hash table for each connection. The hash key will be based on
175 source address/port,destination address/port, and protocol of a packet. The
176 hash key will be processed to allow a single entry to be used, regardless of
177 which direction the packet is flowing (thus changing source and destination).
178 The ACL is implemented as libray stattically linked to vFW, which is used for
179 used for rule based packet filtering.
181 TCP connections and UDP pseudo connections will be tracked separately even if
182 theaddresses and ports are identical. Including the protocol in the hash key
185 The Input FIFO contains all the incoming packets for vFW filtering. The vFW
186 Filter has no dependency on which component has written to the Input FIFO.
187 Packets will be dequeued from the FIFO in bulk for processing by the vFW.
188 Packets will be enqueued to the output FIFO.
190 The software or hardware loadbalancing can be used for traffic distribution
191 across multiple worker threads. The hardware loadbalancing require ethernet
192 flow director support from hardware (eg. Fortville x710 NIC card).
193 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
198 In vFW, each component is constructed using packet framework pipelines.
199 It includes Rx and Tx Driver, Master pipeline, load balancer pipeline and
200 vfw worker pipeline components. A Pipeline framework is a collection of input
201 ports, table(s),output ports and actions (functions).
203 Receive and Transmit Driver
204 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
205 Packets will be received in bulk and provided to LoadBalancer(LB) thread.
206 Transimit takes packets from worker threads in a dedicated ring and sent to
211 The Master component is part of all the IP Pipeline applications. This component
212 does not process any packets and should configure with Core 0, to allow
213 other cores for processing of the traffic. This component is responsible for
214 1. Initializing each component of the Pipeline application in different threads
215 2. Providing CLI shell for the user control/debug
216 3. Propagating the commands from user to the corresponding components
220 This pipeline processes the APRICMP packets.
224 The TXTX and RXRX pipelines are pass through pipelines to forward both ingress
225 and egress traffic to Loadbalancer. This is required when the Software
226 Loadbalancer is used.
228 Load Balancer Pipeline
229 ^^^^^^^^^^^^^^^^^^^^^^
230 The vFW support both hardware and software balancing for load balancing of
231 traffic across multiple VNF threads. The Hardware load balancing require support
232 from hardware like Flow Director for steering of packets to application through
235 The Software Load balancer is also supported if hardware load balancing can't be
236 used for any reason. The TXRX along with LOADB pipeline provides support for
237 software load balancing by distributing the flows to Multiple vFW worker
239 Loadbalancer (HW or SW) distributes traffic based on the 5 tuple (src addr, src
240 port, dest addr, dest port and protocol) applying an XOR logic distributing to
241 active worker threads, thereby maintaining an affinity of flows to worker
246 The vFW performs the basic packet filtering and will drop the invalid and
247 malformed packets.The Dynamic packet filtering done using the connection tracker
248 library. The packets are processed in bulk and Hash table is used to maintain
249 the connection details.
250 Every TCP/UDP packets are passed through connection tracker library for valid
251 connection. The ACL library integrated to firewall provide rule based filtering.
260 This application implements vCGNAPT. The idea of vCGNAPT is to extend the life of
261 the service providers IPv4 network infrastructure and mitigate IPv4 address
262 exhaustion by using address and port translation in large scale. It processes the
263 traffic in both the directions.
265 It also supports the connectivity between the IPv6 access network to IPv4 data network
266 using the IPv6 to IPv4 address translation and vice versa.
271 This application provides a standalone DPDK based high performance vCGNAPT
272 Virtual Network Function implementation.
277 The vCGNAPT VNF currently supports the following functionality:
282 ā¢ ARP (request, response, gratuitous)
283 ā¢ ICMP (terminal echo, echo response, passthrough)
284 ā¢ UDP, TCP and ICMP protocol passthrough
285 ā¢ Multithread support
286 ā¢ Multiple physical port support
287 ā¢ Limiting max ports per client
288 ā¢ Limiting max clients per public IP address
289 ā¢ Live Session tracking to NAT flow
295 The Upstream path defines the traffic from Private to Public and the downstream
296 path defines the traffic from Public to Private. The vCGNAPT has same set of
297 components to process Upstream and Downstream traffic.
299 In vCGNAPT application, each component is constructed as IP Pipeline framework.
300 It includes Master pipeline component, load balancer pipeline component and vCGNAPT
303 A Pipeline framework is collection of input ports, table(s), output ports and
304 actions (functions). In vCGNAPT pipeline, main sub components are the Inport function
305 handler, Table and Table function handler. vCGNAPT rules will be configured in the
306 table which translates egress and ingress traffic according to physical port
307 information from which side packet is arrived. The actions can be forwarding to the
308 output port (either egress or ingress) or to drop the packet.
312 The idea of vCGNAPT is to extend the life of the service providers IPv4 network infrastructure
313 and mitigate IPv4 address exhaustion by using address and port translation in large scale.
314 It processes the traffic in both the directions.
320 | Private consumer | CPE ----
321 | IPv4 traffic +-----+ |
322 +------------------+ |
323 | +-------------------+ +------------------+
325 |-> - Private IPv4 - vCGNAPT - Public -
326 |-> - access network - NAT44 - IPv4 traffic -
328 | +-------------------+ +------------------+
329 +------------------+ |
331 | Private consumer - CPE ----
332 | IPv4 traffic +-----+
334 Figure: vCGNAPT deployment in Service provider network
337 Components of vCGNAPT
338 ---------------------
339 In vCGNAPT, each component is constructed as a packet framework. It includes
340 Master pipeline component, driver, load balancer pipeline component and
341 vCGNAPT worker pipeline component. A pipeline framework is a collection of
342 input ports, table(s), output ports and actions (functions).
344 Receive and transmit driver
345 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
346 Packets will be received in bulk and provided to load balancer thread. The
347 transmit takes packets from worker thread in a dedicated ring and sent to the
352 This component does not process any packets and should configure with Core 0,
353 to save cores for other components which processes traffic. The component
356 1. Initializing each component of the Pipeline application in different threads
357 2. Providing CLI shell for the user
358 3. Propagating the commands from user to the corresponding components.
359 4. ARP and ICMP are handled here.
361 Load Balancer pipeline
362 ^^^^^^^^^^^^^^^^^^^^^^^
363 Load balancer is part of the Multi-Threaded CGMAPT release which distributes
364 the flows to Multiple ACL worker threads.
366 Distributes traffic based on the 2 or 5 tuple (source address, source port,
367 destination address, destination port and protocol) applying an XOR logic
368 distributing the load to active worker threads, thereby maintaining an
369 affinity of flows to worker threads.
371 Tuple can be modified/configured using configuration file
376 The vCGNAPT component performs translation of private IP & port to public IP &
377 port at egress side and public IP & port to private IP & port at Ingress side
378 based on the NAT rules added to the pipeline Hash table. The NAT rules are
379 added to the Hash table via user commands. The packets that have a matching
380 egress key or ingress key in the NAT table will be processed to change IP &
381 port and will be forwarded to the output port. The packets that do not have a
382 match will be taken a default action. The default action may result in drop of
388 The vCGNAPT component performs translation of private IP & port to public IP &
389 port at egress side and public IP & port to private IP & port at Ingress side
390 based on the NAT rules added to the pipeline Hash table. Dynamic nature of
391 vCGNAPT refers to the addition of NAT entries in the Hash table dynamically
392 when new packet arrives. The NAT rules will be added to the Hash table
393 automatically when there is no matching entry in the table and the packet is
394 circulated through software queue. The packets that have a matching egress
395 key or ingress key in the NAT table will be processed to change IP &
396 port and will be forwarded to the output port defined in the entry.
398 Dynamic vCGNAPT acts as static one too, we can do NAT entries statically.
399 Static NAT entries port range must not conflict to dynamic NAT port range.
401 vCGNAPT Static Topology
402 ----------------------
404 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 1) IXIA
406 Egress --> The packets sent out from ixia(port 0) will be CGNAPTed to ixia(port 1).
407 Igress --> The packets sent out from ixia(port 1) will be CGNAPTed to ixia(port 0).
409 vCGNAPT Dynamic Topology (UDP_REPLAY)
410 -------------------------------------
412 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 0)UDP_REPLAY
414 Egress --> The packets sent out from ixia will be CGNAPTed to L3FWD/L4REPLAY.
415 Ingress --> The L4REPLAY upon reception of packets (Private to Public Network),
416 will immediately replay back the traffic to IXIA interface. (Pub -->Priv).
421 After the installation of ISB on L4Replay server go to /opt/isb_bin and run the
426 ./UDP_Replay -c core_mask -n no_of_channels(let it be as 2) -- -p PORT_MASK --config="(port,queue,lcore)"
427 eg: ./UDP_Replay -c 0xf -n 4 -- -p 0x3 --config="(0,0,1)"
435 This application implements Access Control List (ACL). ACL is typically used
436 for rule based policy enforcement. It restricts access to a destination IP
437 address/port based on various header fields, such as source IP address/port,
438 destination IP address/port and protocol. It is built on top of DPDK and uses
439 the packet framework infrastructure.
443 This application provides a standalone DPDK based high performance ACL Virtual
444 Network Function implementation.
448 The ACL Filter performs bulk filtering of incoming packets based on rules in
449 current ruleset, discarding any packets not permitted by the rules. The
450 mechanisms needed for building the rule database and performing lookups are
451 provided by the DPDK API.
453 http://dpdk.org/doc/api/rte__acl_8h.html
455 The Input FIFO contains all the incoming packets for ACL filtering. Packets
456 will be dequeued from the FIFO in bulk for processing by the ACL. Packets will
457 be enqueued to the output FIFO.
459 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
461 The DPDK ACL example:
463 http://dpdk.org/doc/guides/sample_app_ug/l3_forward_access_ctrl.html
465 #figure-ipv4-acl-rule contains a suitable syntax and parser for ACL rules.
469 In ACL, each component is constructed as a packet framework. It includes
470 Master pipeline component, driver, load balancer pipeline component and ACL
471 worker pipeline component. A pipeline framework is a collection of input ports,
472 table(s), output ports and actions (functions).
474 Receive and transmit driver
475 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
476 Packets will be received in bulk and provided to load balancer thread. The
477 transmit takes packets from worker thread in a dedicated ring and it is sent
478 to the hardware queue.
482 This component does not process any packets and should configure with Core 0,
483 to save cores for other components which processes traffic.
485 The component is responsible for
487 1. Initializing each component of the Pipeline application in different threads
488 2. Providing CLI shell for the user
489 3. Propagating the commands from user to the corresponding components.
490 4. ARP and ICMP are handled here.
495 Load balancer is part of the Multi-Threaded ACL release which distributes
496 the flows to Multiple ACL worker threads.
498 Distributes traffic based on the 5 tuple (source address, source port, destination
499 address, destination port and protocol) applying an XOR logic distributing the
500 load to active worker threads, thereby maintaining an affinity of flows to
506 Visit the following link for DPDK ACL library implementation.
508 http://dpdk.org/doc/api/rte__acl_8h.html
509 http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
511 Provides shadow copy for runtime rule configuration support
513 Implements policy based packet forwarding
521 An Edge Router typically sits between two networks such as the provider core
522 network and the provider access network. In the below diagram, Customer Edge
523 (CE) Router sits in the provider access network and MPLS cloud network
524 represents the provide core network.
525 The edge router processes the traffic in both the directions. The functionality
526 of the Edge Router varies while processing each direction of traffic. The
527 packets to the core network will be filtered, classified and metered with QoS
528 parameters. The packets to the access network will be shaped according to the
530 The idea of Edge Router application is to provide the benchmark for the
531 functionality of Provider Edge routers in each direction.
533 The DPDK IP Pipeline Framework provides set of libraries to build a pipeline
534 application. The Provider Edge Router functionality delivered as a virtual
535 network function (VNF) is integrated with DPDK, optimized for Intel hardware
537 This document assumes the reader possess the knowledge of DPDK concepts and
538 IP Pipeline Framework. For more details, read DPDK Getting Started Guide, DPDK
539 Programmers Guide, DPDK Sample Applications Guide.
544 This application provides a standalone DPDK based high performance Provide
545 Edge Router Network Function implementation.
550 The Edge Router application processes the traffic between Customer and the core
552 The Upstream path defines the traffic from Customer to Core and the downstream
553 path defines the traffic from Core to Customer. The Edge Router has different
554 set of components to process Upstream and Downstream traffic.
556 In Edge Router application, each component is constructed as building blocks in
557 IP Pipeline framework. As in Pipeline framework, each component has its own
558 input ports, table and output ports. The rules of the component will be
559 configured in the table which decides the path of the traffic and any action to
560 be performed on the traffic. The actions can be forwarding to the output port,
561 forwarding to the next table or drop. For more details, please refer Section 24
562 of DPDK Programmers Guide (3).
564 The Core-to-Customer traffic is mentioned as downstream. For downstream
565 processing, Edge Router has the following functionalities in Downstream
567 ---> Packet Rx --> Routing --> Traffic Manager --> Packet Tx -->
570 To identify the route based on the destination IP.
571 To provide QinQ label based on the destination IP.
573 Updates the MAC address based on the route entry.
574 Appends the QinQ label based on the route entry.
576 To perform QoS traffic management (5-level hierarchical scheduling) based on
577 the predefined set of Service Level Agreements (SLAs)
578 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
579 Traffic Manager Profile which contains the SLA parameters are provided as
580 part of the application.
582 The Customer-to-Core traffic is mentioned as upstream. For upstream processing,
583 Edge Router has the following functionalities in Upstream.
585 ---> Packet Rx --> ACL filters --> Flow control --> Metering Policing &
586 Marking --> Routing --> Queueing & Packet Tx -->
589 To filter the unwanted packets based on the defined ACL rules.
590 Source IP, Destination IP, Protocol, Source Port and Destination Port are
591 used to derive the ACL rules.
593 To classify the packet based on the QinQ label
594 To assign a specific flow id based on the classification.
596 Two stages of QoS traffic metering and policing is applied.
597 1st stage is performed per flow ID using trTCM algorithm
598 2nd stage is performed per flow ID traffic class using trTCM algorithm
599 Packets will be either dropped or marked Green, Yellow, Red based on the
602 To identify the route based on the destination IP
603 To provide MPLS label to the packets based on destination IP.
605 Updates the MAC address based on the route entry.
606 Appends the MPLS label based on the route entry.
607 Update the packet color in MPLS EXP field in each MPLS header.
612 The vPE has downstream and upstream pipelines controlled by Master component.
613 Edge router processes two different types of traffic through pipelines
614 I. Downstream (Core-to-Customer)
615 1. Receives TCP traffic from core
616 2. Routes the packet based on the routing rules
617 3. Performs traffic scheduling based on the traffic profile
618 a. Qos scheduling is performed using token bucket algorithm
619 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
620 4. Appends QinQ label in each outgoing packet.
621 II. Upstream (Customer-to-Core)
622 1. Receives QinQ labelled TCP packets from Customer
623 2. Removes the QinQ label
624 3. Classifies the flow using QinQ label and apply Qos metering
625 a. 1st stage Qos metering is performed with flow ID using trTCM algorithm
626 b. 2nd stage Qos metering is performed with flow ID and traffic class using
628 c. traffic class maps to DSCP field in the packet.
629 4. Routes the packet based on the routing rules
630 5. Appends two MPLS labels in each outgoing packet.
635 The Master component is part of all the IP Pipeline applications. This
636 component does not process any packets and should configure with Core0,
637 to save cores for other components which processes traffic. The component
639 1. Initializing each component of the Pipeline application in different threads
640 2. Providing CLI shell for the user
641 3. Propagating the commands from user to the corresponding components.
643 Upstream and Downstream Pipelines
644 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
646 The downstream will have Firewall, Pass-through, Metering and Routing pipelines.
647 The upstream will have Pass-through and Routing pipelines.
649 To run the VNF, execute the following:
653 isb_root/VNFs/vPE$ ./build/ip_pipeline -p 0x3 \
654 -f config/auto_combo_1_instances_1_queues_2_ports_v2.cfg \
655 -s config/auto_combo_1_instances_1_queues_2_ports_v2.txt
658 Prox - Packet pROcessing eXecution engine
659 ==========================================
664 Packet pROcessing eXecution Engine (PROX) which is a DPDK application.
665 PROX can do operations on packets in a highly configurable manner.
666 The PROX application is also displaying performance statistics that can
667 be used for performance investigations.
668 IntelĀ® DPPD - PROX is an application built on top of DPDK which allows creating
669 software architectures, such as the one depicted below, through small and readable
672 .. image:: images/prox-qo-img01.png
674 The figure shows that each core is executing a set of tasks. Currently,
675 a task can be any one of the following:
679 3. Basic Forwarding (no touch)
680 4. L2 Forwarding (change MAC)
682 6. Load balance based on packet fields
683 7. Symmetric load balancing
684 8. QinQ encap/decap IPv4/IPv6
692 One of the example configurations that is distributed with the source code is a
693 Proof of Concept (PoC) implementation of a Broadband Network Gateway (BNG)
694 with Quality of Service (QoS).
695 The software architecture for this PoC is presented below.
697 .. image:: images/prox-qo-img02.png
699 The display shows per task statistics through an ncurses interface.
700 Statistics include: estimated idleness; per second statistics for packets
701 received, transmitted or dropped; per core cache occupancy; cycles per packet.
702 These statistics can help pinpoint bottlenecks in the system.
703 This information can then be used to optimize the configuration.
704 Other features include debugging support, scripting,
705 Open vSwitch support... A screenshot of the display is provided below.
707 .. image:: images/prox-screen-01.png