1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) OPNFV, Intel Corporation and others.
5 .. OPNFV SAMPLEVNF Documentation design file.
7 ==========================
8 SampleVNF Highlevel Design
9 ==========================
11 The high level design of the VNF and common code is explained here.
14 Common Code - L2L3 stack
15 ========================
20 L2L3 stack comprises of a set of libraries which are commonly used by all
23 .. image:: images/l2l3-components.png
26 It comprises of following components.
30 * ARP/ND & L2 adjacency Library
36 Interface manager is a set of API's which acts as a wrapper for the physical
37 interfaces initialization & population. This set of api's assists in configuring
38 an ethernet device, setting up TX & RX queues & starting of the devices. It
39 provides various types of interfaces like L2 interface, IPv4/IPv6 interface.
40 It helps in Configuration (set/get) operations and status updates like (UP/DOWN)
41 from admin or operations. It provides Callback functions registration for other
42 components who wants to listen to interface status. It Maintains table of all
43 the interfaces present. It provides API for getting interface statistics.
45 It Provides wrapper APIs on top of DPDKs LAG(link Aggregation) APIs, This
46 includes creating/deleting BOND interfaces, knowing the properties like Bond
47 mode, xmit policy, link up delay, link monitor frequency.
53 It provides basic lock & unlock functions which should be used for synchronization
56 ARP/ND & L2 adjacency Library
57 -----------------------------
59 The ARP/ND state machine is given in the following diagram.
61 .. image:: images/state-machine.png
64 This library provides api's for handling ARP/ICMPv4 & ND/ICMPV6 packets
65 handling. It provides api's for creating/deleting & populating an entry.
66 It handles ARP request/response messages, Handles ICMP v4 echo request &
67 echo response messages. It handles ND solicitation/advertisement messages
68 for IPV6 packets. It Provide API for L2 Adjacency creation/deletion and
69 retrieval based on nexthop & port_id. It handles Gratuitous ARP.
73 Basic commands for ARP/ND table
76 p 1 arpadd 0 <ip> <mac address> (for adding arp entry)
77 p 1 arpdel 0 <ip> (for deleting an arp entry)
78 p 1 arpreq 0 <ip> (for sending an arp request)
84 This library provides API for taking decision of whether pkt belongs to local
85 system or to forwarding.It Provides API for IPv4/IPv6 local packet out send
86 function. It Provides API for packet forwarding - LPM lookup function.
89 Common Code - Gateway routing
90 =============================
95 Gateway common code is created to support routing functionality for both
96 network and direct attached interfaces. It is supported for both IPv4 and
99 The routeadd command is enhanced to support both net and host interfaces.
100 The net is used to define the gateway and host is used for direct
103 The routing tables are allocated per port basis limited for MAX_PORTS. The
104 number of route entries are supported upto 32 per interface. These sizes
105 can be changed at compile time based on the requirement. Memory is
106 allocated only for the nb_ports which is configured as per the VNF application
112 The next hop IP and Port numbers are retrieved from the routing table based on
113 the destinantion IP addreess. The destination IP address anded with mask is
114 looked in the routing table for the match. The port/interface number which
115 also stored as a part of the table entry is also retrieved.
117 The routing table will be added with entries when the routeadd CLI command is
118 executed through script or run time. There can be multiple routing entries per
121 The routeadd will report error if the match entry already exists or also if any
122 of parameters provide in the commands are not valied. Example the if port
123 number is bigger than the supported number ports/interface per application
126 Reference routeadd command
127 --------------------------
129 Following are typical reference commands and syntax for adding routes using the CLI.
133 ;routeadd <net/host> <port #> <ipv4 nhip address in decimal> <Mask/NotApplicable>
134 routeadd net 0 202.16.100.20 0xffff0000
135 routeadd net 1 172.16.40.20 0xffff0000
136 routeadd host 0 202.16.100.20
137 routeadd host 1 172.16.40.20
139 ;routeadd <net/host> <port #> <ipv6 nhip address in hex> <Depth/NotApplicable>
140 routeadd net 0 fec0::6a05:caff:fe30:21b0 64
141 routeadd net 1 2012::6a05:caff:fe30:2081 64
142 routeadd host 0 fec0::6a05:caff:fe30:21b0
143 routeadd host 1 2012::6a05:caff:fe30:2081
151 Following are the design requierments of the vFW.
153 - The firewall will examine packets and verify that they are appropriate for the
154 current state of the connection. Inappropriate packets will be discarded, and
155 counter(s) incremented.
156 - Support both IPv4 and IPv6 traffic type for TCP/UDP/ICMP.
157 - All packet inspection features like firewall, synproxy, connection tracker
158 in this component may be turned off or on through CLI commands
159 - The Static filtering is done thorugh ACL using DPDK libraries. The rules
160 can be added/modified through CLI commands.
161 - Multiple instance of the vFW Pipeline running on multipe cores should be
162 supported for scaling the performance scaling.
163 - Should follow the DPDK IP pipeline framework
164 - Should use the DPDK libraries and functionalities for better performance
165 - The memory should be allocated in Hugepages using DPDK RTE calls for better
171 The Firewall performs basic filtering for malformed packets and dynamic packet
172 filtering incoming packets using the connection tracker library.
173 The connection data will be stored using a DPDK hash table. There will be one
174 entry in the hash table for each connection. The hash key will be based on
175 source address/port,destination address/port, and protocol of a packet. The
176 hash key will be processed to allow a single entry to be used, regardless of
177 which direction the packet is flowing (thus changing source and destination).
178 The ACL is implemented as libray stattically linked to vFW, which is used for
179 used for rule based packet filtering.
181 TCP connections and UDP pseudo connections will be tracked separately even if
182 theaddresses and ports are identical. Including the protocol in the hash key
185 The Input FIFO contains all the incoming packets for vFW filtering. The vFW
186 Filter has no dependency on which component has written to the Input FIFO.
187 Packets will be dequeued from the FIFO in bulk for processing by the vFW.
188 Packets will be enqueued to the output FIFO.
190 The software or hardware loadbalancing can be used for traffic distribution
191 across multiple worker threads. The hardware loadbalancing require ethernet
192 flow director support from hardware (eg. Fortville x710 NIC card).
193 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
198 In vFW, each component is constructed using packet framework pipelines.
199 It includes Rx and Tx Driver, Master pipeline, load balancer pipeline and
200 vfw worker pipeline components. A Pipeline framework is a collection of input
201 ports, table(s),output ports and actions (functions).
203 Receive and Transmit Driver
204 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
205 Packets will be received in bulk and provided to LoadBalancer(LB) thread.
206 Transimit takes packets from worker threads in a dedicated ring and sent to
211 The Master component is part of all the IP Pipeline applications. This component
212 does not process any packets and should configure with Core 0, to allow
213 other cores for processing of the traffic. This component is responsible for
214 1. Initializing each component of the Pipeline application in different threads
215 2. Providing CLI shell for the user control/debug
216 3. Propagating the commands from user to the corresponding components
220 This pipeline processes the APRICMP packets.
224 The TXTX and RXRX pipelines are pass through pipelines to forward both ingress
225 and egress traffic to Loadbalancer. This is required when the Software
226 Loadbalancer is used.
228 Load Balancer Pipeline
229 ^^^^^^^^^^^^^^^^^^^^^^
230 The vFW support both hardware and software balancing for load balancing of
231 traffic across multiple VNF threads. The Hardware load balancing require support
232 from hardware like Flow Director for steering of packets to application through
235 The Software Load balancer is also supported if hardware load balancing can't be
236 used for any reason. The TXRX along with LOADB pipeline provides support for
237 software load balancing by distributing the flows to Multiple vFW worker
239 Loadbalancer (HW or SW) distributes traffic based on the 5 tuple (src addr, src
240 port, dest addr, dest port and protocol) applying an XOR logic distributing to
241 active worker threads, thereby maintaining an affinity of flows to worker
246 The vFW performs the basic packet filtering and will drop the invalid and
247 malformed packets.The Dynamic packet filtering done using the connection tracker
248 library. The packets are processed in bulk and Hash table is used to maintain
249 the connection details.
250 Every TCP/UDP packets are passed through connection tracker library for valid
251 connection. The ACL library integrated to firewall provide rule based filtering.
260 This application implements vCGNAPT. The idea of vCGNAPT is to extend the life of
261 the service providers IPv4 network infrastructure and mitigate IPv4 address
262 exhaustion by using address and port translation in large scale. It processes the
263 traffic in both the directions.
265 It also supports the connectivity between the IPv6 access network to IPv4 data network
266 using the IPv6 to IPv4 address translation and vice versa.
271 This application provides a standalone DPDK based high performance vCGNAPT
272 Virtual Network Function implementation.
277 The vCGNAPT VNF currently supports the following functionality:
282 ā¢ ARP (request, response, gratuitous)
283 ā¢ ICMP (terminal echo, echo response, passthrough)
284 ā¢ UDP, TCP and ICMP protocol passthrough
285 ā¢ Multithread support
286 ā¢ Multiple physical port support
287 ā¢ Limiting max ports per client
288 ā¢ Limiting max clients per public IP address
289 ā¢ Live Session tracking to NAT flow
295 The Upstream path defines the traffic from Private to Public and the downstream
296 path defines the traffic from Public to Private. The vCGNAPT has same set of
297 components to process Upstream and Downstream traffic.
299 In vCGNAPT application, each component is constructed as IP Pipeline framework.
300 It includes Master pipeline component, load balancer pipeline component and vCGNAPT
303 A Pipeline framework is collection of input ports, table(s), output ports and
304 actions (functions). In vCGNAPT pipeline, main sub components are the Inport function
305 handler, Table and Table function handler. vCGNAPT rules will be configured in the
306 table which translates egress and ingress traffic according to physical port
307 information from which side packet is arrived. The actions can be forwarding to the
308 output port (either egress or ingress) or to drop the packet.
312 The idea of vCGNAPT is to extend the life of the service providers IPv4 network infrastructure
313 and mitigate IPv4 address exhaustion by using address and port translation in large scale.
314 It processes the traffic in both the directions.
320 | Private consumer | CPE ----
321 | IPv4 traffic +-----+ |
322 +------------------+ |
323 | +-------------------+ +------------------+
325 |-> - Private IPv4 - vCGNAPT - Public -
326 |-> - access network - NAT44 - IPv4 traffic -
328 | +-------------------+ +------------------+
329 +------------------+ |
331 | Private consumer - CPE ----
332 | IPv4 traffic +-----+
334 Figure: vCGNAPT deployment in Service provider network
337 Components of vCGNAPT
338 ---------------------
339 In vCGNAPT, each component is constructed as a packet framework. It includes
340 Master pipeline component, driver, load balancer pipeline component and
341 vCGNAPT worker pipeline component. A pipeline framework is a collection of
342 input ports, table(s), output ports and actions (functions).
344 Receive and transmit driver
345 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
346 Packets will be received in bulk and provided to load balancer thread. The
347 transmit takes packets from worker thread in a dedicated ring and sent to the
352 This component does not process any packets and should configure with Core 0,
353 to save cores for other components which processes traffic. The component
356 1. Initializing each component of the Pipeline application in different threads
357 2. Providing CLI shell for the user
358 3. Propagating the commands from user to the corresponding components.
359 4. ARP and ICMP are handled here.
361 Load Balancer pipeline
362 ^^^^^^^^^^^^^^^^^^^^^^
363 Load balancer is part of the Multi-Threaded CGMAPT release which distributes
364 the flows to Multiple ACL worker threads.
366 Distributes traffic based on the 2 or 5 tuple (source address, source port,
367 destination address, destination port and protocol) applying an XOR logic
368 distributing the load to active worker threads, thereby maintaining an
369 affinity of flows to worker threads.
371 Tuple can be modified/configured using configuration file
376 The vCGNAPT component performs translation of private IP & port to public IP &
377 port at egress side and public IP & port to private IP & port at Ingress side
378 based on the NAT rules added to the pipeline Hash table. The NAT rules are
379 added to the Hash table via user commands. The packets that have a matching
380 egress key or ingress key in the NAT table will be processed to change IP &
381 port and will be forwarded to the output port. The packets that do not have a
382 match will be taken a default action. The default action may result in drop of
388 The vCGNAPT component performs translation of private IP & port to public IP &
389 port at egress side and public IP & port to private IP & port at Ingress side
390 based on the NAT rules added to the pipeline Hash table. Dynamic nature of
391 vCGNAPT refers to the addition of NAT entries in the Hash table dynamically
392 when new packet arrives. The NAT rules will be added to the Hash table
393 automatically when there is no matching entry in the table and the packet is
394 circulated through software queue. The packets that have a matching egress
395 key or ingress key in the NAT table will be processed to change IP &
396 port and will be forwarded to the output port defined in the entry.
398 Dynamic vCGNAPT acts as static one too, we can do NAT entries statically.
399 Static NAT entries port range must not conflict to dynamic NAT port range.
401 vCGNAPT Static Topology
402 -----------------------
404 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 1)IXIA
407 Egress --> The packets sent out from ixia(port 0) will be CGNAPTed to ixia(port 1).
409 Igress --> The packets sent out from ixia(port 1) will be CGNAPTed to ixia(port 0).
411 vCGNAPT Dynamic Topology (UDP_REPLAY)
412 -------------------------------------
414 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 0)UDP_REPLAY
417 Egress --> The packets sent out from ixia will be CGNAPTed to L3FWD/L4REPLAY.
419 Ingress --> The L4REPLAY upon reception of packets (Private to Public Network),
420 will immediately replay back the traffic to IXIA interface. (Pub -->Priv).
425 After the installation of ISB on L4Replay server go to /opt/isb_bin and run the
430 ./UDP_Replay -c core_mask -n no_of_channels(let it be as 2) -- -p PORT_MASK --config="(port,queue,lcore)"
431 eg: ./UDP_Replay -c 0xf -n 4 -- -p 0x3 --config="(0,0,1)"
439 This application implements Access Control List (ACL). ACL is typically used
440 for rule based policy enforcement. It restricts access to a destination IP
441 address/port based on various header fields, such as source IP address/port,
442 destination IP address/port and protocol. It is built on top of DPDK and uses
443 the packet framework infrastructure.
447 This application provides a standalone DPDK based high performance ACL Virtual
448 Network Function implementation.
452 The ACL Filter performs bulk filtering of incoming packets based on rules in
453 current ruleset, discarding any packets not permitted by the rules. The
454 mechanisms needed for building the rule database and performing lookups are
455 provided by the DPDK API.
457 http://dpdk.org/doc/api/rte__acl_8h.html
459 The Input FIFO contains all the incoming packets for ACL filtering. Packets
460 will be dequeued from the FIFO in bulk for processing by the ACL. Packets will
461 be enqueued to the output FIFO.
463 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
465 The DPDK ACL example:
467 http://doc.dpdk.org/guides/sample_app_ug/l3_forward.html
469 #figure-ipv4-acl-rule contains a suitable syntax and parser for ACL rules.
473 In ACL, each component is constructed as a packet framework. It includes
474 Master pipeline component, driver, load balancer pipeline component and ACL
475 worker pipeline component. A pipeline framework is a collection of input ports,
476 table(s), output ports and actions (functions).
478 Receive and transmit driver
479 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
480 Packets will be received in bulk and provided to load balancer thread. The
481 transmit takes packets from worker thread in a dedicated ring and it is sent
482 to the hardware queue.
486 This component does not process any packets and should configure with Core 0,
487 to save cores for other components which processes traffic.
489 The component is responsible for
491 1. Initializing each component of the Pipeline application in different threads
492 2. Providing CLI shell for the user
493 3. Propagating the commands from user to the corresponding components.
494 4. ARP and ICMP are handled here.
499 Load balancer is part of the Multi-Threaded ACL release which distributes
500 the flows to Multiple ACL worker threads.
502 Distributes traffic based on the 5 tuple (source address, source port, destination
503 address, destination port and protocol) applying an XOR logic distributing the
504 load to active worker threads, thereby maintaining an affinity of flows to
510 Visit the following link for DPDK ACL library implementation.
512 http://dpdk.org/doc/api/rte__acl_8h.html
513 http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
515 Provides shadow copy for runtime rule configuration support
517 Implements policy based packet forwarding
525 An Edge Router typically sits between two networks such as the provider core
526 network and the provider access network. In the below diagram, Customer Edge
527 (CE) Router sits in the provider access network and MPLS cloud network
528 represents the provide core network.
529 The edge router processes the traffic in both the directions. The functionality
530 of the Edge Router varies while processing each direction of traffic. The
531 packets to the core network will be filtered, classified and metered with QoS
532 parameters. The packets to the access network will be shaped according to the
534 The idea of Edge Router application is to provide the benchmark for the
535 functionality of Provider Edge routers in each direction.
537 The DPDK IP Pipeline Framework provides set of libraries to build a pipeline
538 application. The Provider Edge Router functionality delivered as a virtual
539 network function (VNF) is integrated with DPDK, optimized for Intel hardware
541 This document assumes the reader possess the knowledge of DPDK concepts and
542 IP Pipeline Framework. For more details, read DPDK Getting Started Guide, DPDK
543 Programmers Guide, DPDK Sample Applications Guide.
548 This application provides a standalone DPDK based high performance Provide
549 Edge Router Network Function implementation.
554 The Edge Router application processes the traffic between Customer and the core
556 The Upstream path defines the traffic from Customer to Core and the downstream
557 path defines the traffic from Core to Customer. The Edge Router has different
558 set of components to process Upstream and Downstream traffic.
560 In Edge Router application, each component is constructed as building blocks in
561 IP Pipeline framework. As in Pipeline framework, each component has its own
562 input ports, table and output ports. The rules of the component will be
563 configured in the table which decides the path of the traffic and any action to
564 be performed on the traffic. The actions can be forwarding to the output port,
565 forwarding to the next table or drop. For more details, please refer Section 24
566 of DPDK Programmers Guide (3).
568 The Core-to-Customer traffic is mentioned as downstream. For downstream
569 processing, Edge Router has the following functionalities in Downstream
571 ---> Packet Rx --> Routing --> Traffic Manager --> Packet Tx -->
574 To identify the route based on the destination IP.
575 To provide QinQ label based on the destination IP.
577 Updates the MAC address based on the route entry.
578 Appends the QinQ label based on the route entry.
580 To perform QoS traffic management (5-level hierarchical scheduling) based on
581 the predefined set of Service Level Agreements (SLAs)
582 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
583 Traffic Manager Profile which contains the SLA parameters are provided as
584 part of the application.
586 The Customer-to-Core traffic is mentioned as upstream. For upstream processing,
587 Edge Router has the following functionalities in Upstream.
589 ---> Packet Rx --> ACL filters --> Flow control --> Metering Policing &
590 Marking --> Routing --> Queueing & Packet Tx -->
593 To filter the unwanted packets based on the defined ACL rules.
594 Source IP, Destination IP, Protocol, Source Port and Destination Port are
595 used to derive the ACL rules.
597 To classify the packet based on the QinQ label
598 To assign a specific flow id based on the classification.
600 Two stages of QoS traffic metering and policing is applied.
601 1st stage is performed per flow ID using trTCM algorithm
602 2nd stage is performed per flow ID traffic class using trTCM algorithm
603 Packets will be either dropped or marked Green, Yellow, Red based on the
606 To identify the route based on the destination IP
607 To provide MPLS label to the packets based on destination IP.
609 Updates the MAC address based on the route entry.
610 Appends the MPLS label based on the route entry.
611 Update the packet color in MPLS EXP field in each MPLS header.
616 The vPE has downstream and upstream pipelines controlled by Master component.
617 Edge router processes two different types of traffic through pipelines:
619 I) Downstream (Core-to-Customer)
621 1. Receives TCP traffic from core
622 2. Routes the packet based on the routing rules
623 3. Performs traffic scheduling based on the traffic profile
625 a. Qos scheduling is performed using token bucket algorithm.
626 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
627 4. Appends QinQ label in each outgoing packet.
629 II) Upstream (Customer-to-Core)
631 1. Receives QinQ labelled TCP packets from Customer
632 2. Removes the QinQ label
633 3. Classifies the flow using QinQ label and apply Qos metering
635 a. 1st stage Qos metering is performed with flow ID using trTCM algorithm
636 b. 2nd stage Qos metering is performed with flow ID and traffic class using
638 c. traffic class maps to DSCP field in the packet.
639 4. Routes the packet based on the routing rules
640 5. Appends two MPLS labels in each outgoing packet.
645 The Master component is part of all the IP Pipeline applications. This
646 component does not process any packets and should configure with Core0,
647 to save cores for other components which processes traffic. The component
650 1. Initializing each component of the Pipeline application in different threads
651 2. Providing CLI shell for the user
652 3. Propagating the commands from user to the corresponding components.
654 Upstream and Downstream Pipelines
655 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
657 The downstream will have Firewall, Pass-through, Metering and Routing pipelines.
658 The upstream will have Pass-through and Routing pipelines.
660 To run the VNF, execute the following:
664 isb_root/VNFs/vPE$ ./build/ip_pipeline -p 0x3 \
665 -f config/auto_combo_1_instances_1_queues_2_ports_v2.cfg \
666 -s config/auto_combo_1_instances_1_queues_2_ports_v2.txt
669 Prox - Packet pROcessing eXecution engine
670 =========================================
675 Packet pROcessing eXecution Engine (PROX) which is a DPDK application.
676 PROX can do operations on packets in a highly configurable manner.
677 The PROX application is also displaying performance statistics that can
678 be used for performance investigations.
679 IntelĀ® DPPD - PROX is an application built on top of DPDK which allows creating
680 software architectures, such as the one depicted below, through small and readable
683 .. image:: images/prox-qo-img01.png
685 The figure shows that each core is executing a set of tasks. Currently,
686 a task can be any one of the following:
690 3. Basic Forwarding (no touch)
691 4. L2 Forwarding (change MAC)
693 6. Load balance based on packet fields
694 7. Symmetric load balancing
695 8. QinQ encap/decap IPv4/IPv6
703 One of the example configurations that is distributed with the source code is a
704 Proof of Concept (PoC) implementation of a Broadband Network Gateway (BNG)
705 with Quality of Service (QoS).
706 The software architecture for this PoC is presented below.
708 .. image:: images/prox-qo-img02.png
710 The display shows per task statistics through an ncurses interface.
711 Statistics include: estimated idleness; per second statistics for packets
712 received, transmitted or dropped; per core cache occupancy; cycles per packet.
713 These statistics can help pinpoint bottlenecks in the system.
714 This information can then be used to optimize the configuration.
715 Other features include debugging support, scripting,
716 Open vSwitch support... A screenshot of the display is provided below.
718 .. image:: images/prox-screen-01.png