1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) OPNFV, Intel Corporation and others.
5 .. OPNFV SAMPLEVNF Documentation design file.
7 ==========================
8 SampleVNF Highlevel Design
9 ==========================
11 The high level design of the VNF and common code is explained here.
14 Common Code - L2L3 stack
15 ========================
20 L2L3 stack comprises of a set of libraries which are commonly used by all
23 It comprises of following components.
27 * ARP/ND & L2 adjacency Library
34 Interface manager is a set of API's which acts as a wrapper for the physical
35 interfaces initialization & population. This set of api's assists in configuring
36 an ethernet device, setting up TX & RX queues & starting of the devices. It
37 provides various types of interfaces like L2 interface, IPv4/IPv6 interface.
38 It helps in Configuration (set/get) operations and status updates like (UP/DOWN)
39 from admin or operations. It provides Callback functions registration for other
40 components who wants to listen to interface status. It Maintains table of all
41 the interfaces present. It provides API for getting interface statistics.
43 It Provides wrapper APIs on top of DPDKs LAG(link Aggregation) APIs, This
44 includes creating/deleting BOND interfaces, knowing the properties like Bond
45 mode, xmit policy, link up delay, link monitor frequency.
51 It provides basic lock & unlock functions which should be used for synchronization
54 ARP/ND & L2 adjacency Library
55 -----------------------------
57 The ARP/ND state machine is given in the following diagram.
59 .. image:: state-machine.png
61 This library provides api's for handling ARP/ICMPv4 & ND/ICMPV6 packets
62 handling. It provides api's for creating/deleting & populating an entry.
63 It handles ARP request/response messages, Handles ICMP v4 echo request &
64 echo response messages. It handles ND solicitation/advertisement messages
65 for IPV6 packets. It Provide API for L2 Adjacency creation/deletion and
66 retrieval based on nexthop & port_id. It handles Gratuitous ARP.
70 Basic commands for ARP/ND table
73 p 1 arpadd 0 <ip> <mac address> (for adding arp entry)
74 p 1 arpdel 0 <ip> (for deleting an arp entry)
75 p 1 arpreq 0 <ip> (for sending an arp request)
81 This library provides API for taking decision of whether pkt belongs to local
82 system or to forwarding.It Provides API for IPv4/IPv6 local packet out send
83 function. It Provides API for packet forwarding - LPM lookup function.
86 Common Code - Gateway routing
87 =============================
92 Gateway common code is created to support routing functionality for both
93 network and direct attached interfaces. It is supported for both IPv4 and
96 The routeadd command is enhanced to support both net and host interfaces.
97 The net is used to define the gateway and host is used for direct
100 The routing tables are allocated per port basis limited for MAX_PORTS. The
101 number of route entries are supported upto 32 per interface. These sizes
102 can be changed at compile time based on the requirement. Memory is
103 allocated only for the nb_ports which is configured as per the VNF application
109 The next hop IP and Port numbers are retrieved from the routing table based on
110 the destinantion IP addreess. The destination IP address anded with mask is
111 looked in the routing table for the match. The port/interface number which
112 also stored as a part of the table entry is also retrieved.
114 The routing table will be added with entries when the routeadd CLI command is
115 executed through script or run time. There can be multiple routing entries per
118 The routeadd will report error if the match entry already exists or also if any
119 of parameters provide in the commands are not valied. Example the if port
120 number is bigger than the supported number ports/interface per application
123 Reference routeadd command
124 --------------------------
126 Following are typical reference commands and syntax for adding routes using the CLI.
130 ;routeadd <net/host> <port #> <ipv4 nhip address in decimal> <Mask/NotApplicable>
131 routeadd net 0 202.16.100.20 0xffff0000
132 routeadd net 1 172.16.40.20 0xffff0000
133 routeadd host 0 202.16.100.20
134 routeadd host 1 172.16.40.20
136 ;routeadd <net/host> <port #> <ipv6 nhip address in hex> <Depth/NotApplicable>
137 routeadd net 0 fec0::6a05:caff:fe30:21b0 64
138 routeadd net 1 2012::6a05:caff:fe30:2081 64
139 routeadd host 0 fec0::6a05:caff:fe30:21b0
140 routeadd host 1 2012::6a05:caff:fe30:2081
148 Following are the design requierments of the vFW.
150 - The firewall will examine packets and verify that they are appropriate for the
151 current state of the connection. Inappropriate packets will be discarded, and
152 counter(s) incremented.
153 - Support both IPv4 and IPv6 traffic type for TCP/UDP/ICMP.
154 - All packet inspection features like firewall, synproxy, connection tracker
155 in this component may be turned off or on through CLI commands
156 - The Static filtering is done thorugh ACL using DPDK libraries. The rules
157 can be added/modified through CLI commands.
158 - Multiple instance of the vFW Pipeline running on multipe cores should be
159 supported for scaling the performance scaling.
160 - Should follow the DPDK IP pipeline framework
161 - Should use the DPDK libraries and functionalities for better performance
162 - The memory should be allocated in Hugepages using DPDK RTE calls for better
168 The Firewall performs basic filtering for malformed packets and dynamic packet
169 filtering incoming packets using the connection tracker library.
170 The connection data will be stored using a DPDK hash table. There will be one
171 entry in the hash table for each connection. The hash key will be based on
172 source address/port,destination address/port, and protocol of a packet. The
173 hash key will be processed to allow a single entry to be used, regardless of
174 which direction the packet is flowing (thus changing source and destination).
175 The ACL is implemented as libray stattically linked to vFW, which is used for
176 used for rule based packet filtering.
178 TCP connections and UDP pseudo connections will be tracked separately even if
179 theaddresses and ports are identical. Including the protocol in the hash key
182 The Input FIFO contains all the incoming packets for vFW filtering. The vFW
183 Filter has no dependency on which component has written to the Input FIFO.
184 Packets will be dequeued from the FIFO in bulk for processing by the vFW.
185 Packets will be enqueued to the output FIFO.
187 The software or hardware loadbalancing can be used for traffic distribution
188 across multiple worker threads. The hardware loadbalancing require ethernet
189 flow director support from hardware (eg. Fortville x710 NIC card).
190 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
195 In vFW, each component is constructed using packet framework pipelines.
196 It includes Rx and Tx Driver, Master pipeline, load balancer pipeline and
197 vfw worker pipeline components. A Pipeline framework is a collection of input
198 ports, table(s),output ports and actions (functions).
200 Receive and Transmit Driver
201 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
202 Packets will be received in bulk and provided to LoadBalancer(LB) thread.
203 Transimit takes packets from worker threads in a dedicated ring and sent to
208 The Master component is part of all the IP Pipeline applications. This component
209 does not process any packets and should configure with Core 0, to allow
210 other cores for processing of the traffic. This component is responsible for
211 1. Initializing each component of the Pipeline application in different threads
212 2. Providing CLI shell for the user control/debug
213 3. Propagating the commands from user to the corresponding components
217 This pipeline processes the APRICMP packets.
221 The TXTX and RXRX pipelines are pass through pipelines to forward both ingress
222 and egress traffic to Loadbalancer. This is required when the Software
223 Loadbalancer is used.
225 Load Balancer Pipeline
226 ^^^^^^^^^^^^^^^^^^^^^^
227 The vFW support both hardware and software balancing for load balancing of
228 traffic across multiple VNF threads. The Hardware load balancing require support
229 from hardware like Flow Director for steering of packets to application through
232 The Software Load balancer is also supported if hardware load balancing can't be
233 used for any reason. The TXRX along with LOADB pipeline provides support for
234 software load balancing by distributing the flows to Multiple vFW worker
236 Loadbalancer (HW or SW) distributes traffic based on the 5 tuple (src addr, src
237 port, dest addr, dest port and protocol) applying an XOR logic distributing to
238 active worker threads, thereby maintaining an affinity of flows to worker
243 The vFW performs the basic packet filtering and will drop the invalid and
244 malformed packets.The Dynamic packet filtering done using the connection tracker
245 library. The packets are processed in bulk and Hash table is used to maintain
246 the connection details.
247 Every TCP/UDP packets are passed through connection tracker library for valid
248 connection. The ACL library integrated to firewall provide rule based filtering.
257 This application implements vCGNAPT. The idea of vCGNAPT is to extend the life of
258 the service providers IPv4 network infrastructure and mitigate IPv4 address
259 exhaustion by using address and port translation in large scale. It processes the
260 traffic in both the directions.
262 It also supports the connectivity between the IPv6 access network to IPv4 data network
263 using the IPv6 to IPv4 address translation and vice versa.
268 This application provides a standalone DPDK based high performance vCGNAPT
269 Virtual Network Function implementation.
274 The vCGNAPT VNF currently supports the following functionality:
279 ā¢ ARP (request, response, gratuitous)
280 ā¢ ICMP (terminal echo, echo response, passthrough)
281 ā¢ UDP, TCP and ICMP protocol passthrough
282 ā¢ Multithread support
283 ā¢ Multiple physical port support
284 ā¢ Limiting max ports per client
285 ā¢ Limiting max clients per public IP address
286 ā¢ Live Session tracking to NAT flow
292 The Upstream path defines the traffic from Private to Public and the downstream
293 path defines the traffic from Public to Private. The vCGNAPT has same set of
294 components to process Upstream and Downstream traffic.
296 In vCGNAPT application, each component is constructed as IP Pipeline framework.
297 It includes Master pipeline component, load balancer pipeline component and vCGNAPT
300 A Pipeline framework is collection of input ports, table(s), output ports and
301 actions (functions). In vCGNAPT pipeline, main sub components are the Inport function
302 handler, Table and Table function handler. vCGNAPT rules will be configured in the
303 table which translates egress and ingress traffic according to physical port
304 information from which side packet is arrived. The actions can be forwarding to the
305 output port (either egress or ingress) or to drop the packet.
309 The idea of vCGNAPT is to extend the life of the service providers IPv4 network infrastructure
310 and mitigate IPv4 address exhaustion by using address and port translation in large scale.
311 It processes the traffic in both the directions.
317 | Private consumer | CPE ----
318 | IPv4 traffic +-----+ |
319 +------------------+ |
320 | +-------------------+ +------------------+
322 |-> - Private IPv4 - vCGNAPT - Public -
323 |-> - access network - NAT44 - IPv4 traffic -
325 | +-------------------+ +------------------+
326 +------------------+ |
328 | Private consumer - CPE ----
329 | IPv4 traffic +-----+
331 Figure: vCGNAPT deployment in Service provider network
334 Components of vCGNAPT
335 ---------------------
336 In vCGNAPT, each component is constructed as a packet framework. It includes
337 Master pipeline component, driver, load balancer pipeline component and
338 vCGNAPT worker pipeline component. A pipeline framework is a collection of
339 input ports, table(s), output ports and actions (functions).
341 Receive and transmit driver
342 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
343 Packets will be received in bulk and provided to load balancer thread. The
344 transmit takes packets from worker thread in a dedicated ring and sent to the
349 This component does not process any packets and should configure with Core 0,
350 to save cores for other components which processes traffic. The component
353 1. Initializing each component of the Pipeline application in different threads
354 2. Providing CLI shell for the user
355 3. Propagating the commands from user to the corresponding components.
356 4. ARP and ICMP are handled here.
358 Load Balancer pipeline
359 ^^^^^^^^^^^^^^^^^^^^^^^
360 Load balancer is part of the Multi-Threaded CGMAPT release which distributes
361 the flows to Multiple ACL worker threads.
363 Distributes traffic based on the 2 or 5 tuple (source address, source port,
364 destination address, destination port and protocol) applying an XOR logic
365 distributing the load to active worker threads, thereby maintaining an
366 affinity of flows to worker threads.
368 Tuple can be modified/configured using configuration file
373 The vCGNAPT component performs translation of private IP & port to public IP &
374 port at egress side and public IP & port to private IP & port at Ingress side
375 based on the NAT rules added to the pipeline Hash table. The NAT rules are
376 added to the Hash table via user commands. The packets that have a matching
377 egress key or ingress key in the NAT table will be processed to change IP &
378 port and will be forwarded to the output port. The packets that do not have a
379 match will be taken a default action. The default action may result in drop of
385 The vCGNAPT component performs translation of private IP & port to public IP &
386 port at egress side and public IP & port to private IP & port at Ingress side
387 based on the NAT rules added to the pipeline Hash table. Dynamic nature of
388 vCGNAPT refers to the addition of NAT entries in the Hash table dynamically
389 when new packet arrives. The NAT rules will be added to the Hash table
390 automatically when there is no matching entry in the table and the packet is
391 circulated through software queue. The packets that have a matching egress
392 key or ingress key in the NAT table will be processed to change IP &
393 port and will be forwarded to the output port defined in the entry.
395 Dynamic vCGNAPT acts as static one too, we can do NAT entries statically.
396 Static NAT entries port range must not conflict to dynamic NAT port range.
398 vCGNAPT Static Topology
399 ----------------------
401 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 1) IXIA
403 Egress --> The packets sent out from ixia(port 0) will be CGNAPTed to ixia(port 1).
404 Igress --> The packets sent out from ixia(port 1) will be CGNAPTed to ixia(port 0).
406 vCGNAPT Dynamic Topology (UDP_REPLAY)
407 -------------------------------------
409 IXIA(Port 0)-->(Port 0)VNF(Port 1)-->(Port 0)UDP_REPLAY
411 Egress --> The packets sent out from ixia will be CGNAPTed to L3FWD/L4REPLAY.
412 Ingress --> The L4REPLAY upon reception of packets (Private to Public Network),
413 will immediately replay back the traffic to IXIA interface. (Pub -->Priv).
418 After the installation of ISB on L4Replay server go to /opt/isb_bin and run the
423 ./UDP_Replay -c core_mask -n no_of_channels(let it be as 2) -- -p PORT_MASK --config="(port,queue,lcore)"
424 eg: ./UDP_Replay -c 0xf -n 4 -- -p 0x3 --config="(0,0,1)"
432 This application implements Access Control List (ACL). ACL is typically used
433 for rule based policy enforcement. It restricts access to a destination IP
434 address/port based on various header fields, such as source IP address/port,
435 destination IP address/port and protocol. It is built on top of DPDK and uses
436 the packet framework infrastructure.
440 This application provides a standalone DPDK based high performance ACL Virtual
441 Network Function implementation.
445 The ACL Filter performs bulk filtering of incoming packets based on rules in
446 current ruleset, discarding any packets not permitted by the rules. The
447 mechanisms needed for building the rule database and performing lookups are
448 provided by the DPDK API.
450 http://dpdk.org/doc/api/rte__acl_8h.html
452 The Input FIFO contains all the incoming packets for ACL filtering. Packets
453 will be dequeued from the FIFO in bulk for processing by the ACL. Packets will
454 be enqueued to the output FIFO.
456 The Input and Output FIFOs will be implemented using DPDK Ring Buffers.
458 The DPDK ACL example:
460 http://dpdk.org/doc/guides/sample_app_ug/l3_forward_access_ctrl.html
462 #figure-ipv4-acl-rule contains a suitable syntax and parser for ACL rules.
466 In ACL, each component is constructed as a packet framework. It includes
467 Master pipeline component, driver, load balancer pipeline component and ACL
468 worker pipeline component. A pipeline framework is a collection of input ports,
469 table(s), output ports and actions (functions).
471 Receive and transmit driver
472 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
473 Packets will be received in bulk and provided to load balancer thread. The
474 transmit takes packets from worker thread in a dedicated ring and it is sent
475 to the hardware queue.
479 This component does not process any packets and should configure with Core 0,
480 to save cores for other components which processes traffic.
482 The component is responsible for
484 1. Initializing each component of the Pipeline application in different threads
485 2. Providing CLI shell for the user
486 3. Propagating the commands from user to the corresponding components.
487 4. ARP and ICMP are handled here.
492 Load balancer is part of the Multi-Threaded ACL release which distributes
493 the flows to Multiple ACL worker threads.
495 Distributes traffic based on the 5 tuple (source address, source port, destination
496 address, destination port and protocol) applying an XOR logic distributing the
497 load to active worker threads, thereby maintaining an affinity of flows to
503 Visit the following link for DPDK ACL library implementation.
505 http://dpdk.org/doc/api/rte__acl_8h.html
506 http://dpdk.org/doc/guides/prog_guide/packet_classif_access_ctrl.html
508 Provides shadow copy for runtime rule configuration support
510 Implements policy based packet forwarding
518 An Edge Router typically sits between two networks such as the provider core
519 network and the provider access network. In the below diagram, Customer Edge
520 (CE) Router sits in the provider access network and MPLS cloud network
521 represents the provide core network.
522 The edge router processes the traffic in both the directions. The functionality
523 of the Edge Router varies while processing each direction of traffic. The
524 packets to the core network will be filtered, classified and metered with QoS
525 parameters. The packets to the access network will be shaped according to the
527 The idea of Edge Router application is to provide the benchmark for the
528 functionality of Provider Edge routers in each direction.
530 The DPDK IP Pipeline Framework provides set of libraries to build a pipeline
531 application. The Provider Edge Router functionality delivered as a virtual
532 network function (VNF) is integrated with DPDK, optimized for Intel hardware
534 This document assumes the reader possess the knowledge of DPDK concepts and
535 IP Pipeline Framework. For more details, read DPDK Getting Started Guide, DPDK
536 Programmers Guide, DPDK Sample Applications Guide.
541 This application provides a standalone DPDK based high performance Provide
542 Edge Router Network Function implementation.
547 The Edge Router application processes the traffic between Customer and the core
549 The Upstream path defines the traffic from Customer to Core and the downstream
550 path defines the traffic from Core to Customer. The Edge Router has different
551 set of components to process Upstream and Downstream traffic.
553 In Edge Router application, each component is constructed as building blocks in
554 IP Pipeline framework. As in Pipeline framework, each component has its own
555 input ports, table and output ports. The rules of the component will be
556 configured in the table which decides the path of the traffic and any action to
557 be performed on the traffic. The actions can be forwarding to the output port,
558 forwarding to the next table or drop. For more details, please refer Section 24
559 of DPDK Programmers Guide (3).
561 The Core-to-Customer traffic is mentioned as downstream. For downstream
562 processing, Edge Router has the following functionalities in Downstream
564 ---> Packet Rx --> Routing --> Traffic Manager --> Packet Tx -->
567 To identify the route based on the destination IP.
568 To provide QinQ label based on the destination IP.
570 Updates the MAC address based on the route entry.
571 Appends the QinQ label based on the route entry.
573 To perform QoS traffic management (5-level hierarchical scheduling) based on
574 the predefined set of Service Level Agreements (SLAs)
575 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
576 Traffic Manager Profile which contains the SLA parameters are provided as
577 part of the application.
579 The Customer-to-Core traffic is mentioned as upstream. For upstream processing,
580 Edge Router has the following functionalities in Upstream.
582 ---> Packet Rx --> ACL filters --> Flow control --> Metering Policing &
583 Marking --> Routing --> Queueing & Packet Tx -->
586 To filter the unwanted packets based on the defined ACL rules.
587 Source IP, Destination IP, Protocol, Source Port and Destination Port are
588 used to derive the ACL rules.
590 To classify the packet based on the QinQ label
591 To assign a specific flow id based on the classification.
593 Two stages of QoS traffic metering and policing is applied.
594 1st stage is performed per flow ID using trTCM algorithm
595 2nd stage is performed per flow ID traffic class using trTCM algorithm
596 Packets will be either dropped or marked Green, Yellow, Red based on the
599 To identify the route based on the destination IP
600 To provide MPLS label to the packets based on destination IP.
602 Updates the MAC address based on the route entry.
603 Appends the MPLS label based on the route entry.
604 Update the packet color in MPLS EXP field in each MPLS header.
609 The vPE has downstream and upstream pipelines controlled by Master component.
610 Edge router processes two different types of traffic through pipelines
611 I. Downstream (Core-to-Customer)
612 1. Receives TCP traffic from core
613 2. Routes the packet based on the routing rules
614 3. Performs traffic scheduling based on the traffic profile
615 a. Qos scheduling is performed using token bucket algorithm
616 SVLAN, CVLAN, DSCP fields are used to determine transmission priority.
617 4. Appends QinQ label in each outgoing packet.
618 II. Upstream (Customer-to-Core)
619 1. Receives QinQ labelled TCP packets from Customer
620 2. Removes the QinQ label
621 3. Classifies the flow using QinQ label and apply Qos metering
622 a. 1st stage Qos metering is performed with flow ID using trTCM algorithm
623 b. 2nd stage Qos metering is performed with flow ID and traffic class using
625 c. traffic class maps to DSCP field in the packet.
626 4. Routes the packet based on the routing rules
627 5. Appends two MPLS labels in each outgoing packet.
632 The Master component is part of all the IP Pipeline applications. This
633 component does not process any packets and should configure with Core0,
634 to save cores for other components which processes traffic. The component
636 1. Initializing each component of the Pipeline application in different threads
637 2. Providing CLI shell for the user
638 3. Propagating the commands from user to the corresponding components.
640 Upstream and Downstream Pipelines
641 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
643 The downstream will have Firewall, Pass-through, Metering and Routing pipelines.
644 The upstream will have Pass-through and Routing pipelines.
646 To run the VNF, execute the following:
650 isb_root/VNFs/vPE$ ./build/ip_pipeline -p 0x3 \
651 -f config/auto_combo_1_instances_1_queues_2_ports_v2.cfg \
652 -s config/auto_combo_1_instances_1_queues_2_ports_v2.txt
655 Prox - Packet pROcessing eXecution engine
656 ==========================================
661 Packet pROcessing eXecution Engine (PROX) which is a DPDK application.
662 PROX can do operations on packets in a highly configurable manner.
663 The PROX application is also displaying performance statistics that can
664 be used for performance investigations.
665 IntelĀ® DPPD - PROX is an application built on top of DPDK which allows creating
666 software architectures, such as the one depicted below, through small and readable
669 .. image:: images/prox-qo-img01.png
671 The figure shows that each core is executing a set of tasks. Currently,
672 a task can be any one of the following:
676 3. Basic Forwarding (no touch)
677 4. L2 Forwarding (change MAC)
679 6. Load balance based on packet fields
680 7. Symmetric load balancing
681 8. QinQ encap/decap IPv4/IPv6
689 One of the example configurations that is distributed with the source code is a
690 Proof of Concept (PoC) implementation of a Broadband Network Gateway (BNG)
691 with Quality of Service (QoS).
692 The software architecture for this PoC is presented below.
694 .. image:: images/prox-qo-img02.png
696 The display shows per task statistics through an ncurses interface.
697 Statistics include: estimated idleness; per second statistics for packets
698 received, transmitted or dropped; per core cache occupancy; cycles per packet.
699 These statistics can help pinpoint bottlenecks in the system.
700 This information can then be used to optimize the configuration.
701 Other features include debugging support, scripting,
702 Open vSwitch support... A screenshot of the display is provided below.
704 .. image:: images/prox-screen-01.png