docs/development/ngvsrequirements/ngvs-requirements-document.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0 International
   2 .. License.http://creativecommons.org/licenses/by/4.0
   3 .. (c) Xuan Jia (China Mobile)
   4
   5 ==========================================================================
   6 OpenRetriever Next Gen VIM & Edge Computing Scheduler Requirements Document
   7 ===========================================================================
   8
   9 Created by the OPNFV OpenRetriever Team
  10
  11 | Amar Kapadia
  12 | Wassim Haddad
  13 | Heikki Mahkonen
  14 | Srinivasa Addepalli
  15
  16
  17 | v1.0 5/3/17
  18 | v1.1 5/16/17
  19 | v1.2 7/26/17
  20
  21 Motivation
  22 ----------
  23
  24 The OpenRetriever team believes that existing and new NFV workloads can
  25 benefit from a new VIM placement and scheduling component. We further
  26 believe that these same requirements will be very useful for edge
  27 computing scheduling. This document aims to document requirements for
  28 this effort.
  29
  30 By placement and scheduling, we mean:
  31
  32 -  Choose which hardware node to run the VNF on factors such as AAA, ML prediction or MANO
  33
  34 -  Start the VNF(s) depending on a trigger e.g. receiving requests such as DHCP,DNS or upon data packet or NULL trigger
  35
  36 We use the generic term “scheduler” to refer to the placement and
  37 scheduling component in the rest of this document. We are not including
  38 lifecycle management of the VNF in our definition of the scheduler.
  39
  40 At a high level, we believe the VIM scheduler must:
  41
  42 -  Support virtual machines, containers and unikernels
  43
  44 -  Support legacy and event-driven scheduling
  45
  46    -  By legacy scheduling we mean scheduling without any trigger (see above)
  47 i.e. the current technique used by schedulers such as OpenStack Nova.
  48    -  By event-driven scheduling we mean scheduling with a trigger (see above).
  49 We do not mean that the unikernel or container that is going to run the VNF is
  50 already running . The instance is started and torn-down in response to traffic.
  51 The two step process is transparent to the user.
  52    -  More specialized higher level schedulers and orchestration systems may be
  53 run on top e.g. FaaS (similar to AWS Lambda) etc.
  54
  55 +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  56 | Serverless vs. FaaS vs. Event-Driven Terminology                                                                                                                                                                                                          |
  57 |                                                                                                                                                                                                                                                           |
  58 | Serverless: By serverless, we mean a general PaaS concept where the user does
  59  not have to specify which physical or virtual compute resource their code snippet or function will run on. The code snippet/function is executed in response to an event.   |
  60 |                                                                                                                                                                                                                                                           |
  61 | FaaS: We use this term synonymously with serverless.                                                                                                                                                                                                      |
  62 |                                                                                                                                                                                                                                                           |
  63 | Event-Driven: By event-driven, we mean an entire microservice or service (as opposed a code snippet) is executed in response to an event.                                                                                                                 |
  64 +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  65
  66 -  Work in distributed edge environments
  67
  68 Please provide your inputs. Once we have a comprehensive list of
  69 requirements, we will investigate what the right open source solution
  70 should be, and how to influence that particular project.
  71
  72 Use cases
  73 ---------
  74
  75 A number of NFV use cases can benefit from a new VIM scheduler:
  76
  77 vCPE
  78 ~~~~
  79
  80 vCPE can benefit from a new scheduler in two ways:
  81
  82 1. uCPE devices have very few cores (4-8 typical). Running statically scheduled
  83 VMs is inefficient. An event-driven scheduler would help optimize the hardware resources and increase capacity.
  84
  85 2. vCPE is a bursty NFV use case, where services are not “on” all the time.
  86 Legacy provisioning of virtual machines for each VNF significantly reduces
  87 resource utilization, which in turn negatively impacts the
  88 total-cost-of-ownership (TCO). Recent Intel studies have shown, in certain
  89 cases, vCPE saves 30-40% TCO over physical functions. This number is hardly
  90 compelling, we believe it needs to be significantly higher to be of any
  91 interest. This can be accomplished by increasing utilization, which in turn
  92 can be achieved through event-driven scheduling.
  93
  94 IOT/ MEC
  95 ~~~~~~~~
  96
  97 IOT & multi-access edge computing
  98 (`*MEC* <http://www.etsi.org/technologies-clusters/technologies/multi-access-edge-computing>`__)
  99 share many of the same characteristics as the uCPE. Though serverless
 100 functions increase the resource utilization, it does not provide ability
 101 for application developers to introduce traditional security functions.
 102 Serverless services that can be brought up on-demand basis provide
 103 increases resource utilization as well as ability to introduce security
 104 functions within the service. Additionally, there is need for low
 105 latency and high security as well. A new scheduler can help with these
 106 needs.
 107
 108 5G
 109 ~~
 110
 111 5G brings with it a number of above requirements, but perhaps the one
 112 that stands out the most is price/ performance. By using containers and
 113 unikernels, the price/ performance ratio can be significantly improved.
 114 (Containers or unikernels result in ~10x density with Legacy scheduling;
 115 higher density is possible with event-driven scheduling.) 5G will also
 116 bring MEC and IOT needs from the prior use case.
 117
 118 Security
 119 ~~~~~~~~
 120
 121 Many traditional services are always-on. Always-on services provide
 122 enough time for attackers to find vulnerabilities and exploit them. By
 123 bringing up workloads on demand basis and terminating them upon
 124 completion of its usage, closes the time advantage attackers have. For
 125 example, in three tier architecture of “Web”, “App” and “DB”, following
 126 on demand bring up would reduce the attack surface
 127
 128 -  On demand bring up of “DB” service upon “APP” layer request.
 129 -  On demand bringup of “APP” service upon “Web” layer authenticates the user.
 130 -  On demand bring up of “Web” service upon “DNS” request or upon seeing “SYN” packet
 131
 132 Workloads can be brought down upon inactivity or using some application
 133 specific methods. Thin services (implemented using unikernels & Clear
 134 containers) and fast schedulers are required to enable this kind of
 135 security.
 136
 137 Detailed Requirements
 138 ---------------------
 139
 140 Multiple compute types
 141 ~~~~~~~~~~~~~~~~~~~~~~
 142
 143 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 144 | Requirement                            | Details                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
 145 +========================================+=====================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================+
 146 | Support for virtual machines           | VMs are the most common form of VNFs, and are not going away anytime soon. A scheduler must be able to support VMs. In theory, the MANO software could use two VIMs: one for VMs and another for containers/ unikernels. However, we believe this is a suboptimal solution since the operational complexity doubles - now the ops team has to deal with two VIM software layers. Also, networking coordination between the two VIM layers becomes complex.                                                                                          |
 147 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 148 |                                        | NOTE: Bare-metal server scheduling, e.g. OpenStack Ironic, is out-of-scope for this document.                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 149 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 150 | Support containers                     | This need is clear, the future of VNFs seems to be containerized VNFs. Containers are 10x more dense than VMs and boot 10x faster. Containers will also accelerate the move to cloud-native VNFs. Some users may want nested scheduling e.g. containers in VMs or containers in containers. Nested scheduling is out-of-scope for this document. We will only focus on one layer of scheduling problem and expect the other layer of scheduler to be distinct and separate.                                                                         |
 151 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 152 | Support unikernels                     | Unikernels are lightweight VMs; with the same density of containers, but faster boot times than containers. Since unikernels are VMs and incredible small surface area, they have rock-solid security characteristics. Unikernels are also higher performance than VMs. For these reasons, unikernels could play an important role in NFV. The downsides with unikernels are i) they are new, ii) often tied to a programming language and iii) they require a software recompile. Unikernels are an ideal fit for micro-VNFs. More specifically:   |
 153 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 154 |                                        | -  Need VNFs to be highly secure by reducing significantly the attack surface                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 155 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 156 |                                        | -  Need to be able to schedule to NFVI with high performance OVS-less services chaining (e.g. through shared memory) that can significantly improve performance                                                                                                                                                                                                                                                                                                                                                                                     |
 157 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 158 | Colocation                             | We need support for affinity/anti-affinity constraints on VNF compute type (i.e. VM, unikernel, container). This will make colocation of different types of VNF compute types on the same host possible, if needed.                                                                                                                                                                                                                                                                                                                                 |
 159 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 160 | Support all compute types on one SFC   | Since VNFs are procured from different vendors, it is possible to get a mix of compute types: VMs, containers, unikernels; and it should be possible to construct a service function chain from heterogeneous compute types.                                                                                                                                                                                                                                                                                                                        |
 161 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 162 | Unified API for all compute types      | Even though it is theoretically possible to have different APIs for different compute types and push the problem to the MANO layer, this increases the overall complexity for the solution. For this reason, the API needs to be unified and consistent for different compute types.                                                                                                                                                                                                                                                                |
 163 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 164 | Hardware awareness                     | Ability to place workloads with specific hardware or underlying infrastructure capabilities (e.g. Intel EPA [1]_, FD.io, Smart NICs, Trusted Execution Environment, shared memory switching etc.)                                                                                                                                                                                                                                                                                                                                                   |
 165 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 166 | Rich networking                        | The new VIM scheduler needs to be supported by rich networking features currently available to OpenStack Nova through OpenStack Neutron (See document outlining K8s `*networking* <https://docs.google.com/document/d/1TW3P4c8auWwYy-w_5afIPDcGNLK3LZf0m14943eVfVg/edit?ts=5901ec88>`__ requirements as an example):                                                                                                                                                                                                                                |
 167 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 168 |                                        | -  Ability to create multiple IP addresses/ VNF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 169 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 170 |                                        | -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
 171 |                                        | -  Networks not having cluster-wide connectivity; not having visibility to each other                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 172 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 173 |                                        | -  Multi-tenancy: i) support traffic isolation between compute entities belonging to different tenants, ii) support overlapping IP addresses across VNFs.                                                                                                                                                                                                                                                                                                                                                                                           |
 174 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 175 |                                        | -  Limit services such as load balancing, service discovery etc. on certain network interfaces (see additional `*document* <https://docs.google.com/document/d/1mNZZ2lL6PERBbt653y_hnck3O4TkQhrlIzW1cIc8dJI/edit>`__).                                                                                                                                                                                                                                                                                                                              |
 176 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 177 |                                        | -  L2 and L3 connectivity (?)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 178 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 179 |                                        | -  Service Discovery                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
 180 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 181 | Image repository & shared storage      | -  Centralized/distributed image repository                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
 182 |                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
 183 |                                        | -  Support shared storage (e.g. OpenStack Cinder, K8s volumes etc.)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
 184 +----------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 185 .. [1]
 186    Intel EPA includes DPDK, SR-IOV, CPU and NUMA pinning, Huge Pages
 187    etc.
 188
 189 [OPEN QUESTION] What subset of the Neutron functionality is required
 190 here?
 191
 192 Multiple scheduling techniques
 193 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 194
 195 +---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 196 | Requirement               | Details                                                                                                                                                                                                                                                                                                            |
 197 +===========================+====================================================================================================================================================================================================================================================================================================================+
 198 | Legacy scheduling         | This is the current technique used by OpenStack Nova and container orchestration engines. Legacy scheduling needs to be supported as-is.                                                                                                                                                                           |
 199 +---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 200 | Event-driven scheduling   | This applies only to unikernels, since unikernels are the only compute type that can boot at packet RTT. Thus, the requirement is to be able to schedule and boot unikernel instances in response to events with <30ms of ms (e.g., event-driven type of scheduling) as a must-have and <10ms as a nice-to-have.   |
 201 +---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 202 | Distributed Scheduling    | Since services need to be brought up at packet RTT, there could be requirement to distribute the scheduling across compute nodes.                                                                                                                                                                                  |
 203 +---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 204 | Multi Stage scheduling    | To enable scheduling of services at packet RTT, there is a need to divide the scheduling to at least two stages - Initial stage where multiple service images are uploaded to candidate compute nodes and second stage where distributed scheduler bring up the service using the locally cached images.           |
 205 +---------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 206
 207 [OPEN QUESTION] What subset of the rich scheduler feature-set is
 208 required here? (e.g. affinity, anti-affinity, understanding of dataplane
 209 acceleration etc.)
 210
 211 Highly distributed environments
 212 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 213
 214 There are two possibilities here. A) The entire VIM will be in an edge
 215 device and the MANO software will have to deal with 10s or 100s of
 216 thousands of VIM instances. B) The alternative is that the VIM itself
 217 will manage edge devices, i.e. the MANO software will deal with a
 218 limited number of VIM instances. Both scenarios are captured below.
 219
 220 +--------------------+---------------------------------------------------------------------------------------------------------------+
 221 | Requirement        | Details                                                                                                       |
 222 +====================+===============================================================================================================+
 223 | Small footprint    | It should be possible to run the VIM scheduler in 1-2 cores.                                                  |
 224 +--------------------+---------------------------------------------------------------------------------------------------------------+
 225 | Nodes across WAN   | It should be possible to distribute the VIM scheduler across nodes separated by long RTT delays (i.e. WAN).   |
 226 +--------------------+---------------------------------------------------------------------------------------------------------------+
 227
 228 Software Survey Candidates
 229 --------------------------
 230
 231 Once the survey is complete, we will evaluate the following software
 232 stacks against those requirements. Each survey, either conducted in
 233 person and/or via documentation review, will consist of:
 234
 235 1. Architecture overview
 236
 237 2. Pros
 238
 239 3. Cons
 240
 241 4. Gap analysis
 242
 243 5. How gaps can be addressed
 244
 245 Each survey is expected to take 3-4 weeks.
 246
 247 +------------------------------------------+------------------------------------------------------+
 248 | CNCF K8s                                 | Srini (talk to Xuan, Frederic, study gap analysis)   |
 249 +------------------------------------------+------------------------------------------------------+
 250 | Docker Swarm                             |                                                      |
 251 +------------------------------------------+------------------------------------------------------+
 252 | VMware Photon                            | Srikanth                                             |
 253 +------------------------------------------+------------------------------------------------------+
 254 | Intel Clear Container                    | Srini                                                |
 255 +------------------------------------------+------------------------------------------------------+
 256 | Intel Ciao                               | Srini                                                |
 257 +------------------------------------------+------------------------------------------------------+
 258 | OpenStack Nova                           |                                                      |
 259 +------------------------------------------+------------------------------------------------------+
 260 | Mesos                                    | Srikanth                                             |
 261 +------------------------------------------+------------------------------------------------------+
 262 | Virtlet (VM scheduling by K8s)           | Amar                                                 |
 263 +------------------------------------------+------------------------------------------------------+
 264 | Kubelet (VM scheduling by K8s)           | Amar                                                 |
 265 +------------------------------------------+------------------------------------------------------+
 266 | Kuryr (K8s to Neutron interface)         | Prem                                                 |
 267 +------------------------------------------+------------------------------------------------------+
 268 | RunV (like RunC) - can it support a VM   |                                                      |
 269 +------------------------------------------+------------------------------------------------------+
 270 | Nelson distributed container framework   |                                                      |
 271 +------------------------------------------+------------------------------------------------------+
 272 | Nomad                                    |                                                      |
 273 +------------------------------------------+------------------------------------------------------+
 274
 275 Additional Points to Revisit
 276 ----------------------------
 277
 278 -  Guidance on how to create immutable infrastructure with complete configuration, and benefits to performance and security
 279 -  Guidance on API - VNFM vs. VIM
 280