1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
4 Detailed architecture and message flows
5 =======================================
7 Within the Promise project we consider two different architectural options, i.e.
8 a *shim-layer* based architecture and an architecture targeting at full
9 OpenStack *integration*.
11 Shim-layer architecture
12 -----------------------
14 The *shim-layer architecture* is using a layer on top of OpenStack to provide
15 the capacity management, resource reservation, and resource allocation features.
18 Detailed Message Flows
19 ^^^^^^^^^^^^^^^^^^^^^^
21 Note, that only selected parameters for the messages are shown. Refer to
22 :ref:`northbound_API` and Annex :ref:`yang_schema` for a full set of message
25 Resource Capacity Management
26 """"""""""""""""""""""""""""
28 .. figure:: images/figure5_new.png
32 Capacity Management Scenario
34 :numref:`figure5` shows a detailed message flow between the consumers and the
35 capacity management functional blocks inside the shim-layer. It has the
38 * Step 1a: The Consumer sends a *query-capacity* request to Promise
39 using some filter like time-windows or resource type. The capacity is
40 looked up in the shim-layer capacity map.
42 * Step 1b: The shim-layer will respond with information about the
43 total, available, reserved, and used (allocated) capacities matching the
46 * Step 2a: The Consumer can send *increase/decrease-capacity* requests
47 to update the capacity available to the reservation system. It can be
48 100% of available capacity in the given provider/source or only a subset,
49 i.e., it can allow for leaving some "buffer" in the actual NFVI to be
50 used outside the Promise shim-layer or for a different reservation
51 service instance. It can also be used to inform the reservation system
52 that from a certain time in the future, additional resources can be
53 reserved (e.g. due to a planned upgrade of the capacity), or the
54 available capacity will be reduced (e.g. due to a planned downtime of
55 some of the resources).
57 * Step 2b: The shim-layer will respond with an ACK/NACK message.
59 * Step 3a: Consumers can subscribe for capacity-change events using a
62 * Step 3b: Each successful subscription is responded with a
65 * Step 4: The shim-layer monitors the capacity information for the
66 various types of resources by periodically querying the various
67 Controllers (e.g. Nova, Neutron, Cinder) or by creating event alarms in
68 the VIM (e.g. with Ceilometer for OpenStack) and updates capacity
69 information in its capacity map.
71 * Step 5: Capacity changes are notified to the Consumer.
76 .. figure:: images/figure6_new.png
80 Resource Reservation for Future Use Scenario
82 :numref:`figure6` shows a detailed message flow between the Consumer and the
83 resource reservation functional blocks inside the shim-layer. It has the
86 * Step 1a: The Consumer creates a resource reservation request for
87 future use by setting a start and end time for the reservation as well as
88 more detailed information about the resources to be reserved. The Promise
89 shim-layer will check the free capacity in the given time window and in
90 case sufficient capacity exists to meet the reservation request, will
91 mark those resources "reserved" in its reservation map.
93 * Step 1b: If the reservation was successful, a reservation_id and
94 status of the reservation will be returned to the Consumer. In case the
95 reservation cannot be met, the shim-layer may return information about
96 the maximum capacity that could be reserved during the requested time
97 window and/or a potential time window where the requested (amount of)
98 resources would be available.
100 * Step 2a: Reservations can be updated using an *update-reservation*,
101 providing the reservation_id and the new reservation_data. Promise
102 Reservation Manageer will check the feasibility to update the reservation
105 * Step 2b: If the reservation was updated successfully, a
106 reservation_id and status of the reservation will be returned to the
107 Consumer. Otherwise, an appropriate error message will be returned.
109 * Step 3a: A *cancel-reservation* request can be used to withdraw an
110 existing reservation. Promise will update the reservation map by removing
111 the reservation as well as the capacity map by adding the freed capacity.
113 * Step 3b: The response message confirms the cancelation.
115 * Step 4a: Consumers can also issue *query-reservation* requests to
116 receive a list of reservation. An input filter can be used to narrow down
117 the query, e.g., only provide reservations in a given time window.
118 Promise will query its reservation map to identify reservations matching
121 * Step 4b: The response message contains information about all
122 reservations matching the input filter. It also provides information
123 about the utilization in the requested time window.
125 * Step 5a: Consumers can subscribe for reservation-change events using
128 * Step 5b: Each successful subscription is responded with a
131 * Step 6a: Promise synchronizes the available and used capacity with
134 * Step 6b: In certain cases, e.g., due a failure in the underlying
135 hardware, some reservations cannot be kept up anymore and have to be
136 updated or canceled. The shim-layer will identify affected reservations
137 among its reservation records.
139 * Step 7: Subscribed Consumers will be informed about the updated
140 reservations. The notification contains the updated reservation_data and
141 new status of the reservation. It is then up to the Consumer to take
142 appropriate actions in order to ensure high priority reservations are
143 favored over lower priority reservations.
148 .. figure:: images/figure7_new.png
154 :numref:`figure7` shows a detailed message flow between the Consumer, the
155 functional blocks inside the shim-layer, and the VIM. It has the following
158 * Step 1a: The Consumer sends a *create-instance* request providing
159 information about the resources to be reserved, i.e., provider_id
160 (optional in case of only one provider), name of the instance, the
161 requested flavour and image, etc. If the allocation is against an
162 existing reservation, the reservation_id has to be provided.
164 * Step 1b: If a reservation_id was provided, Promise checks if a
165 reservation with that ID exists, the reservation start time has arrived
166 (i.e. the reservation is active), and the required capacity for the
167 requested flavor is within the available capacity of the reservation. If
168 those conditions are met, Promise creates a record for the allocation
169 (VMState="INITIALIZED") and update its databases. If no reservation_id
170 was provided in the allocation request, Promise checks whether the
171 required capacity to meet the request can be provided from the available,
172 non-reserved capacity. If yes, Promise creates a record for the
173 allocation and update its databases. In any other case, Promise rejects
174 the *create-instance* request.
176 * Step 2: In the case the *create-instance* request was rejected,
177 Promise responds with a "status=rejected" providing the reason of the
178 rejection. This will help the Consumer to take appropriate actions, e.g.,
179 send an updated *create-instance* request. The allocation work flow will
180 terminate at this step and the below steps are not executed.
182 * Step 3a: If the *create-instance* request was accepted and a related
183 allocation record has been created, the shim-layer issues a
184 *createServer* request to the VIM Controller providing all information to
185 create the server instance.
187 * Step 3b: The VIM Controller sends an immediate reply with an
188 instance_id and starts the VIM-internal allocation process.
190 * Step 4: The Consumer gets an immediate response message with
191 allocation status "in progress" and the assigned instance_id.
193 * Step 5a+b: The consumer subscribes to receive notifications about
194 allocation events related to the requested instance. Promise responds
195 with an acknowledgment including a subscribe_id.
197 * Step 6: In parallel to the previous step, Promise shim-layer creates
198 an alarm in Aodh to receive notifications about all changes to the
199 VMState for instance_id.
201 * Step 7a: The VIM Controller notifies all instance related events to
202 Ceilometer. After the allocation has been completed or failed, it sends
203 an event to Ceilometer. This triggers the OpenStack alarming service Aodh
204 to notify the new VMState (e.g. ACTIVE and ERROR) to the shim-layer that
205 updates its internal allocation records.
207 * Step 7b: Promise sends a notification message to the subscribed
208 Consumer with information on the allocated resources including their new
211 * Step 8a+b: Allocated instances can be terminated by the Consumer by
212 sending a *destroy-instance* request to the shim-layer. Promise responds
213 with an acknowledgment and the new status "DELETING" for the instance.
215 * Step 9a: Promise sends a *deleteServer* request for the instance_id
216 to the VIM Controller.
218 * Step 10a: After the instance has been deleted, an event alarm is
219 sent to the shim-layer that updates its internal allocation records and
220 capacity utilization.
222 * Step 10b: The shim-layer also notifies the subscribed Consumer about
223 the successfully destroyed instance.
229 .. note:: This section is to be updated
231 In the following, the internal logic and operations of the shim-layer will be
232 explained in more detail, e.g. the "check request" (step 1b in
233 :numref:`figure7` of the allocation work flow).
237 Integrated architecture
238 -----------------------
240 The *integrated architecture* aims at full integration with OpenStack.
241 This means that it is planned to use the already existing OpenStack APIs
242 extended with the reservation capabilities.
244 The advantage of this approach is that we don't need to re-model the
245 complex resource structure we have for the virtual machines and the
246 corresponding infrastructure.
248 The atomic item is the virtual machine with the minimum set of resources
249 it requires to be able to start it up. It is important to state that
250 resource reservation is handled on VM instance level as opposed to standalone
251 resources like CPU, memory and so forth. As the placement is an important
252 aspect in order to be able to use the reserved resources it provides the
253 constraint to handle resources in groups.
255 The placement constraint also makes it impossible to use a quota management
256 system to solve the base use case described earlier in this document.
258 OpenStack had a project called Blazar, which was created in order to provide
259 resource reservation functionality in cloud environments. It uses the Shelve
260 API of Nova, which provides a sub-optimal solution. Due to the fact that this
261 feature blocks the reserved resources this solution cannot be considered to
262 be final. Further work is needed to reach a more optimal stage, where the
263 Nova scheduler is intended to be used to schedule the resources for future
264 use to make the reservations.
269 The work has two main stages to reach the final solution. The following main work items
270 are on the roadmap for this approach:
272 #. Sub-optimal solution by using the shelve API of Nova through the Blazar project:
274 * Fix the code base of the Blazar project:
276 Due to integration difficulties the Blazar project got suspended. Since the last
277 activities in that repository the OpenStack code base and environment changed
278 significantly, which means that the project's code base needs to be updated to the
279 latest standards and has to be able to interact with the latest version of the
280 other OpenStack services.
282 * Update the Blazar API:
284 The REST API needs to be extended to contain the attributes for the reservation
285 defined in this document. This activity shall include testing towards the new API.
287 #. Use Nova scheduler to avoid blocking the reserved resources:
289 * Analyze the Nova scheduler:
291 The status and the possible interface between the resource reservation system and
292 the Nova scheduler needs to be identified. It is crucial to achieve a much more
293 optimal solution than what the current version of Blazar can provide. The goal is
294 to be able to use the reserved resources before the reservation starts. In order to
295 be able to achieve this we need the scheduler to do scheduling for the future
296 considering the reservation intervals that are specified in the request.
298 * Define a new design based on the analysis and start the work on it:
300 The design for the more optimal solution can be defined only after analyzing the
301 structure and capabilities of the Nova scheduler.
303 * This phase can be started in parallel with the previous one.
305 Detailed Message Flows
306 ^^^^^^^^^^^^^^^^^^^^^^
313 .. note:: to be specified