Merge "[docs] Add odl-ovs-noha scenario docs"
[fuel.git] / docs / release / userguide / userguide.rst
1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) Open Platform for NFV Project, Inc. and its contributors
4
5 *********************
6 OPNFV Fuel User Guide
7 *********************
8
9 Abstract
10 ========
11
12 This document contains details about using OPNFV Fuel ``Gambia`` release after
13 it was deployed. For details on how to deploy OpenStack, check
14 the installation instructions in the :ref:`fuel_userguide_references` section.
15
16 This is an unified documentation for both ``x86_64`` and ``aarch64``
17 architectures. All information is common for both architectures
18 except when explicitly stated.
19
20 Network Overview
21 ================
22
23 Fuel uses several networks to deploy and administer the cloud:
24
25 +------------------+----------------------------------------------------------+
26 | Network name     | Description                                              |
27 |                  |                                                          |
28 +==================+==========================================================+
29 | **PXE/admin**    | Used for booting the nodes via PXE and/or Salt           |
30 |                  | control network                                          |
31 +------------------+----------------------------------------------------------+
32 | **mcpcontrol**   | Used to provision the infrastructure hosts (Salt & MaaS) |
33 +------------------+----------------------------------------------------------+
34 | **management**   | Used for internal communication between                  |
35 |                  | OpenStack components                                     |
36 +------------------+----------------------------------------------------------+
37 | **internal**     | Used for VM data communication within the                |
38 |                  | cloud deployment                                         |
39 +------------------+----------------------------------------------------------+
40 | **public**       | Used to provide Virtual IPs for public endpoints         |
41 |                  | that are used to connect to OpenStack services APIs.     |
42 |                  | Used by Virtual machines to access the Internet          |
43 +------------------+----------------------------------------------------------+
44
45 These networks - except ``mcpcontrol`` - can be Linux bridges configured
46 before the deploy on the Jumpserver.
47 If they don't exists at deploy time, they will be created by the scripts as
48 ``libvirt`` managed networks.
49
50 Network ``mcpcontrol``
51 ~~~~~~~~~~~~~~~~~~~~~~
52
53 ``mcpcontrol`` is a virtual network, managed by libvirt. Its only purpose is to
54 provide a simple method of assigning an arbitrary ``INSTALLER_IP`` to the Salt
55 master node (``cfg01``), to maintain backwards compatibility with old OPNFV
56 Fuel behavior. Normally, end-users only need to change the ``INSTALLER_IP`` if
57 the default CIDR (``10.20.0.0/24``) overlaps with existing lab networks.
58
59 ``mcpcontrol`` has both NAT and DHCP enabled, so the Salt master (``cfg01``)
60 and the MaaS VM (``mas01``, when present) get assigned predefined IPs (``.2``,
61 ``.3``, while the jumpserver bridge port gets ``.1``).
62
63 +------------------+---------------------------+-----------------------------+
64 | Host             | Offset in IP range        | Default address             |
65 +==================+===========================+=============================+
66 | ``jumpserver``   | 1st                       | ``10.20.0.1``               |
67 +------------------+---------------------------+-----------------------------+
68 | ``cfg01``        | 2nd                       | ``10.20.0.2``               |
69 +------------------+---------------------------+-----------------------------+
70 | ``mas01``        | 3rd                       | ``10.20.0.3``               |
71 +------------------+---------------------------+-----------------------------+
72
73 This network is limited to the ``jumpserver`` host and does not require any
74 manual setup.
75
76 Network ``PXE/admin``
77 ~~~~~~~~~~~~~~~~~~~~~
78
79 .. TIP::
80
81     ``PXE/admin`` does not usually use an IP range offset in ``IDF``.
82
83 .. NOTE::
84
85     During ``MaaS`` commissioning phase, IP addresses are handed out by
86     ``MaaS``'s DHCP.
87
88 .. WARNING::
89
90     Default addresses in below table correspond to a ``PXE/admin`` CIDR of
91     ``192.168.11.0/24`` (the usual value used in OPNFV labs).
92
93     This is defined in ``IDF`` and can easily be changed to something else.
94
95 .. TODO: detail MaaS DHCP range start/end
96
97 +------------------+-----------------------+---------------------------------+
98 | Host             | Offset in IP range    | Default address                 |
99 +==================+=======================+=================================+
100 | ``jumpserver``   | 1st                   | ``192.168.11.1``                |
101 |                  |                       | (manual assignment)             |
102 +------------------+-----------------------+---------------------------------+
103 | ``cfg01``        | 2nd                   | ``192.168.11.2``                |
104 +------------------+-----------------------+---------------------------------+
105 | ``mas01``        | 3rd                   | ``192.168.11.3``                |
106 +------------------+-----------------------+---------------------------------+
107 | ``prx01``,       | 4th,                  | ``192.168.11.4``,               |
108 | ``prx02``        | 5th                   | ``192.168.11.5``                |
109 +------------------+-----------------------+---------------------------------+
110 | ``gtw01``,       | ...                   | ``...``                         |
111 | ``gtw02``,       |                       |                                 |
112 | ``gtw03``        |                       |                                 |
113 +------------------+-----------------------+---------------------------------+
114 | ``kvm01``,       |                       |                                 |
115 | ``kvm02``,       |                       |                                 |
116 | ``kvm03``        |                       |                                 |
117 +------------------+-----------------------+---------------------------------+
118 | ``dbs01``,       |                       |                                 |
119 | ``dbs02``,       |                       |                                 |
120 | ``dbs03``        |                       |                                 |
121 +------------------+-----------------------+---------------------------------+
122 | ``msg01``,       |                       |                                 |
123 | ``msg02``,       |                       |                                 |
124 | ``msg03``        |                       |                                 |
125 +------------------+-----------------------+---------------------------------+
126 | ``mdb01``,       |                       |                                 |
127 | ``mdb02``,       |                       |                                 |
128 | ``mdb03``        |                       |                                 |
129 +------------------+-----------------------+---------------------------------+
130 | ``ctl01``,       |                       |                                 |
131 | ``ctl02``,       |                       |                                 |
132 | ``ctl03``        |                       |                                 |
133 +------------------+-----------------------+---------------------------------+
134 | ``odl01``,       |                       |                                 |
135 | ``odl02``,       |                       |                                 |
136 | ``odl03``        |                       |                                 |
137 +------------------+-----------------------+---------------------------------+
138 | ``mon01``,       |                       |                                 |
139 | ``mon02``,       |                       |                                 |
140 | ``mon03``,       |                       |                                 |
141 | ``log01``,       |                       |                                 |
142 | ``log02``,       |                       |                                 |
143 | ``log03``,       |                       |                                 |
144 | ``mtr01``,       |                       |                                 |
145 | ``mtr02``,       |                       |                                 |
146 | ``mtr03``        |                       |                                 |
147 +------------------+-----------------------+---------------------------------+
148 | ``cmp001``,      |                       |                                 |
149 | ``cmp002``,      |                       |                                 |
150 | ``...``          |                       |                                 |
151 +------------------+-----------------------+---------------------------------+
152
153 Network ``management``
154 ~~~~~~~~~~~~~~~~~~~~~~
155
156 .. TIP::
157
158     ``management`` often has an IP range offset defined in ``IDF``.
159
160 .. WARNING::
161
162     Default addresses in below table correspond to a ``management`` IP range of
163     ``172.16.10.10-172.16.10.254`` (one of the commonly used values in OPNFV
164     labs). This is defined in ``IDF`` and can easily be changed to something
165     else. Since the ``jumpserver`` address is manually assigned, this is
166     usually not subject to the IP range restriction in ``IDF``.
167
168 +------------------+-----------------------+---------------------------------+
169 | Host             | Offset in IP range    | Default address                 |
170 +==================+=======================+=================================+
171 | ``jumpserver``   | N/A                   | ``172.16.10.1``                 |
172 |                  |                       | (manual assignment)             |
173 +------------------+-----------------------+---------------------------------+
174 | ``cfg01``        | 1st                   | ``172.16.10.2``                 |
175 |                  |                       | (IP range ignored for now)      |
176 +------------------+-----------------------+---------------------------------+
177 | ``mas01``        | 2nd                   | ``172.16.10.12``                |
178 +------------------+-----------------------+---------------------------------+
179 | ``prx``          | 3rd,                  | ``172.16.10.13``,               |
180 |                  |                       |                                 |
181 | ``prx01``,       | 4th,                  | ``172.16.10.14``,               |
182 | ``prx02``        | 5th                   | ``172.16.10.15``                |
183 +------------------+-----------------------+---------------------------------+
184 | ``gtw01``,       | ...                   | ``...``                         |
185 | ``gtw02``,       |                       |                                 |
186 | ``gtw03``        |                       |                                 |
187 +------------------+-----------------------+---------------------------------+
188 | ``kvm``,         |                       |                                 |
189 |                  |                       |                                 |
190 | ``kvm01``,       |                       |                                 |
191 | ``kvm02``,       |                       |                                 |
192 | ``kvm03``        |                       |                                 |
193 +------------------+-----------------------+---------------------------------+
194 | ``dbs``,         |                       |                                 |
195 |                  |                       |                                 |
196 | ``dbs01``,       |                       |                                 |
197 | ``dbs02``,       |                       |                                 |
198 | ``dbs03``        |                       |                                 |
199 +------------------+-----------------------+---------------------------------+
200 | ``msg``,         |                       |                                 |
201 |                  |                       |                                 |
202 | ``msg01``,       |                       |                                 |
203 | ``msg02``,       |                       |                                 |
204 | ``msg03``        |                       |                                 |
205 +------------------+-----------------------+---------------------------------+
206 | ``mdb``,         |                       |                                 |
207 |                  |                       |                                 |
208 | ``mdb01``,       |                       |                                 |
209 | ``mdb02``,       |                       |                                 |
210 | ``mdb03``        |                       |                                 |
211 +------------------+-----------------------+---------------------------------+
212 | ``ctl``,         |                       |                                 |
213 |                  |                       |                                 |
214 | ``ctl01``,       |                       |                                 |
215 | ``ctl02``,       |                       |                                 |
216 | ``ctl03``        |                       |                                 |
217 +------------------+-----------------------+---------------------------------+
218 | ``odl``,         |                       |                                 |
219 |                  |                       |                                 |
220 | ``odl01``,       |                       |                                 |
221 | ``odl02``,       |                       |                                 |
222 | ``odl03``        |                       |                                 |
223 +------------------+-----------------------+---------------------------------+
224 | ``mon``,         |                       |                                 |
225 |                  |                       |                                 |
226 | ``mon01``,       |                       |                                 |
227 | ``mon02``,       |                       |                                 |
228 | ``mon03``,       |                       |                                 |
229 |                  |                       |                                 |
230 | ``log``,         |                       |                                 |
231 |                  |                       |                                 |
232 | ``log01``,       |                       |                                 |
233 | ``log02``,       |                       |                                 |
234 | ``log03``,       |                       |                                 |
235 |                  |                       |                                 |
236 | ``mtr``,         |                       |                                 |
237 |                  |                       |                                 |
238 | ``mtr01``,       |                       |                                 |
239 | ``mtr02``,       |                       |                                 |
240 | ``mtr03``        |                       |                                 |
241 +------------------+-----------------------+---------------------------------+
242 | ``cmp001``,      |                       |                                 |
243 | ``cmp002``,      |                       |                                 |
244 | ``...``          |                       |                                 |
245 +------------------+-----------------------+---------------------------------+
246
247 Network ``internal``
248 ~~~~~~~~~~~~~~~~~~~~
249
250 .. TIP::
251
252     ``internal`` does not usually use an IP range offset in ``IDF``.
253
254 .. WARNING::
255
256     Default addresses in below table correspond to an ``internal`` CIDR of
257     ``10.1.0.0/24`` (the usual value used in OPNFV labs).
258     This is defined in ``IDF`` and can easily be changed to something else.
259
260 +------------------+------------------------+--------------------------------+
261 | Host             | Offset in IP range     | Default address                |
262 +==================+========================+================================+
263 | ``jumpserver``   | N/A                    | ``10.1.0.1``                   |
264 |                  |                        | (manual assignment, optional)  |
265 +------------------+------------------------+--------------------------------+
266 | ``gtw01``,       | 1st,                   | ``10.1.0.2``,                  |
267 | ``gtw02``,       | 2nd,                   | ``10.1.0.3``,                  |
268 | ``gtw03``        | 3rd                    | ``10.1.0.4``                   |
269 +------------------+------------------------+--------------------------------+
270 | ``cmp001``,      | 4th,                   | ``10.1.0.5``,                  |
271 | ``cmp002``,      | 5th,                   | ``10.1.0.6``,                  |
272 | ``...``          | ...                    | ``...``                        |
273 +------------------+------------------------+--------------------------------+
274
275 Network ``public``
276 ~~~~~~~~~~~~~~~~~~
277
278 .. TIP::
279
280     ``public`` often has an IP range offset defined in ``IDF``.
281
282 .. WARNING::
283
284     Default addresses in below table correspond to a ``public`` IP range of
285     ``172.30.10.100-172.30.10.254`` (one of the used values in OPNFV
286     labs). This is defined in ``IDF`` and can easily be changed to something
287     else. Since the ``jumpserver`` address is manually assigned, this is
288     usually not subject to the IP range restriction in ``IDF``.
289
290 +------------------+------------------------+--------------------------------+
291 | Host             | Offset in IP range     | Default address                |
292 +==================+========================+================================+
293 | ``jumpserver``   | N/A                    | ``172.30.10.72``               |
294 |                  |                        | (manual assignment, optional)  |
295 +------------------+------------------------+--------------------------------+
296 | ``prx``,         | 1st,                   | ``172.30.10.101``,             |
297 |                  |                        |                                |
298 | ``prx01``,       | 2nd,                   | ``172.30.10.102``,             |
299 | ``prx02``        | 3rd                    | ``172.30.10.103``              |
300 +------------------+------------------------+--------------------------------+
301 | ``gtw01``,       | 4th,                   | ``172.30.10.104``,             |
302 | ``gtw02``,       | 5th,                   | ``172.30.10.105``,             |
303 | ``gtw03``        | 6th                    | ``172.30.10.106``              |
304 +------------------+------------------------+--------------------------------+
305 | ``ctl01``,       | ...                    | ``...``                        |
306 | ``ctl02``,       |                        |                                |
307 | ``ctl03``        |                        |                                |
308 +------------------+------------------------+--------------------------------+
309 | ``odl``,         |                        |                                |
310 +------------------+------------------------+--------------------------------+
311 | ``cmp001``,      |                        |                                |
312 | ``cmp002``,      |                        |                                |
313 | ``...``          |                        |                                |
314 +------------------+------------------------+--------------------------------+
315
316 Accessing the Salt Master Node (``cfg01``)
317 ==========================================
318
319 The Salt Master node (``cfg01``) runs a ``sshd`` server listening on
320 ``0.0.0.0:22``.
321
322 To login as ``ubuntu`` user, use the RSA private key ``/var/lib/opnfv/mcp.rsa``:
323
324 .. code-block:: console
325
326     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no \
327                               -i /var/lib/opnfv/mcp.rsa \
328                               -l ubuntu 10.20.0.2
329     ubuntu@cfg01:~$
330
331 .. NOTE::
332
333     User ``ubuntu`` has sudo rights.
334
335 .. TIP::
336
337     The Salt master IP (``10.20.0.2``) is not hard set, it is configurable via
338     ``INSTALLER_IP`` during deployment.
339
340 .. TIP::
341
342     Starting with the ``Gambia`` release, ``cfg01`` is containerized, so this
343     also works (from ``jumpserver`` only):
344
345 .. code-block:: console
346
347     jenkins@jumpserver:~$ docker exec -it fuel bash
348     root@cfg01:~$
349
350 Accessing Cluster Nodes
351 =======================
352
353 Logging in to cluster nodes is possible from the Jumpserver, Salt Master etc.
354
355 .. code-block:: console
356
357     jenkins@jumpserver:~$ ssh -i /var/lib/opnfv/mcp.rsa ubuntu@192.168.11.52
358
359 .. TIP::
360
361     ``/etc/hosts`` on ``cfg01`` has all the cluster hostnames, which can be
362     used instead of IP addresses.
363
364 .. code-block:: console
365
366     root@cfg01:~$ ssh -i ~/fuel/mcp/scripts/mcp.rsa ubuntu@ctl01
367
368 Debugging ``MaaS`` Comissioning/Deployment Issues
369 =================================================
370
371 One of the most common issues when setting up a new POD is ``MaaS`` failing to
372 commission/deploy the nodes, usually timing out after a couple of retries.
373
374 Such failures might indicate misconfiguration in ``PDF``/``IDF``, ``TOR``
375 switch configuration or even faulty hardware.
376
377 Here are a couple of pointers for isolating the problem.
378
379 Accessing the ``MaaS`` Dashboard
380 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
381
382 ``MaaS`` web-based dashboard is available at
383 ``http://<mas01 IP address>:5240/MAAS``, e.g.
384 ``http://172.16.10.12:5240/MAAS``.
385
386 The administrator credentials are ``opnfv``/``opnfv_secret``.
387
388 .. NOTE::
389
390     ``mas01`` VM does not automatically get assigned an IP address in the
391     public network segment. If ``MaaS`` dashboard should be accesiable from
392     the public network, such an address can be manually added to the last
393     VM NIC interface in ``mas01`` (which is already connected to the public
394     network bridge).
395
396 Ensure Commission/Deploy Timeouts Are Not Too Small
397 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
398
399 Some hardware takes longer to boot or to run the initial scripts during
400 commissioning/deployment phases. If that's the case, ``MaaS`` will time out
401 waiting for the process to finish. ``MaaS`` logs will reflect that, and the
402 issue is usually easy to observe on the nodes' serial console - if the node
403 seems to PXE-boot the OS live image, starts executing cloud-init/curtin
404 hooks without spilling critical errors, then it is powered down/shut off,
405 most likely the timeout was hit.
406
407 To access the serial console of a node, see your board manufacturer's
408 documentation. Some hardware no longer has a physical serial connector these
409 days, usually being replaced by a vendor-specific software-based interface.
410
411 If the board supports ``SOL`` (Serial Over LAN) over ``IPMI`` lanplus protocol,
412 a simpler solution to hook to the serial console is to use ``ipmitool``.
413
414 .. TIP::
415
416     Early boot stage output might not be shown over ``SOL``, but only over
417     the video console provided by the (vendor-specific) interface.
418
419 .. code-block:: console
420
421     jenkins@jumpserver:~$ ipmitool -H <host BMC IP> -U <user> -P <pass> \
422                                    -I lanplus sol activate
423
424 To bypass this, simply set a larger timeout in the ``IDF``.
425
426 Check Jumpserver Network Configuration
427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428
429 .. code-block:: console
430
431     jenkins@jumpserver:~$ brctl show
432     jenkins@jumpserver:~$ ifconfig -a
433
434 +-----------------------+------------------------------------------------+
435 | Configuration item    | Expected behavior                              |
436 +=======================+================================================+
437 | IP addresses assigned | IP addresses should be assigned to the bridge, |
438 | to bridge ports       | and not to individual bridge ports             |
439 +-----------------------+------------------------------------------------+
440
441 Check Network Connectivity Between Nodes on the Jumpserver
442 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
443
444 ``cfg01`` is a Docker container running on the ``jumpserver``, connected to
445 Docker networks (created by docker-compose automatically on container up),
446 which in turn are connected using veth pairs to their ``libvirt`` managed
447 counterparts.
448
449 For example, the ``mcpcontrol`` network(s) should look like below.
450
451 .. code-block:: console
452
453     jenkins@jumpserver:~$ brctl show mcpcontrol
454     bridge name   bridge id           STP enabled   interfaces
455     mcpcontrol    8000.525400064f77   yes           mcpcontrol-nic
456                                                     veth_mcp0
457                                                     vnet8
458
459     jenkins@jumpserver:~$ docker network ls
460     NETWORK ID    NAME                              DRIVER   SCOPE
461     81a0fdb3bd78  docker-compose_docker-mcpcontrol  macvlan  local
462     [...]
463
464     jenkins@jumpserver:~$ docker network inspect docker-compose_mcpcontrol
465     [
466         {
467             "Name": "docker-compose_mcpcontrol",
468             [...]
469             "Options": {
470                 "parent": "veth_mcp1"
471             },
472         }
473     ]
474
475 Before investigating the rest of the cluster networking configuration, the
476 first thing to check is that ``cfg01`` has network connectivity to other
477 jumpserver hosted nodes, e.g. ``mas01`` and to the jumpserver itself
478 (provided that the jumpserver has an IP address in that particular network
479 segment).
480
481 .. code-block:: console
482
483     jenkins@jumpserver:~$ docker exec -it fuel bash
484     root@cfg01:~# ifconfig -a | grep inet
485         inet addr:10.20.0.2     Bcast:0.0.0.0  Mask:255.255.255.0
486         inet addr:172.16.10.2   Bcast:0.0.0.0  Mask:255.255.255.0
487         inet addr:192.168.11.2  Bcast:0.0.0.0  Mask:255.255.255.0
488
489 For each network of interest (``mcpcontrol``, ``mgmt``, ``PXE/admin``), check
490 that ``cfg01`` can ping the jumpserver IP in that network segment, as well as
491 the ``mas01`` IP in that network.
492
493 .. NOTE::
494
495     ``mcpcontrol`` is set up at VM bringup, so it should always be available,
496     while the other networks are configured by Salt as part of the
497     ``virtual_init`` STATE file.
498
499 .. code-block:: console
500
501     root@cfg01:~# ping -c1 10.20.0.1  # mcpcontrol jumpserver IP
502     root@cfg01:~# ping -c1 10.20.0.3  # mcpcontrol mas01 IP
503
504 .. TIP::
505
506     ``mcpcontrol`` CIDR is configurable via ``INSTALLER_IP`` env var during
507     deployment. However, IP offsets inside that segment are hard set to ``.1``
508     for the jumpserver, ``.2`` for ``cfg01``, respectively to ``.3`` for
509     ``mas01`` node.
510
511 .. code-block:: console
512
513     root@cfg01:~# salt 'mas*' pillar.item --out yaml \
514                   _param:infra_maas_node01_deploy_address \
515                   _param:infra_maas_node01_address
516     mas01.mcp-ovs-noha.local:
517       _param:infra_maas_node01_address: 172.16.10.12
518       _param:infra_maas_node01_deploy_address: 192.168.11.3
519
520     root@cfg01:~# ping -c1 192.168.11.1  # PXE/admin jumpserver IP
521     root@cfg01:~# ping -c1 192.168.11.3  # PXE/admin mas01 IP
522     root@cfg01:~# ping -c1 172.16.10.1   # mgmt jumpserver IP
523     root@cfg01:~# ping -c1 172.16.10.12  # mgmt mas01 IP
524
525 .. TIP::
526
527     Jumpserver IP addresses for ``PXE/admin``, ``mgmt`` and ``public`` bridges
528     are user-chosen and manually set, so above snippets should be adjusted
529     accordingly if the user chose a different IP, other than ``.1`` in each
530     CIDR.
531
532 Alternatively, a quick ``nmap`` scan would work just as well.
533
534 .. code-block:: console
535
536     root@cfg01:~# apt update && apt install -y nmap
537     root@cfg01:~# nmap -sn 10.20.0.0/24     # expected: cfg01, mas01, jumpserver
538     root@cfg01:~# nmap -sn 192.168.11.0/24  # expected: cfg01, mas01, jumpserver
539     root@cfg01:~# nmap -sn 172.16.10.0/24   # expected: cfg01, mas01, jumpserver
540
541 Check ``DHCP`` Reaches Cluster Nodes
542 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
543
544 One common symptom observed during failed commissioning is that ``DHCP`` does
545 not work as expected between cluster nodes (baremetal nodes in the cluster; or
546 virtual machines on the jumpserver in case of ``hybrid`` deployments) and
547 the ``MaaS`` node.
548
549 To confirm or rule out this possibility, monitor the serial console output of
550 one (or more) cluster nodes during ``MaaS`` commissioning. If the node is
551 properly configured to attempt PXE boot, yet it times out waiting for an IP
552 address from ``mas01`` ``DHCP``, it's worth checking that ``DHCP`` packets
553 reach the ``jumpserver``, respectively the ``mas01`` VM.
554
555 .. code-block:: console
556
557     jenkins@jumpserver:~$ sudo apt update && sudo apt install -y dhcpdump
558     jenkins@jumpserver:~$ sudo dhcpdump -i admin_br
559
560 .. TIP::
561
562     If ``DHCP`` requests are present, but no replies are sent, ``iptables``
563     might be interfering on the jumpserver.
564
565 Check ``MaaS`` Logs
566 ~~~~~~~~~~~~~~~~~~~
567
568 If networking looks fine, yet nodes still fail to commission and/or deploy,
569 ``MaaS`` logs might offer more details about the failure:
570
571 * ``/var/log/maas/maas.log``
572 * ``/var/log/maas/rackd.log``
573 * ``/var/log/maas/regiond.log``
574
575 .. TIP::
576
577     If the problem is with the cluster node and not on the ``MaaS`` server,
578     node's kernel logs usually contain useful information.
579     These are saved via rsyslog on the ``mas01`` node in
580     ``/var/log/maas/rsyslog``.
581
582 Recovering Failed Deployments
583 =============================
584
585 The first deploy attempt might fail due to various reasons. If the problem
586 is not systemic (i.e. fixing it will not introduce incompatible configuration
587 changes, like setting a different ``INSTALLER_IP``), the environment is safe
588 to be reused and the deployment process can pick up from where it left off.
589
590 Leveraging these mechanisms requires a minimum understanding of how the
591 deploy process works, at least for manual ``STATE`` runs.
592
593 Automatic (re)deploy
594 ~~~~~~~~~~~~~~~~~~~~
595
596 OPNFV Fuel's ``deploy.sh`` script offers a dedicated argument for this, ``-f``,
597 which will skip executing the first ``N`` ``STATE`` files, where ``N`` is the
598 number of ``-f`` occurrences in the argument list.
599
600 .. TIP::
601
602     The list of ``STATE`` files to be executed for a specific environment
603     depends on the OPNFV scenario chosen, deployment type (``virtual``,
604     ``baremetal`` or ``hybrid``) and the presence/absence of a ``VCP``
605     (virtualized control plane).
606
607 e.g.: Let's consider a ``baremetal`` enviroment, with ``VCP`` and a simple
608 scenario ``os-nosdn-nofeature-ha``, where ``deploy.sh`` failed executing the
609 ``openstack_ha`` ``STATE`` file.
610
611 The simplest redeploy approach (which usually works for **any** combination of
612 deployment type/VCP/scenario) is to issue the same deploy command as the
613 original attempt used, then adding a single ``-f``:
614
615 .. code-block:: console
616
617     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
618                                             -s <scenario> [...] \
619                                             -f # skips running the virtual_init STATE file
620
621 All ``STATE`` files are re-entrant, so the above is equivalent (but a little
622 slower) to skipping all ``STATE`` files before the ``openstack_ha`` one, like:
623
624 .. code-block:: console
625
626     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
627                                             -s <scenario> [...] \
628                                             -ffff # skips virtual_init, maas, baremetal_init, virtual_control_plane
629
630 .. TIP::
631
632     For fine tuning the infrastructure setup steps executed during deployment,
633     see also the ``-e`` and ``-P`` deploy arguments.
634
635 .. NOTE::
636
637     On rare occassions, the cluster cannot idempotently be redeployed (e.g.
638     broken MySQL/Galera cluster), in which case some cleanup is due before
639     (re)running the ``STATE`` files. See ``-E`` deploy arg, which allows
640     either forcing a ``MaaS`` node deletion, then redeployment of all
641     baremetal nodes, if used twice (``-EE``); or only erasing the ``VCP`` VMs
642     if used only once (``-E``).
643
644 Manual ``STATE`` Run
645 ~~~~~~~~~~~~~~~~~~~~
646
647 Instead of leveraging the full ``deploy.sh``, one could execute the ``STATE``
648 files one by one (or partially) from the ``cfg01``.
649
650 However, this requires a better understanding of how the list of ``STATE``
651 files to be executed is constructed for a specific scenario, depending on the
652 deployment type and the cluster having baremetal nodes, implemented in:
653
654 * ``mcp/config/scenario/defaults.yaml.j2``
655 * ``mcp/config/scenario/<scenario-name>.yaml``
656
657 e.g.: For the example presented above (baremetal with ``VCP``,
658 ``os-nosdn-nofeature-ha``), the list of ``STATE`` files would be:
659
660 * ``virtual_init``
661 * ``maas``
662 * ``baremetal_init``
663 * ``virtual_control_plane``
664 * ``openstack_ha``
665 * ``networks``
666
667 To execute one (or more) of the remaining ``STATE`` files after a failure:
668
669 .. code-block:: console
670
671     jenkins@jumpserver:~$ docker exec -it fuel bash
672     root@cfg01:~$ cd ~/fuel/mcp/config/states
673     root@cfg01:~/fuel/mcp/config/states$ ./openstack_ha
674     root@cfg01:~/fuel/mcp/config/states$ CI_DEBUG=true ./networks
675
676 For even finer granularity, one can also run the commands in a ``STATE`` file
677 one by one manually, e.g. if the execution failed applying the ``rabbitmq``
678 sls:
679
680 .. code-block:: console
681
682     root@cfg01:~$ salt -I 'rabbitmq:server' state.sls rabbitmq
683
684 Exploring the Cloud with Salt
685 =============================
686
687 To gather information about the cloud, the salt commands can be used.
688 It is based around a master-minion idea where the salt-master pushes config to
689 the minions to execute actions.
690
691 For example tell salt to execute a ping to ``8.8.8.8`` on all the nodes.
692
693 .. code-block:: console
694
695     root@cfg01:~$ salt "*" network.ping 8.8.8.8
696                        ^^^                       target
697                            ^^^^^^^^^^^^          function to execute
698                                         ^^^^^^^  argument passed to the function
699
700 .. TIP::
701
702     Complex filters can be done to the target like compound queries or node roles.
703
704 For more information about Salt see the :ref:`fuel_userguide_references`
705 section.
706
707 Some examples are listed below. Note that these commands are issued from Salt
708 master as ``root`` user.
709
710 View the IPs of All the Components
711 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
712
713 .. code-block:: console
714
715     root@cfg01:~$ salt "*" network.ip_addrs
716     cfg01.mcp-odl-ha.local:
717        - 10.20.0.2
718        - 172.16.10.100
719     mas01.mcp-odl-ha.local:
720        - 10.20.0.3
721        - 172.16.10.3
722        - 192.168.11.3
723     .........................
724
725 View the Interfaces of All the Components and Put the Output in a ``yaml`` File
726 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
727
728 .. code-block:: console
729
730     root@cfg01:~$ salt "*" network.interfaces --out yaml --output-file interfaces.yaml
731     root@cfg01:~# cat interfaces.yaml
732     cfg01.mcp-odl-ha.local:
733      enp1s0:
734        hwaddr: 52:54:00:72:77:12
735        inet:
736        - address: 10.20.0.2
737          broadcast: 10.20.0.255
738          label: enp1s0
739          netmask: 255.255.255.0
740        inet6:
741        - address: fe80::5054:ff:fe72:7712
742          prefixlen: '64'
743          scope: link
744        up: true
745     .........................
746
747 View Installed Packages on MaaS Node
748 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
749
750 .. code-block:: console
751
752     root@cfg01:~# salt "mas*" pkg.list_pkgs
753     mas01.mcp-odl-ha.local:
754         ----------
755         accountsservice:
756             0.6.40-2ubuntu11.3
757         acl:
758             2.2.52-3
759         acpid:
760             1:2.0.26-1ubuntu2
761         adduser:
762             3.113+nmu3ubuntu4
763         anerd:
764             1
765     .........................
766
767 Execute Any Linux Command on All Nodes (e.g. ``ls /var/log``)
768 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
769
770 .. code-block:: console
771
772     root@cfg01:~# salt "*" cmd.run 'ls /var/log'
773     cfg01.mcp-odl-ha.local:
774        alternatives.log
775        apt
776        auth.log
777        boot.log
778        btmp
779        cloud-init-output.log
780        cloud-init.log
781     .........................
782
783 Execute Any Linux Command on Nodes Using Compound Queries Filter
784 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
785
786 .. code-block:: console
787
788     root@cfg01:~# salt -C '* and cfg01*' cmd.run 'ls /var/log'
789     cfg01.mcp-odl-ha.local:
790        alternatives.log
791        apt
792        auth.log
793        boot.log
794        btmp
795        cloud-init-output.log
796        cloud-init.log
797     .........................
798
799 Execute Any Linux Command on Nodes Using Role Filter
800 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
801
802 .. code-block:: console
803
804     root@cfg01:~# salt -I 'nova:compute' cmd.run 'ls /var/log'
805     cmp001.mcp-odl-ha.local:
806        alternatives.log
807        apache2
808        apt
809        auth.log
810        btmp
811        ceilometer
812        cinder
813        cloud-init-output.log
814        cloud-init.log
815     .........................
816
817 Accessing Openstack
818 ===================
819
820 Once the deployment is complete, Openstack CLI is accessible from controller
821 VMs (``ctl01`` ... ``ctl03``).
822
823 Openstack credentials are at ``/root/keystonercv3``.
824
825 .. code-block:: console
826
827     root@ctl01:~# source keystonercv3
828     root@ctl01:~# openstack image list
829     +--------------------------------------+-----------------------------------------------+--------+
830     | ID                                   | Name                                          | Status |
831     +======================================+===============================================+========+
832     | 152930bf-5fd5-49c2-b3a1-cae14973f35f | CirrosImage                                   | active |
833     | 7b99a779-78e4-45f3-9905-64ae453e3dcb | Ubuntu16.04                                   | active |
834     +--------------------------------------+-----------------------------------------------+--------+
835
836 The OpenStack Dashboard, Horizon, is available at ``http://<proxy public VIP>``.
837 The administrator credentials are ``admin``/``opnfv_secret``.
838
839 .. figure:: img/horizon_login.png
840     :width: 60%
841     :align: center
842
843 A full list of IPs/services is available at ``<proxy public VIP>:8090`` for
844 ``baremetal`` deploys.
845
846 .. figure:: img/salt_services_ip.png
847     :width: 60%
848     :align: center
849
850 Guest Operating System Support
851 ==============================
852
853 There are a number of possibilities regarding the guest operating systems
854 which can be spawned on the nodes.
855 The current system spawns virtual machines for VCP VMs on the KVM nodes and VMs
856 requested by users in OpenStack compute nodes. Currently the system supports
857 the following ``UEFI``-images for the guests:
858
859 +------------------+-------------------+--------------------+
860 | OS name          | ``x86_64`` status | ``aarch64`` status |
861 +==================+===================+====================+
862 | Ubuntu 17.10     | untested          | Full support       |
863 +------------------+-------------------+--------------------+
864 | Ubuntu 16.04     | Full support      | Full support       |
865 +------------------+-------------------+--------------------+
866 | Ubuntu 14.04     | untested          | Full support       |
867 +------------------+-------------------+--------------------+
868 | Fedora atomic 27 | untested          | Full support       |
869 +------------------+-------------------+--------------------+
870 | Fedora cloud 27  | untested          | Full support       |
871 +------------------+-------------------+--------------------+
872 | Debian           | untested          | Full support       |
873 +------------------+-------------------+--------------------+
874 | Centos 7         | untested          | Not supported      |
875 +------------------+-------------------+--------------------+
876 | Cirros 0.3.5     | Full support      | Full support       |
877 +------------------+-------------------+--------------------+
878 | Cirros 0.4.0     | Full support      | Full support       |
879 +------------------+-------------------+--------------------+
880
881 The above table covers only ``UEFI`` images and implies ``OVMF``/``AAVMF``
882 firmware on the host. An ``x86_64`` deployment also supports ``non-UEFI``
883 images, however that choice is up to the underlying hardware and the
884 administrator to make.
885
886 The images for the above operating systems can be found in their respective
887 websites.
888
889 OpenStack Storage
890 =================
891
892 OpenStack Cinder is the project behind block storage in OpenStack and OPNFV
893 Fuel supports LVM out of the box.
894
895 By default ``x86_64`` supports 2 additional block storage devices, while
896 ``aarch64`` supports only one.
897
898 More devices can be supported if the OS-image created has additional
899 properties allowing block storage devices to be spawned as ``SCSI`` drives.
900 To do this, add the properties below to the server:
901
902 .. code-block:: console
903
904     root@ctl01:~$ openstack image set --property hw_disk_bus='scsi' \
905                                       --property hw_scsi_model='virtio-scsi' \
906                                       <image>
907
908 The choice regarding which bus to use for the storage drives is an important
909 one. ``virtio-blk`` is the default choice for OPNFV Fuel, which attaches the
910 drives in ``/dev/vdX``. However, since we want to be able to attach a
911 larger number of volumes to the virtual machines, we recommend the switch to
912 ``SCSI`` drives which are attached in ``/dev/sdX`` instead.
913
914 ``virtio-scsi`` is a little worse in terms of performance but the ability to
915 add a larger number of drives combined with added features like ZFS, Ceph et
916 al, leads us to suggest the use of ``virtio-scsi`` in OPNFV Fuel for both
917 architectures.
918
919 More details regarding the differences and performance of ``virtio-blk`` vs
920 ``virtio-scsi`` are beyond the scope of this manual but can be easily found
921 in other sources online like `VirtIO SCSI`_ or `VirtIO performance`_.
922
923 Additional configuration for configuring images in OpenStack can be found in
924 the OpenStack Glance documentation.
925
926 OpenStack Endpoints
927 ===================
928
929 For each OpenStack service three endpoints are created: ``admin``, ``internal``
930 and ``public``.
931
932 .. code-block:: console
933
934     ubuntu@ctl01:~$ openstack endpoint list --service keystone
935     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
936     | ID                               | Region    | Service Name | Service Type | Enabled | Interface | URL                          |
937     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
938     | 008fec57922b4e9e8bf02c770039ae77 | RegionOne | keystone     | identity     | True    | internal  | http://172.16.10.26:5000/v3  |
939     | 1a1f3c3340484bda9ef7e193f50599e6 | RegionOne | keystone     | identity     | True    | admin     | http://172.16.10.26:35357/v3 |
940     | b0a47d42d0b6491b995d7e6230395de8 | RegionOne | keystone     | identity     | True    | public    | https://10.0.15.2:5000/v3    |
941     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
942
943 MCP sets up all Openstack services to talk to each other over unencrypted
944 connections on the internal management network. All admin/internal endpoints
945 use plain http, while the public endpoints are https connections terminated
946 via nginx at the ``VCP`` proxy VMs.
947
948 To access the public endpoints an SSL certificate has to be provided. For
949 convenience, the installation script will copy the required certificate
950 to the ``cfg01`` node at ``/etc/ssl/certs/os_cacert``.
951
952 Copy the certificate from the ``cfg01`` node to the client that will access
953 the https endpoints and place it under ``/etc/ssl/certs/``.
954 The SSL connection will be established automatically after.
955
956 .. code-block:: console
957
958     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no -i /var/lib/opnfv/mcp.rsa -l ubuntu 10.20.0.2 \
959       "cat /etc/ssl/certs/os_cacert" | sudo tee /etc/ssl/certs/os_cacert
960
961 Reclass Model Viewer Tutorial
962 =============================
963
964 In order to get a better understanding of the ``reclass`` model Fuel uses, the
965 `reclass-doc`_ tool can be used to visualise the ``reclass`` model.
966
967 To avoid installing packages on the ``jumpserver`` or another host, the
968 ``cfg01`` Docker container can be used. Since the ``fuel`` git repository
969 located on the ``jumpserver`` is already mounted inside ``cfg01`` container,
970 the results can be visualized using a web browser on the ``jumpserver`` at the
971 end of the procedure.
972
973 .. code-block:: console
974
975     jenkins@jumpserver:~$ docker exec -it fuel bash
976     root@cfg01:~$ apt-get update
977     root@cfg01:~$ apt-get install -y npm nodejs
978     root@cfg01:~$ npm install -g reclass-doc
979     root@cfg01:~$ ln -s /usr/bin/nodejs /usr/bin/node
980     root@cfg01:~$ reclass-doc --output ~/fuel/mcp/reclass/modeler \
981                                        ~/fuel/mcp/reclass
982
983 The generated documentation should be available on the ``jumpserver`` inside
984 ``fuel`` git repo subpath ``mcp/reclass/modeler/index.html``.
985
986 .. figure:: img/reclass_doc.png
987     :width: 60%
988     :align: center
989
990 .. _fuel_userguide_references:
991
992 References
993 ==========
994
995 #. :ref:`OPNFV Fuel Installation Instruction <fuel-installation>`
996 #. `Saltstack Documentation`_
997 #. `Saltstack Formulas`_
998 #. `VirtIO performance`_
999 #. `VirtIO SCSI`_
1000
1001 .. _`Saltstack Documentation`: https://docs.saltstack.com/en/latest/topics/
1002 .. _`Saltstack Formulas`: https://salt-formulas.readthedocs.io/en/latest/
1003 .. _`VirtIO performance`: https://mpolednik.github.io/2017/01/23/virtio-blk-vs-virtio-scsi/
1004 .. _`VirtIO SCSI`: https://www.ovirt.org/develop/release-management/features/storage/virtio-scsi/
1005 .. _`reclass-doc`: https://github.com/jirihybek/reclass-doc