c6602f3cb980843943edb38f6b98155b786f36cb
[fuel.git] / docs / release / userguide / userguide.rst
1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) Open Platform for NFV Project, Inc. and its contributors
4
5 *********************
6 OPNFV Fuel User Guide
7 *********************
8
9 Abstract
10 ========
11
12 This document contains details about using OPNFV Fuel ``Gambia`` release after
13 it was deployed. For details on how to deploy OpenStack, check
14 the installation instructions in the :ref:`fuel_userguide_references` section.
15
16 This is an unified documentation for both ``x86_64`` and ``aarch64``
17 architectures. All information is common for both architectures
18 except when explicitly stated.
19
20 Network Overview
21 ================
22
23 Fuel uses several networks to deploy and administer the cloud:
24
25 +------------------+----------------------------------------------------------+
26 | Network name     | Description                                              |
27 |                  |                                                          |
28 +==================+==========================================================+
29 | **PXE/admin**    | Used for booting the nodes via PXE and/or Salt           |
30 |                  | control network                                          |
31 +------------------+----------------------------------------------------------+
32 | **mcpcontrol**   | Used to provision the infrastructure hosts (Salt & MaaS) |
33 +------------------+----------------------------------------------------------+
34 | **management**   | Used for internal communication between                  |
35 |                  | OpenStack components                                     |
36 +------------------+----------------------------------------------------------+
37 | **internal**     | Used for VM data communication within the                |
38 |                  | cloud deployment                                         |
39 +------------------+----------------------------------------------------------+
40 | **public**       | Used to provide Virtual IPs for public endpoints         |
41 |                  | that are used to connect to OpenStack services APIs.     |
42 |                  | Used by Virtual machines to access the Internet          |
43 +------------------+----------------------------------------------------------+
44
45 These networks - except ``mcpcontrol`` - can be Linux bridges configured
46 before the deploy on the Jumpserver.
47 If they don't exists at deploy time, they will be created by the scripts as
48 ``libvirt`` managed networks.
49
50 Network ``mcpcontrol``
51 ~~~~~~~~~~~~~~~~~~~~~~
52
53 ``mcpcontrol`` is a virtual network, managed by libvirt. Its only purpose is to
54 provide a simple method of assigning an arbitrary ``INSTALLER_IP`` to the Salt
55 master node (``cfg01``), to maintain backwards compatibility with old OPNFV
56 Fuel behavior. Normally, end-users only need to change the ``INSTALLER_IP`` if
57 the default CIDR (``10.20.0.0/24``) overlaps with existing lab networks.
58
59 ``mcpcontrol`` has both NAT and DHCP enabled, so the Salt master (``cfg01``)
60 and the MaaS VM (``mas01``, when present) get assigned predefined IPs (``.2``,
61 ``.3``, while the jumpserver bridge port gets ``.1``).
62
63 +------------------+---------------------------+-----------------------------+
64 | Host             | Offset in IP range        | Default address             |
65 +==================+===========================+=============================+
66 | ``jumpserver``   | 1st                       | ``10.20.0.1``               |
67 +------------------+---------------------------+-----------------------------+
68 | ``cfg01``        | 2nd                       | ``10.20.0.2``               |
69 +------------------+---------------------------+-----------------------------+
70 | ``mas01``        | 3rd                       | ``10.20.0.3``               |
71 +------------------+---------------------------+-----------------------------+
72
73 This network is limited to the ``jumpserver`` host and does not require any
74 manual setup.
75
76 Network ``PXE/admin``
77 ~~~~~~~~~~~~~~~~~~~~~
78
79 .. TIP::
80
81     ``PXE/admin`` does not usually use an IP range offset in ``IDF``.
82
83 .. NOTE::
84
85     During ``MaaS`` commissioning phase, IP addresses are handed out by
86     ``MaaS``'s DHCP.
87
88 .. NOTE::
89
90     Default addresses in below table correspond to a ``PXE/admin`` CIDR of
91     ``192.168.11.0/24`` (the usual value used in OPNFV labs).
92
93     This is defined in ``IDF`` and can easily be changed to something else.
94
95 .. TODO: detail MaaS DHCP range start/end
96
97 +------------------+-----------------------+---------------------------------+
98 | Host             | Offset in IP range    | Default address                 |
99 +==================+=======================+=================================+
100 | ``jumpserver``   | 1st                   | ``192.168.11.1``                |
101 |                  |                       | (manual assignment)             |
102 +------------------+-----------------------+---------------------------------+
103 | ``cfg01``        | 2nd                   | ``192.168.11.2``                |
104 +------------------+-----------------------+---------------------------------+
105 | ``mas01``        | 3rd                   | ``192.168.11.3``                |
106 +------------------+-----------------------+---------------------------------+
107 | ``prx01``,       | 4th,                  | ``192.168.11.4``,               |
108 | ``prx02``        | 5th                   | ``192.168.11.5``                |
109 +------------------+-----------------------+---------------------------------+
110 | ``gtw01``,       | ...                   | ``...``                         |
111 | ``gtw02``,       |                       |                                 |
112 | ``gtw03``        |                       |                                 |
113 +------------------+-----------------------+---------------------------------+
114 | ``kvm01``,       |                       |                                 |
115 | ``kvm02``,       |                       |                                 |
116 | ``kvm03``        |                       |                                 |
117 +------------------+-----------------------+---------------------------------+
118 | ``dbs01``,       |                       |                                 |
119 | ``dbs02``,       |                       |                                 |
120 | ``dbs03``        |                       |                                 |
121 +------------------+-----------------------+---------------------------------+
122 | ``msg01``,       |                       |                                 |
123 | ``msg02``,       |                       |                                 |
124 | ``msg03``        |                       |                                 |
125 +------------------+-----------------------+---------------------------------+
126 | ``mdb01``,       |                       |                                 |
127 | ``mdb02``,       |                       |                                 |
128 | ``mdb03``        |                       |                                 |
129 +------------------+-----------------------+---------------------------------+
130 | ``ctl01``,       |                       |                                 |
131 | ``ctl02``,       |                       |                                 |
132 | ``ctl03``        |                       |                                 |
133 +------------------+-----------------------+---------------------------------+
134 | ``odl01``,       |                       |                                 |
135 | ``odl02``,       |                       |                                 |
136 | ``odl03``        |                       |                                 |
137 +------------------+-----------------------+---------------------------------+
138 | ``mon01``,       |                       |                                 |
139 | ``mon02``,       |                       |                                 |
140 | ``mon03``,       |                       |                                 |
141 | ``log01``,       |                       |                                 |
142 | ``log02``,       |                       |                                 |
143 | ``log03``,       |                       |                                 |
144 | ``mtr01``,       |                       |                                 |
145 | ``mtr02``,       |                       |                                 |
146 | ``mtr03``        |                       |                                 |
147 +------------------+-----------------------+---------------------------------+
148 | ``cmp001``,      |                       |                                 |
149 | ``cmp002``,      |                       |                                 |
150 | ``...``          |                       |                                 |
151 +------------------+-----------------------+---------------------------------+
152
153 Network ``management``
154 ~~~~~~~~~~~~~~~~~~~~~~
155
156 .. TIP::
157
158     ``management`` often has an IP range offset defined in ``IDF``.
159
160 .. NOTE::
161
162     Default addresses in below table correspond to a ``management`` CIDR of
163     ``172.16.10.0/24`` (one of the commonly used values in OPNFV labs).
164     This is defined in ``IDF`` and can easily be changed to something else.
165
166 .. WARNING::
167
168     Default addresses in below table correspond to a ``management`` IP range of
169     ``172.16.10.10-172.16.10.254`` (one of the commonly used values in OPNFV
170     labs). This is defined in ``IDF`` and can easily be changed to something
171     else. Since the ``jumpserver`` address is manually assigned, this is
172     usually not subject to the IP range restriction in ``IDF``.
173
174 +------------------+-----------------------+---------------------------------+
175 | Host             | Offset in IP range    | Default address                 |
176 +==================+=======================+=================================+
177 | ``jumpserver``   | N/A                   | ``172.16.10.1``                 |
178 |                  |                       | (manual assignment)             |
179 +------------------+-----------------------+---------------------------------+
180 | ``cfg01``        | 1st                   | ``172.16.10.2``                 |
181 |                  |                       | (IP range ignored for now)      |
182 +------------------+-----------------------+---------------------------------+
183 | ``mas01``        | 2nd                   | ``172.16.10.12``                |
184 +------------------+-----------------------+---------------------------------+
185 | ``prx``          | 3rd,                  | ``172.16.10.13``,               |
186 |                  |                       |                                 |
187 | ``prx01``,       | 4th,                  | ``172.16.10.14``,               |
188 | ``prx02``        | 5th                   | ``172.16.10.15``                |
189 +------------------+-----------------------+---------------------------------+
190 | ``gtw01``,       | ...                   | ``...``                         |
191 | ``gtw02``,       |                       |                                 |
192 | ``gtw03``        |                       |                                 |
193 +------------------+-----------------------+---------------------------------+
194 | ``kvm``,         |                       |                                 |
195 |                  |                       |                                 |
196 | ``kvm01``,       |                       |                                 |
197 | ``kvm02``,       |                       |                                 |
198 | ``kvm03``        |                       |                                 |
199 +------------------+-----------------------+---------------------------------+
200 | ``dbs``,         |                       |                                 |
201 |                  |                       |                                 |
202 | ``dbs01``,       |                       |                                 |
203 | ``dbs02``,       |                       |                                 |
204 | ``dbs03``        |                       |                                 |
205 +------------------+-----------------------+---------------------------------+
206 | ``msg``,         |                       |                                 |
207 |                  |                       |                                 |
208 | ``msg01``,       |                       |                                 |
209 | ``msg02``,       |                       |                                 |
210 | ``msg03``        |                       |                                 |
211 +------------------+-----------------------+---------------------------------+
212 | ``mdb``,         |                       |                                 |
213 |                  |                       |                                 |
214 | ``mdb01``,       |                       |                                 |
215 | ``mdb02``,       |                       |                                 |
216 | ``mdb03``        |                       |                                 |
217 +------------------+-----------------------+---------------------------------+
218 | ``ctl``,         |                       |                                 |
219 |                  |                       |                                 |
220 | ``ctl01``,       |                       |                                 |
221 | ``ctl02``,       |                       |                                 |
222 | ``ctl03``        |                       |                                 |
223 +------------------+-----------------------+---------------------------------+
224 | ``odl``,         |                       |                                 |
225 |                  |                       |                                 |
226 | ``odl01``,       |                       |                                 |
227 | ``odl02``,       |                       |                                 |
228 | ``odl03``        |                       |                                 |
229 +------------------+-----------------------+---------------------------------+
230 | ``mon``,         |                       |                                 |
231 |                  |                       |                                 |
232 | ``mon01``,       |                       |                                 |
233 | ``mon02``,       |                       |                                 |
234 | ``mon03``,       |                       |                                 |
235 |                  |                       |                                 |
236 | ``log``,         |                       |                                 |
237 |                  |                       |                                 |
238 | ``log01``,       |                       |                                 |
239 | ``log02``,       |                       |                                 |
240 | ``log03``,       |                       |                                 |
241 |                  |                       |                                 |
242 | ``mtr``,         |                       |                                 |
243 |                  |                       |                                 |
244 | ``mtr01``,       |                       |                                 |
245 | ``mtr02``,       |                       |                                 |
246 | ``mtr03``        |                       |                                 |
247 +------------------+-----------------------+---------------------------------+
248 | ``cmp001``,      |                       |                                 |
249 | ``cmp002``,      |                       |                                 |
250 | ``...``          |                       |                                 |
251 +------------------+-----------------------+---------------------------------+
252
253 Network ``internal``
254 ~~~~~~~~~~~~~~~~~~~~
255
256 .. TIP::
257
258     ``internal`` does not usually use an IP range offset in ``IDF``.
259
260 .. NOTE::
261
262     Default addresses in below table correspond to an ``internal`` CIDR of
263     ``10.1.0.0/24`` (the usual value used in OPNFV labs).
264     This is defined in ``IDF`` and can easily be changed to something else.
265
266 +------------------+------------------------+--------------------------------+
267 | Host             | Offset in IP range     | Default address                |
268 +==================+========================+================================+
269 | ``jumpserver``   | N/A                    | ``10.1.0.1``                   |
270 |                  |                        | (manual assignment, optional)  |
271 +------------------+------------------------+--------------------------------+
272 | ``gtw01``,       | 1st,                   | ``10.1.0.2``,                  |
273 | ``gtw02``,       | 2nd,                   | ``10.1.0.3``,                  |
274 | ``gtw03``        | 3rd                    | ``10.1.0.4``                   |
275 +------------------+------------------------+--------------------------------+
276 | ``cmp001``,      | 4th,                   | ``10.1.0.5``,                  |
277 | ``cmp002``,      | 5th,                   | ``10.1.0.6``,                  |
278 | ``...``          | ...                    | ``...``                        |
279 +------------------+------------------------+--------------------------------+
280
281 Network ``public``
282 ~~~~~~~~~~~~~~~~~~
283
284 .. TIP::
285
286     ``public`` often has an IP range offset defined in ``IDF``.
287
288 .. NOTE::
289
290     Default addresses in below table correspond to a ``public`` CIDR of
291     ``172.30.10.0/24`` (one of the used values in OPNFV labs).
292     This is defined in ``IDF`` and can easily be changed to something else.
293
294 .. WARNING::
295
296     Default addresses in below table correspond to a ``public`` IP range of
297     ``172.30.10.100-172.30.10.254`` (one of the used values in OPNFV
298     labs). This is defined in ``IDF`` and can easily be changed to something
299     else. Since the ``jumpserver`` address is manually assigned, this is
300     usually not subject to the IP range restriction in ``IDF``.
301
302 +------------------+------------------------+--------------------------------+
303 | Host             | Offset in IP range     | Default address                |
304 +==================+========================+================================+
305 | ``jumpserver``   | N/A                    | ``172.30.10.72``               |
306 |                  |                        | (manual assignment, optional)  |
307 +------------------+------------------------+--------------------------------+
308 | ``prx``,         | 1st,                   | ``172.30.10.101``,             |
309 |                  |                        |                                |
310 | ``prx01``,       | 2nd,                   | ``172.30.10.102``,             |
311 | ``prx02``        | 3rd                    | ``172.30.10.103``              |
312 +------------------+------------------------+--------------------------------+
313 | ``gtw01``,       | 4th,                   | ``172.30.10.104``,             |
314 | ``gtw02``,       | 5th,                   | ``172.30.10.105``,             |
315 | ``gtw03``        | 6th                    | ``172.30.10.106``              |
316 +------------------+------------------------+--------------------------------+
317 | ``ctl01``,       | ...                    | ``...``                        |
318 | ``ctl02``,       |                        |                                |
319 | ``ctl03``        |                        |                                |
320 +------------------+------------------------+--------------------------------+
321 | ``odl``,         |                        |                                |
322 +------------------+------------------------+--------------------------------+
323 | ``cmp001``,      |                        |                                |
324 | ``cmp002``,      |                        |                                |
325 | ``...``          |                        |                                |
326 +------------------+------------------------+--------------------------------+
327
328 Accessing the Salt Master Node (``cfg01``)
329 ==========================================
330
331 The Salt Master node (``cfg01``) runs a ``sshd`` server listening on
332 ``0.0.0.0:22``.
333
334 To login as ``ubuntu`` user, use the RSA private key ``/var/lib/opnfv/mcp.rsa``:
335
336 .. code-block:: console
337
338     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no \
339                               -i /var/lib/opnfv/mcp.rsa \
340                               -l ubuntu 10.20.0.2
341     ubuntu@cfg01:~$
342
343 .. NOTE::
344
345     User ``ubuntu`` has sudo rights.
346
347 .. TIP::
348
349     The Salt master IP (``10.20.0.2``) is not hard set, it is configurable via
350     ``INSTALLER_IP`` during deployment.
351
352 .. TIP::
353
354     Starting with the ``Gambia`` release, ``cfg01`` is containerized, so this
355     also works (from ``jumpserver`` only):
356
357 .. code-block:: console
358
359     jenkins@jumpserver:~$ docker exec -it fuel bash
360     root@cfg01:~$
361
362 Accessing Cluster Nodes
363 =======================
364
365 Logging in to cluster nodes is possible from the Jumpserver, Salt Master etc.
366
367 .. code-block:: console
368
369     jenkins@jumpserver:~$ ssh -i /var/lib/opnfv/mcp.rsa ubuntu@192.168.11.52
370
371 .. TIP::
372
373     ``/etc/hosts`` on ``cfg01`` has all the cluster hostnames, which can be
374     used instead of IP addresses.
375
376 .. code-block:: console
377
378     root@cfg01:~$ ssh -i ~/fuel/mcp/scripts/mcp.rsa ubuntu@ctl01
379
380 Debugging ``MaaS`` Comissioning/Deployment Issues
381 =================================================
382
383 One of the most common issues when setting up a new POD is ``MaaS`` failing to
384 commission/deploy the nodes, usually timing out after a couple of retries.
385
386 Such failures might indicate misconfiguration in ``PDF``/``IDF``, ``TOR``
387 switch configuration or even faulty hardware.
388
389 Here are a couple of pointers for isolating the problem.
390
391 Accessing the ``MaaS`` Dashboard
392 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
393
394 ``MaaS`` web-based dashboard is available at
395 ``http://<mas01 IP address>:5240/MAAS``, e.g.
396 ``http://172.16.10.12:5240/MAAS``.
397
398 The administrator credentials are ``opnfv``/``opnfv_secret``.
399
400 .. NOTE::
401
402     ``mas01`` VM does not automatically get assigned an IP address in the
403     public network segment. If ``MaaS`` dashboard should be accesiable from
404     the public network, such an address can be manually added to the last
405     VM NIC interface in ``mas01`` (which is already connected to the public
406     network bridge).
407
408 Ensure Commission/Deploy Timeouts Are Not Too Small
409 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
410
411 Some hardware takes longer to boot or to run the initial scripts during
412 commissioning/deployment phases. If that's the case, ``MaaS`` will time out
413 waiting for the process to finish. ``MaaS`` logs will reflect that, and the
414 issue is usually easy to observe on the nodes' serial console - if the node
415 seems to PXE-boot the OS live image, starts executing cloud-init/curtin
416 hooks without spilling critical errors, then it is powered down/shut off,
417 most likely the timeout was hit.
418
419 To access the serial console of a node, see your board manufacturer's
420 documentation. Some hardware no longer has a physical serial connector these
421 days, usually being replaced by a vendor-specific software-based interface.
422
423 If the board supports ``SOL`` (Serial Over LAN) over ``IPMI`` lanplus protocol,
424 a simpler solution to hook to the serial console is to use ``ipmitool``.
425
426 .. TIP::
427
428     Early boot stage output might not be shown over ``SOL``, but only over
429     the video console provided by the (vendor-specific) interface.
430
431 .. code-block:: console
432
433     jenkins@jumpserver:~$ ipmitool -H <host BMC IP> -U <user> -P <pass> \
434                                    -I lanplus sol activate
435
436 To bypass this, simply set a larger timeout in the ``IDF``.
437
438 Check Jumpserver Network Configuration
439 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
440
441 .. code-block:: console
442
443     jenkins@jumpserver:~$ brctl show
444     jenkins@jumpserver:~$ ifconfig -a
445
446 +-----------------------+------------------------------------------------+
447 | Configuration item    | Expected behavior                              |
448 +=======================+================================================+
449 | IP addresses assigned | IP addresses should be assigned to the bridge, |
450 | to bridge ports       | and not to individual bridge ports             |
451 +-----------------------+------------------------------------------------+
452
453 Check Network Connectivity Between Nodes on the Jumpserver
454 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
455
456 ``cfg01`` is a Docker container running on the ``jumpserver``, connected to
457 Docker networks (created by docker-compose automatically on container up),
458 which in turn are connected using veth pairs to their ``libvirt`` managed
459 counterparts.
460
461 For example, the ``mcpcontrol`` network(s) should look like below.
462
463 .. code-block:: console
464
465     jenkins@jumpserver:~$ brctl show mcpcontrol
466     bridge name   bridge id           STP enabled   interfaces
467     mcpcontrol    8000.525400064f77   yes           mcpcontrol-nic
468                                                     veth_mcp0
469                                                     vnet8
470
471     jenkins@jumpserver:~$ docker network ls
472     NETWORK ID    NAME                              DRIVER   SCOPE
473     81a0fdb3bd78  docker-compose_docker-mcpcontrol  macvlan  local
474     [...]
475
476     jenkins@jumpserver:~$ docker network inspect docker-compose_mcpcontrol
477     [
478         {
479             "Name": "docker-compose_mcpcontrol",
480             [...]
481             "Options": {
482                 "parent": "veth_mcp1"
483             },
484         }
485     ]
486
487 Before investigating the rest of the cluster networking configuration, the
488 first thing to check is that ``cfg01`` has network connectivity to other
489 jumpserver hosted nodes, e.g. ``mas01`` and to the jumpserver itself
490 (provided that the jumpserver has an IP address in that particular network
491 segment).
492
493 .. code-block:: console
494
495     jenkins@jumpserver:~$ docker exec -it fuel bash
496     root@cfg01:~# ifconfig -a | grep inet
497         inet addr:10.20.0.2     Bcast:0.0.0.0  Mask:255.255.255.0
498         inet addr:172.16.10.2   Bcast:0.0.0.0  Mask:255.255.255.0
499         inet addr:192.168.11.2  Bcast:0.0.0.0  Mask:255.255.255.0
500
501 For each network of interest (``mcpcontrol``, ``mgmt``, ``PXE/admin``), check
502 that ``cfg01`` can ping the jumpserver IP in that network segment, as well as
503 the ``mas01`` IP in that network.
504
505 .. NOTE::
506
507     ``mcpcontrol`` is set up at VM bringup, so it should always be available,
508     while the other networks are configured by Salt as part of the
509     ``virtual_init`` STATE file.
510
511 .. code-block:: console
512
513     root@cfg01:~# ping -c1 10.20.0.1  # mcpcontrol jumpserver IP
514     root@cfg01:~# ping -c1 10.20.0.3  # mcpcontrol mas01 IP
515
516 .. TIP::
517
518     ``mcpcontrol`` CIDR is configurable via ``INSTALLER_IP`` env var during
519     deployment. However, IP offsets inside that segment are hard set to ``.1``
520     for the jumpserver, ``.2`` for ``cfg01``, respectively to ``.3`` for
521     ``mas01`` node.
522
523 .. code-block:: console
524
525     root@cfg01:~# salt 'mas*' pillar.item --out yaml \
526                   _param:infra_maas_node01_deploy_address \
527                   _param:infra_maas_node01_address
528     mas01.mcp-ovs-noha.local:
529       _param:infra_maas_node01_address: 172.16.10.12
530       _param:infra_maas_node01_deploy_address: 192.168.11.3
531
532     root@cfg01:~# ping -c1 192.168.11.1  # PXE/admin jumpserver IP
533     root@cfg01:~# ping -c1 192.168.11.3  # PXE/admin mas01 IP
534     root@cfg01:~# ping -c1 172.16.10.1   # mgmt jumpserver IP
535     root@cfg01:~# ping -c1 172.16.10.12  # mgmt mas01 IP
536
537 .. TIP::
538
539     Jumpserver IP addresses for ``PXE/admin``, ``mgmt`` and ``public`` bridges
540     are user-chosen and manually set, so above snippets should be adjusted
541     accordingly if the user chose a different IP, other than ``.1`` in each
542     CIDR.
543
544 Alternatively, a quick ``nmap`` scan would work just as well.
545
546 .. code-block:: console
547
548     root@cfg01:~# apt update && apt install -y nmap
549     root@cfg01:~# nmap -sn 10.20.0.0/24     # expected: cfg01, mas01, jumpserver
550     root@cfg01:~# nmap -sn 192.168.11.0/24  # expected: cfg01, mas01, jumpserver
551     root@cfg01:~# nmap -sn 172.16.10.0/24   # expected: cfg01, mas01, jumpserver
552
553 Check ``DHCP`` Reaches Cluster Nodes
554 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
555
556 One common symptom observed during failed commissioning is that ``DHCP`` does
557 not work as expected between cluster nodes (baremetal nodes in the cluster; or
558 virtual machines on the jumpserver in case of ``hybrid`` deployments) and
559 the ``MaaS`` node.
560
561 To confirm or rule out this possibility, monitor the serial console output of
562 one (or more) cluster nodes during ``MaaS`` commissioning. If the node is
563 properly configured to attempt PXE boot, yet it times out waiting for an IP
564 address from ``mas01`` ``DHCP``, it's worth checking that ``DHCP`` packets
565 reach the ``jumpserver``, respectively the ``mas01`` VM.
566
567 .. code-block:: console
568
569     jenkins@jumpserver:~$ sudo apt update && sudo apt install -y dhcpdump
570     jenkins@jumpserver:~$ sudo dhcpdump -i admin_br
571
572 .. TIP::
573
574     If ``DHCP`` requests are present, but no replies are sent, ``iptables``
575     might be interfering on the jumpserver.
576
577 Check ``MaaS`` Logs
578 ~~~~~~~~~~~~~~~~~~~
579
580 If networking looks fine, yet nodes still fail to commission and/or deploy,
581 ``MaaS`` logs might offer more details about the failure:
582
583 * ``/var/log/maas/maas.log``
584 * ``/var/log/maas/rackd.log``
585 * ``/var/log/maas/regiond.log``
586
587 .. TIP::
588
589     If the problem is with the cluster node and not on the ``MaaS`` server,
590     node's kernel logs usually contain useful information.
591     These are saved via rsyslog on the ``mas01`` node in
592     ``/var/log/maas/rsyslog``.
593
594 Recovering Failed Deployments
595 =============================
596
597 The first deploy attempt might fail due to various reasons. If the problem
598 is not systemic (i.e. fixing it will not introduce incompatible configuration
599 changes, like setting a different ``INSTALLER_IP``), the environment is safe
600 to be reused and the deployment process can pick up from where it left off.
601
602 Leveraging these mechanisms requires a minimum understanding of how the
603 deploy process works, at least for manual ``STATE`` runs.
604
605 Automatic (re)deploy
606 ~~~~~~~~~~~~~~~~~~~~
607
608 OPNFV Fuel's ``deploy.sh`` script offers a dedicated argument for this, ``-f``,
609 which will skip executing the first ``N`` ``STATE`` files, where ``N`` is the
610 number of ``-f`` occurrences in the argument list.
611
612 .. TIP::
613
614     The list of ``STATE`` files to be executed for a specific environment
615     depends on the OPNFV scenario chosen, deployment type (``virtual``,
616     ``baremetal`` or ``hybrid``) and the presence/absence of a ``VCP``
617     (virtualized control plane).
618
619 e.g.: Let's consider a ``baremetal`` enviroment, with ``VCP`` and a simple
620 scenario ``os-nosdn-nofeature-ha``, where ``deploy.sh`` failed executing the
621 ``openstack_ha`` ``STATE`` file.
622
623 The simplest redeploy approach (which usually works for **any** combination of
624 deployment type/VCP/scenario) is to issue the same deploy command as the
625 original attempt used, then adding a single ``-f``:
626
627 .. code-block:: console
628
629     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
630                                             -s <scenario> [...] \
631                                             -f # skips running the virtual_init STATE file
632
633 All ``STATE`` files are re-entrant, so the above is equivalent (but a little
634 slower) to skipping all ``STATE`` files before the ``openstack_ha`` one, like:
635
636 .. code-block:: console
637
638     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
639                                             -s <scenario> [...] \
640                                             -ffff # skips virtual_init, maas, baremetal_init, virtual_control_plane
641
642 .. TIP::
643
644     For fine tuning the infrastructure setup steps executed during deployment,
645     see also the ``-e`` and ``-P`` deploy arguments.
646
647 .. NOTE::
648
649     On rare occassions, the cluster cannot idempotently be redeployed (e.g.
650     broken MySQL/Galera cluster), in which case some cleanup is due before
651     (re)running the ``STATE`` files. See ``-E`` deploy arg, which allows
652     either forcing a ``MaaS`` node deletion, then redeployment of all
653     baremetal nodes, if used twice (``-EE``); or only erasing the ``VCP`` VMs
654     if used only once (``-E``).
655
656 Manual ``STATE`` Run
657 ~~~~~~~~~~~~~~~~~~~~
658
659 Instead of leveraging the full ``deploy.sh``, one could execute the ``STATE``
660 files one by one (or partially) from the ``cfg01``.
661
662 However, this requires a better understanding of how the list of ``STATE``
663 files to be executed is constructed for a specific scenario, depending on the
664 deployment type and the cluster having baremetal nodes, implemented in:
665
666 * ``mcp/config/scenario/defaults.yaml.j2``
667 * ``mcp/config/scenario/<scenario-name>.yaml``
668
669 e.g.: For the example presented above (baremetal with ``VCP``,
670 ``os-nosdn-nofeature-ha``), the list of ``STATE`` files would be:
671
672 * ``virtual_init``
673 * ``maas``
674 * ``baremetal_init``
675 * ``virtual_control_plane``
676 * ``openstack_ha``
677 * ``networks``
678
679 To execute one (or more) of the remaining ``STATE`` files after a failure:
680
681 .. code-block:: console
682
683     jenkins@jumpserver:~$ docker exec -it fuel bash
684     root@cfg01:~$ cd ~/fuel/mcp/config/states
685     root@cfg01:~/fuel/mcp/config/states$ ./openstack_ha
686     root@cfg01:~/fuel/mcp/config/states$ CI_DEBUG=true ./networks
687
688 For even finer granularity, one can also run the commands in a ``STATE`` file
689 one by one manually, e.g. if the execution failed applying the ``rabbitmq``
690 sls:
691
692 .. code-block:: console
693
694     root@cfg01:~$ salt -I 'rabbitmq:server' state.sls rabbitmq
695
696 Exploring the Cloud with Salt
697 =============================
698
699 To gather information about the cloud, the salt commands can be used.
700 It is based around a master-minion idea where the salt-master pushes config to
701 the minions to execute actions.
702
703 For example tell salt to execute a ping to ``8.8.8.8`` on all the nodes.
704
705 .. code-block:: console
706
707     root@cfg01:~$ salt "*" network.ping 8.8.8.8
708                        ^^^                       target
709                            ^^^^^^^^^^^^          function to execute
710                                         ^^^^^^^  argument passed to the function
711
712 .. TIP::
713
714     Complex filters can be done to the target like compound queries or node roles.
715
716 For more information about Salt see the :ref:`fuel_userguide_references`
717 section.
718
719 Some examples are listed below. Note that these commands are issued from Salt
720 master as ``root`` user.
721
722 View the IPs of All the Components
723 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
724
725 .. code-block:: console
726
727     root@cfg01:~$ salt "*" network.ip_addrs
728     cfg01.mcp-odl-ha.local:
729        - 10.20.0.2
730        - 172.16.10.100
731     mas01.mcp-odl-ha.local:
732        - 10.20.0.3
733        - 172.16.10.3
734        - 192.168.11.3
735     .........................
736
737 View the Interfaces of All the Components and Put the Output in a ``yaml`` File
738 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
739
740 .. code-block:: console
741
742     root@cfg01:~$ salt "*" network.interfaces --out yaml --output-file interfaces.yaml
743     root@cfg01:~# cat interfaces.yaml
744     cfg01.mcp-odl-ha.local:
745      enp1s0:
746        hwaddr: 52:54:00:72:77:12
747        inet:
748        - address: 10.20.0.2
749          broadcast: 10.20.0.255
750          label: enp1s0
751          netmask: 255.255.255.0
752        inet6:
753        - address: fe80::5054:ff:fe72:7712
754          prefixlen: '64'
755          scope: link
756        up: true
757     .........................
758
759 View Installed Packages on MaaS Node
760 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
761
762 .. code-block:: console
763
764     root@cfg01:~# salt "mas*" pkg.list_pkgs
765     mas01.mcp-odl-ha.local:
766         ----------
767         accountsservice:
768             0.6.40-2ubuntu11.3
769         acl:
770             2.2.52-3
771         acpid:
772             1:2.0.26-1ubuntu2
773         adduser:
774             3.113+nmu3ubuntu4
775         anerd:
776             1
777     .........................
778
779 Execute Any Linux Command on All Nodes (e.g. ``ls /var/log``)
780 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
781
782 .. code-block:: console
783
784     root@cfg01:~# salt "*" cmd.run 'ls /var/log'
785     cfg01.mcp-odl-ha.local:
786        alternatives.log
787        apt
788        auth.log
789        boot.log
790        btmp
791        cloud-init-output.log
792        cloud-init.log
793     .........................
794
795 Execute Any Linux Command on Nodes Using Compound Queries Filter
796 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
797
798 .. code-block:: console
799
800     root@cfg01:~# salt -C '* and cfg01*' cmd.run 'ls /var/log'
801     cfg01.mcp-odl-ha.local:
802        alternatives.log
803        apt
804        auth.log
805        boot.log
806        btmp
807        cloud-init-output.log
808        cloud-init.log
809     .........................
810
811 Execute Any Linux Command on Nodes Using Role Filter
812 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
813
814 .. code-block:: console
815
816     root@cfg01:~# salt -I 'nova:compute' cmd.run 'ls /var/log'
817     cmp001.mcp-odl-ha.local:
818        alternatives.log
819        apache2
820        apt
821        auth.log
822        btmp
823        ceilometer
824        cinder
825        cloud-init-output.log
826        cloud-init.log
827     .........................
828
829 Accessing Openstack
830 ===================
831
832 Once the deployment is complete, Openstack CLI is accessible from controller
833 VMs (``ctl01`` ... ``ctl03``).
834
835 Openstack credentials are at ``/root/keystonercv3``.
836
837 .. code-block:: console
838
839     root@ctl01:~# source keystonercv3
840     root@ctl01:~# openstack image list
841     +--------------------------------------+-----------------------------------------------+--------+
842     | ID                                   | Name                                          | Status |
843     +======================================+===============================================+========+
844     | 152930bf-5fd5-49c2-b3a1-cae14973f35f | CirrosImage                                   | active |
845     | 7b99a779-78e4-45f3-9905-64ae453e3dcb | Ubuntu16.04                                   | active |
846     +--------------------------------------+-----------------------------------------------+--------+
847
848 The OpenStack Dashboard, Horizon, is available at ``http://<proxy public VIP>``.
849 The administrator credentials are ``admin``/``opnfv_secret``.
850
851 .. figure:: img/horizon_login.png
852     :width: 60%
853     :align: center
854
855 A full list of IPs/services is available at ``<proxy public VIP>:8090`` for
856 ``baremetal`` deploys.
857
858 .. figure:: img/salt_services_ip.png
859     :width: 60%
860     :align: center
861
862 Guest Operating System Support
863 ==============================
864
865 There are a number of possibilities regarding the guest operating systems
866 which can be spawned on the nodes.
867 The current system spawns virtual machines for VCP VMs on the KVM nodes and VMs
868 requested by users in OpenStack compute nodes. Currently the system supports
869 the following ``UEFI``-images for the guests:
870
871 +------------------+-------------------+--------------------+
872 | OS name          | ``x86_64`` status | ``aarch64`` status |
873 +==================+===================+====================+
874 | Ubuntu 17.10     | untested          | Full support       |
875 +------------------+-------------------+--------------------+
876 | Ubuntu 16.04     | Full support      | Full support       |
877 +------------------+-------------------+--------------------+
878 | Ubuntu 14.04     | untested          | Full support       |
879 +------------------+-------------------+--------------------+
880 | Fedora atomic 27 | untested          | Full support       |
881 +------------------+-------------------+--------------------+
882 | Fedora cloud 27  | untested          | Full support       |
883 +------------------+-------------------+--------------------+
884 | Debian           | untested          | Full support       |
885 +------------------+-------------------+--------------------+
886 | Centos 7         | untested          | Not supported      |
887 +------------------+-------------------+--------------------+
888 | Cirros 0.3.5     | Full support      | Full support       |
889 +------------------+-------------------+--------------------+
890 | Cirros 0.4.0     | Full support      | Full support       |
891 +------------------+-------------------+--------------------+
892
893 The above table covers only ``UEFI`` images and implies ``OVMF``/``AAVMF``
894 firmware on the host. An ``x86_64`` deployment also supports ``non-UEFI``
895 images, however that choice is up to the underlying hardware and the
896 administrator to make.
897
898 The images for the above operating systems can be found in their respective
899 websites.
900
901 OpenStack Storage
902 =================
903
904 OpenStack Cinder is the project behind block storage in OpenStack and OPNFV
905 Fuel supports LVM out of the box.
906
907 By default ``x86_64`` supports 2 additional block storage devices, while
908 ``aarch64`` supports only one.
909
910 More devices can be supported if the OS-image created has additional
911 properties allowing block storage devices to be spawned as ``SCSI`` drives.
912 To do this, add the properties below to the server:
913
914 .. code-block:: console
915
916     root@ctl01:~$ openstack image set --property hw_disk_bus='scsi' \
917                                       --property hw_scsi_model='virtio-scsi' \
918                                       <image>
919
920 The choice regarding which bus to use for the storage drives is an important
921 one. ``virtio-blk`` is the default choice for OPNFV Fuel, which attaches the
922 drives in ``/dev/vdX``. However, since we want to be able to attach a
923 larger number of volumes to the virtual machines, we recommend the switch to
924 ``SCSI`` drives which are attached in ``/dev/sdX`` instead.
925
926 ``virtio-scsi`` is a little worse in terms of performance but the ability to
927 add a larger number of drives combined with added features like ZFS, Ceph et
928 al, leads us to suggest the use of ``virtio-scsi`` in OPNFV Fuel for both
929 architectures.
930
931 More details regarding the differences and performance of ``virtio-blk`` vs
932 ``virtio-scsi`` are beyond the scope of this manual but can be easily found
933 in other sources online like `VirtIO SCSI`_ or `VirtIO performance`_.
934
935 Additional configuration for configuring images in OpenStack can be found in
936 the OpenStack Glance documentation.
937
938 OpenStack Endpoints
939 ===================
940
941 For each OpenStack service three endpoints are created: ``admin``, ``internal``
942 and ``public``.
943
944 .. code-block:: console
945
946     ubuntu@ctl01:~$ openstack endpoint list --service keystone
947     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
948     | ID                               | Region    | Service Name | Service Type | Enabled | Interface | URL                          |
949     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
950     | 008fec57922b4e9e8bf02c770039ae77 | RegionOne | keystone     | identity     | True    | internal  | http://172.16.10.26:5000/v3  |
951     | 1a1f3c3340484bda9ef7e193f50599e6 | RegionOne | keystone     | identity     | True    | admin     | http://172.16.10.26:35357/v3 |
952     | b0a47d42d0b6491b995d7e6230395de8 | RegionOne | keystone     | identity     | True    | public    | https://10.0.15.2:5000/v3    |
953     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
954
955 MCP sets up all Openstack services to talk to each other over unencrypted
956 connections on the internal management network. All admin/internal endpoints
957 use plain http, while the public endpoints are https connections terminated
958 via nginx at the ``VCP`` proxy VMs.
959
960 To access the public endpoints an SSL certificate has to be provided. For
961 convenience, the installation script will copy the required certificate
962 to the ``cfg01`` node at ``/etc/ssl/certs/os_cacert``.
963
964 Copy the certificate from the ``cfg01`` node to the client that will access
965 the https endpoints and place it under ``/etc/ssl/certs/``.
966 The SSL connection will be established automatically after.
967
968 .. code-block:: console
969
970     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no -i /var/lib/opnfv/mcp.rsa -l ubuntu 10.20.0.2 \
971       "cat /etc/ssl/certs/os_cacert" | sudo tee /etc/ssl/certs/os_cacert
972
973 Reclass Model Viewer Tutorial
974 =============================
975
976 In order to get a better understanding of the ``reclass`` model Fuel uses, the
977 `reclass-doc`_ tool can be used to visualise the ``reclass`` model.
978
979 To avoid installing packages on the ``jumpserver`` or another host, the
980 ``cfg01`` Docker container can be used. Since the ``fuel`` git repository
981 located on the ``jumpserver`` is already mounted inside ``cfg01`` container,
982 the results can be visualized using a web browser on the ``jumpserver`` at the
983 end of the procedure.
984
985 .. code-block:: console
986
987     jenkins@jumpserver:~$ docker exec -it fuel bash
988     root@cfg01:~$ apt-get update
989     root@cfg01:~$ apt-get install -y npm nodejs
990     root@cfg01:~$ npm install -g reclass-doc
991     root@cfg01:~$ ln -s /usr/bin/nodejs /usr/bin/node
992     root@cfg01:~$ reclass-doc --output ~/fuel/mcp/reclass/modeler \
993                                        ~/fuel/mcp/reclass
994
995 The generated documentation should be available on the ``jumpserver`` inside
996 ``fuel`` git repo subpath ``mcp/reclass/modeler/index.html``.
997
998 .. figure:: img/reclass_doc.png
999     :width: 60%
1000     :align: center
1001
1002 .. _fuel_userguide_references:
1003
1004 References
1005 ==========
1006
1007 #. :ref:`OPNFV Fuel Installation Instruction <fuel-installation>`
1008 #. `Saltstack Documentation`_
1009 #. `Saltstack Formulas`_
1010 #. `VirtIO performance`_
1011 #. `VirtIO SCSI`_
1012
1013 .. _`Saltstack Documentation`: https://docs.saltstack.com/en/latest/topics/
1014 .. _`Saltstack Formulas`: https://salt-formulas.readthedocs.io/en/latest/
1015 .. _`VirtIO performance`: https://mpolednik.github.io/2017/01/23/virtio-blk-vs-virtio-scsi/
1016 .. _`VirtIO SCSI`: https://www.ovirt.org/develop/release-management/features/storage/virtio-scsi/
1017 .. _`reclass-doc`: https://github.com/jirihybek/reclass-doc