6b9b7617f394d650cfe35ee7f597b9396eff40fe
[fuel.git] / docs / release / userguide / userguide.rst
1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
2 .. http://creativecommons.org/licenses/by/4.0
3 .. (c) Open Platform for NFV Project, Inc. and its contributors
4
5 *********************
6 OPNFV Fuel User Guide
7 *********************
8
9 Abstract
10 ========
11
12 This document contains details about using OPNFV Fuel ``Hunter`` release after
13 it was deployed. For details on how to deploy OpenStack, check
14 the installation instructions in the :ref:`fuel_userguide_references` section.
15
16 This is an unified documentation for both ``x86_64`` and ``aarch64``
17 architectures. All information is common for both architectures
18 except when explicitly stated.
19
20 Network Overview
21 ================
22
23 Fuel uses several networks to deploy and administer the cloud:
24
25 +------------------+----------------------------------------------------------+
26 | Network name     | Description                                              |
27 |                  |                                                          |
28 +==================+==========================================================+
29 | **PXE/admin**    | Used for booting the nodes via PXE and/or Salt           |
30 |                  | control network                                          |
31 +------------------+----------------------------------------------------------+
32 | **mcpcontrol**   | Docker network used to provision the infrastructure      |
33 |                  | hosts (Salt & MaaS)                                      |
34 +------------------+----------------------------------------------------------+
35 | **management**   | Used for internal communication between                  |
36 |                  | OpenStack components                                     |
37 +------------------+----------------------------------------------------------+
38 | **internal**     | Used for VM data communication within the                |
39 |                  | cloud deployment                                         |
40 +------------------+----------------------------------------------------------+
41 | **public**       | Used to provide Virtual IPs for public endpoints         |
42 |                  | that are used to connect to OpenStack services APIs.     |
43 |                  | Used by Virtual machines to access the Internet          |
44 +------------------+----------------------------------------------------------+
45
46 These networks - except ``mcpcontrol`` - can be Linux bridges configured
47 before the deploy on the Jumpserver.
48 If they don't exists at deploy time, they will be created by the scripts as
49 ``libvirt`` managed networks (except ``mcpcontrol``, which will be handled by
50 Docker using the ``bridge`` driver).
51
52 Network ``mcpcontrol``
53 ~~~~~~~~~~~~~~~~~~~~~~
54
55 ``mcpcontrol`` is a virtual network, managed by Docker. Its only purpose is to
56 provide a simple method of assigning an arbitrary ``INSTALLER_IP`` to the Salt
57 master node (``cfg01``), to maintain backwards compatibility with old OPNFV
58 Fuel behavior. Normally, end-users only need to change the ``INSTALLER_IP`` if
59 the default CIDR (``10.20.0.0/24``) overlaps with existing lab networks.
60
61 ``mcpcontrol`` uses the Docker bridge driver, so the Salt master (``cfg01``)
62 and the MaaS containers (``mas01``, when present) get assigned predefined IPs
63 (``.2``, ``.3``, while the jumpserver gets ``.1``).
64
65 +------------------+---------------------------+-----------------------------+
66 | Host             | Offset in IP range        | Default address             |
67 +==================+===========================+=============================+
68 | ``jumpserver``   | 1st                       | ``10.20.0.1``               |
69 +------------------+---------------------------+-----------------------------+
70 | ``cfg01``        | 2nd                       | ``10.20.0.2``               |
71 +------------------+---------------------------+-----------------------------+
72 | ``mas01``        | 3rd                       | ``10.20.0.3``               |
73 +------------------+---------------------------+-----------------------------+
74
75 This network is limited to the ``jumpserver`` host and does not require any
76 manual setup.
77
78 Network ``PXE/admin``
79 ~~~~~~~~~~~~~~~~~~~~~
80
81 .. TIP::
82
83     ``PXE/admin`` does not usually use an IP range offset in ``IDF``.
84
85 .. NOTE::
86
87     During ``MaaS`` commissioning phase, IP addresses are handed out by
88     ``MaaS``'s DHCP.
89
90 .. WARNING::
91
92     Default addresses in below table correspond to a ``PXE/admin`` CIDR of
93     ``192.168.11.0/24`` (the usual value used in OPNFV labs).
94
95     This is defined in ``IDF`` and can easily be changed to something else.
96
97 .. TODO: detail MaaS DHCP range start/end
98
99 +------------------+-----------------------+---------------------------------+
100 | Host             | Offset in IP range    | Default address                 |
101 +==================+=======================+=================================+
102 | ``jumpserver``   | 1st                   | ``192.168.11.1``                |
103 |                  |                       | (manual assignment)             |
104 +------------------+-----------------------+---------------------------------+
105 | ``cfg01``        | 2nd                   | ``192.168.11.2``                |
106 +------------------+-----------------------+---------------------------------+
107 | ``mas01``        | 3rd                   | ``192.168.11.3``                |
108 +------------------+-----------------------+---------------------------------+
109 | ``prx01``,       | 4th,                  | ``192.168.11.4``,               |
110 | ``prx02``        | 5th                   | ``192.168.11.5``                |
111 +------------------+-----------------------+---------------------------------+
112 | ``gtw01``,       | ...                   | ``...``                         |
113 | ``gtw02``,       |                       |                                 |
114 | ``gtw03``        |                       |                                 |
115 +------------------+-----------------------+---------------------------------+
116 | ``kvm01``,       |                       |                                 |
117 | ``kvm02``,       |                       |                                 |
118 | ``kvm03``        |                       |                                 |
119 +------------------+-----------------------+---------------------------------+
120 | ``dbs01``,       |                       |                                 |
121 | ``dbs02``,       |                       |                                 |
122 | ``dbs03``        |                       |                                 |
123 +------------------+-----------------------+---------------------------------+
124 | ``msg01``,       |                       |                                 |
125 | ``msg02``,       |                       |                                 |
126 | ``msg03``        |                       |                                 |
127 +------------------+-----------------------+---------------------------------+
128 | ``mdb01``,       |                       |                                 |
129 | ``mdb02``,       |                       |                                 |
130 | ``mdb03``        |                       |                                 |
131 +------------------+-----------------------+---------------------------------+
132 | ``ctl01``,       |                       |                                 |
133 | ``ctl02``,       |                       |                                 |
134 | ``ctl03``        |                       |                                 |
135 +------------------+-----------------------+---------------------------------+
136 | ``odl01``,       |                       |                                 |
137 | ``odl02``,       |                       |                                 |
138 | ``odl03``        |                       |                                 |
139 +------------------+-----------------------+---------------------------------+
140 | ``mon01``,       |                       |                                 |
141 | ``mon02``,       |                       |                                 |
142 | ``mon03``,       |                       |                                 |
143 | ``log01``,       |                       |                                 |
144 | ``log02``,       |                       |                                 |
145 | ``log03``,       |                       |                                 |
146 | ``mtr01``,       |                       |                                 |
147 | ``mtr02``,       |                       |                                 |
148 | ``mtr03``        |                       |                                 |
149 +------------------+-----------------------+---------------------------------+
150 | ``cmp001``,      |                       |                                 |
151 | ``cmp002``,      |                       |                                 |
152 | ``...``          |                       |                                 |
153 +------------------+-----------------------+---------------------------------+
154
155 Network ``management``
156 ~~~~~~~~~~~~~~~~~~~~~~
157
158 .. TIP::
159
160     ``management`` often has an IP range offset defined in ``IDF``.
161
162 .. WARNING::
163
164     Default addresses in below table correspond to a ``management`` IP range of
165     ``172.16.10.10-172.16.10.254`` (one of the commonly used values in OPNFV
166     labs). This is defined in ``IDF`` and can easily be changed to something
167     else. Since the ``jumpserver`` address is manually assigned, this is
168     usually not subject to the IP range restriction in ``IDF``.
169
170 +------------------+-----------------------+---------------------------------+
171 | Host             | Offset in IP range    | Default address                 |
172 +==================+=======================+=================================+
173 | ``jumpserver``   | N/A                   | ``172.16.10.1``                 |
174 |                  |                       | (manual assignment)             |
175 +------------------+-----------------------+---------------------------------+
176 | ``cfg01``        | 1st                   | ``172.16.10.11``                |
177 +------------------+-----------------------+---------------------------------+
178 | ``mas01``        | 2nd                   | ``172.16.10.12``                |
179 +------------------+-----------------------+---------------------------------+
180 | ``prx``          | 3rd,                  | ``172.16.10.13``,               |
181 |                  |                       |                                 |
182 | ``prx01``,       | 4th,                  | ``172.16.10.14``,               |
183 | ``prx02``        | 5th                   | ``172.16.10.15``                |
184 +------------------+-----------------------+---------------------------------+
185 | ``gtw01``,       | ...                   | ``...``                         |
186 | ``gtw02``,       |                       |                                 |
187 | ``gtw03``        |                       |                                 |
188 +------------------+-----------------------+---------------------------------+
189 | ``kvm``,         |                       |                                 |
190 |                  |                       |                                 |
191 | ``kvm01``,       |                       |                                 |
192 | ``kvm02``,       |                       |                                 |
193 | ``kvm03``        |                       |                                 |
194 +------------------+-----------------------+---------------------------------+
195 | ``dbs``,         |                       |                                 |
196 |                  |                       |                                 |
197 | ``dbs01``,       |                       |                                 |
198 | ``dbs02``,       |                       |                                 |
199 | ``dbs03``        |                       |                                 |
200 +------------------+-----------------------+---------------------------------+
201 | ``msg``,         |                       |                                 |
202 |                  |                       |                                 |
203 | ``msg01``,       |                       |                                 |
204 | ``msg02``,       |                       |                                 |
205 | ``msg03``        |                       |                                 |
206 +------------------+-----------------------+---------------------------------+
207 | ``mdb``,         |                       |                                 |
208 |                  |                       |                                 |
209 | ``mdb01``,       |                       |                                 |
210 | ``mdb02``,       |                       |                                 |
211 | ``mdb03``        |                       |                                 |
212 +------------------+-----------------------+---------------------------------+
213 | ``ctl``,         |                       |                                 |
214 |                  |                       |                                 |
215 | ``ctl01``,       |                       |                                 |
216 | ``ctl02``,       |                       |                                 |
217 | ``ctl03``        |                       |                                 |
218 +------------------+-----------------------+---------------------------------+
219 | ``odl``,         |                       |                                 |
220 |                  |                       |                                 |
221 | ``odl01``,       |                       |                                 |
222 | ``odl02``,       |                       |                                 |
223 | ``odl03``        |                       |                                 |
224 +------------------+-----------------------+---------------------------------+
225 | ``mon``,         |                       |                                 |
226 |                  |                       |                                 |
227 | ``mon01``,       |                       |                                 |
228 | ``mon02``,       |                       |                                 |
229 | ``mon03``,       |                       |                                 |
230 |                  |                       |                                 |
231 | ``log``,         |                       |                                 |
232 |                  |                       |                                 |
233 | ``log01``,       |                       |                                 |
234 | ``log02``,       |                       |                                 |
235 | ``log03``,       |                       |                                 |
236 |                  |                       |                                 |
237 | ``mtr``,         |                       |                                 |
238 |                  |                       |                                 |
239 | ``mtr01``,       |                       |                                 |
240 | ``mtr02``,       |                       |                                 |
241 | ``mtr03``        |                       |                                 |
242 +------------------+-----------------------+---------------------------------+
243 | ``cmp001``,      |                       |                                 |
244 | ``cmp002``,      |                       |                                 |
245 | ``...``          |                       |                                 |
246 +------------------+-----------------------+---------------------------------+
247
248 Network ``internal``
249 ~~~~~~~~~~~~~~~~~~~~
250
251 .. TIP::
252
253     ``internal`` does not usually use an IP range offset in ``IDF``.
254
255 .. WARNING::
256
257     Default addresses in below table correspond to an ``internal`` CIDR of
258     ``10.1.0.0/24`` (the usual value used in OPNFV labs).
259     This is defined in ``IDF`` and can easily be changed to something else.
260
261 +------------------+------------------------+--------------------------------+
262 | Host             | Offset in IP range     | Default address                |
263 +==================+========================+================================+
264 | ``jumpserver``   | N/A                    | ``10.1.0.1``                   |
265 |                  |                        | (manual assignment, optional)  |
266 +------------------+------------------------+--------------------------------+
267 | ``gtw01``,       | 1st,                   | ``10.1.0.2``,                  |
268 | ``gtw02``,       | 2nd,                   | ``10.1.0.3``,                  |
269 | ``gtw03``        | 3rd                    | ``10.1.0.4``                   |
270 +------------------+------------------------+--------------------------------+
271 | ``cmp001``,      | 4th,                   | ``10.1.0.5``,                  |
272 | ``cmp002``,      | 5th,                   | ``10.1.0.6``,                  |
273 | ``...``          | ...                    | ``...``                        |
274 +------------------+------------------------+--------------------------------+
275
276 Network ``public``
277 ~~~~~~~~~~~~~~~~~~
278
279 .. TIP::
280
281     ``public`` often has an IP range offset defined in ``IDF``.
282
283 .. WARNING::
284
285     Default addresses in below table correspond to a ``public`` IP range of
286     ``172.30.10.100-172.30.10.254`` (one of the used values in OPNFV
287     labs). This is defined in ``IDF`` and can easily be changed to something
288     else. Since the ``jumpserver`` address is manually assigned, this is
289     usually not subject to the IP range restriction in ``IDF``.
290
291 +------------------+------------------------+--------------------------------+
292 | Host             | Offset in IP range     | Default address                |
293 +==================+========================+================================+
294 | ``jumpserver``   | N/A                    | ``172.30.10.72``               |
295 |                  |                        | (manual assignment, optional)  |
296 +------------------+------------------------+--------------------------------+
297 | ``prx``,         | 1st,                   | ``172.30.10.101``,             |
298 |                  |                        |                                |
299 | ``prx01``,       | 2nd,                   | ``172.30.10.102``,             |
300 | ``prx02``        | 3rd                    | ``172.30.10.103``              |
301 +------------------+------------------------+--------------------------------+
302 | ``gtw01``,       | 4th,                   | ``172.30.10.104``,             |
303 | ``gtw02``,       | 5th,                   | ``172.30.10.105``,             |
304 | ``gtw03``        | 6th                    | ``172.30.10.106``              |
305 +------------------+------------------------+--------------------------------+
306 | ``ctl01``,       | ...                    | ``...``                        |
307 | ``ctl02``,       |                        |                                |
308 | ``ctl03``        |                        |                                |
309 +------------------+------------------------+--------------------------------+
310 | ``odl``,         |                        |                                |
311 +------------------+------------------------+--------------------------------+
312 | ``cmp001``,      |                        |                                |
313 | ``cmp002``,      |                        |                                |
314 | ``...``          |                        |                                |
315 +------------------+------------------------+--------------------------------+
316
317 Accessing the Salt Master Node (``cfg01``)
318 ==========================================
319
320 The Salt Master node (``cfg01``) runs a ``sshd`` server listening on
321 ``0.0.0.0:22``.
322
323 To login as ``ubuntu`` user, use the RSA private key ``/var/lib/opnfv/mcp.rsa``:
324
325 .. code-block:: console
326
327     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no \
328                               -i /var/lib/opnfv/mcp.rsa \
329                               -l ubuntu 10.20.0.2
330     ubuntu@cfg01:~$
331
332 .. NOTE::
333
334     User ``ubuntu`` has sudo rights.
335
336 .. TIP::
337
338     The Salt master IP (``10.20.0.2``) is not hard set, it is configurable via
339     ``INSTALLER_IP`` during deployment.
340
341 .. TIP::
342
343     Starting with the ``Gambia`` release, ``cfg01`` is containerized, so this
344     also works (from ``jumpserver`` only):
345
346 .. code-block:: console
347
348     jenkins@jumpserver:~$ docker exec -it fuel bash
349     root@cfg01:~$
350
351 Accessing the MaaS Node (``mas01``)
352 ===================================
353
354 Starting with the ``Hunter`` release, the MaaS node (``mas01``) is
355 containerized and no longer runs a ``sshd`` server. To access it (from
356 ``jumpserver`` only):
357
358 .. code-block:: console
359
360     jenkins@jumpserver:~$ docker exec -it maas bash
361     root@mas01:~$
362
363 Accessing Cluster Nodes
364 =======================
365
366 Logging in to cluster nodes is possible from the Jumpserver, Salt Master etc.
367
368 .. code-block:: console
369
370     jenkins@jumpserver:~$ ssh -i /var/lib/opnfv/mcp.rsa ubuntu@192.168.11.52
371
372 .. TIP::
373
374     ``/etc/hosts`` on ``cfg01`` has all the cluster hostnames, which can be
375     used instead of IP addresses.
376
377     ``/root/.ssh/config`` on ``cfg01`` configures the default user and key:
378     ``ubuntu``, respectively ``/root/fuel/mcp/scripts/mcp.rsa``.
379
380 .. code-block:: console
381
382     root@cfg01:~$ ssh ctl01
383
384 Debugging ``MaaS`` Comissioning/Deployment Issues
385 =================================================
386
387 One of the most common issues when setting up a new POD is ``MaaS`` failing to
388 commission/deploy the nodes, usually timing out after a couple of retries.
389
390 Such failures might indicate misconfiguration in ``PDF``/``IDF``, ``TOR``
391 switch configuration or even faulty hardware.
392
393 Here are a couple of pointers for isolating the problem.
394
395 Accessing the ``MaaS`` Dashboard
396 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
397
398 ``MaaS`` web-based dashboard is available at
399 ``http://<jumpserver IP address>:5240/MAAS``.
400
401 The administrator credentials are ``opnfv``/``opnfv_secret``.
402
403 Ensure Commission/Deploy Timeouts Are Not Too Small
404 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
405
406 Some hardware takes longer to boot or to run the initial scripts during
407 commissioning/deployment phases. If that's the case, ``MaaS`` will time out
408 waiting for the process to finish. ``MaaS`` logs will reflect that, and the
409 issue is usually easy to observe on the nodes' serial console - if the node
410 seems to PXE-boot the OS live image, starts executing cloud-init/curtin
411 hooks without spilling critical errors, then it is powered down/shut off,
412 most likely the timeout was hit.
413
414 To access the serial console of a node, see your board manufacturer's
415 documentation. Some hardware no longer has a physical serial connector these
416 days, usually being replaced by a vendor-specific software-based interface.
417
418 If the board supports ``SOL`` (Serial Over LAN) over ``IPMI`` lanplus protocol,
419 a simpler solution to hook to the serial console is to use ``ipmitool``.
420
421 .. TIP::
422
423     Early boot stage output might not be shown over ``SOL``, but only over
424     the video console provided by the (vendor-specific) interface.
425
426 .. code-block:: console
427
428     jenkins@jumpserver:~$ ipmitool -H <host BMC IP> -U <user> -P <pass> \
429                                    -I lanplus sol activate
430
431 To bypass this, simply set a larger timeout in the ``IDF``.
432
433 Check Jumpserver Network Configuration
434 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
435
436 .. code-block:: console
437
438     jenkins@jumpserver:~$ brctl show
439     jenkins@jumpserver:~$ ifconfig -a
440
441 +-----------------------+------------------------------------------------+
442 | Configuration item    | Expected behavior                              |
443 +=======================+================================================+
444 | IP addresses assigned | IP addresses should be assigned to the bridge, |
445 | to bridge ports       | and not to individual bridge ports             |
446 +-----------------------+------------------------------------------------+
447
448 Check Network Connectivity Between Nodes on the Jumpserver
449 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
450
451 ``cfg01`` is a Docker container running on the ``jumpserver``, connected to
452 Docker networks (created by docker-compose automatically on container up),
453 which in turn are connected using veth pairs to their ``libvirt`` managed
454 counterparts (or manually created bridges).
455
456 For example, the ``mgmt`` network(s) should look like below for a ``virtual``
457 deployment.
458
459 .. code-block:: console
460
461     jenkins@jumpserver:~$ brctl show mgmt
462     bridge name   bridge id           STP enabled   interfaces
463     mgmt          8000.525400064f77   yes           mgmt-nic
464                                                     veth_mcp2
465                                                     vnet8
466
467     jenkins@jumpserver:~$ docker network ls
468     NETWORK ID    NAME                              DRIVER   SCOPE
469     81a0fdb3bd78  docker-compose_mgmt               macvlan  local
470     [...]
471
472     jenkins@jumpserver:~$ docker network inspect docker-compose_mgmt
473     [
474         {
475             "Name": "docker-compose_mgmt",
476             [...]
477             "Options": {
478                 "parent": "veth_mcp3"
479             },
480         }
481     ]
482
483 Before investigating the rest of the cluster networking configuration, the
484 first thing to check is that ``cfg01`` has network connectivity to other
485 jumpserver hosted nodes, e.g. ``mas01`` and to the jumpserver itself
486 (provided that the jumpserver has an IP address in that particular network
487 segment).
488
489 .. code-block:: console
490
491     jenkins@jumpserver:~$ docker exec -it fuel bash
492     root@cfg01:~# ifconfig -a | grep inet
493         inet addr:10.20.0.2     Bcast:0.0.0.0  Mask:255.255.255.0
494         inet addr:172.16.10.2   Bcast:0.0.0.0  Mask:255.255.255.0
495         inet addr:192.168.11.2  Bcast:0.0.0.0  Mask:255.255.255.0
496
497 For each network of interest (``mgmt``, ``PXE/admin``), check
498 that ``cfg01`` can ping the jumpserver IP in that network segment.
499
500 .. NOTE::
501
502     ``mcpcontrol`` is set up at container bringup, so it should always be
503     available, while the other networks are configured by Salt as part of the
504     ``virtual_init`` STATE file.
505
506 .. code-block:: console
507
508     root@cfg01:~# ping -c1 10.20.0.1  # mcpcontrol jumpserver IP
509     root@cfg01:~# ping -c1 10.20.0.3  # mcpcontrol mas01 IP
510
511 .. TIP::
512
513     ``mcpcontrol`` CIDR is configurable via ``INSTALLER_IP`` env var during
514     deployment. However, IP offsets inside that segment are hard set to ``.1``
515     for the jumpserver, ``.2`` for ``cfg01``, respectively to ``.3`` for
516     ``mas01`` node.
517
518 .. code-block:: console
519
520     root@cfg01:~# salt 'mas*' pillar.item --out yaml \
521                   _param:infra_maas_node01_deploy_address \
522                   _param:infra_maas_node01_address
523     mas01.mcp-ovs-noha.local:
524       _param:infra_maas_node01_address: 172.16.10.12
525       _param:infra_maas_node01_deploy_address: 192.168.11.3
526
527     root@cfg01:~# ping -c1 192.168.11.1  # PXE/admin jumpserver IP
528     root@cfg01:~# ping -c1 192.168.11.3  # PXE/admin mas01 IP
529     root@cfg01:~# ping -c1 172.16.10.1   # mgmt jumpserver IP
530     root@cfg01:~# ping -c1 172.16.10.12  # mgmt mas01 IP
531
532 .. TIP::
533
534     Jumpserver IP addresses for ``PXE/admin``, ``mgmt`` and ``public`` bridges
535     are user-chosen and manually set, so above snippets should be adjusted
536     accordingly if the user chose a different IP, other than ``.1`` in each
537     CIDR.
538
539 Alternatively, a quick ``nmap`` scan would work just as well.
540
541 .. code-block:: console
542
543     root@cfg01:~# apt update && apt install -y nmap
544     root@cfg01:~# nmap -sn 10.20.0.0/24     # expected: cfg01, mas01, jumpserver
545     root@cfg01:~# nmap -sn 192.168.11.0/24  # expected: cfg01, mas01, jumpserver
546     root@cfg01:~# nmap -sn 172.16.10.0/24   # expected: cfg01, mas01, jumpserver
547
548 Check ``DHCP`` Reaches Cluster Nodes
549 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
550
551 One common symptom observed during failed commissioning is that ``DHCP`` does
552 not work as expected between cluster nodes (baremetal nodes in the cluster; or
553 virtual machines on the jumpserver in case of ``hybrid`` deployments) and
554 the ``MaaS`` node.
555
556 To confirm or rule out this possibility, monitor the serial console output of
557 one (or more) cluster nodes during ``MaaS`` commissioning. If the node is
558 properly configured to attempt PXE boot, yet it times out waiting for an IP
559 address from ``mas01`` ``DHCP``, it's worth checking that ``DHCP`` packets
560 reach the ``jumpserver``, respectively the ``mas01`` container.
561
562 .. code-block:: console
563
564     jenkins@jumpserver:~$ sudo apt update && sudo apt install -y dhcpdump
565     jenkins@jumpserver:~$ sudo dhcpdump -i admin_br
566
567 .. TIP::
568
569     If ``DHCP`` requests are present, but no replies are sent, ``iptables``
570     might be interfering on the jumpserver.
571
572 Check ``MaaS`` Logs
573 ~~~~~~~~~~~~~~~~~~~
574
575 If networking looks fine, yet nodes still fail to commission and/or deploy,
576 ``MaaS`` logs might offer more details about the failure:
577
578 * ``/var/log/maas/maas.log``
579 * ``/var/log/maas/rackd.log``
580 * ``/var/log/maas/regiond.log``
581
582 .. TIP::
583
584     If the problem is with the cluster node and not on the ``MaaS`` server,
585     node's kernel logs usually contain useful information.
586     These are saved via rsyslog on the ``mas01`` node in
587     ``/var/log/maas/rsyslog``.
588
589 Recovering Failed Deployments
590 =============================
591
592 The first deploy attempt might fail due to various reasons. If the problem
593 is not systemic (i.e. fixing it will not introduce incompatible configuration
594 changes, like setting a different ``INSTALLER_IP``), the environment is safe
595 to be reused and the deployment process can pick up from where it left off.
596
597 Leveraging these mechanisms requires a minimum understanding of how the
598 deploy process works, at least for manual ``STATE`` runs.
599
600 Automatic (re)deploy
601 ~~~~~~~~~~~~~~~~~~~~
602
603 OPNFV Fuel's ``deploy.sh`` script offers a dedicated argument for this, ``-f``,
604 which will skip executing the first ``N`` ``STATE`` files, where ``N`` is the
605 number of ``-f`` occurrences in the argument list.
606
607 .. TIP::
608
609     The list of ``STATE`` files to be executed for a specific environment
610     depends on the OPNFV scenario chosen, deployment type (``virtual``,
611     ``baremetal`` or ``hybrid``) and the presence/absence of a ``VCP``
612     (virtualized control plane).
613
614 e.g.: Let's consider a ``baremetal`` enviroment, with ``VCP`` and a simple
615 scenario ``os-nosdn-nofeature-ha``, where ``deploy.sh`` failed executing the
616 ``openstack_ha`` ``STATE`` file.
617
618 The simplest redeploy approach (which usually works for **any** combination of
619 deployment type/VCP/scenario) is to issue the same deploy command as the
620 original attempt used, then adding a single ``-f``:
621
622 .. code-block:: console
623
624     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
625                                             -s <scenario> [...] \
626                                             -f # skips running the virtual_init STATE file
627
628 All ``STATE`` files are re-entrant, so the above is equivalent (but a little
629 slower) to skipping all ``STATE`` files before the ``openstack_ha`` one, like:
630
631 .. code-block:: console
632
633     jenkins@jumpserver:~/fuel$ ci/deploy.sh -l <lab_name> -p <pod_name> \
634                                             -s <scenario> [...] \
635                                             -ffff # skips virtual_init, maas, baremetal_init, virtual_control_plane
636
637 .. TIP::
638
639     For fine tuning the infrastructure setup steps executed during deployment,
640     see also the ``-e`` and ``-P`` deploy arguments.
641
642 .. NOTE::
643
644     On rare occassions, the cluster cannot idempotently be redeployed (e.g.
645     broken MySQL/Galera cluster), in which case some cleanup is due before
646     (re)running the ``STATE`` files. See ``-E`` deploy arg, which allows
647     either forcing a ``MaaS`` node deletion, then redeployment of all
648     baremetal nodes, if used twice (``-EE``); or only erasing the ``VCP`` VMs
649     if used only once (``-E``).
650
651 Manual ``STATE`` Run
652 ~~~~~~~~~~~~~~~~~~~~
653
654 Instead of leveraging the full ``deploy.sh``, one could execute the ``STATE``
655 files one by one (or partially) from the ``cfg01``.
656
657 However, this requires a better understanding of how the list of ``STATE``
658 files to be executed is constructed for a specific scenario, depending on the
659 deployment type and the cluster having baremetal nodes, implemented in:
660
661 * ``mcp/config/scenario/defaults.yaml.j2``
662 * ``mcp/config/scenario/<scenario-name>.yaml``
663
664 e.g.: For the example presented above (baremetal with ``VCP``,
665 ``os-nosdn-nofeature-ha``), the list of ``STATE`` files would be:
666
667 * ``virtual_init``
668 * ``maas``
669 * ``baremetal_init``
670 * ``virtual_control_plane``
671 * ``openstack_ha``
672 * ``networks``
673
674 To execute one (or more) of the remaining ``STATE`` files after a failure:
675
676 .. code-block:: console
677
678     jenkins@jumpserver:~$ docker exec -it fuel bash
679     root@cfg01:~$ cd ~/fuel/mcp/config/states
680     root@cfg01:~/fuel/mcp/config/states$ ./openstack_ha
681     root@cfg01:~/fuel/mcp/config/states$ CI_DEBUG=true ./networks
682
683 For even finer granularity, one can also run the commands in a ``STATE`` file
684 one by one manually, e.g. if the execution failed applying the ``rabbitmq``
685 sls:
686
687 .. code-block:: console
688
689     root@cfg01:~$ salt -I 'rabbitmq:server' state.sls rabbitmq
690
691 Exploring the Cloud with Salt
692 =============================
693
694 To gather information about the cloud, the salt commands can be used.
695 It is based around a master-minion idea where the salt-master pushes config to
696 the minions to execute actions.
697
698 For example tell salt to execute a ping to ``8.8.8.8`` on all the nodes.
699
700 .. code-block:: console
701
702     root@cfg01:~$ salt "*" network.ping 8.8.8.8
703                        ^^^                       target
704                            ^^^^^^^^^^^^          function to execute
705                                         ^^^^^^^  argument passed to the function
706
707 .. TIP::
708
709     Complex filters can be done to the target like compound queries or node roles.
710
711 For more information about Salt see the :ref:`fuel_userguide_references`
712 section.
713
714 Some examples are listed below. Note that these commands are issued from Salt
715 master as ``root`` user.
716
717 View the IPs of All the Components
718 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
719
720 .. code-block:: console
721
722     root@cfg01:~$ salt "*" network.ip_addrs
723     cfg01.mcp-odl-ha.local:
724        - 10.20.0.2
725        - 172.16.10.100
726     mas01.mcp-odl-ha.local:
727        - 10.20.0.3
728        - 172.16.10.3
729        - 192.168.11.3
730     .........................
731
732 View the Interfaces of All the Components and Put the Output in a ``yaml`` File
733 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
734
735 .. code-block:: console
736
737     root@cfg01:~$ salt "*" network.interfaces --out yaml --output-file interfaces.yaml
738     root@cfg01:~# cat interfaces.yaml
739     cfg01.mcp-odl-ha.local:
740      enp1s0:
741        hwaddr: 52:54:00:72:77:12
742        inet:
743        - address: 10.20.0.2
744          broadcast: 10.20.0.255
745          label: enp1s0
746          netmask: 255.255.255.0
747        inet6:
748        - address: fe80::5054:ff:fe72:7712
749          prefixlen: '64'
750          scope: link
751        up: true
752     .........................
753
754 View Installed Packages on MaaS Node
755 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
756
757 .. code-block:: console
758
759     root@cfg01:~# salt "mas*" pkg.list_pkgs
760     mas01.mcp-odl-ha.local:
761         ----------
762         accountsservice:
763             0.6.40-2ubuntu11.3
764         acl:
765             2.2.52-3
766         acpid:
767             1:2.0.26-1ubuntu2
768         adduser:
769             3.113+nmu3ubuntu4
770         anerd:
771             1
772     .........................
773
774 Execute Any Linux Command on All Nodes (e.g. ``ls /var/log``)
775 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
776
777 .. code-block:: console
778
779     root@cfg01:~# salt "*" cmd.run 'ls /var/log'
780     cfg01.mcp-odl-ha.local:
781        alternatives.log
782        apt
783        auth.log
784        boot.log
785        btmp
786        cloud-init-output.log
787        cloud-init.log
788     .........................
789
790 Execute Any Linux Command on Nodes Using Compound Queries Filter
791 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
792
793 .. code-block:: console
794
795     root@cfg01:~# salt -C '* and cfg01*' cmd.run 'ls /var/log'
796     cfg01.mcp-odl-ha.local:
797        alternatives.log
798        apt
799        auth.log
800        boot.log
801        btmp
802        cloud-init-output.log
803        cloud-init.log
804     .........................
805
806 Execute Any Linux Command on Nodes Using Role Filter
807 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
808
809 .. code-block:: console
810
811     root@cfg01:~# salt -I 'nova:compute' cmd.run 'ls /var/log'
812     cmp001.mcp-odl-ha.local:
813        alternatives.log
814        apache2
815        apt
816        auth.log
817        btmp
818        ceilometer
819        cinder
820        cloud-init-output.log
821        cloud-init.log
822     .........................
823
824 Accessing Openstack
825 ===================
826
827 Once the deployment is complete, Openstack CLI is accessible from controller
828 VMs (``ctl01`` ... ``ctl03``).
829
830 Openstack credentials are at ``/root/keystonercv3``.
831
832 .. code-block:: console
833
834     root@ctl01:~# source keystonercv3
835     root@ctl01:~# openstack image list
836     +--------------------------------------+-----------------------------------------------+--------+
837     | ID                                   | Name                                          | Status |
838     +======================================+===============================================+========+
839     | 152930bf-5fd5-49c2-b3a1-cae14973f35f | CirrosImage                                   | active |
840     | 7b99a779-78e4-45f3-9905-64ae453e3dcb | Ubuntu16.04                                   | active |
841     +--------------------------------------+-----------------------------------------------+--------+
842
843 The OpenStack Dashboard, Horizon, is available at ``http://<proxy public VIP>``.
844 The administrator credentials are ``admin``/``opnfv_secret``.
845
846 .. figure:: img/horizon_login.png
847     :width: 60%
848     :align: center
849
850 A full list of IPs/services is available at ``<proxy public VIP>:8090`` for
851 ``baremetal`` deploys.
852
853 .. figure:: img/salt_services_ip.png
854     :width: 60%
855     :align: center
856
857 Guest Operating System Support
858 ==============================
859
860 There are a number of possibilities regarding the guest operating systems
861 which can be spawned on the nodes.
862 The current system spawns virtual machines for VCP VMs on the KVM nodes and VMs
863 requested by users in OpenStack compute nodes. Currently the system supports
864 the following ``UEFI``-images for the guests:
865
866 +------------------+-------------------+--------------------+
867 | OS name          | ``x86_64`` status | ``aarch64`` status |
868 +==================+===================+====================+
869 | Ubuntu 17.10     | untested          | Full support       |
870 +------------------+-------------------+--------------------+
871 | Ubuntu 16.04     | Full support      | Full support       |
872 +------------------+-------------------+--------------------+
873 | Ubuntu 14.04     | untested          | Full support       |
874 +------------------+-------------------+--------------------+
875 | Fedora atomic 27 | untested          | Full support       |
876 +------------------+-------------------+--------------------+
877 | Fedora cloud 27  | untested          | Full support       |
878 +------------------+-------------------+--------------------+
879 | Debian           | untested          | Full support       |
880 +------------------+-------------------+--------------------+
881 | Centos 7         | untested          | Not supported      |
882 +------------------+-------------------+--------------------+
883 | Cirros 0.3.5     | Full support      | Full support       |
884 +------------------+-------------------+--------------------+
885 | Cirros 0.4.0     | Full support      | Full support       |
886 +------------------+-------------------+--------------------+
887
888 The above table covers only ``UEFI`` images and implies ``OVMF``/``AAVMF``
889 firmware on the host. An ``x86_64`` deployment also supports ``non-UEFI``
890 images, however that choice is up to the underlying hardware and the
891 administrator to make.
892
893 The images for the above operating systems can be found in their respective
894 websites.
895
896 OpenStack Storage
897 =================
898
899 OpenStack Cinder is the project behind block storage in OpenStack and OPNFV
900 Fuel supports LVM out of the box.
901
902 By default ``x86_64`` supports 2 additional block storage devices, while
903 ``aarch64`` supports only one.
904
905 More devices can be supported if the OS-image created has additional
906 properties allowing block storage devices to be spawned as ``SCSI`` drives.
907 To do this, add the properties below to the server:
908
909 .. code-block:: console
910
911     root@ctl01:~$ openstack image set --property hw_disk_bus='scsi' \
912                                       --property hw_scsi_model='virtio-scsi' \
913                                       <image>
914
915 The choice regarding which bus to use for the storage drives is an important
916 one. ``virtio-blk`` is the default choice for OPNFV Fuel, which attaches the
917 drives in ``/dev/vdX``. However, since we want to be able to attach a
918 larger number of volumes to the virtual machines, we recommend the switch to
919 ``SCSI`` drives which are attached in ``/dev/sdX`` instead.
920
921 ``virtio-scsi`` is a little worse in terms of performance but the ability to
922 add a larger number of drives combined with added features like ZFS, Ceph et
923 al, leads us to suggest the use of ``virtio-scsi`` in OPNFV Fuel for both
924 architectures.
925
926 More details regarding the differences and performance of ``virtio-blk`` vs
927 ``virtio-scsi`` are beyond the scope of this manual but can be easily found
928 in other sources online like `VirtIO SCSI`_ or `VirtIO performance`_.
929
930 Additional configuration for configuring images in OpenStack can be found in
931 the OpenStack Glance documentation.
932
933 OpenStack Endpoints
934 ===================
935
936 For each OpenStack service three endpoints are created: ``admin``, ``internal``
937 and ``public``.
938
939 .. code-block:: console
940
941     ubuntu@ctl01:~$ openstack endpoint list --service keystone
942     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
943     | ID                               | Region    | Service Name | Service Type | Enabled | Interface | URL                          |
944     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
945     | 008fec57922b4e9e8bf02c770039ae77 | RegionOne | keystone     | identity     | True    | internal  | http://172.16.10.26:5000/v3  |
946     | 1a1f3c3340484bda9ef7e193f50599e6 | RegionOne | keystone     | identity     | True    | admin     | http://172.16.10.26:35357/v3 |
947     | b0a47d42d0b6491b995d7e6230395de8 | RegionOne | keystone     | identity     | True    | public    | https://10.0.15.2:5000/v3    |
948     +----------------------------------+-----------+--------------+--------------+---------+-----------+------------------------------+
949
950 MCP sets up all Openstack services to talk to each other over unencrypted
951 connections on the internal management network. All admin/internal endpoints
952 use plain http, while the public endpoints are https connections terminated
953 via nginx at the ``VCP`` proxy VMs.
954
955 To access the public endpoints an SSL certificate has to be provided. For
956 convenience, the installation script will copy the required certificate
957 to the ``cfg01`` node at ``/etc/ssl/certs/os_cacert``.
958
959 Copy the certificate from the ``cfg01`` node to the client that will access
960 the https endpoints and place it under ``/etc/ssl/certs/``.
961 The SSL connection will be established automatically after.
962
963 .. code-block:: console
964
965     jenkins@jumpserver:~$ ssh -o StrictHostKeyChecking=no -i /var/lib/opnfv/mcp.rsa -l ubuntu 10.20.0.2 \
966       "cat /etc/ssl/certs/os_cacert" | sudo tee /etc/ssl/certs/os_cacert
967
968 Reclass Model Viewer Tutorial
969 =============================
970
971 In order to get a better understanding of the ``reclass`` model Fuel uses, the
972 `reclass-doc`_ tool can be used to visualise the ``reclass`` model.
973
974 To avoid installing packages on the ``jumpserver`` or another host, the
975 ``cfg01`` Docker container can be used. Since the ``fuel`` git repository
976 located on the ``jumpserver`` is already mounted inside ``cfg01`` container,
977 the results can be visualized using a web browser on the ``jumpserver`` at the
978 end of the procedure.
979
980 .. code-block:: console
981
982     jenkins@jumpserver:~$ docker exec -it fuel bash
983     root@cfg01:~$ apt-get update
984     root@cfg01:~$ apt-get install -y npm nodejs
985     root@cfg01:~$ npm install -g reclass-doc
986     root@cfg01:~$ ln -s /usr/bin/nodejs /usr/bin/node
987     root@cfg01:~$ reclass-doc --output ~/fuel/mcp/reclass/modeler \
988                                        ~/fuel/mcp/reclass
989
990 The generated documentation should be available on the ``jumpserver`` inside
991 ``fuel`` git repo subpath ``mcp/reclass/modeler/index.html``.
992
993 .. figure:: img/reclass_doc.png
994     :width: 60%
995     :align: center
996
997 .. _fuel_userguide_references:
998
999 References
1000 ==========
1001
1002 #. :ref:`OPNFV Fuel Installation Instruction <fuel-installation>`
1003 #. `Saltstack Documentation`_
1004 #. `Saltstack Formulas`_
1005 #. `VirtIO performance`_
1006 #. `VirtIO SCSI`_
1007
1008 .. _`Saltstack Documentation`: https://docs.saltstack.com/en/latest/topics/
1009 .. _`Saltstack Formulas`: https://salt-formulas.readthedocs.io/en/latest/
1010 .. _`VirtIO performance`: https://mpolednik.github.io/2017/01/23/virtio-blk-vs-virtio-scsi/
1011 .. _`VirtIO SCSI`: https://www.ovirt.org/develop/release-management/features/storage/virtio-scsi/
1012 .. _`reclass-doc`: https://github.com/jirihybek/reclass-doc