fuel.git
5 years ago[maas] Fix permissions on (partial) redeploy 81/67081/6
Alexandru Avadanii [Thu, 21 Feb 2019 17:06:42 +0000 (18:06 +0100)]
[maas] Fix permissions on (partial) redeploy

When redeploying a cluster only (keeping the infrastructure containers
from a previous deploy), some things need to be adjusted:
- /entrypoint.sh exec permission;
- /etc/maas uid/gid re-align on new (fresh) deploy;
- account for different location of /usr/sbin/tcpdump apparmor profile
  for CentOS jumpservers;

Change-Id: If51db0bc95eff1a497e1df5d457e26a7b902aa5a
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[fdio] Bump compute RAM defaults for virtual PODs 04/67804/4
Alexandru Avadanii [Wed, 8 May 2019 20:17:23 +0000 (22:17 +0200)]
[fdio] Bump compute RAM defaults for virtual PODs

Hugepage count has been recently bumped for virtual PODs via IDF
changes in Pharos, so align our FDio scenarios with the new RAM
requirements.

While at it, fix wrong pod_config template evaluation by moving it
after the templated scenario files are expanded, since pod_config
relies on scenario node definition.

Also, configure VPP to use decimal interface names by default to
align with Pharos macro for the VPP interface name string.

Change-Id: Ib3a89c294a3a2755567fdbe07e3be2b8ca1a5714
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "[docs] Update documentation for Hunter"
Alexandru Avadanii [Tue, 7 May 2019 18:04:41 +0000 (18:04 +0000)]
Merge "[docs] Update documentation for Hunter"

5 years ago[docs] Update documentation for Hunter 73/67773/4
Cristina Pauna [Mon, 6 May 2019 11:00:30 +0000 (14:00 +0300)]
[docs] Update documentation for Hunter

Updated the documentation for the Hunter release plus one minor
change of wording in the deploy script as we no longer install
just Openstack

Change-Id: I853f5536b0f4a89a8c20af0a9650372690ef7c99
Signed-off-by: Cristina Pauna <cristina.pauna@enea.com>
5 years agoMerge "[dpdk] Get back to shared memory model"
Michael Polenchuk [Tue, 7 May 2019 07:53:55 +0000 (07:53 +0000)]
Merge "[dpdk] Get back to shared memory model"

5 years agoMerge "[virtual] Parameterize scenarios based on PDF/IDF"
Alexandru Avadanii [Mon, 6 May 2019 13:32:24 +0000 (13:32 +0000)]
Merge "[virtual] Parameterize scenarios based on PDF/IDF"

5 years ago[dpdk] Get back to shared memory model 24/67724/2
Michael Polenchuk [Tue, 30 Apr 2019 09:03:11 +0000 (13:03 +0400)]
[dpdk] Get back to shared memory model

The per port model potentially requires an increase in memory
resource requirements (which is limited by labs) to support the
same number of ports and configuration as the shared port model.

Set linux:network:openvswitch:per_port_memory explicitly to true
to enable per port mempools support for DPDK devices.

Change-Id: I130885afc50e7a047f8835113d370840827ad718
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoPatch dhcp agent to avoid unwanted rescheduling 74/67674/4
Michael Polenchuk [Tue, 23 Apr 2019 10:42:07 +0000 (14:42 +0400)]
Patch dhcp agent to avoid unwanted rescheduling

Change-Id: Id49f26a2615e2fc06e94eeaf2e9200e83625e6c9
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[ha] Decouple openstack services by roles 81/67681/2
Michael Polenchuk [Wed, 24 Apr 2019 10:53:59 +0000 (14:53 +0400)]
[ha] Decouple openstack services by roles

Deploy the OpenStack API services based on roles to
prevent issues with absent database tables since db_sync
runs only on the nodes with primary role.

Change-Id: I04cf3ce0dd59afd93b8a0dfcf060fbd7e7411c82
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[iec] Copy full contents of IEC git repo 77/67677/1
Alexandru Avadanii [Tue, 23 Apr 2019 15:18:49 +0000 (17:18 +0200)]
[iec] Copy full contents of IEC git repo

Previously we only synced the scripts subdir, but going forward
we will need the full contents of the IEC repo on all cluster nodes.

Change-Id: I88edd4885875048d50d28c1eac9fd413dc2b6ffb
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agomcpcontrol: Avoid duplicate ip rules 13/67613/1
Alexandru Avadanii [Thu, 18 Apr 2019 15:16:32 +0000 (17:16 +0200)]
mcpcontrol: Avoid duplicate ip rules

Executing deploy.sh multiple times led to duplicating the ip rules.

Change-Id: Iad5886a851970f166996226fa3d115a93113c6db
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agomcpcontrol: policy based routing for INSTALLER_IP 86/67586/1
Alexandru Avadanii [Sun, 14 Apr 2019 23:58:03 +0000 (01:58 +0200)]
mcpcontrol: policy based routing for INSTALLER_IP

To bypass Docker 'bridge'-backed network isolation, we previously
added an extra routing hop, which broke access from inside the
'mcpcontrol' Docker network (typically 10.20.0.0/24) to its
bridge address (10.20.0.1), leading to DNS issues on Salt Master.

This change leverages policy based routing to only add the extra
routing hop for connections originating from the default Docker
bridge network ('docker0'). Note that other Docker networks
using the 'bridge' driver are still isolated from 'mcpcontrol'.

Fixes: d9b44acb

Change-Id: Ib92901c3278ae9b815f28f26d4c26f82bcadacd6
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "[odl] Disable timeout for learnt flows of snat"
Michael Polenchuk [Fri, 12 Apr 2019 14:32:36 +0000 (14:32 +0000)]
Merge "[odl] Disable timeout for learnt flows of snat"

5 years ago[baremetal] Tune up dpdk options 40/67540/2
Michael Polenchuk [Thu, 11 Apr 2019 13:42:49 +0000 (17:42 +0400)]
[baremetal] Tune up dpdk options

Optimized for LF-POD2 as nic assigned to private/dpdk interface
and pinned cores resides on numa #0. Core #11 is for DPDK,
the rest four cores for PMDs.

Change-Id: Icca701bc1a66f3672b8511e0245c82ca29788a8b
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[odl] Disable timeout for learnt flows of snat 88/67488/2
Michael Polenchuk [Fri, 5 Apr 2019 08:46:16 +0000 (12:46 +0400)]
[odl] Disable timeout for learnt flows of snat

Set timeout value for snat punts to zero to turn
off the rate limiting and installation of learnt flows.

Change-Id: I79dad8fd0f925bfc11d7dc1678c3a414dc35fa56
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "route mcpcontrol via PXE br to bypass isolation"
Michael Polenchuk [Fri, 12 Apr 2019 08:39:30 +0000 (08:39 +0000)]
Merge "route mcpcontrol via PXE br to bypass isolation"

5 years agoroute mcpcontrol via PXE br to bypass isolation 41/67541/1
Alexandru Avadanii [Thu, 11 Apr 2019 14:04:03 +0000 (16:04 +0200)]
route mcpcontrol via PXE br to bypass isolation

Recent virsh/Docker network rework changed mcpcontrol (previously
a virsh-managed network) into a Docker-controlled network using
the 'bridge' driver.
As a consequence, Docker now isolates traffic from 'mcpcontrol'
network from the default Docker bridge network ('docker0') using
iptables rules that check input/output interfaces.
Yardstick (and any other Docker container hooked via 'docker0')
will not be able to ssh into Salt master due to this isolation.

One possible workaround would be to explicitly ACCEPT traffic
from 'docker0' going to Salt master. However, this is only
properly supported starting with Docker 17.06, while most CI hosts
and end users are still using 17.05 or older.
In older Docker releases, DOCKER-USER iptables table was not
avaiable, so injecting custom iptables and making them persistent
is not only complicated, it's also prone to subtle errors.

Another way to bypass the iptables rules is to route the packets
coming from our new Docker network via another bridge before
letting them find their way into 'docker0'.
This change adds a new route for the Salt master host (note that
MaaS container will not benefit from this) via the PXE bridge on
the jumphost (which can be either a real Linux bridge for baremetal
deployments or a virsh-managed network); adding one extra network
hop for each packet going between our 'mcpcontrol' Docker network
and 'docker0', effectively bypassing the Docker-enforced iptables
DROP.

Change-Id: Id8ac7a638c778887b361c9b64c320664c88f59fd
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[ha] Take out class with backports repo 25/67525/3
Michael Polenchuk [Wed, 10 Apr 2019 13:41:52 +0000 (17:41 +0400)]
[ha] Take out class with backports repo

* update system reclass
* rectify telemetry redis options

Change-Id: I6dca1ae52e7f7d73a90e53fceddca8e86872651b
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "Setup repository with backports"
Michael Polenchuk [Wed, 10 Apr 2019 08:38:52 +0000 (08:38 +0000)]
Merge "Setup repository with backports"

5 years agoMerge "[VCP VMs] AArch64: Switch seeding back to qemu-nbd"
Alexandru Avadanii [Tue, 9 Apr 2019 11:55:15 +0000 (11:55 +0000)]
Merge "[VCP VMs] AArch64: Switch seeding back to qemu-nbd"

5 years agoSetup repository with backports 91/67491/3
Michael Polenchuk [Fri, 5 Apr 2019 13:24:39 +0000 (17:24 +0400)]
Setup repository with backports

Change-Id: I791436f512dea6c6bc61133c4122ac872950af8e
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[VCP VMs] AArch64: Switch seeding back to qemu-nbd 99/67499/1
Alexandru Avadanii [Mon, 1 Apr 2019 14:25:49 +0000 (16:25 +0200)]
[VCP VMs] AArch64: Switch seeding back to qemu-nbd

Upstream change [1] switched from old qemu-nbd preseeding of VCP VMs
to using a cloud-init + configuration drive. This breaks on AArch64
with "IDE controllers are unsupported for this QEMU binary or machine
type", so switch back to using qemu-nbd.

[1] https://github.com/Mirantis/reclass-system-salt-model/commit/c0e4807

Change-Id: I0dfeb638d408343c76a73fafa503048a79ce1f6e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[virtual] Parameterize scenarios based on PDF/IDF 62/67162/3
Alexandru Avadanii [Thu, 28 Feb 2019 14:46:19 +0000 (15:46 +0100)]
[virtual] Parameterize scenarios based on PDF/IDF

NOTE: only os-nosdn-nofeature-noha is parameterized for now.

- move config drive & disk creation from prepare_vms to create_vms;
- make default disk size(s) configurable based on scenario defaults
  and vPDF;
  * compute nodes require 2 disks to be defined in vPDF, since the
    pillar reclass model assumes /dev/vdb is reserved for cinder;
  * if multiple disks are defined in vPDF, they are created and
    attached accordinly (only ctl01 and cmp nodes are parameterized
    in this change; only for the os-nosdn-nofeature-noha scenario);
- vCPU specifications are deduced based on vPDF (sockets, cores);
  * threads/core is hard set to 2 since vPDF does not have a key
    for it;
  * NUMA resources are distributed evenly based on the number of
    sockets configured in PDF;
  * no less than the mininum requirement for a scenario is allocated
    (e.g. if PDF specifies 2 cores, but the scenario requires at
    least 4 cores, the larger value will be used);
- RAM is deduced based on PDF (but no less than the mininum req is
  allocated, e.g. if PDF specifies 2GB RAM for computes, but the
  scenario requires at least 8GB, the larger value will be used);

Change-Id: I97188aa2a1006865b8429eb6483e10c76795f7d2
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[dpdk] Rise up available memory on computes 90/67490/1
Michael Polenchuk [Fri, 5 Apr 2019 12:55:01 +0000 (16:55 +0400)]
[dpdk] Rise up available memory on computes

There is no enough memory (default 4k pages) for services
like libvirt, which cannot fork child processes.

Change-Id: I44d8efd7cafb52a7c823c02738c1d321017aa7a3
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoDefine stub for cinder service in keystone 81/67481/1
Michael Polenchuk [Thu, 4 Apr 2019 13:24:54 +0000 (17:24 +0400)]
Define stub for cinder service in keystone

Required only for Rally validation in cinder scenarios,
there is no useful functionaly in terms of cluster.

Change-Id: Idc4d62cbbc9974972e9d492b5a419342077e3d9a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[noha] Deploy dhcp/metadata agents on computes 72/67472/1
Michael Polenchuk [Wed, 3 Apr 2019 11:33:26 +0000 (15:33 +0400)]
[noha] Deploy dhcp/metadata agents on computes

Sometimes instance doesn't get ip address from dhcp server, which
resides only on gateway node, so run additional dhcp/metadata agents
on compute nodes to handle tenant networks in place.

Change-Id: If1d74af665cf8db64b09f846fac7192f76abdb25
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[dpdk] Enable per port memory model 57/67457/4
Michael Polenchuk [Mon, 1 Apr 2019 15:04:00 +0000 (19:04 +0400)]
[dpdk] Enable per port memory model

The per port memory model provides a more transparent memory usage model
and avoids pool exhaustion due to competing memory requirements for
interfaces. (http://docs.openvswitch.org/en/latest/topics/dpdk/memory/)

Change-Id: I5add0f49cdcdf2fc3d24affee10a275abe3ca46a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[akraino] Add IEC K8-calico scenarios 11/67011/7
Alexandru Avadanii [Mon, 11 Feb 2019 11:04:59 +0000 (11:04 +0000)]
[akraino] Add IEC K8-calico scenarios

- bump Pharos git submodule to allow PODs with fewer nodes;
- add `k8-calico-iec-noha` scenario definition for Akraino
  IEC basic configuration;
- add `k8-calico-iec-vcp-noha` scenario definition for Akraino
  IEC nested (virtualized control plane) configuration;
- add `akraino_iec` state, which will leverage the Akraino IEC
  bootstrap scripts from [1];
- replace system.reboot salt call with cmd.run 'reboot' as it's more
  reliable;
- use kernel 4.15 for AArch64 K8 IEC scenarios;

NOTE: These scenarios will not be released in OPNFV since don't rely
on Salt formulas but instead of Akraino IEC scripts to install K8s.

[1] https://gerrit.akraino.org/r/#/q/project:iec

Change-Id: I4e538e0563d724cd3fd5c4d462ddc22d0c739402
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoBring in kubernetes scenario 95/67195/10
Michael Polenchuk [Thu, 7 Mar 2019 14:57:49 +0000 (18:57 +0400)]
Bring in kubernetes scenario

Change-Id: I2b41ce2e275bb053fa2590654ea7fa432b0c857f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoRectify system reclass after update 96/67396/3
Michael Polenchuk [Tue, 26 Mar 2019 09:08:39 +0000 (13:08 +0400)]
Rectify system reclass after update

* add opendaylight password (removed from system level)
* get updated ovn system class w/o mysql settings
* enable ceilometer user back (removed along with outdated service/endpoints)
* adjsut check interval of haproxy for noha scenarios since there is
  only one backend for services, i.e. failover ain't expected

Change-Id: Iedee290e1cfcf838998bd44dc09a729d143974ac
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[fdio] salt-formula-neutron: Fix VPP support patch"
Michael Polenchuk [Wed, 27 Mar 2019 08:38:54 +0000 (08:38 +0000)]
Merge "[fdio] salt-formula-neutron: Fix VPP support patch"

5 years ago[fdio] salt-formula-neutron: Fix VPP support patch 76/67376/1
Alexandru Avadanii [Mon, 25 Mar 2019 15:00:18 +0000 (16:00 +0100)]
[fdio] salt-formula-neutron: Fix VPP support patch

After Rocky support was added upstream to salt-formula-neutron, our
FDIO patch continued to be applied only for Queens, so refresh the
patch by switching to Rocky.

Change-Id: If0bbb9c4ec674d386ceade00ef8fe936482fb49c
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoUpdate system reclass 39/67339/4
Michael Polenchuk [Fri, 22 Mar 2019 13:46:02 +0000 (17:46 +0400)]
Update system reclass

Change-Id: I745a838b1f2f294b6c455700509ddf4b0264446f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoRevert "Fix race condition with nova privsep utime" 06/67306/2
Michael Polenchuk [Tue, 19 Mar 2019 18:04:55 +0000 (18:04 +0000)]
Revert "Fix race condition with nova privsep utime"

This reverts commit ac56d7b14f46b05f497b3dca4b6a4b0bfedd83e2.
The original patch has been merged (https://review.openstack.org/643011)

Change-Id: I3a7cd825f371e375d36256143b4b8c91f90ee26e
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[lib] nbd: Explicitly map partitions 83/67283/2
Alexandru Avadanii [Mon, 18 Mar 2019 15:11:50 +0000 (16:11 +0100)]
[lib] nbd: Explicitly map partitions

Certain kernels (e.g. 4.4.0-101+ in Ubuntu) no longer automatically
ack the partition table update after `kpartx -a /dev/nbdX`, see [1].

To avoid another dependency on `parted` packages, use `partx` from
`util-linux`, which is already installed as a dependency of e2fsprogs.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1743026

Change-Id: Ibd993fe210c1a11814e89a66759568d4d117d613
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoSmooth down telemetry services 56/67256/1
Michael Polenchuk [Thu, 14 Mar 2019 15:08:39 +0000 (19:08 +0400)]
Smooth down telemetry services

* update gnocchi to 4.3
* remove outdated ceilometer api

Change-Id: I7adaf3ddc76d93531b6b0997b684672b80f2992f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[lib] Create veths using systemd opnfv-fuel units 80/67180/1
Alexandru Avadanii [Tue, 5 Mar 2019 15:49:23 +0000 (16:49 +0100)]
[lib] Create veths using systemd opnfv-fuel units

Create 2 systemd services on the jumphost that will handle veth
pairs creation, respectively adding them to virsh/real bridges.
This allows us to set docker containers restart policy to 'always',
enabling persistent Salt Master/MaaS containers across jumphost
reboots.

NOTE: libvirt creates virtual networks async, hence the need for
retrying hooking veths to them.

JIRA: FUEL-406

Change-Id: I1ca033cb5eb854b577b57bb2387a58bd9605a5bb
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoTurn off meltdown/spectre patches 69/67169/2
Michael Polenchuk [Mon, 4 Mar 2019 08:49:58 +0000 (12:49 +0400)]
Turn off meltdown/spectre patches

Change-Id: Id75ffe4db808a4ec250ba8b86c5d49f1206c3784
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoTune up nova/neutron intervals 18/67118/3
Michael Polenchuk [Tue, 26 Feb 2019 14:09:14 +0000 (18:09 +0400)]
Tune up nova/neutron intervals

Also re-align resources for virtual scenarios.

Change-Id: Id0d55407fd5b1720a24e30c364219f8b08e89d06
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoFix race condition with nova privsep utime 15/67115/1
Michael Polenchuk [Tue, 26 Feb 2019 10:52:06 +0000 (14:52 +0400)]
Fix race condition with nova privsep utime

Bug: https://bugs.launchpad.net/nova/+bug/1809123
Change-Id: I14622c21826aeeddac6ea7bf7f9d116cd3e68cfb
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[cfg01] Reduce mine_interval to 15 min"
Michael Polenchuk [Tue, 26 Feb 2019 07:03:04 +0000 (07:03 +0000)]
Merge "[cfg01] Reduce mine_interval to 15 min"

5 years ago[lib] Add fatal validation of old kernel on Ubuntu 86/67086/1
Alexandru Avadanii [Fri, 22 Feb 2019 15:31:24 +0000 (16:31 +0100)]
[lib] Add fatal validation of old kernel on Ubuntu

As reported in [1], kernel 4.4 seems to break nested virtualization,
add a fatal check against it.

[1] https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1797332

Change-Id: I0aef8a7340dd82bfeb2e58c9642623b9ec13dca5
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[cfg01] Reduce mine_interval to 15 min 66/67066/2
Alexandru Avadanii [Mon, 18 Feb 2019 22:08:30 +0000 (23:08 +0100)]
[cfg01] Reduce mine_interval to 15 min

Some PODs are fast enough to get past installing, syncing and using
MaaS to provision the OS on the baremetal nodes before the 1h mine
refresh.

Since mine.update operation is fast enough to go unnoticed and we
only collect IP addresses, grains and pem entries, schedule it every
15 minutes.

Due to reclass class inheritance, we can't easily override this via
pillar data, so handle it via entrypoint.sh.

Change-Id: I0d8ed2da838ad09c94e9327d0131d3e239de4f08
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoInstall missing gnocchi dependencies 82/67082/1
Michael Polenchuk [Fri, 22 Feb 2019 08:45:52 +0000 (12:45 +0400)]
Install missing gnocchi dependencies

Change-Id: Ifc4fff90551344c69295990b220f0778967887a4
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[baremetal] Containerize MaaS"
Alexandru Avadanii [Tue, 19 Feb 2019 15:17:25 +0000 (15:17 +0000)]
Merge "[baremetal] Containerize MaaS"

5 years agoMerge "[cfg01] Schedule x509.get_pem_entries mine update"
Alexandru Avadanii [Fri, 15 Feb 2019 13:06:46 +0000 (13:06 +0000)]
Merge "[cfg01] Schedule x509.get_pem_entries mine update"

5 years ago[cfg01] Schedule x509.get_pem_entries mine update 33/67033/1
Alexandru Avadanii [Fri, 15 Feb 2019 00:23:20 +0000 (01:23 +0100)]
[cfg01] Schedule x509.get_pem_entries mine update

Previously, Salt Master CA mine was only sent once, during
salt.minion.ca state execution at cfg01 bringup / bootstrap.

This causes possible issues with:
- Salt Master container restart (mine data is lost);
- UNH Lab deployment (uknown rootcause, might be related to XFS and
  overlay2 being used with Docker on CentOS);

To bypass this issue, make x509.get_pem_entries module send mine data
at the default mine interval (60 minutes).

Change-Id: I5f6334ae18f5af6cbe0a164791603b67f0a3668f
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[baremetal] Containerize MaaS 99/66799/7
Alexandru Avadanii [Thu, 7 Feb 2019 18:51:04 +0000 (19:51 +0100)]
[baremetal] Containerize MaaS

- replace mas01 VM with a Docker container;
- drop `mcpcontrol` virsh-managed network, including special handling
  previously required for it across all scripts;
- drop infrastructure VMs handling from scripts, the only VMs we still
  handle are cluster VMs for virtual and/or hybrid deployments;
- drop SSH server from mas01;
- stop running linux state on mas01, as all prerequisites are properly
  handled durin Docker build or via entrypoint.sh - for completeness,
  we still keep pillar data in sync with the actual contents of mas01
  configuration, so running the state manually would still work;
- make port 5240 available on the jumpserver for MaaS dashboard access;
- docs: update diagrams and text to reflect the new changes;

Change-Id: I6d9424995e9a90c530fd7577edf401d552bab929
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoRise up salt's gather job timeout 25/67025/1
Michael Polenchuk [Thu, 14 Feb 2019 10:33:38 +0000 (14:33 +0400)]
Rise up salt's gather job timeout

While the minions are working their jobs the CLI is waiting for the
first initial timeout period (timeout) to start. When that hits,
the CLI sends sends the first "find_job" query. This kicks off the
gather_job_timeout timer. Sometimes a minion doesn't respond to the request
within the gather_job_timeout time period (default is 10s), so rise up
this value to give a chance for a minion to report actual status.

Change-Id: Ic3756b82fdeb17718870ab30e9578263d25309f7
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[odl] Settle the broken pkg->config dependency"
Michael Polenchuk [Mon, 11 Feb 2019 11:32:22 +0000 (11:32 +0000)]
Merge "[odl] Settle the broken pkg->config dependency"

5 years agoMerge "[docker] Add MaaS container build support"
Alexandru Avadanii [Mon, 11 Feb 2019 10:42:24 +0000 (10:42 +0000)]
Merge "[docker] Add MaaS container build support"

5 years ago[odl] Settle the broken pkg->config dependency 99/66899/1
Michael Polenchuk [Fri, 8 Feb 2019 09:39:16 +0000 (13:39 +0400)]
[odl] Settle the broken pkg->config dependency

Change-Id: I3bbe3e4be520ccac198654bb4a7d493aa8450023
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[docker] Add MaaS container build support 95/66895/1
Alexandru Avadanii [Thu, 7 Feb 2019 18:50:33 +0000 (19:50 +0100)]
[docker] Add MaaS container build support

Change-Id: I7709c9ca9e701b656447154919eb084a710f49af
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[odl] Disable PaxOsgi logging by default 73/66873/3
Michael Polenchuk [Thu, 7 Feb 2019 08:36:56 +0000 (12:36 +0400)]
[odl] Disable PaxOsgi logging by default

The PaxOsgi logging has a performance impact
(i.e. makes pressure to the Java GC).

Change-Id: Ic0bc2c0d1cfac195a04d1cfa90fa7fa47fc37612
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[odl/noha] Make vif_plugging non-fatal"
Michael Polenchuk [Tue, 5 Feb 2019 07:34:08 +0000 (07:34 +0000)]
Merge "[odl/noha] Make vif_plugging non-fatal"

5 years agoMerge "[cfg01] Use ssh config to set default user & key"
Michael Polenchuk [Tue, 5 Feb 2019 07:32:32 +0000 (07:32 +0000)]
Merge "[cfg01] Use ssh config to set default user & key"

5 years agoMerge "[fdio] Fix VPP package pinning"
Alexandru Avadanii [Mon, 4 Feb 2019 21:33:27 +0000 (21:33 +0000)]
Merge "[fdio] Fix VPP package pinning"

5 years ago[fdio] Fix VPP package pinning 53/66853/1
Alexandru Avadanii [Mon, 4 Feb 2019 15:08:06 +0000 (16:08 +0100)]
[fdio] Fix VPP package pinning

Previously, Ubuntu ignored the VPP pinning with:

N: Ignoring file 'fdio.ubuntu' in directory '/etc/apt/preferences.d/'
as it has an invalid filename extension

Change-Id: I5ee60c1715bea3b4180b55125dc72962a70c2754
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[cfg01] Use ssh config to set default user & key 51/66851/1
Alexandru Avadanii [Mon, 4 Feb 2019 15:03:06 +0000 (16:03 +0100)]
[cfg01] Use ssh config to set default user & key

Change-Id: I7486569568207f7652f8bdfcf1060ce51a9dbb0e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[submodule] Bump Pharos for arm-pod10 cmp change 49/66849/1
Alexandru Avadanii [Mon, 4 Feb 2019 14:08:20 +0000 (15:08 +0100)]
[submodule] Bump Pharos for arm-pod10 cmp change

Change-Id: Ia7f8845017333e54db110bca5b3715702948b76b
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[odl/noha] Make vif_plugging non-fatal 93/66793/2
Michael Polenchuk [Thu, 31 Jan 2019 12:34:28 +0000 (16:34 +0400)]
[odl/noha] Make vif_plugging non-fatal

In order to mitigate live migration procedure make VIF plugging
event non-fatal for nova-compute. Also align max value of memory
for instance of ODL controller.

Change-Id: I0d00cc97c652eef3bd3404fac4715e2e7f2f02c7
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[deploy] Allow only operating system install"
Cristina Pauna [Wed, 30 Jan 2019 10:29:53 +0000 (10:29 +0000)]
Merge "[deploy] Allow only operating system install"

5 years agoMerge "[fdio] Pin VPP packages to 18.07-release"
Cristina Pauna [Wed, 30 Jan 2019 08:43:03 +0000 (08:43 +0000)]
Merge "[fdio] Pin VPP packages to 18.07-release"

5 years ago[deploy] Allow only operating system install 77/66777/3
Alexandru Avadanii [Tue, 29 Jan 2019 17:05:44 +0000 (18:05 +0100)]
[deploy] Allow only operating system install

Extend one of the existing deployment arguments to allow the
installation of only the operating system and infrastructure networks,
skipping cloud setup.

Change-Id: Ibc5d0f324ed15b66f809839cfce49a0324b6fe4d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "[ovs] Start ovs services before networking"
Alexandru Avadanii [Tue, 29 Jan 2019 16:08:07 +0000 (16:08 +0000)]
Merge "[ovs] Start ovs services before networking"

5 years ago[fdio] Pin VPP packages to 18.07-release 71/66771/3
Alexandru Avadanii [Tue, 29 Jan 2019 15:15:50 +0000 (16:15 +0100)]
[fdio] Pin VPP packages to 18.07-release

VPP 18.10 has a weird bug triggered by certain packets, e.g. from
inside a guest VM on a compute node, these behave differently:
$ udhcpc -x hostname:1234567890123456789012  # works
$ udhcpc -x hostname:12345678901234567890123 # confuses VPP on gtw01

To avoid this bug, pin VPP to the previous release, which does not
exhibit the issue.

Change-Id: I8c1e085731909d4b9296e8b09608887a4b5bfdd6
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[fdio] Increase VIF plug-in timeout 77/66677/1
Alexandru Avadanii [Sun, 27 Jan 2019 19:15:02 +0000 (20:15 +0100)]
[fdio] Increase VIF plug-in timeout

Baremetal clusters might benefit from having a little more time
to plug in the VIFs.

Change-Id: I9406a0ef24de2177827b3acd27b7c60b293a4572
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[ovs] Start ovs services before networking 49/66649/5
Alexandru Avadanii [Fri, 25 Jan 2019 20:28:27 +0000 (21:28 +0100)]
[ovs] Start ovs services before networking

Fix broken systemd service unit dependecies:
- OVS should start before networking service;
- OVS ports & bridges should not be automatically ifup-ed by
  networking service to avoid races, so drop 'auto' for both
  (OVS ports are automatically handled when part of an OVS bridge);
- explicitly ifup OVS bridges as part of networking service, but
  after all Linux interfaces have been handled;
- use 'allow-ovs br-prv' to let OVS handle br-prv and avoid another
  race condition;

While at it, fix some other related issues:
- make OVS service start after DPDK service (if present);
- bump OVS-DPDK compute VMs RAM since since switching from MTU 1500
  to jumbo frames for virtual PODs a while ago failed to do so [1];
- avoid creating conflicting reclass linux.network.interfaces entries
  for OVS ports by using their name (drop 'ovs_port_' prefix):
  * for untagged networks they will override existing common defs;
  * for tagged networks, they will create separate entries;
- DPDK scenarios: make gtw01 br-prv members OVS ports to avoid race
  conditions after node reboot by letting OVS handle them;

[1] https://developers.redhat.com/blog/2018/03/16/\
    ovs-dpdk-hugepage-memory/

Change-Id: I0266ba67f3849b6f7e331a758146b331730bae55
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "Enable back auto for ports network script"
Michael Polenchuk [Thu, 24 Jan 2019 11:55:07 +0000 (11:55 +0000)]
Merge "Enable back auto for ports network script"

5 years agoMerge "[fdio] Make VIF timeout non-fatal"
Alexandru Avadanii [Thu, 24 Jan 2019 10:53:27 +0000 (10:53 +0000)]
Merge "[fdio] Make VIF timeout non-fatal"

5 years agoEnable back auto for ports network script 91/66591/5
Michael Polenchuk [Wed, 23 Jan 2019 11:36:57 +0000 (15:36 +0400)]
Enable back auto for ports network script

The ovs port remains in down state after reboot if "auto" is off.
Also turn off no_wait option for odl-noha scenarios.

Change-Id: I0121b3190869528e5f2e9985f9e9299ac6c6724e
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[fdio] Make VIF timeout non-fatal 99/66599/1
Alexandru Avadanii [Thu, 24 Jan 2019 00:25:17 +0000 (01:25 +0100)]
[fdio] Make VIF timeout non-fatal

The first VMs spawned still exhibit the race condition described in
the ticket, so apply the same workaround proposed during the Fraser
release cycle in FDS.

JIRA: FDS-156

Change-Id: I3b2b1ed7b5711daf81b5f4a263e4dbee9f502259
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[docs] Update Gambia 7.2 release date 93/66593/1
Alexandru Avadanii [Wed, 23 Jan 2019 17:34:21 +0000 (18:34 +0100)]
[docs] Update Gambia 7.2 release date

Change-Id: I27d13cafcfa45f70413695dbb6fe29e5bb222a3e
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoPass domain name properly for heat stack user 55/66555/2
Michael Polenchuk [Tue, 22 Jan 2019 08:53:07 +0000 (12:53 +0400)]
Pass domain name properly for heat stack user

Change-Id: I74c1c85310e2012e664764b6129fc4a52faaf106
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[uefi_cleanup] Use grain targeting"
Alexandru Avadanii [Mon, 21 Jan 2019 18:55:44 +0000 (18:55 +0000)]
Merge "[uefi_cleanup] Use grain targeting"

5 years agoMerge "[noha] baremetal: Fix undef armband_repo_version"
Michael Polenchuk [Mon, 21 Jan 2019 10:03:09 +0000 (10:03 +0000)]
Merge "[noha] baremetal: Fix undef armband_repo_version"

5 years ago[uefi_cleanup] Use grain targeting 05/66505/1
Alexandru Avadanii [Sat, 19 Jan 2019 20:10:49 +0000 (21:10 +0100)]
[uefi_cleanup] Use grain targeting

Alternating HA and no-HA scenario deployments on baremetal requires
non-hostname targeting for UEFI cleanup (e.g. ctl01/gtw01/kvm01).

Change-Id: I9f0e967b500856b65a69ea0ab6ea13e15b327d8b
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "Sync heat domain name with keystone defined"
Alexandru Avadanii [Thu, 17 Jan 2019 17:38:09 +0000 (17:38 +0000)]
Merge "Sync heat domain name with keystone defined"

5 years ago[submodule] Bump Pharos for arm-pod10 cmp NIC sync 57/66457/1
Alexandru Avadanii [Thu, 17 Jan 2019 14:38:06 +0000 (15:38 +0100)]
[submodule] Bump Pharos for arm-pod10 cmp NIC sync

Change-Id: I177598d4d20539e50aab5f283e8d10022a4f1a14
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoSync heat domain name with keystone defined 55/66455/1
Michael Polenchuk [Thu, 17 Jan 2019 13:38:23 +0000 (17:38 +0400)]
Sync heat domain name with keystone defined

Change-Id: Ibf88f179af2570a707ade78f772342b7da23b74f
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years ago[noha] baremetal: Fix undef armband_repo_version 43/66443/1
Alexandru Avadanii [Wed, 16 Jan 2019 20:27:28 +0000 (21:27 +0100)]
[noha] baremetal: Fix undef armband_repo_version

Change-Id: I0e56261fc2fc2a0a3f164531c72d88f7c46f5ca1
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[submodule] Bump Pharos for arm-pod10 NIC reorder 25/66425/2
Alexandru Avadanii [Wed, 16 Jan 2019 12:38:27 +0000 (13:38 +0100)]
[submodule] Bump Pharos for arm-pod10 NIC reorder

Change-Id: I79d3167432d48500346d5c8294d447c54e0cb6be
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoMerge "Align patches"
Michael Polenchuk [Wed, 16 Jan 2019 11:31:44 +0000 (11:31 +0000)]
Merge "Align patches"

5 years agoAlign patches 65/66365/2
Michael Polenchuk [Mon, 14 Jan 2019 13:18:50 +0000 (17:18 +0400)]
Align patches

* patch is merged into oslo-templates
* rocky repo key name is made as for others
* jinja package is updated to fix incorrect quoted value
  [https://github.com/saltstack/salt/issues/46594]

Change-Id: Ia6359cf89579b4d892ae40c4d087168edcd86ebb
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMake shutdown only on physical nodes 21/66421/2
Michael Polenchuk [Wed, 16 Jan 2019 10:03:48 +0000 (14:03 +0400)]
Make shutdown only on physical nodes

Change-Id: If167e7a6bdcdccd6b6df43bd5cac54250abec61a
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[baremetal] Shutdown nodes from previous deploy"
Alexandru Avadanii [Tue, 15 Jan 2019 13:52:05 +0000 (13:52 +0000)]
Merge "[baremetal] Shutdown nodes from previous deploy"

5 years ago[odl] Set conntrack as netvirt nat mode 37/66337/3
Michael Polenchuk [Fri, 11 Jan 2019 10:30:52 +0000 (14:30 +0400)]
[odl] Set conntrack as netvirt nat mode

The conntrack-based SNAT uses the Linux netfilter framework to
do the NAPT and track the connection. The first packet in a traffic is
passed to the netfilter to be translated with the external IP. The
following packets will use the netfilter for further inbound and
outbound translation.

Change-Id: I1090b4fe041f8d9533aa4ce1964284a4a5c073ce
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[centos] Update altarch kernel URL"
Michael Polenchuk [Mon, 14 Jan 2019 08:57:33 +0000 (08:57 +0000)]
Merge "[centos] Update altarch kernel URL"

5 years agoMerge "[patch] Drop reclass.system patch for repo arch"
Michael Polenchuk [Mon, 14 Jan 2019 08:55:39 +0000 (08:55 +0000)]
Merge "[patch] Drop reclass.system patch for repo arch"

5 years agoMerge "[noha] Fix gtw private NIC name in j2 templates"
Michael Polenchuk [Mon, 14 Jan 2019 08:46:11 +0000 (08:46 +0000)]
Merge "[noha] Fix gtw private NIC name in j2 templates"

5 years ago[baremetal] Shutdown nodes from previous deploy 51/66351/1
Alexandru Avadanii [Sun, 13 Jan 2019 17:49:07 +0000 (18:49 +0100)]
[baremetal] Shutdown nodes from previous deploy

When noha scenarios are scheduled on the same CI POD currently
running a previously deployed HA scenario, one baremetal node
might remain unused (kvm03), connect to the new Salt master and
interfere with the deployment.

To prevent that, shutdown all baremetal nodes at the begining of the
deployment.

Change-Id: Ia9bad8b5d8348433cefac9aa76eca0de664f187d
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[centos] Update altarch kernel URL 27/66127/2
Alexandru Avadanii [Mon, 31 Dec 2018 19:43:51 +0000 (20:43 +0100)]
[centos] Update altarch kernel URL

CentOS recently moved its kernel source RPM from the altarch subdir
to the same directory x86_64 kernel sources used to reside, so update
our script accordinly.

Change-Id: I88010eabdfc15d6a79350dface29258cc37c4b95
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[patch] Drop reclass.system patch for repo arch 21/66321/2
Alexandru Avadanii [Thu, 10 Jan 2019 14:47:00 +0000 (15:47 +0100)]
[patch] Drop reclass.system patch for repo arch

MCP repos no longer publish arm64 metadata, so drop our patch that
selected arm64 metadata on arm64 systems.
Instead, let it default to 'deb [arch=amd64]', which will allow
arm64 systems to fetch amd64 metadata and inherintely fetch all
arch-independent packages from the same repos.

While at it, switch to 'rocky-armband' repos on arm64 systems.

Change-Id: I07fda895f5162bfa576c62336cbb4d74e985f37a
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[noha] Fix gtw private NIC name in j2 templates 43/66343/1
Alexandru Avadanii [Fri, 11 Jan 2019 18:38:41 +0000 (19:38 +0100)]
[noha] Fix gtw private NIC name in j2 templates

Change-Id: Ic266864913dcac021b3e12f426e1c8a60c23fe87
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years ago[patch] Avoid ifup run if noifupdown is turned on 11/66111/8
Michael Polenchuk [Tue, 25 Dec 2018 10:28:47 +0000 (14:28 +0400)]
[patch] Avoid ifup run if noifupdown is turned on

Handle noifupdown option for all cmd.run states
with explicit ifup call as well.

Change-Id: Ie855a0810bcfe4a856cf9d29bd0755643d71ff4d
Signed-off-by: Michael Polenchuk <mpolenchuk@mirantis.com>
5 years agoMerge "[docs] Update documentation for Gambia 7.2"
Alexandru Avadanii [Thu, 10 Jan 2019 14:25:51 +0000 (14:25 +0000)]
Merge "[docs] Update documentation for Gambia 7.2"

5 years agoMerge "[state] Fold aarch64 conditions"
Alexandru Avadanii [Thu, 10 Jan 2019 14:07:23 +0000 (14:07 +0000)]
Merge "[state] Fold aarch64 conditions"

5 years ago[docs] Update documentation for Gambia 7.2 17/66317/1
Cristina Pauna [Thu, 10 Jan 2019 12:32:45 +0000 (14:32 +0200)]
[docs] Update documentation for Gambia 7.2

Change-Id: I180f668b297ad97dd95bd9201005410fe7a62b4c
Signed-off-by: Cristina Pauna <cristina.pauna@enea.com>
5 years ago[state] Fold aarch64 conditions 09/66309/1
Alexandru Avadanii [Wed, 9 Jan 2019 14:47:47 +0000 (15:47 +0100)]
[state] Fold aarch64 conditions

The armband formula already has checks in place to run only on
nodes with the expected arch, so remove the duplicate condition
in state files.

Change-Id: I05b26368a2d97422830a692e09242bc50e4eb1db
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
5 years agoBring in FDIO (VPP+DPDK) scenario 71/66271/5
Alexandru Avadanii [Thu, 8 Nov 2018 18:06:46 +0000 (19:06 +0100)]
Bring in FDIO (VPP+DPDK) scenario

- cmp, gtw: bump RAM allocation to accomodate hugepages/VPP;
  for now we overcommit, gtw01 resources can probably be lowered;
- submodule: add salt-formula-neutron so we can locally patch it;
- repo:
  * FD.IO repos for VPP packages;
  * networking-vpp PPA for python-networking-vpp Neutron driver;
- use vpp-router for L3, disable neutron-l3-agent;
- baremetal_init: apply repo config before network (otherwise UCA
  repo is missing when trying to install DPDK on baremetal nodes);
- arm64: iommu.passthrough=1 is required on ThunderX for VPP on
  newer kernels;

Design quirks:
- vpp service runs as 'neutron' user, which does not exist at the
  time VPP is installed and initially started, hence the need to
  restart it before starting the vpp-agent service;
- gtw01 node has DPDK, yet to configure it via IDF we use the
  compute-specific OVS-targeted parameters like
  `compute_ovs_dpdk_socket_mem`, which is a bit misleading;
- vpp-agent requires ml2_conf.ini on ALL compute AND network nodes
  to parse per-node physnet-to-real interface names;
- vpp process is bound to core '1' (not parameterized via IDF);

Change-Id: I659f7dbebcab7b154e7b1fb829cd7159b4372ec8
Signed-off-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>