1 =======================================
2 Overcloud Container Design/Architecture
3 =======================================
5 This document describes the changes done to implement container deployments in
8 * OOO container architecture
9 * Upstream vs Downstream deployment
10 * Apex container deployment overview
12 OOO container architecture
13 --------------------------
15 Typically in OOO each OpenStack service is represented by a TripleO Heat
16 Template stored under the puppet/services directory in the THT code base. For
17 containers, there are new templates created in the docker/services directory
18 which include templates for most of the previously defined puppet services.
19 These docker templates in almost all cases inherit their puppet template
20 counterpart and then build off of that to provide OOO docker specific
23 The containers configuration in OOO is still done via puppet, and config files
24 are then copied into a host directory to be later mounted in the service
25 container during deployment. The docker template contains docker specific
26 settings to the service, including what files to mount into the container,
27 along with which puppet resources to execute, etc. Note, the puppet code is
28 still stored locally on the host, while the service python code is stored in
31 RDO has its own registry which stores the Docker images per service to use in
32 deployments. The container image is usually just a CentOS 7 container with the
33 relevant service RPM installed.
35 In addition, Ceph no longer uses puppet to deploy. puppet-ceph was previously
36 used to configure Ceph on the overcloud, but has been replaced with
37 Ceph-Ansible. During container deployment, the undercloud calls a mistral
38 workflow to initiate a Ceph-Ansible playbook that will download the Ceph Daemon
39 container image to the overcloud and configure it.
41 Upstream vs. Downstream deployment
42 ----------------------------------
44 In Apex we typically build artifacts and then deploy from them. This works in
45 the past as we usually modify disk images (qcow2s) with files or patches and
46 distribute them as RPMs. However, with containers space becomes an issue. The
47 size of each container image ranges from 800 MB to over 2GB. This makes it
48 unfeasible to download all of the possible images and store them into a disk
49 image for distribution.
51 Therefore for container deployments the only option is to deploy using
52 upstream. This means that only upstream undercloud/overcloud images are pulled
53 at deploy time, and the required containers are docker pulled during deployment
54 into the undercloud. For upstream deployments the modified time of the
55 RDO images are checked and cached locally, to refrain from unnecessary
56 downloading of artifacts. Also, the optional '--no-fetch' argument may be
57 provided at deploy time, to ignore pulling any new images, as long as previous
58 artifacts are cached locally.
60 Apex container deployment
61 -------------------------
63 For deploying containers with Apex, a new deploy setting is available,
64 'containers'. When this flag is used, along with '--upstream' the following
67 1. The upstream RDO images for undercloud/overcloud are checked and
68 downloaded if necessary.
69 2. The undercloud VM is installed and configured as a normal deployment.
70 3. The overcloud prep image method is called which is modified now for
71 patches and containers. The method will now return a set of container
72 images which are going to be patched. These can be either due to a change
73 in OpenDaylight version for example, or patches included in the deploy
74 settings for the overcloud that include a python path.
75 4. During the overcloud image prep, a new directory in the Apex tmp dir is
76 created called 'containers' which then includes sub-directories for each
77 docker image which is being patched (for example, 'containers/nova-api').
78 5. A Dockerfile is created inside of the directory created in step 4, which
79 holds Dockerfile operations to rebuild the container with patches or any
80 required changes. Several container images could be used for different
81 services inside of an OS project. For example, there are different images
82 for each nova service (nova-api, nova-conductor, nova-compute). Therefore
83 a lookup is done to figure out all of the container images that a
84 hypothetically provided nova patch would apply to. Then a directory and
85 Dockerfile is created for each image. All of this is tar'ed and
86 compressed into an archive which will be copied to the undercloud.
87 6. Next, the deployment is checked to see if a Ceph devices was provided in
88 Apex settings. If it is not, then a persistent loop device is created
89 in the overcloud image to serve as storage backend for Ceph OSDs. Apex
90 previously used a directory '/srv/data' to serve as the backend to the
91 OSDs, but that is no longer supported with Ceph-Ansible.
92 7. The deployment command is then created, as usual, but with minor changes
93 to add docker.yaml and docker-ha.yaml files which are required to deploy
95 8. Next a new playbook is executed, 'prepare_overcloud_containers.yaml',
96 which includes several steps:
98 a. The previously archived docker image patches are copied and unpacked
100 b. 'overcloud_containers' and 'sdn_containers' image files are then
101 prepared which are basically just yaml files which indicate which
102 docker images to pull and where to store them. Which in our case is a
103 local docker registry.
104 c. The docker images are then pulled and stored into the local registry.
105 The reason for using a local registry is to then have a static source
106 of images that do not change every time a user deploys. This allows
107 for more control and predictability in deployments.
108 d. Next, the images in the local registry are cross-checked against
109 the images that were previously collected as requiring patches. Any
110 image which then exists in the local registry and also requires changes
111 is then rebuilt by the docker build command, tagged with 'apex' and
112 then pushed into the local registry. This helps the user distinguish
113 which containers have been modified by Apex, in case any debugging is
114 needed in comparing upstream docker images with Apex modifications.
115 e. Then new OOO image files are created, to indicate to OOO that the
116 docker images to use for deployment are the ones in the local registry.
117 Also, the ones modified by Apex are modified with the 'apex' tag.
118 f. The relevant Ceph Daemon Docker image is pulled and pushed into the
119 local registry for deployment.
120 9. At this point the OOO deployment command is initiated as in regular
121 Apex deployments. Each container will be started on the overcloud and
122 puppet executed in it to gather the configuration files in Step 1. This
123 leads to Step 1 taking longer than it used to in non-containerized
124 deployments. Following this step, the containers are then brought up in
125 their regular step order, while mounting the previously generated