docs/release/configguide/feature.configuration.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0 International License.
   2 .. http://creativecommons.org/licenses/by/4.0
   3
   4 Doctor Configuration
   5 ====================
   6
   7 OPNFV installers install most components of Doctor framework including
   8 OpenStack Nova, Neutron and Cinder (Doctor Controller) and OpenStack
   9 Ceilometer and Aodh (Doctor Notifier) except Doctor Monitor.
  10
  11 After major components of OPNFV are deployed, you can setup Doctor functions
  12 by following instructions in this section. You can also learn detailed
  13 steps for all supported installers under `doctor/doctor_tests/installer`_.
  14
  15 .. _doctor/doctor_tests/installer: https://git.opnfv.org/doctor/tree/doctor_tests/installer
  16
  17 Doctor Inspector
  18 ----------------
  19
  20 You need to configure one of Doctor Inspectors below. You can also learn detailed steps for
  21 all supported Inspectors under `doctor/doctor_tests/inspector`_.
  22
  23 .. _doctor/doctor_tests/inspector: https://git.opnfv.org/doctor/tree/doctor_tests/inspector
  24
  25
  26 **Sample Inspector**
  27
  28 Sample Inspector is intended to show minimum functions of Doctor Inspector.
  29
  30 Sample Inspector is suggested to be placed in one of the controller nodes,
  31 but it can be put on any host where Sample Inspector can reach and access
  32 the OpenStack Controllers (e.g. Nova, Neutron).
  33
  34 Make sure OpenStack env parameters are set properly, so that Sample Inspector
  35 can issue admin actions such as compute host force-down and state update of VM.
  36
  37 Then, you can configure Sample Inspector as follows:
  38
  39 .. code-block:: bash
  40
  41     git clone https://gerrit.opnfv.org/gerrit/doctor
  42     cd doctor/doctor_tests/inspector
  43     INSPECTOR_PORT=12345
  44     python sample.py $INSPECTOR_PORT > inspector.log 2>&1 &
  45
  46 **Congress**
  47
  48 OpenStack `Congress`_ is a Governance as a Service (previously Policy as a
  49 Service). Congress implements Doctor Inspector as it can inspect a fault
  50 situation and propagate errors onto other entities.
  51
  52 .. _Congress: https://governance.openstack.org/tc/reference/projects/congress.html
  53
  54 Congress is deployed by OPNFV Apex installer. You need to enable doctor
  55 datasource driver and set policy rules. By the example configuration below,
  56 Congress will force down nova compute service when it received a fault event
  57 of that compute host. Also, Congress will set the state of all VMs running on
  58 that host from ACTIVE to ERROR state.
  59
  60 .. code-block:: bash
  61
  62     openstack congress datasource create doctor "doctor"
  63
  64     openstack congress datasource create --config api_version=$NOVA_MICRO_VERSION \
  65         --config username=$OS_USERNAME --config tenant_name=$OS_TENANT_NAME \
  66         --config password=$OS_PASSWORD --config auth_url=$OS_AUTH_URL \
  67         nova "nova21"
  68
  69     openstack congress policy rule create \
  70         --name host_down classification \
  71         'host_down(host) :-
  72             doctor:events(hostname=host, type="compute.host.down", status="down")'
  73
  74     openstack congress policy rule create \
  75         --name active_instance_in_host classification \
  76         'active_instance_in_host(vmid, host) :-
  77             nova:servers(id=vmid, host_name=host, status="ACTIVE")'
  78
  79     openstack congress policy rule create \
  80         --name host_force_down classification \
  81         'execute[nova:services.force_down(host, "nova-compute", "True")] :-
  82             host_down(host)'
  83
  84     openstack congress policy rule create \
  85         --name error_vm_states classification \
  86         'execute[nova:servers.reset_state(vmid, "error")] :-
  87             host_down(host),
  88             active_instance_in_host(vmid, host)'
  89
  90 **Vitrage**
  91
  92 OpenStack `Vitrage`_ is an RCA (Root Cause Analysis) service for organizing,
  93 analyzing and expanding OpenStack alarms & events. Vitrage implements Doctor
  94 Inspector, as it receives a notification that a host is down and calls Nova
  95 force-down API. In addition, it raises alarms on the instances running on this
  96 host.
  97
  98 .. _Vitrage: https://wiki.openstack.org/wiki/Vitrage
  99
 100 Vitrage is not deployed by OPNFV installers yet. It can be installed either on
 101 top of a devstack environment, or on top of a real OpenStack environment. See
 102 `Vitrage Installation`_
 103
 104 .. _`Vitrage Installation`: https://docs.openstack.org/developer/vitrage/installation-and-configuration.html
 105
 106 Doctor SB API and a Doctor datasource were implemented in Vitrage in the Ocata
 107 release. The Doctor datasource is enabled by default.
 108
 109 After Vitrage is installed and configured, there is a need to configure it to
 110 support the Doctor use case. This can be done in a few steps:
 111
 112 1. Make sure that 'aodh' and 'doctor' are included in the list of datasource
 113    types in /etc/vitrage/vitrage.conf:
 114
 115 .. code-block:: bash
 116
 117     [datasources]
 118     types = aodh,doctor,nova.host,nova.instance,nova.zone,static,cinder.volume,neutron.network,neutron.port,heat.stack
 119
 120 2. Enable the Vitrage Nova notifier. Set the following line in
 121    /etc/vitrage/vitrage.conf:
 122
 123 .. code-block:: bash
 124
 125     [DEFAULT]
 126     notifiers = nova
 127
 128 3. Add a template that is responsible to call Nova force-down if Vitrage
 129    receives a 'compute.host.down' alarm. Copy `template`_ and place it under
 130    /etc/vitrage/templates
 131
 132 .. _template: https://github.com/openstack/vitrage/blob/master/etc/vitrage/templates.sample/host_down_scenarios.yaml
 133
 134 4. Restart the vitrage-graph and vitrage-notifier services
 135
 136
 137 Doctor Monitors
 138 ---------------
 139
 140 Doctor Monitors are suggested to be placed in one of the controller nodes,
 141 but those can be put on any host which is reachable to target compute host and
 142 accessible by the Doctor Inspector.
 143 You need to configure Monitors for all compute hosts one by one. You can also learn detailed
 144 steps for all supported monitors under `doctor/doctor_tests/monitor`_.
 145
 146 .. _doctor/doctor_tests/monitor: https://git.opnfv.org/doctor/tree/doctor_tests/monitor
 147
 148 **Sample Monitor**
 149 You can configure the Sample Monitor as follows (Example for Apex deployment):
 150
 151 .. code-block:: bash
 152
 153     git clone https://gerrit.opnfv.org/gerrit/doctor
 154     cd doctor/doctor_tests/monitor
 155     INSPECTOR_PORT=12345
 156     COMPUTE_HOST='overcloud-novacompute-1.localdomain.com'
 157     COMPUTE_IP=192.30.9.5
 158     sudo python sample.py "$COMPUTE_HOST" "$COMPUTE_IP" \
 159         "http://127.0.0.1:$INSPECTOR_PORT/events" > monitor.log 2>&1 &
 160
 161 **Collectd Monitor**
 162
 163 OpenStack components
 164 ====================
 165
 166 In OPNFV and with Doctor testing you can have all OpenStack components configured
 167 as needed. Here is sample of the needed configuration modifications.
 168
 169 Ceilometer
 170 ----------
 171
 172 /etc/ceilometer/event_definitions.yaml:
 173 # Maintenance use case needs new alarm definitions to be added
 174 - event_type: maintenance.scheduled
 175     traits:
 176       actions_at:
 177         fields: payload.maintenance_at
 178         type: datetime
 179       allowed_actions:
 180         fields: payload.allowed_actions
 181       host_id:
 182         fields: payload.host_id
 183       instances:
 184         fields: payload.instances
 185       metadata:
 186         fields: payload.metadata
 187       project_id:
 188         fields: payload.project_id
 189       reply_url:
 190         fields: payload.reply_url
 191       session_id:
 192         fields: payload.session_id
 193       state:
 194         fields: payload.state
 195 - event_type: maintenance.host
 196     traits:
 197       host:
 198         fields: payload.host
 199       project_id:
 200         fields: payload.project_id
 201       session_id:
 202         fields: payload.session_id
 203       state:
 204         fields: payload.state
 205
 206 /etc/ceilometer/event_pipeline.yaml:
 207 # Maintenance and Fault management both needs these to be added
 208     - notifier://
 209     - notifier://?topic=alarm.all
 210
 211 Nova
 212 ----
 213
 214 /etc/nova/nova.conf
 215 cpu_allocation_ratio=1.0