docs/results/yardstick-opnfv-ha.rst

   1 .. This work is licensed under a Creative Commons Attribution 4.0 International
   2 .. License.
   3 .. http://creativecommons.org/licenses/by/4.0
   4
   5
   6 ===================================
   7 Test Results for yardstick-opnfv-ha
   8 ===================================
   9
  10 .. toctree::
  11    :maxdepth: 2
  12
  13 Details
  14 =======
  15 There are two test cases, TC019 and TC025, for high availability (HA) test of
  16 OPNFV platform, and both test cases were executed in CMCC's lab with 3+2 HA
  17 deployment, where the installer is Arno SR1 release of fuel.
  18
  19
  20 TC019
  21 -----
  22 This test case verifies the high availability of the openstack service, i.e.
  23 "nova-api", on controller node.
  24 There are one attacker, "kill-process" which kills all "nova-api" processes,
  25 and two monitors, "openstack-cmd" monitoring "nova-api" service by openstack
  26 command "nova image-list", while "process" monitor checks whether "nova-api"
  27 process is running. Please see the test case description document for detail.
  28
  29 Overview of test results
  30 ------------------------
  31 The service_outage_time of "nova image-list" is 0 seconds, while the
  32 process_recover_time of "nova-api" is 300 seconds which equals the running time
  33 of this test case, that means the "nova-api" service can't automatiocally
  34 recover itself.
  35
  36 Detailed test results
  37 ---------------------
  38 All "nova-api" process on the selected controller node was killed, and results
  39 of two monitors were collected. Specifically, the results of "nova image-list"
  40 request were collected from compute node and the status of "nova-api" process
  41 were collected from the selected controller node.
  42
  43 Each monitor was running in a single process. The running time of each monitor
  44 was about 300 seconds with no waiting time between twice monitor running. For
  45 "nova image-list", the running times is 127, that's to say there is one
  46 openstack command request every 2.36 seconds; while the running times is 141
  47 for "nova-api" process checking, the accurancy is about 2.13 seconds.
  48
  49 The outage time of each monitor, which the name is "service_outage_time" for
  50 "openstack-cmd" monitor and "process_recover_time" for "process" monitor, is
  51 defined as the duration from the begin time of the first failure request to the
  52 end time of the last failure request.
  53
  54 All "nova image-list" requestes were success, so the service_outage_time of
  55 "nova image-list" is 0 second, while "nova-api" processes were not running for
  56 all "process" checking, so the process_recover_time of "nova-api" is 300s.
  57
  58 Rationale for decisions
  59 -----------------------
  60 The service_outage_time is 0 second, that means the failover time of openstack
  61 service is less than 2.36s, which is the period of each request. However, the
  62 process_recover_time equals test case runing time, that means the process is
  63 not automatically recovered, so this test case is fail.
  64
  65
  66 TC025
  67 -----
  68 This test case verifies the high availability of controller node. When one of
  69 the controller node abnormally shutdown, the service provided should be OK.
  70 There are one attacker, "kill-process" which kills all "nova-api" processes,
  71 and two "openstack-cmd" monitors, one monitoring openstack command
  72 "nova image-list" and the other monitoring "neutron router-list".
  73 Please see the test case description document for detail.
  74
  75 Overview of test results
  76 ------------------------
  77 The both service_outage_time of "nova image-list" and "neutron router-list"
  78 were 0 second.
  79
  80 Detailed test results
  81 ---------------------
  82 A selected controller node was shutdown, and results of two monitors were
  83 collected from compute node.
  84
  85 The return results of "nova image-list" and "neutron router-list" requests from
  86 compute node were collected, then the failure requestion time were statistic
  87 service_outage_time of corresponding service.
  88
  89 Each monitor was running in a single process. The running time of each monitor
  90 was about 300 seconds with no waiting time between twice monitor running. For
  91 "nova image-list", the running times is 49, that's to say there is one
  92 openstack command request every 6.12 seconds; while the running times is 28 for
  93 "neutron router-list", the accurancy is about 10.71 seconds.
  94
  95 The "service_outage_time" for two monitors is defined as the duration from the
  96 begin time of the first failure request to the end time of the last failure
  97 request.
  98
  99 All "nova image-list" and "neutron router-list" requestes were success, so the
 100 service_outage_time of both two monitor were 0 second.
 101
 102 Rationale for decisions
 103 -----------------------
 104 As service_outage_time of all monitors are 0 second, that means there are none
 105 failure request in this test case running time, this test case is passed.
 106
 107
 108 Conclusions and recommendations
 109 -------------------------------
 110 The TC019 shows the killed process will be not automatically recovered, which
 111 should be imporved.
 112
 113 There are several improvement points for HA test:
 114 a) Running test cases in different enveriment deployed by different installers,
 115 such as compass4nfv, apex and joid, with different versiones.
 116 b) The period of each request is a little long, it needs more accurate test
 117  method.
 118 c) More test cases with different faults and different monitors are needed.