1 .. This work is licensed under a Creative Commons Attribution 4.0 International
3 .. http://creativecommons.org/licenses/by/4.0
6 ===================================
7 Test Results for yardstick-opnfv-ha
8 ===================================
15 There are two test cases, TC019 and TC025, for high availability (HA) test of
16 OPNFV platform, and both test cases were executed in CMCC's lab with 3+2 HA
17 deployment, where the installer is Arno SR1 release of fuel.
22 This test case verifies the high availability of the openstack service, i.e.
23 "nova-api", on controller node.
24 There are one attacker, "kill-process" which kills all "nova-api" processes,
25 and two monitors, "openstack-cmd" monitoring "nova-api" service by openstack
26 command "nova image-list", while "process" monitor checks whether "nova-api"
27 process is running. Please see the test case description document for detail.
29 Overview of test results
30 ------------------------
31 The service_outage_time of "nova image-list" is 0 seconds, while the
32 process_recover_time of "nova-api" is 300 seconds which equals the running time
33 of this test case, that means the "nova-api" service can't automatiocally
38 All "nova-api" process on the selected controller node was killed, and results
39 of two monitors were collected. Specifically, the results of "nova image-list"
40 request were collected from compute node and the status of "nova-api" process
41 were collected from the selected controller node.
43 Each monitor was running in a single process. The running time of each monitor
44 was about 300 seconds with no waiting time between twice monitor running. For
45 "nova image-list", the running times is 127, that's to say there is one
46 openstack command request every 2.36 seconds; while the running times is 141
47 for "nova-api" process checking, the accurancy is about 2.13 seconds.
49 The outage time of each monitor, which the name is "service_outage_time" for
50 "openstack-cmd" monitor and "process_recover_time" for "process" monitor, is
51 defined as the duration from the begin time of the first failure request to the
52 end time of the last failure request.
54 All "nova image-list" requestes were success, so the service_outage_time of
55 "nova image-list" is 0 second, while "nova-api" processes were not running for
56 all "process" checking, so the process_recover_time of "nova-api" is 300s.
58 Rationale for decisions
59 -----------------------
60 The service_outage_time is 0 second, that means the failover time of openstack
61 service is less than 2.36s, which is the period of each request. However, the
62 process_recover_time equals test case runing time, that means the process is
63 not automatically recovered, so this test case is fail.
68 This test case verifies the high availability of controller node. When one of
69 the controller node abnormally shutdown, the service provided should be OK.
70 There are one attacker, "kill-process" which kills all "nova-api" processes,
71 and two "openstack-cmd" monitors, one monitoring openstack command
72 "nova image-list" and the other monitoring "neutron router-list".
73 Please see the test case description document for detail.
75 Overview of test results
76 ------------------------
77 The both service_outage_time of "nova image-list" and "neutron router-list"
82 A selected controller node was shutdown, and results of two monitors were
83 collected from compute node.
85 The return results of "nova image-list" and "neutron router-list" requests from
86 compute node were collected, then the failure requestion time were statistic
87 service_outage_time of corresponding service.
89 Each monitor was running in a single process. The running time of each monitor
90 was about 300 seconds with no waiting time between twice monitor running. For
91 "nova image-list", the running times is 49, that's to say there is one
92 openstack command request every 6.12 seconds; while the running times is 28 for
93 "neutron router-list", the accurancy is about 10.71 seconds.
95 The "service_outage_time" for two monitors is defined as the duration from the
96 begin time of the first failure request to the end time of the last failure
99 All "nova image-list" and "neutron router-list" requestes were success, so the
100 service_outage_time of both two monitor were 0 second.
102 Rationale for decisions
103 -----------------------
104 As service_outage_time of all monitors are 0 second, that means there are none
105 failure request in this test case running time, this test case is passed.
108 Conclusions and recommendations
109 -------------------------------
110 The TC019 shows the killed process will be not automatically recovered, which
113 There are several improvement points for HA test:
114 a) Running test cases in different enveriment deployed by different installers,
115 such as compass4nfv, apex and joid, with different versiones.
116 b) The period of each request is a little long, it needs more accurate test
118 c) More test cases with different faults and different monitors are needed.