Merge "Use TRex release v2.41 to support both x86 and aarch64"
[yardstick.git] / docs / testing / user / userguide / opnfv_yardstick_tc092.rst
1 .. This work is licensed under a Creative Commons Attribution 4.0 International
2 .. License.
3 .. http://creativecommons.org/licenses/by/4.0
4 .. (c) OPNFV, Ericsson and others.
5
6 *************************************
7 Yardstick Test Case Description TC092
8 *************************************
9
10 +-----------------------------------------------------------------------------+
11 |SDN Controller resilience in HA configuration                                |
12 |                                                                             |
13 +--------------+--------------------------------------------------------------+
14 |test case id  | OPNFV_YARDSTICK_TC092: SDN controller resilience and high    |
15 |              | availability HA configuration                                |
16 |              |                                                              |
17 +--------------+--------------------------------------------------------------+
18 |test purpose  | This test validates SDN controller node high availability by |
19 |              | verifying there is no impact on the data plane connectivity  |
20 |              | when one SDN controller fails in a HA configuration,         |
21 |              | i.e. all existing configured network services DHCP, ARP, L2, |
22 |              | L3VPN, Security Groups should continue to operate            |
23 |              | between the existing VMs while one SDN controller instance   |
24 |              | is offline and rebooting.                                    |
25 |              |                                                              |
26 |              | The test also validates that network service operations such |
27 |              | as creating a new VM in an existing or new L2 network        |
28 |              | network remain operational while one instance of the         |
29 |              | SDN controller is offline and recovers from the failure.     |
30 |              |                                                              |
31 +--------------+--------------------------------------------------------------+
32 |test method   | This test case:                                              |
33 |              |  1. fails one instance of a SDN controller cluster running   |
34 |              |     in a HA configuration on the OpenStack controller node   |
35 |              |                                                              |
36 |              |  2. checks if already configured L2 connectivity between     |
37 |              |     existing VMs is not impacted                             |
38 |              |                                                              |
39 |              |  3. verifies that the system never loses the ability to      |
40 |              |     execute virtual network operations, even when the        |
41 |              |     failed SDN Controller is still recovering                |
42 |              |                                                              |
43 +--------------+--------------------------------------------------------------+
44 |attackers     | In this test case, an attacker called “kill-process” is      |
45 |              | needed. This attacker includes three parameters:             |
46 |              |  1. ``fault_type``: which is used for finding the attacker's |
47 |              |     scripts. It should be set to 'kill-process' in this test |
48 |              |                                                              |
49 |              |  2. ``process_name``: should be set to sdn controller        |
50 |              |     process                                                  |
51 |              |                                                              |
52 |              |  3. ``host``: which is the name of a control node where      |
53 |              |     opendaylight process is running                          |
54 |              |                                                              |
55 |              | example:                                                     |
56 |              |   - ``fault_type``: “kill-process”                           |
57 |              |   - ``process_name``: “opendaylight-karaf” (TBD)             |
58 |              |   - ``host``: node1                                          |
59 |              |                                                              |
60 +--------------+--------------------------------------------------------------+
61 |monitors      | In this test case, the following monitors are needed         |
62 |              |  1. ``ping_same_network_l2``: monitor pinging traffic        |
63 |              |     between the VMs in same neutron network                  |
64 |              |                                                              |
65 |              |  2. ``ping_external_snat``: monitor ping traffic from VMs to |
66 |              |     external destinations (e.g. google.com)                  |
67 |              |                                                              |
68 |              |  3. ``SDN controller process monitor``: a monitor checking   |
69 |              |     the state of a specified SDN controller process. It      |
70 |              |     measures the recovery time of the given process.         |
71 |              |                                                              |
72 +--------------+--------------------------------------------------------------+
73 |operations    | In this test case, the following operations are needed:      |
74 |              |  1. "nova-create-instance-in_network": create a VM instance  |
75 |              |     in one of the existing neutron network.                  |
76 |              |                                                              |
77 +--------------+--------------------------------------------------------------+
78 |metrics       | In this test case, there are two metrics:                    |
79 |              |  1. process_recover_time: which indicates the maximun        |
80 |              |     time (seconds) from the process being killed to          |
81 |              |     recovered                                                |
82 |              |                                                              |
83 |              |  2. packet_drop: measure the packets that have been dropped  |
84 |              |     by the monitors using pktgen.                            |
85 |              |                                                              |
86 +--------------+--------------------------------------------------------------+
87 |test tool     | Developed by the project. Please see folder:                 |
88 |              | "yardstick/benchmark/scenarios/availability/ha_tools"        |
89 |              |                                                              |
90 +--------------+--------------------------------------------------------------+
91 |references    | TBD                                                          |
92 |              |                                                              |
93 +--------------+--------------------------------------------------------------+
94 |configuration | This test case needs two configuration files:                |
95 |              |  1. test case file: opnfv_yardstick_tc092.yaml               |
96 |              |     - Attackers: see above “attackers” discription           |
97 |              |     - Monitors: see above “monitors” discription             |
98 |              |       - waiting_time: which is the time (seconds) from the   |
99 |              |         process being killed to stoping monitors the         |
100 |              |         monitors                                             |
101 |              |     - SLA: see above “metrics” discription                   |
102 |              |                                                              |
103 |              |  2. POD file: pod.yaml The POD configuration should record   |
104 |              |     on pod.yaml first. the “host” item in this test case     |
105 |              |     will use the node name in the pod.yaml.                  |
106 |              |                                                              |
107 +--------------+--------------------------------------------------------------+
108 |test sequence | Description and expected result                              |
109 |              |                                                              |
110 +--------------+--------------------------------------------------------------+
111 |pre-action    |  1. The OpenStack cluster is set up with an SDN controller   |
112 |              |     running in a three node cluster configuration.           |
113 |              |                                                              |
114 |              |  2. One or more neutron networks are created with two or     |
115 |              |     more VMs attached to each of the neutron networks.       |
116 |              |                                                              |
117 |              |  3. The neutron networks are attached to a neutron router    |
118 |              |     which is attached to an external network the towards     |
119 |              |     DCGW.                                                    |
120 |              |                                                              |
121 |              |  4. The master node of SDN controller cluster is known.      |
122 |              |                                                              |
123 +--------------+--------------------------------------------------------------+
124 |step 1        | Start ip connectivity monitors:                              |
125 |              |  1. Check the L2 connectivity between the VMs in the same    |
126 |              |     neutron network.                                         |
127 |              |                                                              |
128 |              |  2. Check the external connectivity of the VMs.              |
129 |              |                                                              |
130 |              | Each monitor runs in an independent process.                 |
131 |              |                                                              |
132 |              | Result: The monitor info will be collected.                  |
133 |              |                                                              |
134 +--------------+--------------------------------------------------------------+
135 |step 2        | Start attacker:                                              |
136 |              | SSH to the VIM node and kill the SDN controller process      |
137 |              | determined in step 2.                                        |
138 |              |                                                              |
139 |              | Result: One SDN controller service will be shut down         |
140 |              |                                                              |
141 +--------------+--------------------------------------------------------------+
142 |step 3        | Restart the SDN controller.                                  |
143 |              |                                                              |
144 +--------------+--------------------------------------------------------------+
145 |step 4        | Create a new VM in the existing Neutron network while the    |
146 |              | SDN controller is offline or still recovering.               |
147 |              |                                                              |
148 +--------------+--------------------------------------------------------------+
149 |step 5        | Stop IP connectivity monitors after a period of time         |
150 |              | specified by “waiting_time”                                  |
151 |              |                                                              |
152 |              | Result: The monitor info will be aggregated                  |
153 |              |                                                              |
154 +--------------+--------------------------------------------------------------+
155 |step 6        | Verify the IP connectivity monitor result                    |
156 |              |                                                              |
157 |              | Result: IP connectivity monitor should not have any packet   |
158 |              | drop failures reported                                       |
159 |              |                                                              |
160 +--------------+--------------------------------------------------------------+
161 |step 7        | Verify process_recover_time, which indicates the maximun     |
162 |              | time (seconds) from the process being killed to recovered,   |
163 |              | is within the SLA. This step blocks until either the         |
164 |              | process has recovered or a timeout occurred.                 |
165 |              |                                                              |
166 |              | Result: process_recover_time is within SLA limits, if not,   |
167 |              | test case failed and stopped.                                |
168 |              |                                                              |
169 +--------------+--------------------------------------------------------------+
170 |step 8        | Start IP connectivity monitors for the  new VM:              |
171 |              |  1. Check the L2 connectivity from the existing VMs to the   |
172 |              |     new VM in the Neutron network.                           |
173 |              |                                                              |
174 |              |  2. Check connectivity from one VM to an external host on    |
175 |              |     the Internet to verify SNAT functionality.               |
176 |              |                                                              |
177 |              | Result: The monitor info will be collected.                  |
178 |              |                                                              |
179 +--------------+--------------------------------------------------------------+
180 |step 9        | Stop IP connectivity monitors after a period of time         |
181 |              | specified by “waiting_time”                                  |
182 |              |                                                              |
183 |              | Result: The monitor info will be aggregated                  |
184 |              |                                                              |
185 +--------------+--------------------------------------------------------------+
186 |step 10       | Verify the IP connectivity monitor result                    |
187 |              |                                                              |
188 |              | Result: IP connectivity monitor should not have any packet   |
189 |              | drop failures reported                                       |
190 |              |                                                              |
191 +--------------+--------------------------------------------------------------+
192 |test verdict  | Fails only if SLA is not passed, or if there is a test case  |
193 |              | execution problem.                                           |
194 |              |                                                              |
195 +--------------+--------------------------------------------------------------+
196