1 .. This work is licensed under a Creative Commons Attribution 4.0 International
3 .. http://creativecommons.org/licenses/by/4.0
4 .. (c) OPNFV, Ericsson and others.
6 *************************************
7 Yardstick Test Case Description TC092
8 *************************************
10 +-----------------------------------------------------------------------------+
11 |SDN Controller resilience in HA configuration |
13 +--------------+--------------------------------------------------------------+
14 |test case id | OPNFV_YARDSTICK_TC092: SDN controller resilience and high |
15 | | availability HA configuration |
17 +--------------+--------------------------------------------------------------+
18 |test purpose | This test validates SDN controller node high availability by |
19 | | verifying there is no impact on the data plane connectivity |
20 | | when one SDN controller fails in a HA configuration, |
21 | | i.e. all existing configured network services DHCP, ARP, L2, |
22 | | L3VPN, Security Groups should continue to operate |
23 | | between the existing VMs while one SDN controller instance |
24 | | is offline and rebooting. |
26 | | The test also validates that network service operations such |
27 | | as creating a new VM in an existing or new L2 network |
28 | | network remain operational while one instance of the |
29 | | SDN controller is offline and recovers from the failure. |
31 +--------------+--------------------------------------------------------------+
32 |test method | This test case: |
33 | | 1. fails one instance of a SDN controller cluster running |
34 | | in a HA configuration on the OpenStack controller node |
36 | | 2. checks if already configured L2 connectivity between |
37 | | existing VMs is not impacted |
39 | | 3. verifies that the system never loses the ability to |
40 | | execute virtual network operations, even when the |
41 | | failed SDN Controller is still recovering |
43 +--------------+--------------------------------------------------------------+
44 |attackers | In this test case, an attacker called “kill-process” is |
45 | | needed. This attacker includes three parameters: |
47 | | 1. ``fault_type``: which is used for finding the attacker's |
48 | | scripts. It should be set to 'kill-process' in this test |
50 | | 2. ``process_name``: should be set to sdn controller |
53 | | 3. ``host``: which is the name of a control node where |
54 | | opendaylight process is running |
57 | | - ``fault_type``: “kill-process” |
58 | | - ``process_name``: “opendaylight-karaf” (TBD) |
59 | | - ``host``: node1 |
61 +--------------+--------------------------------------------------------------+
62 |monitors | In this test case, the following monitors are needed |
63 | | 1. ``ping_same_network_l2``: monitor pinging traffic |
64 | | between the VMs in same neutron network |
66 | | 2. ``ping_external_snat``: monitor ping traffic from VMs to |
67 | | external destinations (e.g. google.com) |
69 | | 3. ``SDN controller process monitor``: a monitor checking |
70 | | the state of a specified SDN controller process. It |
71 | | measures the recovery time of the given process. |
73 +--------------+--------------------------------------------------------------+
74 |operations | In this test case, the following operations are needed: |
75 | | 1. "nova-create-instance-in_network": create a VM instance |
76 | | in one of the existing neutron network. |
78 +--------------+--------------------------------------------------------------+
79 |metrics | In this test case, there are two metrics: |
80 | | 1. process_recover_time: which indicates the maximun |
81 | | time (seconds) from the process being killed to |
84 | | 2. packet_drop: measure the packets that have been dropped |
85 | | by the monitors using pktgen. |
87 +--------------+--------------------------------------------------------------+
88 |test tool | Developed by the project. Please see folder: |
89 | | "yardstick/benchmark/scenarios/availability/ha_tools" |
91 +--------------+--------------------------------------------------------------+
94 +--------------+--------------------------------------------------------------+
95 |configuration | This test case needs two configuration files: |
96 | | 1. test case file: opnfv_yardstick_tc092.yaml |
98 | | - Attackers: see above “attackers” discription |
99 | | - Monitors: see above “monitors” discription |
101 | | - waiting_time: which is the time (seconds) from the |
102 | | process being killed to stoping monitors the |
105 | | - SLA: see above “metrics” discription |
107 | | 2. POD file: pod.yaml The POD configuration should record |
108 | | on pod.yaml first. the “host” item in this test case |
109 | | will use the node name in the pod.yaml. |
111 +--------------+--------------------------------------------------------------+
112 |test sequence | Description and expected result |
114 +--------------+--------------------------------------------------------------+
115 |pre-action | 1. The OpenStack cluster is set up with an SDN controller |
116 | | running in a three node cluster configuration. |
118 | | 2. One or more neutron networks are created with two or |
119 | | more VMs attached to each of the neutron networks. |
121 | | 3. The neutron networks are attached to a neutron router |
122 | | which is attached to an external network the towards |
125 | | 4. The master node of SDN controller cluster is known. |
127 +--------------+--------------------------------------------------------------+
128 |step 1 | Start ip connectivity monitors: |
129 | | 1. Check the L2 connectivity between the VMs in the same |
130 | | neutron network. |
132 | | 2. Check the external connectivity of the VMs. |
134 | | Each monitor runs in an independent process. |
136 | | Result: The monitor info will be collected. |
138 +--------------+--------------------------------------------------------------+
139 |step 2 | Start attacker: |
140 | | SSH to the VIM node and kill the SDN controller process |
141 | | determined in step 2. |
143 | | Result: One SDN controller service will be shut down |
145 +--------------+--------------------------------------------------------------+
146 |step 3 | Restart the SDN controller. |
148 +--------------+--------------------------------------------------------------+
149 |step 4 | Create a new VM in the existing Neutron network while the |
150 | | SDN controller is offline or still recovering. |
152 +--------------+--------------------------------------------------------------+
153 |step 5 | Stop IP connectivity monitors after a period of time |
154 | | specified by “waiting_time” |
156 | | Result: The monitor info will be aggregated |
158 +--------------+--------------------------------------------------------------+
159 |step 6 | Verify the IP connectivity monitor result |
161 | | Result: IP connectivity monitor should not have any packet |
162 | | drop failures reported |
164 +--------------+--------------------------------------------------------------+
165 |step 7 | Verify process_recover_time, which indicates the maximun |
166 | | time (seconds) from the process being killed to recovered, |
167 | | is within the SLA. This step blocks until either the |
168 | | process has recovered or a timeout occurred. |
170 | | Result: process_recover_time is within SLA limits, if not, |
171 | | test case failed and stopped. |
173 +--------------+--------------------------------------------------------------+
174 |step 8 | Start IP connectivity monitors for the new VM: |
176 | | 1. Check the L2 connectivity from the existing VMs to the |
177 | | new VM in the Neutron network. |
179 | | 2. Check connectivity from one VM to an external host on |
180 | | the Internet to verify SNAT functionality. |
182 | | Result: The monitor info will be collected. |
184 +--------------+--------------------------------------------------------------+
185 |step 9 | Stop IP connectivity monitors after a period of time |
186 | | specified by “waiting_time” |
188 | | Result: The monitor info will be aggregated |
190 +--------------+--------------------------------------------------------------+
191 |step 10 | Verify the IP connectivity monitor result |
193 | | Result: IP connectivity monitor should not have any packet |
194 | | drop failures reported |
196 +--------------+--------------------------------------------------------------+
197 |test verdict | Fails only if SLA is not passed, or if there is a test case |
198 | | execution problem. |
200 +--------------+--------------------------------------------------------------+