X-Git-Url: https://gerrit.opnfv.org/gerrit/gitweb?p=doctor.git;a=blobdiff_plain;f=docs%2Frelease%2Fscenarios%2Fmaintenance%2Fmaintenance.rst;fp=docs%2Fdevelopment%2Foverview%2Ffunctest_scenario%2Fdoctor-scenario-in-functest.rst;h=ecfe76b12f7031db0f02f6ce5d63dfc638228c96;hp=4505dd8fbc1d5f27fb3f687ee8485c7d4d7f870a;hb=72a1f8c92f1692f1ea8dcb5bc706ec9939c30e0a;hpb=6ff11513a0d3728c79033af623c79dd6df7c621e diff --git a/docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst b/docs/release/scenarios/maintenance/maintenance.rst similarity index 51% rename from docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst rename to docs/release/scenarios/maintenance/maintenance.rst index 4505dd8f..ecfe76b1 100644 --- a/docs/development/overview/functest_scenario/doctor-scenario-in-functest.rst +++ b/docs/release/scenarios/maintenance/maintenance.rst @@ -2,142 +2,6 @@ .. http://creativecommons.org/licenses/by/4.0 - -Platform overview -""""""""""""""""" - -Doctor platform provides these features since `Danube Release `_: - -* Immediate Notification -* Consistent resource state awareness for compute host down -* Valid compute host status given to VM owner - -These features enable high availability of Network Services on top of -the virtualized infrastructure. Immediate notification allows VNF managers -(VNFM) to process recovery actions promptly once a failure has occurred. -Same framework can also be utilized to have VNFM awareness about -infrastructure maintenance. - -Consistency of resource state is necessary to execute recovery actions -properly in the VIM. - -Ability to query host status gives VM owner the possibility to get -consistent state information through an API in case of a compute host -fault. - -The Doctor platform consists of the following components: - -* OpenStack Compute (Nova) -* OpenStack Networking (Neutron) -* OpenStack Telemetry (Ceilometer) -* OpenStack Alarming (AODH) -* Doctor Sample Inspector, OpenStack Congress or OpenStack Vitrage -* Doctor Sample Monitor or any monitor supported by Congress or Vitrage - -.. note:: - Doctor Sample Monitor is used in Doctor testing. However in real - implementation like Vitrage, there are several other monitors supported. - -You can see an overview of the Doctor platform and how components interact in -:numref:`figure-p1`. - -.. figure:: ./images/Fault-management-design.png - :name: figure-p1 - :width: 100% - - Doctor platform and typical sequence - -Detailed information on the Doctor architecture can be found in the Doctor -requirements documentation: -http://artifacts.opnfv.org/doctor/docs/requirements/05-implementation.html - -Running test cases -"""""""""""""""""" - -Functest will call the "doctor_tests/main.py" in Doctor to run the test job. -Doctor testing can also be triggered by tox on OPNFV installer jumphost. Tox -is normally used for functional, module and coding style testing in Python -project. - -Currently, 'Apex', 'MCP' and 'local' installer are supported. - - -Fault management use case -""""""""""""""""""""""""" - -* A consumer of the NFVI wants to receive immediate notifications about faults - in the NFVI affecting the proper functioning of the virtual resources. - Therefore, such faults have to be detected as quickly as possible, and, when - a critical error is observed, the affected consumer is immediately informed - about the fault and can switch over to the STBY configuration. - -The faults to be monitored (and at which detection rate) will be configured by -the consumer. Once a fault is detected, the Inspector in the Doctor -architecture will check the resource map maintained by the Controller, to find -out which virtual resources are affected and then update the resources state. -The Notifier will receive the failure event requests sent from the Controller, -and notify the consumer(s) of the affected resources according to the alarm -configuration. - -Detailed workflow information is as follows: - -* Consumer(VNFM): (step 0) creates resources (network, server/instance) and an - event alarm on state down notification of that server/instance or Neutron - port. - -* Monitor: (step 1) periodically checks nodes, such as ping from/to each - dplane nic to/from gw of node, (step 2) once it fails to send out event - with "raw" fault event information to Inspector - -* Inspector: when it receives an event, it will (step 3) mark the host down - ("mark-host-down"), (step 4) map the PM to VM, and change the VM status to - down. In network failure case, also Neutron port is changed to down. - -* Controller: (step 5) sends out instance update event to Ceilometer. In network - failure case, also Neutron port is changed to down and corresponding event is - sent to Ceilometer. - -* Notifier: (step 6) Ceilometer transforms and passes the events to AODH, - (step 7) AODH will evaluate events with the registered alarm definitions, - then (step 8) it will fire the alarm to the "consumer" who owns the - instance - -* Consumer(VNFM): (step 9) receives the event and (step 10) recreates a new - instance - -Fault management test case -"""""""""""""""""""""""""" - -Functest will call the 'doctor-test' command in Doctor to run the test job. - -The following steps are executed: - -Firstly, get the installer ip according to the installer type. Then ssh to -the installer node to get the private key for accessing to the cloud. As -'fuel' installer, ssh to the controller node to modify nova and ceilometer -configurations. - -Secondly, prepare image for booting VM, then create a test project and test -user (both default to doctor) for the Doctor tests. - -Thirdly, boot a VM under the doctor project and check the VM status to verify -that the VM is launched completely. Then get the compute host info where the VM -is launched to verify connectivity to the target compute host. Get the consumer -ip according to the route to compute ip and create an alarm event in Ceilometer -using the consumer ip. - -Fourthly, the Doctor components are started, and, based on the above preparation, -a failure is injected to the system, i.e. the network of compute host is -disabled for 3 minutes. To ensure the host is down, the status of the host -will be checked. - -Finally, the notification time, i.e. the time between the execution of step 2 -(Monitor detects failure) and step 9 (Consumer receives failure notification) -is calculated. - -According to the Doctor requirements, the Doctor test is successful if the -notification time is below 1 second. - Maintenance use case """""""""""""""""""" @@ -249,7 +113,8 @@ After all computes are maintained, `admin tool` can send `MAINTENANCE_COMPLETE` to tell maintenance/upgrade is now complete. For `app manager` this means he can scale back to full capacity. -This is the current sample implementation and test case. Real life -implementation is started in OpenStack Fenix project and there we should -eventually address requirements more deeply and update the test case with Fenix -implementation. +There is currently sample implementation on VNFM and test case. In +infrastructure side there is sample implementation of 'admin_tool' and +there is also support for the OpenStack Fenix that extends the use case to +support 'ETSI FEAT03' for VNFM interaction and to optimize the whole +infrastructure mainteannce and upgrade.