Configuring automatic evacuation using a shared storage

  1. Open configuration file cloud_manager.conf and adjust settings:

    Section [host_tasks]:

    • allow_evacuate_host = True (parameter allows (or disables) host evacuation, default value is True)
    • evacuation_retries = 2 (parameter defines the number of attempts to evacuate instances from the compute node, the default value is 2)

    Note

    You also need to make sure that the values of the deny_evacuate parameter do not specify nodes for which evacuation is prohibited. The nodes specified in this parameter will not be automatically evacuated.

    Section [node_tracker]:

    • enabled = True (parameter allows checking the status of compute nodes, default value is True)
    • max_down_hosts = 1, 2, 3... (≥ 1) (the parameter defines the maximum allowed number of computing nodes in the down status, except for backup ones. If this number is exceeded, automatic evacuation is not performed for any of the nodes. Negative numbers are not allowed. By default, the parameter is set to 0, automatic evacuation is not performed)
    • mutex = 3 (the parameter determines the number of attempts to determine the status of the hypervisor when switching to the down status before starting the handler, by default the parameter has a value of 3)
    • loop_time = 30 (the parameter defines the time interval between checks of the status of computing nodes in seconds, the default value is 30)

    Section [extra_availability_check]:

    • enabled =  True (parameter for enabling or disabling additional checks for the availability of compute nodes through storage, the default parameter is False)
    • delay = 60 (parameter defines the delay when retrying to read the compute node state file in seconds, the default parameter is 60)
    • attempts = 2 (parameter defines the number of attempts to read the file, the default parameter is 2)
    • instance_rate = 100 (parameter determines the required percentage of running instances, the default parameter is 100)

    Parameter instance_rate determines the correctness of making decisions about the emergency exit of the instance.

    If the time of writing to the file is later than the time of the transition of the status of the computing node to DOWN, then the percentage of running instances for the node is considered.

    If the percentage of running instances for compute nodes is less than the percentage specified in the configuration file, then the DOWN status is considered incorrect and is processed according to the standard algorithm.

  2. Restart CloudManager module services for configuration file changes to take effect:

    sudo i - superuser mode
    systemctl restart aos-cloud-manager-*
    
  3. In Dashboard, create an availability storage (Administrator - Infrastructure - Availability Storages), assign the storage to the required hypervisor:

    ../../_images/assigning_storage_to_hypervisor.png

    Assigning storage to hypervisor

#.Create instance by selecting the required hypervisor to which the availability check store is assigned.

As a result of the actions taken, instances will be evacuated using the shared storage when the automatic evacuation is started.