Availability storages

Availability storages are intended for additional checking of the correctness of the status change Nova-compute and checking the percentage of running instances to compute node when it transitions from the up status to the down status. After creating and assigning an availability check store data about the state of instances on the host is collected and the data is written to the storage file using the module Agent. The repository has a directory for the compute node, as well as a directory for the host with CloudManager installed (controller). If in the config file of the CloudManager module additional availability check is configured, then in the case when the status compute node changes from up to down, and above node no operation has been performed using power management tools, an additional availability check will be triggered via storage common to compute node and the host on which CloudManager is installed. The check will be carried out in stages:

  • Attempt is made to read data from a file into storage which is also connected to compute node on which status data is collected instances using the module Agent by means of the CloudManager module;
  • If the attempt to read data is unsuccessful then CloudManager will try to get data from others storages connected to the virtual host;
  • If data has not been read from any of the availability storages, then a delay occurs when rereading data from all availability storages connected to compute nodes the delay time is determined by the value of the DELAY parameter from configuration file. When the number of attempts to read data has exceeded the value of the ATTEMPTS parameter set in config file compute node is considered to be turned off incorrectly and processing is performed according to the standard scenario;
  • If data reading was successful and writing to the file occurred before the node transitioned from up to down then attempts are made to read data from other availability storages connected to compute node. If the data reading was successful and the time of writing to the file is later than the transition compute node from up to down then the percentage of running instances on the host;
  • If the percentage of running instances on a node is greater than or equal to the value specified in configuration file then the status change compute node from up to down is considered correct. The node is not included in the list of lost, and additional actions are not required on it;
  • If the percentage of running instances on a node is less than the value from из configuration file then changing the status of the node is considered invalid. Compute node is marked as lost and is taken into account when calculating the number of nodes in the down status for comparison with the value of the MAX_DOWN_HOSTS parameter of configuration file.