Setting up environment

Manual installation and configuration of Sanlock in the OpenStack cloud

Currently, OpenStack Cinder allows to connect iscsi block devices, and separate LUN and connection from the iscsi initiator side are created for each such device.

LVM and sanlock provide an alternative solution in which a one-time LUN is connected to all compute nodes, and a block device for instance is allocated by creating a logical volume in the specified LUN, and, thanks to sanlock locks, the block device is connected to a specific instance or (with multiple connections) instances.

LUN connection

At this point, you need a dedicated LUN on your NAS. For authorization, we will provide access to the dedicated LUN by the name of the initiator. The initiator name for each server must be unique, it is stored in the file /etc/iscsi/initiatorname.iscsi, and by default contains random combination of characters.

If the directory /etc/iscsi does not exist, then the package open-iscsi must be installed:

apt-get update
apt-get install open-iscsi -y

Submit names of all initiators to the system administrator or enter them in settings of the dedicated LUN.

Getting available targets

The next step is getting available targets from the network storage:

iscsiadm -m discovery -t sendtargets -p <san_ip>:<port>

Where: <san_ip> is name or ip-address of NAS; <port> is TCP port for connecting to the NAS (3260 by default, in this case it is optional).

This command will provide list of available targets for the initiator registered on this server. If the network storage has several addresses, then you need to get list of available targets from each address.

Target attaching

Target is connected by the command:

iscsiadm -m node --targetname <target_name> -p <san_ip>:<port> --login

Where: <target_name> is name of the target got in the previous step; <san_ip>:<port> is name and port of the NAS.

Target names for the same LUN for different storage addresses can be the same or different. In any case, it is necessary to connect the LUN by the name of the target that is available on the given name of the NAS.

Repeat command execution for each NAS name or address and target provided for this storage name.

As a result of execution, one or more new block devices with names of the format /dev/sdX should appear on your system.

multipath installing

  1. Installing package:

    apt-get install -y multipath-tools
    
  2. Loading kernel modules:

    modprobe dm-multipath
    modprobe dm-round-robin
    
  3. Creating configuration file /etc/multipath.conf:

    defaults {
            polling_interval        10
            path_selector           "round-robin 0"
            path_grouping_policy    failover
            prio                    const
            path_checker            readsector0
            rr_min_io               100
            max_fds                 8192
            rr_weight               priorities
            failback                immediate
            no_path_retry           10
            user_friendly_names     yes
            find_multipaths         no
            verbosity               2
    }
    
    blacklist {
           devnode \"^(ram|nvme|drbd|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*\"
           devnode \"^hd[a-z]\"
           devnode \"^vd[a-z]\"
           devnode \"^rbd*\"
    }
    
    devices {
            device {
                    vendor "IET"
                    product "VIRTUAL-DISK"
            }
    }
    
  4. Service starting:

    systemctl start multipathd
    
  5. Enabling the service to start automatically when the operating system boots:

    systemctl enable multipathd
    
  6. Checking if multipath has combined access to network drives:

    multipath -ll
    sanlock (3600c0ff00028976702fc3c6001000000) dm-0 HP,MSA 2040 SAN
    size=279G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=50 status=active
    | `- 7:0:0:1 sdf 8:80 active ready running
    `-+- policy='round-robin 0' prio=10 status=enabled
     `- 6:0:0:1 sdd 8:48 active ready running
    
  7. We see non-standard name of the block device sanlock (instead mpathX) in the given output of the command, which is available in the system as /dev/mapper/sanlock. This is done for convenience by adding following lines to configuration file /etc/multipath.conf:

    multipaths {
            multipath {
                    wwid                  3600c0ff00028976702fc3c6001000000
                    alias                 sanlock
            }
    }
    

    This redefinition of the name makes it possible to get correct access to the block device, since upon the next boot of the system, the standard name /dev/mapper/mpathX may change if there are several network volumes in the system.

Setting up LVM and sanlock

  1. Installing packages:

    apt-get install lvm2-lockd sanlock lvm2
    
  2. Service starting:

    systemctl restart sanlock
    

Setting up LVM

  1. Change parameters in the file /etc/lvm/lvm.conf:

    ...
    filter = [ "a|^/dev/mapper/sanlock$|", "r|.*/|" ]
    …
    use_lvmlockd = 1
    ...
    

    We restrict the search for virtual logical volumes only on the /dev/mapper/sanlock device using the filter parameter. If you use LVM on other block devices, then you must specify this explicitly, which will allow you not to consider LVM inside the logical block devices with which instances work.

    Parameter use_lvmlockd=1 enables interaction with the service lvmlockd.

  2. Change in the file /etc/lvm/lvmlocal.conf:

    ...
    host_id = 11
    ...
    

    This parameter must have a unique numeric value for each system and must be between between 1 and 2000. This value is used when creating locks.

  3. Service restarting:

    systemctl restart lvmlockd lvmlocks
    

Setting up LVM and sanlock is complete.

Creating logical volume group

Creating logical volume group is performed once on any of servers to which the LUN is connected.

  1. Preparing a block device:

    pvcreate /dev/mapper/sanlock
    
  2. Creating logical volume group:

    vgcreate --shared --lock-type sanlock vol /dev/mapper/mglobal
    

    Thus we create group of disks vol with type shared using locks like sanlock on a block device /dev/mapper/sanlock.

  3. Switching volume group (VG) to lock mode is performed by the command:

    vgchange --lock-start vol
    

Creating of environment for correct system startup

At this stage we create system settings to start services in the desired sequence when the operating system boots, namely:

  • sanlock;
  • starting group of logical volumes with support for locks;
  • starting aos-agent;
  • starting OpenStack services which use logical volume group.
  1. Modifying the launch aos-agent:

    mkdir -p /etc/systemd/system/aos-agent.service.d
    
  2. Creating file /etc/systemd/system/aos-agent.service.d/override.conf:

    [Unit]
    After=start-lvm-lock.service sanlock.service lvmlocks.service
    
    [Service]
    ExecStartPre=/bin/sleep 10
    
  3. Modifying the launch nova-compute (for cloud compute nodes):

    mkdir -p /etc/systemd/system/nova-compute.service.d
    
  4. Creating file /etc/systemd/system/nova-compute.service.d/override.conf:

    [Unit]
    After=neutron-openvswitch-agent.service start-lvm-lock.service aos-agent.service
    
    [Service]
    ExecStartPre=/bin/sleep 60
    
  5. Modifying the launch cinder-volume (on nodes on which this service is installed):

    mkdir -p /etc/systemd/system/cinder-volume.service.d
    
  6. Creating file /etc/systemd/system/cinder-volume.service.d/override.conf:

    [Unit]
    After=neutron-openvswitch-agent.service start-lvm-lock.service aos-agent.service
    
    [Service]
    ExecStartPre=/bin/sleep 60
    

Creating service that starts logical volume group(s) start-lvm-lock.service

  1. Creating file /usr/local/bin/lvm-lock.sh:

    #!/bin/bash
    count=$(vgs -a | grep " vol " | wc -l)
    while [ "$count" -eq "0" ];
    do
        sleep 5
        count=$(vgs -a | grep " vol " | wc -l)
    done
    
    pv_vol=$(pvs -o UUID,NAME,vg_name | grep " vol " | awk '{print $2;}' | sed "s/\n//g")
    
    for pv in $pv_vol
    do
        while ! [ -e $pv ]
        do
            sleep 5
        done
    done
    
    vgchange --lock-start vol
    
    # chmod +x /usr/local/bin/lvm-lock.sh
    

    This executable file waits for block devices on which a group of logical volumes are located to appear in the system, then changes the mode of operation of thelogical volumes group to work in blocking mode.

    If there are several shared VG in the system, then you need to duplicate the code for each VG with the replacement of `` vol ‘’ with the name of another `` VG ‘’.

  2. Creating file of /etc/systemd/system/start-lvm-lock.service service:

    [Unit]
    After=lvmlockd.service lvmlocks.service sanlock.service iscsi.service iscsid.service
    
    [Service]
    Type=oneshot
    ExecStart=/usr/local/bin/lvm-lock.sh
    
    [Install]
    WantedBy=multi-user.target
    
  3. Starting service:

    systemctl start start-lvm-lock.service
    
  4. Enabling the automatic start of the service when the operating system boots:

    systemctl enable start-lvm-lock.service
    

Creating new type of Cinder voluems located on shared LVM

  1. On the node where the cinder-volume service is running, edit the file /etc/cinder/cinder.conf:

    [DEFAULT]
    rpc_response_timeout = 2888
    …
    enabled_backends = sanlock
    default_volume_type = sanlock
    ...
    [sanlock]
    volume_driver = cinder_sharedlvm_driver.driver.SharedLVMDriver
    agent_transport_url = amqp://aos:password@controller:5672/aos
    volume_group = vol
    lvm_type = default
    lvm_mirrors = 0
    target_protocol = iscsi
    target_helper = tgtadm
    volume_backend_name=sanlock
    volume_clear=none
    volume_clear_size=100
    agent_response_timeout = 2888
    
    
    [nova]
    token_auth_url = http://controller:5000
    auth_section = keystone_authtoken
    auth_type = password
    
  2. Add new backend to the enabled_backends parameter (in the example it is the only one).

  3. Redefine parameter default_volume_type as needed or if this parameter was not previously defined.

  4. The login and password for RabbitMQ must be replaced with the used values in your cloud. Change the name controller to the name of your cloud controller.

  5. Creating volume type sanlock:

    cinder type-create sanlock
    
  6. Binding volumes created with the backend with the name sanlock to the sanlock type:

    openstack volume type set sanlock --property volume_backend_name=sanlock
    
  7. Create consistent access group:

    cinder consisgroup-create --name sanlock sanlock
    

Correction of possible errors

Reasons that lead to errors in the process of work can be different, but one way or another they are associated with the loss of access to the lock partition, as a result of which the group of logical volumes goes into operation without locks, which leads to inoperability of the OpenStack services.

To prevent errors that occur, it is recommended to look at the settings for iscsi connections, namely timeouts, so that the LVM service does not receive errors from the network side about the unavailability of block devices during short-term failures in the local network.

If an error occurs, then it is necessary to restore the operating mode of the locks.

If the compute node does not have any currently used logical disks to restore the operation of group of disks in the locked mode, you must execute the command:

vgchange --lock-start vol

where vol is logical volume group name.

But the presence of active locks will not allow this command to be executed successfully.

Diagnosing the presence of active locks:

sanlock client status
daemon b83efaba-0b75-4b30-9b61-9d78195468c3.cn-193980.
p -1 helper
p -1 listener
p 1010 lvmlockd
p -1 status
s lvm_sanlock:212:/dev/mapper/vol-lvmlock:0
r lvm_sanlock:i2gjby-VEc6-APGD-x9if-ldk2-oxxI-XOwExz:/dev/mapper/vol-lvmlock:77594624:4 p 1010
r lvm_sanlock:PDHtUW-8i1O-ECi4-cxmj-pSAF-hZs2-MVqmcF:/dev/mapper/vol-lvmlock:70254592:2 p 1010

In this case, the system uses 2 logical volumes, information about which is stored in the section``/dev/mapper/vol-lvmlock``.

You must disable the use of disks to deactivate locks.

Getting list of running instances:

virsh list
Id   Name                State
-----------------------------------
2    instance-00000005   running
4    instance-00000007   running
6    instance-0000000d   running

Determining the list of available logical volumes:

lvs -a
 LV                                          VG      Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
 [lvmlock]                                   vol     -wi-ao---- 256,00m
 volume-0d6c493f-697f-48d5-a2dd-8d7aa2c5542c vol     -wi-ao----  60,00g
 volume-697bd726-e4ae-4c6f-8180-f7dee51421cf vol     -wi-------  60,00g
 volume-fe496e6c-0926-4f81-8439-753f42822019 vol     -wi-ao----  30,00g

Determining which of instances are using logical volumes:

virsh dumpxml instance-00000005
   <disk type='block' device='disk'>
     <driver name='qemu' type='raw' cache='none'/>
     <source dev='/dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019'/>
     <backingStore/>
     <target dev='vda' bus='virtio'/>
     <serial>fe496e6c-0926-4f81-8439-753f42822019</serial>
     <alias name='virtio-disk0'/>
     <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
   </disk>

If logical volume of shared group is specified in the configuration, then this instance must be shutted off. The name of the used logical volume must be saved; in the future, it will be needed to restore the instance.

Shutting off the intance:

virsh destroy instance-00000005

After turning off all instances, you need to remove locks from volumes and deactivate them:

lvchange -aln /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019

Volume name for this command was got earlier from diagnostic results.

This command must be executed for all volumes activated in the system.

After deactivating all logical volumes, you must restart lock services and enable the use of locks for the group:

systemctl restart sanlock lvmlockd
vgchange --lock-start vol

Deactivated locks must be restored before resuming intances:

lvchange -aey /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019

Some logical volumes can be used simultaneously on several instances, such volumes must be recovered non-exclusively:

lvchange -asy /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019

After restoring all locks, you can start the stopped instances:

virsh start instance-00000005

Repeat this operation for each previously stopped instance.

At the end it is recommended to restart nova-compute:

systemctl restart nova-compute