Setting up environment¶
- Manual installation and configuration of Sanlock in the OpenStack cloud
Manual installation and configuration of Sanlock in the OpenStack cloud¶
Currently, OpenStack Cinder allows to connect iscsi block devices, and separate LUN and connection from the iscsi initiator side are created for each such device.
LVM
and sanlock
provide an alternative solution in which a one-time LUN is connected to all compute nodes, and a block device for instance is allocated by creating a logical volume in the specified LUN
, and, thanks to sanlock locks, the block device is connected to a specific instance or (with multiple connections) instances.
LUN connection¶
At this point, you need a dedicated LUN on your NAS. For authorization, we will provide access to the dedicated LUN by the name of the initiator. The initiator name for each server must be unique, it is stored in the file /etc/iscsi/initiatorname.iscsi
, and by default contains random combination of characters.
If the directory /etc/iscsi
does not exist, then the package open-iscsi
must be installed:
apt-get update
apt-get install open-iscsi -y
Submit names of all initiators to the system administrator or enter them in settings of the dedicated LUN.
Getting available targets¶
The next step is getting available targets from the network storage:
iscsiadm -m discovery -t sendtargets -p <san_ip>:<port>
Where:
<san_ip>
is name or ip-address of NAS;
<port>
is TCP port for connecting to the NAS (3260 by default, in this case it is optional).
This command will provide list of available targets for the initiator registered on this server. If the network storage has several addresses, then you need to get list of available targets from each address.
Target attaching¶
Target is connected by the command:
iscsiadm -m node --targetname <target_name> -p <san_ip>:<port> --login
Where:
<target_name>
is name of the target got in the previous step;
<san_ip>:<port>
is name and port of the NAS.
Target names for the same LUN
for different storage addresses can be the same or different. In any case, it is necessary to connect the LUN
by the name of the target that is available on the given name of the NAS.
Repeat command execution for each NAS name or address and target provided for this storage name.
As a result of execution, one or more new block devices with names of the format /dev/sdX
should appear on your system.
multipath installing¶
Installing package:
apt-get install -y multipath-tools
Loading kernel modules:
modprobe dm-multipath modprobe dm-round-robin
Creating configuration file
/etc/multipath.conf
:defaults { polling_interval 10 path_selector "round-robin 0" path_grouping_policy failover prio const path_checker readsector0 rr_min_io 100 max_fds 8192 rr_weight priorities failback immediate no_path_retry 10 user_friendly_names yes find_multipaths no verbosity 2 } blacklist { devnode \"^(ram|nvme|drbd|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*\" devnode \"^hd[a-z]\" devnode \"^vd[a-z]\" devnode \"^rbd*\" } devices { device { vendor "IET" product "VIRTUAL-DISK" } }
Service starting:
systemctl start multipathd
Enabling the service to start automatically when the operating system boots:
systemctl enable multipathd
Checking if
multipath
has combined access to network drives:multipath -ll sanlock (3600c0ff00028976702fc3c6001000000) dm-0 HP,MSA 2040 SAN size=279G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw |-+- policy='round-robin 0' prio=50 status=active | `- 7:0:0:1 sdf 8:80 active ready running `-+- policy='round-robin 0' prio=10 status=enabled `- 6:0:0:1 sdd 8:48 active ready running
We see non-standard name of the block device
sanlock
(insteadmpathX
) in the given output of the command, which is available in the system as/dev/mapper/sanlock
. This is done for convenience by adding following lines to configuration file/etc/multipath.conf
:multipaths { multipath { wwid 3600c0ff00028976702fc3c6001000000 alias sanlock } }
This redefinition of the name makes it possible to get correct access to the block device, since upon the next boot of the system, the standard name
/dev/mapper/mpathX
may change if there are several network volumes in the system.
Setting up LVM and sanlock¶
Installing packages:
apt-get install lvm2-lockd sanlock lvm2
Service starting:
systemctl restart sanlock
Setting up LVM¶
Change parameters in the file
/etc/lvm/lvm.conf
:... filter = [ "a|^/dev/mapper/sanlock$|", "r|.*/|" ] … use_lvmlockd = 1 ...
We restrict the search for virtual logical volumes only on the
/dev/mapper/sanlock
device using thefilter
parameter. If you use LVM on other block devices, then you must specify this explicitly, which will allow you not to consider LVM inside the logical block devices with which instances work.Parameter
use_lvmlockd=1
enables interaction with the servicelvmlockd
.Change in the file
/etc/lvm/lvmlocal.conf
:... host_id = 11 ...
This parameter must have a unique numeric value for each system and must be between between 1 and 2000. This value is used when creating locks.
Service restarting:
systemctl restart lvmlockd lvmlocks
Setting up LVM
and sanlock
is complete.
Creating logical volume group¶
Creating logical volume group is performed once on any of servers to which the LUN is connected.
Preparing a block device:
pvcreate /dev/mapper/sanlock
Creating logical volume group:
vgcreate --shared --lock-type sanlock vol /dev/mapper/mglobal
Thus we create group of disks
vol
with typeshared
using locks likesanlock
on a block device/dev/mapper/sanlock
.Switching volume group (VG) to lock mode is performed by the command:
vgchange --lock-start vol
Creating of environment for correct system startup¶
At this stage we create system settings to start services in the desired sequence when the operating system boots, namely:
- sanlock;
- starting group of logical volumes with support for locks;
- starting aos-agent;
- starting OpenStack services which use logical volume group.
Modifying the launch
aos-agent
:mkdir -p /etc/systemd/system/aos-agent.service.d
Creating file
/etc/systemd/system/aos-agent.service.d/override.conf
:[Unit] After=start-lvm-lock.service sanlock.service lvmlocks.service [Service] ExecStartPre=/bin/sleep 10
Modifying the launch
nova-compute
(for cloud compute nodes):mkdir -p /etc/systemd/system/nova-compute.service.d
Creating file
/etc/systemd/system/nova-compute.service.d/override.conf
:[Unit] After=neutron-openvswitch-agent.service start-lvm-lock.service aos-agent.service [Service] ExecStartPre=/bin/sleep 60
Modifying the launch
cinder-volume
(on nodes on which this service is installed):mkdir -p /etc/systemd/system/cinder-volume.service.d
Creating file
/etc/systemd/system/cinder-volume.service.d/override.conf
:[Unit] After=neutron-openvswitch-agent.service start-lvm-lock.service aos-agent.service [Service] ExecStartPre=/bin/sleep 60
Creating service that starts logical volume group(s) start-lvm-lock.service¶
Creating file /usr/local/bin/lvm-lock.sh:
#!/bin/bash count=$(vgs -a | grep " vol " | wc -l) while [ "$count" -eq "0" ]; do sleep 5 count=$(vgs -a | grep " vol " | wc -l) done pv_vol=$(pvs -o UUID,NAME,vg_name | grep " vol " | awk '{print $2;}' | sed "s/\n//g") for pv in $pv_vol do while ! [ -e $pv ] do sleep 5 done done vgchange --lock-start vol # chmod +x /usr/local/bin/lvm-lock.sh
This executable file waits for block devices on which a group of logical volumes are located to appear in the system, then changes the mode of operation of thelogical volumes group to work in blocking mode.
If there are several
shared VG
in the system, then you need to duplicate the code for eachVG
with the replacement of `` vol ‘’ with the name of another `` VG ‘’.Creating file of
/etc/systemd/system/start-lvm-lock.service
service:[Unit] After=lvmlockd.service lvmlocks.service sanlock.service iscsi.service iscsid.service [Service] Type=oneshot ExecStart=/usr/local/bin/lvm-lock.sh [Install] WantedBy=multi-user.target
Starting service:
systemctl start start-lvm-lock.service
Enabling the automatic start of the service when the operating system boots:
systemctl enable start-lvm-lock.service
Correction of possible errors¶
Reasons that lead to errors in the process of work can be different, but one way or another they are associated with the loss of access to the lock partition, as a result of which the group of logical volumes goes into operation without locks, which leads to inoperability of the OpenStack services.
To prevent errors that occur, it is recommended to look at the settings for iscsi connections, namely timeouts, so that the LVM service does not receive errors from the network side about the unavailability of block devices during short-term failures in the local network.
If an error occurs, then it is necessary to restore the operating mode of the locks.
If the compute node does not have any currently used logical disks to restore the operation of group of disks in the locked mode, you must execute the command:
vgchange --lock-start vol
where vol
is logical volume group name.
But the presence of active locks will not allow this command to be executed successfully.
Diagnosing the presence of active locks:
sanlock client status
daemon b83efaba-0b75-4b30-9b61-9d78195468c3.cn-193980.
p -1 helper
p -1 listener
p 1010 lvmlockd
p -1 status
s lvm_sanlock:212:/dev/mapper/vol-lvmlock:0
r lvm_sanlock:i2gjby-VEc6-APGD-x9if-ldk2-oxxI-XOwExz:/dev/mapper/vol-lvmlock:77594624:4 p 1010
r lvm_sanlock:PDHtUW-8i1O-ECi4-cxmj-pSAF-hZs2-MVqmcF:/dev/mapper/vol-lvmlock:70254592:2 p 1010
In this case, the system uses 2 logical volumes, information about which is stored in the section``/dev/mapper/vol-lvmlock``.
You must disable the use of disks to deactivate locks.
Getting list of running instances:
virsh list
Id Name State
-----------------------------------
2 instance-00000005 running
4 instance-00000007 running
6 instance-0000000d running
Determining the list of available logical volumes:
lvs -a
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
[lvmlock] vol -wi-ao---- 256,00m
volume-0d6c493f-697f-48d5-a2dd-8d7aa2c5542c vol -wi-ao---- 60,00g
volume-697bd726-e4ae-4c6f-8180-f7dee51421cf vol -wi------- 60,00g
volume-fe496e6c-0926-4f81-8439-753f42822019 vol -wi-ao---- 30,00g
Determining which of instances are using logical volumes:
virsh dumpxml instance-00000005
<disk type='block' device='disk'>
<driver name='qemu' type='raw' cache='none'/>
<source dev='/dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019'/>
<backingStore/>
<target dev='vda' bus='virtio'/>
<serial>fe496e6c-0926-4f81-8439-753f42822019</serial>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>
If logical volume of shared group is specified in the configuration, then this instance must be shutted off. The name of the used logical volume must be saved; in the future, it will be needed to restore the instance.
Shutting off the intance:
virsh destroy instance-00000005
After turning off all instances, you need to remove locks from volumes and deactivate them:
lvchange -aln /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019
Volume name for this command was got earlier from diagnostic results.
This command must be executed for all volumes activated in the system.
After deactivating all logical volumes, you must restart lock services and enable the use of locks for the group:
systemctl restart sanlock lvmlockd
vgchange --lock-start vol
Deactivated locks must be restored before resuming intances:
lvchange -aey /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019
Some logical volumes can be used simultaneously on several instances, such volumes must be recovered non-exclusively:
lvchange -asy /dev/vol/volume-fe496e6c-0926-4f81-8439-753f42822019
After restoring all locks, you can start the stopped instances:
virsh start instance-00000005
Repeat this operation for each previously stopped instance.
At the end it is recommended to restart nova-compute
:
systemctl restart nova-compute