[Slackbuilds-users] Libvirt package - small change to /etc/rc.d/rc.libvirt

Tue May 30 19:09:05 UTC 2017

On 24/05/17 13:49, Sebastian Arcus wrote:
> 
> On 05/05/17 15:50, Robby Workman wrote:
>> On Thu, 4 May 2017 12:55:03 +0100
>> Sebastian Arcus <s.arcus at open-t.co.uk> wrote:
>>
>>> I have a small change to suggest for the /etc/rc.d/rc.libvirt script.
>>> The script at the moment does a 'virsh shutdown <vm-name>' on all
>>> running guests, and then, after waiting only 40 seconds, it destroys
>>> all guests which are still running. I think in most circumstances
>>> this is very likely to lead to corrupted guests because:
>>>
>>> 1. Many guests will take longer than 40 seconds to shutdown,
>>> specially if they are Windows guests.
>>> 2. Some Windows guests might be trying to install updates on
>>> shutdown, which can take 10, 20 or even 60 minutes.
>>> 3. If there are a number of guests running and the host is busy
>>> trying to shut them all down at the same time, even Linux guests will
>>> still be in the process of shutting down after 40 seconds.
>>> 4. If it is a setup where users connect to guests remotely, it is
>>> possible that people are in the middle of doing work which would be
>>> lost.
>>>
>>> Would it maybe be safer to do a 'virsh managedsave <vm-name>' on
>>> every running guest instead? I can see many advantages to this
>>> option, including:
>>>
>>> 1. It is much less likely for users who might use the guests remotely
>>> to loose work, as the machine state is suspended.
>>>
>>> 2. Guests stopped with 'managedsave' will start fine with the 'start'
>>> command - so no need to change any settings anywhere else, and the
>>> 'autostart' feature of libvirt will continue to work fine.
>>>
>>> 2. On my server, the 'managedsave' operation takes around 20 seconds
>>> per guest, and it is more predictable in duration than a 'shutdown',
>>> which depends on the OS in the guest, if it is installing updates etc.
>>>
>>> 4. The 'managedsave' command doesn't return immediately, so we can
>>> wait for each guest in turn and know when it is done. Or ampersand
>>> could be used on the command, and then just do a looped check on
>>> 'virsh list' lower down, waiting for the 'managedsave' on all guests
>>> to finish.
>>>
>>> 5. If my understanding is correct, the 'shutdown' command depends on
>>> the guest implementing correctly and acting on ACPI commands, while a
>>> 'managedsave' seems to be under the complete control of libvirt -
>>> which means the script would work correctly out of the box with a
>>> much wider variety of OS's and configurations on the guest side.
>>>
>>> If necessary, I suppose the script could also implement a
>>> 'reboot_guests' separate command, for people to use it to reboot all
>>> running guests - as the "stop" command wouldn't be performing a full
>>> shutdown any more.
>>>
>>> Just a suggestion in case it helps. I'm already very happy with the
>>> current script, as I used to have to write my own to shutdown
>>> kvm/qemu guests :-)
>>
>>
>> This seems like a reasonable change. Can you push it to a git
>> repo/branch somewhere and send me a link, or else send me a
>> patch(set) to merge?
> 
> With some delay, please find attached the suggested changes to the 
> rc.libvirt script. I have made the following the amendments:
> 
> 1. Increased the default timeout to 5 minutes (300 seconds). I think 
> guests should be destroyed only as a last resort - as the risk of 
> corrupting data is very high. If a quicker shutdown is needed - one 
> should look at optimising or troubleshooting things, not destroying 
> guests. I am half-tempted to suggest increasing this to 10 minutes 
> actually.
> 
> 2. Change the default behaviour of the "stop" option to do a managedsave 
> on all running and paused guests - as this is safer for all guest 
> operating systems - and the duration is more predictable (no Windows 
> automatic updates to install during shutdown etc.)
> 
> 3. Added a guests_shutdown option, for those preferring to do a full 
> shutdown instead of a managedsave (not enough space on hdd to save ram 
> images, for example). So rc.local_shutdown would contain:
> 
>      /etc/rc.d/rc.libvirt guests_shutdown
>      /etc/rc.d/rc.libvirt stop
> 
> 4. Added a guests_reboot option, to reboot all running guests. I find 
> this really useful, as, for example, Windows guests need regular reboots 
> to stop them from going bananas  - so an option to reboot all of them at 
> once seems like a useful addition. This doesn't wait and doesn't have a 
> timeout - just issues the command and exits. The idea is that this 
> wouldn't be used during a reboot, so there is no need to check when the 
> command has finished executing.
> 
> Note: when issuing the managedsave command, the guests go first through 
> a 'paused' state, then they show up as 'shut off' in virsh --list. Thus 
> it is impossible to tell between the guests which were already paused 
> initially, and the ones in transition to the managedsave state, while 
> waiting for the command to complete. Because of this, the only way to 
> proceed is to issue a 'managedsave' on both running and already paused 
> guests.
> 
> On the other hand, 'guests_shutdown' only acts on running guests - an 
> ACPI shutdown command is unlikely to be acted on by a paused guest anyway.
> 
> I also suggest adding the following notes to libvirt-info:
> 
> "The enclosed rc.libvirt script will to a 'managedsave' on all running 
> and paused guests when issuing 'rc.libvirt stop'. Please note that this 
> saves the RAM of each guest to the host hdd (by default under 
> /var/lib/libvirt/qemu/save) - so make sure enough space is available. If 
> you prefer to perform a full shutdown on all running guests instead, 
> issue a 'rc.libvirt guests_shutdown' followed by 'rc.libvirt shutdown'.
> 
> By default 'rc.libvirt stop' and 'rc.libvirt guests_shutdown' will wait 
> a maximum of 5 minutes for all guests to shutdown, after which any 
> guests still running will be destroyed. Adjust this to a suitable value 
> for your system, as destroying a running guest carries a high risk of 
> data loss!
> 
> There is also a 'guests_reboot' for rebooting all running guests."

Has anybody had a chance to see/test this?