Date: Tue, 23 Apr 2019 06:57:30 -0700 From: Jason Barbier <kusuriya@serversave.us> To: Victor Sudakov <vas@mpeks.tomsk.su> Cc: freebsd-virtualization@freebsd.org Subject: Re: [vm-bhyve] Windows 2012 and 2016 servers guests would not stop Message-ID: <3E2402FD-CA3E-4471-B7A9-6D6B0CB1B900@serversave.us> In-Reply-To: <20190423041358.GA2992@admin.sibptus.ru> References: <20190421154616.GA59283@admin.sibptus.ru> <201904211708.x3LH8DiK028282@gndrsh.dnsmgr.net> <20190423024301.GA940@admin.sibptus.ru> <d33ea04e-4f4f-253a-b658-e6ecfd2308a6@redbarn.org> <20190423041358.GA2992@admin.sibptus.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Apr 22, 2019, at 21:13, Victor Sudakov <vas@mpeks.tomsk.su> wrote: >=20 > Paul Vixie wrote: >>=20 >> Victor Sudakov wrote on 2019-04-22 19:43: >> ... >>>> And the implementation is pretty brutal: >>>> # 'vm stopall' >>>> # stop all bhyve instances >>>> # note this will also stop instances not started by vm-bhyve >>>> # >>>> core::stopall(){ >>>> local _pids=3D$(pgrep -f 'bhyve:') >>>>=20 >>>> echo "Shutting down all bhyve virtual machines" >>>> killall bhyve >>>> sleep 1 >>>> killall bhyve >>>> wait_for_pids ${_pids} >>>> } >>=20 >> yow. Eew no that is painful to read! >=20 > To be sure, I was unable to find the above code (as is) in > /usr/local/lib/vm-bhyve/vm-* (the vm-bhyve port 1.3.0). It may be that > something more intelligent is happening in a more recent version, like a > sequential shutdown. However, "kill $pid; sleep 1; kill $pid" seems to > be still present. >=20 >>=20 >>>>=20 >>>> I wonder what the effect of the second kill is, >>>> that seems odd. >>>=20 >>> Indeed. >>=20 >> the first killall will cause each client OS to see a soft shutdown=20 >> signal. the sleep 1 gives them some time to flush their buffers. the=20 >> second killall says, time's up, just stop. >>=20 >> i think this is worse than brutal, it's wrong. consider freebsd's own=20 >> work flow when trying to comply with the first soft shutdown it got: >>=20 >> https://github.com/freebsd/freebsd/blob/master/sbin/reboot/reboot.c#L220 >>=20 >> this has bitten me more than once, because using "pageins" as a proxy=20 >> for "my server processes are busy trying to synchronize their user mode=20= >> state" is inaccurate. i think _any_ continuing I/O should be reason to=20= >> wait the full 60 seconds. >=20 > Would it be beneficial to just hack /usr/local/lib/vm-bhyve/vm-* ? >>=20 >> and so i think the "sleep 1" above should be a "sleep 65". I would echo this and say it should probably be done in a way that you can h= ave a sliding window, some servers and services are not very fault tolerant o= n their own. The example that springs to mind for me is the busy AD domain c= ontroller I manage. It takes 15 mins to flush the disk buffer, if I kill it b= efore the buffer flushes I will have a bad day as my domain at best loses a f= ew transactions at worst is corrupted.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E2402FD-CA3E-4471-B7A9-6D6B0CB1B900>