Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 14 Nov 2020 13:03:56 +0100
From:      Mateusz Piotrowski <0mp@FreeBSD.org>
To:        junchoon@dec.sakura.ne.jp, freebsd-current@freebsd.org
Subject:   Re: Shutdown errors and timeout
Message-ID:  <7316979e-1a87-791a-075c-7f3d7a75f43f@FreeBSD.org>
In-Reply-To: <20201114091951.4888878c686d07ad73e55da8@dec.sakura.ne.jp>
References:  <65b1ff51-a946-61d0-79d9-104c1e053554@gmail.com> <20201113.200459.520180046556100070.yasu@utahime.org> <20201114091951.4888878c686d07ad73e55da8@dec.sakura.ne.jp>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

On 11/14/20 1:19 AM, Tomoaki AOKI wrote:
> On Fri, 13 Nov 2020 20:04:59 +0900 (JST)
> Yasuhiro KIMURA <yasu@utahime.org> wrote:
>
>> From: Johan Hendriks <joh.hendriks@gmail.com>
>>
>>> Hello all, i have two FreeBSD 13 machines, one is a bare metal and one
>>> is virtualbox machine which i both update about once a week.
>>>
>>> The vritual machine seems to fail stopping something and gives a
>>> timeout after 90 sec.
>>>
>>> The console ends with
>>>
>>> Writing entropy file: .
>>> Writing early boot entropy file: .
>>>
>>> 90 second watchdog timeout expired. Shutdown terminated.
>>> Fri Nov13 11:20:40 CEST 2020
>>> Nov 13 11:20:40 test-head init[1]: /etc/rc.shutdown terminated
>>> abnormally, going to single user mode
>>> ...
>>>
>>> On the bare metal machine i see the following.
>>> Writing entropy file: .
>>> Writing early boot entropy file: .
>>> cannot unmount '/var/run': umount failed
>>> cannot unmount '/var/log': umount failed
>>> cannot unmount '/var': umount failed
>>> cannot unmount '/usr/home': umount failed
>>> cannot unmount '/usr': umount failed
>>> cannot unmount '/': umount failed
>>>
>> (snip)
>>> The pools have not been upgraded after the latest openzfs import,
>>> maybe that is related?
>>>
>>> FreeBSD test-freebsd-head 13.0-CURRENT FreeBSD 13.0-CURRENT #2
>>> r367585:
>>>
>>> First thing i noticed is about a week ago.
>> I'm facing same problem with 13.0-CURRENT amd64 r367487 and
>> virtualbox. In my case I use autofs to mount remote file system of
>> 12.2-RELEASE amd64 server with NFSv4. When there is still filesystem
>> mounted by autofs, then watchdog timeout happens while shutdown. The
>> watchdog timeout can be worked around by executing `automount -fu`
>> before shutting down. But 'cannot unmount ...' error messages are
>> still displayed.
>>
>> I added 'rc_debug="YES"' to /etc/rc.conf and checked which rc script
>> causes this message. Then it is displayed when following `zfs_stop`
>> function of /etc/rc.d/zfs is executed.
>>
>> ----------------------------------------------------------------------
>> zfs_stop_main()
>> {
>> 	zfs unshare -a
>> 	zfs unmount -a
>> }
>> ----------------------------------------------------------------------
>>
>> At this point syslog process still running and it opens some files
>> under /var/log. So it make sence that `zfs unmount -a` results in the
>> message.
>>
>> Probably order of executing each rc script in shutdown time should be
>> changed so `/etc/rc.d/zfs faststop` is executed after all processes
>> other than `init' are exited.
> This happens on stable/12, too.
> As a workaround, reverting r367291 on head (r367546 on stable/12)
> would stop the issue until this is really fixed.
>
> If you have shared dataset or jail(s) mounting dataset, the workaround
> would be discouraged. Read commit message for detail.

I've committed r367291 and r367546.

I am not sure if I can think of a proper fix for the described issues, so I guess the best idea 
would be to revert those changes for now until we figure out how to do it properly.

Sorry for the regression.

Best,

Mateusz




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7316979e-1a87-791a-075c-7f3d7a75f43f>