From owner-freebsd-stable@FreeBSD.ORG Thu Dec 4 16:51:20 2014 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B07D53C3 for ; Thu, 4 Dec 2014 16:51:20 +0000 (UTC) Received: from tau.lfms.nl (tau.lfms.nl [93.189.130.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3EFBA11F for ; Thu, 4 Dec 2014 16:51:19 +0000 (UTC) Received: from sim.dt.lfms.nl (dt.lfms.nl [83.84.86.53]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by tau.lfms.nl (Postfix) with ESMTPS id 80A4B8931B for ; Thu, 4 Dec 2014 17:51:11 +0100 (CET) Received: from [192.168.130.112] (borax.dt.lfms.nl [192.168.130.112]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sim.dt.lfms.nl (Postfix) with ESMTPS id 5842C9C09084 for ; Thu, 4 Dec 2014 17:51:11 +0100 (CET) From: Walter Hop Message-Id: <20FF25C9-B2D3-490C-BD01-6F834017BDA1@spam.lifeforms.nl> Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) Subject: Re: System hang on shutdown when running freebsd-update Date: Thu, 4 Dec 2014 17:51:11 +0100 References: <2B4EEDA7-C3D9-465A-B0C9-B5728D438077@spam.lifeforms.nl> <5B600B90-9967-4031-AB9B-40ADDBE56CAF@spam.lifeforms.nl> To: freebsd-stable@FreeBSD.org In-Reply-To: <5B600B90-9967-4031-AB9B-40ADDBE56CAF@spam.lifeforms.nl> X-Mailer: Apple Mail (2.1993) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Dec 2014 16:51:20 -0000 Another update. I narrowed down the issue due to /sbin/init being = replaced. This might be the magic that freebsd-update does to make the = crash happen. (Although there might be more situations that cause a = hang.) I can completely reliably trigger the hang on a default 10.1-RELEASE = install on UFS2 in VMware Fusion with the following procedure: # chflags noschg /sbin/init # cp -Rp /sbin/init /sbin/init2 # rm -f /sbin/init # mv /sbin/init2 /sbin/init # chflags schg /sbin/init # reboot =3D> Hang after "All buffers synced." This looks useful because we don=E2=80=99t have to do a full = freebsd-update to get the hang now. I=E2=80=99d be interested to see if others can reproduce, because for me = in VMware it=E2=80=99s 100% of the times. It doesn=E2=80=99t happen on 10.0 kernel, also not on 10.1 ZFS, and not = on 10.1 UFS2 with softupdates disabled. We updated 15 machines to 10.1 with a modified upgrade procedure (first = disable softupdates, then upgrade to 10.1, then re-enable softupdates). = Without softupdates there=E2=80=99s no lockup. So: 10.1 + UFS2 + softupdates + replacing /sbin/init =3D hang+100% CPU = on next reboot/root unmount/root remount as readonly. I don=E2=80=99t know if we can research more, I wonder what would happen = on CURRENT but I don=E2=80=99t have time to build right now... Cheers, WH > On 29 Nov 2014, at 13:17, Walter Hop = wrote: >=20 > I=E2=80=99m revisiting this issue, since unfortunately I still have it = more often than not when upgrading to 10.1-RELEASE. >=20 > As Kevin Oberman suspected earlier in the thread, the issue seems to = lie in unmounting. The same hang occurs when dropping to single user = mode and trying to re-mount root as readonly. >=20 > I=E2=80=99ve also had another unmount issue after upgrading to = 10.1-RELEASE: >=20 > All buffers synced. > softdep_waitidle: Failed to flush worklist for 0xfffff800027b4330 > unmount of / failed (BUSY) >=20 > I=E2=80=99ve created a PR with the information I have: = https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D195458 >=20 > With the EOL date of FreeBSD 10.0 on the horizon, it=E2=80=99s making = me a little skittish. > Any ideas of experiments I can do to get more info out of a = problematic box, or other options to take? >=20 > Thanks! > WH >=20 >> On 28 Oct 2014, at 23:09, Walter Hop = wrote: >>=20 >> I noticed this same hang after upgrading from 10.0-RELEASE to = 10.1-RC3 in a VM running under VMware Fusion, so the problem appears = still present. >>=20 >> I could only make it happen in the single uptime just after the = system was freebsd-updated from FreeBSD 10.0 to 10.1-RC3. >>=20 >> Here is a screenshot: http://lf.ms/wait-for-reboot.png >>=20 >> It did not make any progress after 2 hours of waiting. When = restarting the VM, the disk was dirty. >>=20 >> Some interesting facts: >> - Note "swapoff: /dev/da0p2: Cannot allocate memory" in the = screenshot which might pose a clue. I haven=E2=80=99t seen this = normally. >> - FreeBSD does respond to ping while it is busy, so it is not a = complete "freeze". >> - The VM is at 100% CPU while this is going on. >=20 > --=20 > Walter Hop | PGP key: https://lifeforms.nl/pgp >=20 > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" --=20 Walter Hop | PGP key: https://lifeforms.nl/pgp