Date: Sun, 30 Oct 2016 11:20:59 -0700 From: David Wolfskill <david@catwhisker.org> To: stable@freebsd.org Subject: (Circumvented) insta-panic from "pkg upgrade" stable/11 @r308090 Message-ID: <20161030182059.GC1203@albert.catwhisker.org>
next in thread | raw e-mail | index | archive | help
--2JFBq9zoW8cOFH7v Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Summary: I've worked around this -- at least, for now -- but a process I've been using every Sunday since July 2015 on a pair of machines suddenly failed this morning (on just one of the machines). For background, (if you're interested): * <http://www.catwhisker.org/~david/FreeBSD/upgrade.html> * <http://www.catwhisker.org/~david/FreeBSD/convert_i386_amd64.html> * <http://www.catwhisker.org/~david/FreeBSD/history/> So... this morning, the update from: FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #95 r307797M= /307819:1100505: Sun Oct 23 03:52:44 PDT 2016 root@freebeast.catwhisker= =2Eorg:/common/S1/obj/usr/src/sys/ALBERT amd64 to: FreeBSD albert.catwhisker.org 11.0-STABLE FreeBSD 11.0-STABLE #102 r308090= M/308101:1100506: Sun Oct 30 04:09:05 PDT 2016 root@freebeast.catwhiske= r.org:/common/S1/obj/usr/src/sys/ALBERT amd64 Just Worked -- as usual. I rebooted both "production" machines, logged in, fired up tmux on each, rotated my typescript files, then fired up script and ran the csh command alias I use on both machines to update the installed ports (from the locally-built packages that reside on my build machine; the production machines access them via NFS). For one of the machines ("bats"), things Just Worked (again). For the other ("albert"), I lost contact. Eventually (after I actually got up and went to the room where the machines are), I found that it had rebooted. Further experimentation showed that in the command sequence: mount -u -w / && \ mount -u -w /usr && \ ( cd /etc/mail && make stop-mta ) && \ service dovecot stop && \ service apache24 stop && \ pkg upgrade it got through "service apache24 stop" OK, but when I issued "pkg upgrade" -- the screen blanked, and the machine started rebooting. On reboot, the /var file system (UFS2+soft updates) showed the typescript files from before the above efforts -- not even the "rotation" (mentioned above) was reflected. (The typescript files in question reside in /var/tmp on the machine.) Oh: and the initial "fsck -p" for /var indicated that fsck needed to be re-run (so when I booted to single-user mode, I did just that). There was no hint in the logs of why the reboot (panic?) occurred. One point that may be at issue is that for bats (where things still worked), I manually mount the package repository from the build machine to bats:/mnt, while for albert (where things failed), I have depended on autofs to handle the mounting as needed (since I need albert to run autofs anyway, and bats does not). E.g.: bats(11.0-S)[1] cat /usr/local/etc/pkg/repos/custom.conf=20 custom: { # url: file:///net/freebeast/tank/poudriere/poudriere/data/packages= /11amd64-ports-home url: file:///mnt enabled: yes, } bats(11.0-S)[2]=20 vs.: albert(11.0-S)[10] cat /usr/local/etc/pkg/repos/custom.conf=20 custom: { url: file:///net/freebeast/tank/poudriere/poudriere/data/packages/1= 1amd64-ports-home enabled: yes, } albert(11.0-S)[11]=20 In the process of finally(!) getting albert's "pkg upgrade" working, I did 2 things differently: * I did not run under tmux. I can't imagine that this contributed, but I cite it for completeness. * Prior to invoking "pkg upgrade", I issued "ls /net/freebeast/tank/poudriere/poudriere/data/packages/11amd64-ports-h= ome" (and got a sane result), so the mount was satisfied prior to "pkg upgrade" being run. I note, too, that one of the times I logged in to albert, the login seemed to "hang" for a while. When I hit ^T (several times), it was apparent that the process was trying to use autofs to mount my home directory (from the FreeNAS box, "grundoon")... and that effort timed out -- I ended up with the whine about inability to find my home directory. And then when I logged out & back in again, /net/grundoon/mnt/tank/homedirs showed up 3 times in the output of "df". So perhaps there's something involving autofs and timing... though that doesn't seem like much to go on. Peace, david --=20 David H. Wolfskill david@catwhisker.org Those who would murder in the name of God or prophet are blasphemous coward= s. See http://www.catwhisker.org/~david/publickey.gpg for my public key. --2JFBq9zoW8cOFH7v Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQF8BAEBCgBmBQJYFjoLXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRDQ0I3Q0VGOTE3QTgwMUY0MzA2NEQ3N0Ix NTM5Q0M0MEEwNDlFRTE3AAoJEBU5zECgSe4XpU8IALzqVsKL/AS6xorrdJXkAnOg bYkfojiz+0Qs/5klBdaOlX3PZJ5nXhq6s40dZuZxIJlCOuhoCfxPJytWudIWiC1z WjEGRi15CW6wk+d3G7+jISf9BlQoWMj2/8cK8Q220o9BJoo7qQby2LiqHIt/xCLC 6RyBQrR9LGkTXYg02NVwqw9xYgTuXLmGrM6lZXRx9+Rei+Va2AMBueOvfXwZ/9b9 25VETGXYEWEj7fH6eDObKs7SpCw1Toc7lC21uWhRRhSCoru85Xm09vV50oOLdq8l fFBmfMrjqxWs+XcQz4dZP3wNTv9G7rlNRGj43n1n41j827Gy8gmql3DbVfZGEw8= =wGhZ -----END PGP SIGNATURE----- --2JFBq9zoW8cOFH7v--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161030182059.GC1203>