From owner-freebsd-stable@freebsd.org Mon Oct 22 06:24:48 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 97A7EFEB024 for ; Mon, 22 Oct 2018 06:24:48 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic302-20.consmr.mail.gq1.yahoo.com (sonic302-20.consmr.mail.gq1.yahoo.com [98.137.68.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 16BD38A949 for ; Mon, 22 Oct 2018 06:24:48 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: ErJBlfwVM1lfgwU6ltYo8BAg0SJQD7.4XnuSdvhbbh6XYcvxnlNyIKEtDbthJkb 5WJYfDF3JzNMxa5s6mY3GwqWjMh4f7KgphAZh_bABXJWDllGuEKNBiVUZNT5aWOr2jt1a59lXeNc Rq5oRj6T0rsaKOPiUrgVDGiUU85IWiTUfCm8zdZuCITHoM8g8yGggNBBtiDoq2hfWIYQopN_odwE eOt6LyzkTsry0uw5.xKclpYo2atTdfwfoAyli66rITFq4OUvWs55gvTymD4nvkVJi2lRRNorp86v y8f.pfr9sijGagUexB35bmB.Oj_Apj45lzewZp_9xZsk9TOE_T3FBDew9EQ1KDY8kIvXl9hIKXbu 6MdeBPlHdVveyo0WdLEG3sNZ5f_TrC39lU7a5.wnqMn8dFxh4Cpat6ZPgWDWf_gPQkafaIrRX5fc 4n7rTTG2_5LQW3DrNIrMMKvRMZlyyYtQr1Uuhn01Ll7Sg_zDmLen9A_76pgFJBvgq3bZzopey_Fz zFo0So_05fM4vQmZCoRU_Kh5MjiMr6Q7LsmhPCVT7ZU7Vthn4ZzCicxSJd6vZHImKapyL8JixfK5 1xy_kPdvDI4N10.aDv2FRwFx.uEeWmaqhD7GWuZZpk7ofr0CzU9xJCjtx_Td76Y3aALc6WBnDr2O d9XswEmMqtpqx9M6fNxICiBEoHQIQteRXcTd1tTWpJqwO0ZVZn3eu5_I_5.0GhzLslQe6QKSsFUR iTpkyvVL8DQs3x7N8megRp1oRJYR2cLFixHDI4GI4_tXoxwkj2syr7exmgz8.wDIHECEGuHyAZNA 7FPDmmpHEEr61FdJUr9fF7q7GbJj3Ku7PWkohaIXa6CsuWcr3TOMZS2SKycqApPTlRtaja7yYxja Dp78XD3LXU2Vzb44o8yorgdvkNXC1aUtYrinrFdtgAmQKsLbjl18zb9DvbeZImcIXG7BvhWwPeQB jnO6YmLI8HoiG.W7YvM8HDOcjzU9cjfKOl6JyhkFDn9Nb8obJd84ppv139z6P5iVgGSW6wfHfcWB etiokcpLtKM_zw8mgXd4SzRXUar4dA1xD3GWHIXZMTHaRiPyUoPN1_7n8hLgzpNZLRtkecVUWKOn jqtHs8x4- Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.gq1.yahoo.com with HTTP; Mon, 22 Oct 2018 06:24:46 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.25]) ([76.115.7.162]) by smtp420.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID df388ffb0415382608330f6a339da29d; Mon, 22 Oct 2018 06:24:44 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: head -r338804 boots threadripper 1950X fine; head -r338810+ do not; -r338807 seems implicated From: Mark Millard In-Reply-To: Date: Sun, 21 Oct 2018 23:24:43 -0700 Cc: Konstantin Belousov , FreeBSD Current , FreeBSD-STABLE Mailing List Content-Transfer-Encoding: quoted-printable Message-Id: <50C22D2F-0D72-4485-9AE2-E22EC336F8CB@yahoo.com> References: <79973E2B-F5C4-4E7C-B92B-1C8D4441C7D1@yahoo.com> To: Warner Losh X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2018 06:24:48 -0000 On 2018-Oct-21, at 8:30 PM, Warner Losh wrote: > On Sun, Oct 21, 2018 at 9:28 PM Warner Losh wrote: >=20 > On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable = wrote: >> [I built based on WITHOUT_ZFS=3D for other reasons. But, >> after installing the build, Hyper-V based boots are >> working.] >>=20 >> On 2018-Oct-20, at 2:09 AM, Mark Millard = wrote: >>=20 >> > On 2018-Oct-20, at 1:39 AM, Mark Millard = wrote: >> >=20 >> >> I attempted to jump from head -r334014 to -r339076 >> >> on a threadripper 1950X board and the boot fails. >> >> This is both native booting and under Hyper-V, >> >> same machine and root file system in both cases. >> >=20 >> > I did my investigation under Hyper-V after seeing >> > a boot failure native. >> >=20 >> > Looks like the native failure is even earlier, >> > before db> is even possible, possibly during >> > early loader activity. >> >=20 >> > So this report is really for running under >> > Hyper-V: -r338804 boots and -r338810 does >> > not. By contrast -r334804 does not boot native. >> > (But I've little information for that context.) >> >=20 >> > Sorry for the confusion. I rushed the report >> > in hopes of getting to sleep. It was not to be. >> >=20 >> >> It fails just after the FreeBSD/SMP lines, >> >> reporting "kernel trap 9 with interrupts disabled". >> >>=20 >> >> It fails in pmap_force_invaldiate_cache_range at >> >> a clflusl (%rax) instruction that produces a >> >> "Fatal trap 9: general protection fault while >> >> in kernel mode". cpudid=3D0 apic id=3D 00 >> >>=20 >> >> I used kernel.txz files from: >> >>=20 >> >> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/ >> >>=20 >> >> to narrow the range of kernel builds for working -> failing >> >> and got: >> >>=20 >> >> -r338804 boots fine >> >> (no amd64 kernel builds between to try) >> >> -r338810+ fails (any that I tried, anyway) >> >>=20 >> >> In that range is -r338807 : >> >>=20 >> >> QUOTE >> >> Author: kib >> >> Date: Wed Sep 19 19:35:02 2018 >> >> New Revision: 338807 >> >> URL:=20 >> >> https://svnweb.freebsd.org/changeset/base/338807 >> >>=20 >> >>=20 >> >> Log: >> >> Convert x86 cache invalidation functions to ifuncs. >> >>=20 >> >> This simplifies the runtime logic and reduces the number of >> >> runtime-constant branches. >> >>=20 >> >> Reviewed by: alc, markj >> >> Sponsored by: The FreeBSD Foundation >> >> Approved by: re (gjb) >> >> Differential revision: =20 >> >> https://reviews.freebsd.org/D16736 >> >>=20 >> >> Modified: >> >> head/sys/amd64/amd64/pmap.c >> >> head/sys/amd64/include/pmap.h >> >> head/sys/dev/drm2/drm_os_freebsd.c >> >> head/sys/dev/drm2/i915/intel_ringbuffer.c >> >> head/sys/i386/i386/pmap.c >> >> head/sys/i386/i386/vm_machdep.c >> >> head/sys/i386/include/pmap.h >> >> head/sys/x86/iommu/intel_utils.c >> >> END QUOTE >> >>=20 >> >> There do seem to be changes associated with >> >> clflush(...) use. Looking at: >> >>=20 >> >> = https://svnweb.freebsd.org/base/head/sys/amd64/amd64/pmap.c?annotate=3D339= 432 >> >>=20 >> >> it appears that pmap_force_invalidate_cache_range has not >> >> changed since -r338807. >> >>=20 >> >> It seems that -r338806 and -r3388810 would be unlikely >> >> contributors. >> >=20 >>=20 >> I went after my native-boot loader problem first because I >> could switch kernels via the loader for booting FreeBSD under >> Hyper-V. Switching loaders is more of a problem. >>=20 >> In order to avoid the loader-time crash I switched to building >> installing based on WITHOUT_ZFS=3D . I've had no active use of >> ZFS in years. (The old official-build loaders that worked were >> non-ZFS ones.) >>=20 >> This took care of the native-boot loader-crash --and, to my >> surprise, also the Hyper-V-boot kernel-time crash. >>=20 >> My private builds now boot the 1950X in both contexts just >> fine. >>=20 >> During my early investigation I did pick up specific changes >> from after -r339076 that seemed to be tied to Ryzen and such. >> (They made no difference to the boot problems at the time >> but I saw no reason to remove them.) >>=20 >> # uname -apKU >> FreeBSD FBSDFSSD 12.0-ALPHA8 FreeBSD 12.0-ALPHA8 #5 r339076:339432M: = Sun Oct 21 16:44:25 PDT 2018 = markmi@FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/G= ENERIC-NODBG amd64 amd64 1200084 1200084 >>=20 >> (stupid gmail)=20 >=20 > The phrase "no active use" bothers me. What does that mean? Are there = any ZFS pools or any disks that any whiff of ZFSish thing on it at all? = Clearly, there's something in the zfs boot loader that's freaking out by = something on your system, but absent that information I can't help you. No ZFS pools: Strictly UFS for FreeBSD file systems for the last few years, UFS before I had access to the 1950X system. I've never before bothered to use WITHOUT_ZFS=3D in my builds. So the system had the ZFS support, such as kernel modules, over all the time that this system had been in use. Prior to the recent versions I saw no such problems. But the default loader was not ZFS capable. As seen in the under-Hyper-V use-context: # gpart show -p =3D> 40 937703008 da0 GPT (447G) 40 1024 da0p1 freebsd-boot (512K) 1064 746586112 da0p2 freebsd-ufs (356G) 746587176 31457280 da0p3 freebsd-swap (15G) 778044456 159383552 da0p4 freebsd-swap (76G) 937428008 275040 - free - (134M) =3D> 40 937703008 da1 GPT (447G) 40 1024 da1p1 freebsd-boot (512K) 1064 369098752 da1p2 freebsd-ufs (176G) 369099816 406846424 da1p3 freebsd-swap (194G) 775946240 130024488 - free - (62G) 905970728 31457280 da1p4 freebsd-swap (15G) 937428008 275040 - free - (134M) =3D> 40 419430320 da2 GPT (200G) 40 4056 - free - (2.0M) 4096 419426263 da2p1 freebsd-ufs (200G) 419430359 1 - free - (512B) =3D> 40 2000409184 da3 GPT (954G) 40 1024 da3p1 freebsd-boot (512K) 1064 2000408159 da3p2 freebsd-ufs (954G) 2000409223 1 - free - (512B) So no ZFS pools. The above context never had the ZFS-capable loader problem but did have the kernel problem. I was booting the 356G freebsd-ufs partition: the only one that I have updated the FreeBSD version on so far. FreeBSD booted natively more drives are seen in gpart show, some not from/for FreeBSD. But the above drives are present and I was booting from the same partition of the same drive: the 356G freebsd-ufs partition. Still no ZFS pools anywhere: # gpart show -p =3D> 34 4000797293 nvd0 GPT (1.9T) 34 262144 nvd0p1 ms-reserved (128M) 262178 2014 - free - (1.0M) 264192 3600451584 nvd0p2 ms-basic-data (1.7T) 3600715776 400081551 - free - (191G) =3D> 40 937703008 nvd1 GPT (447G) 40 1024 nvd1p1 freebsd-boot (512K) 1064 746586112 nvd1p2 freebsd-ufs (356G) 746587176 31457280 nvd1p3 freebsd-swap (15G) 778044456 159383552 nvd1p4 freebsd-swap (76G) 937428008 275040 - free - (134M) =3D> 40 937703008 nvd2 GPT (447G) 40 1024 nvd2p1 freebsd-boot (512K) 1064 369098752 nvd2p2 freebsd-ufs (176G) 369099816 406846424 nvd2p3 freebsd-swap (194G) 775946240 130024488 - free - (62G) 905970728 31457280 nvd2p4 freebsd-swap (15G) 937428008 275040 - free - (134M) =3D> 34 2000409197 nvd3 GPT (954G) 34 2014 - free - (1.0M) 2048 1021952 nvd3p1 ms-recovery (499M) 1024000 202752 nvd3p2 efi (99M) 1226752 32768 nvd3p3 ms-reserved (16M) 1259520 1859119104 nvd3p4 ms-basic-data (886G) 1860378624 140030607 - free - (67G) =3D> 40 2000409184 nvd4 GPT (954G) 40 1024 nvd4p1 freebsd-boot (512K) 1064 2000408159 nvd4p2 freebsd-ufs (954G) 2000409223 1 - free - (512B) =3D> 63 2000409201 ada0 MBR (954G) 63 1985 - free - (993K) 2048 4096 ada0s1 linux-data (2.0M) 6144 2093056 - free - (1.0G) 2099200 1998309376 ada0s2 linux-lvm (953G) 2000408576 688 - free - (344K) =3D> 34 2000409197 ada1 GPT (954G) 34 262144 ada1p1 ms-reserved (128M) 262178 2000147053 - free - (954G) =3D> 34 2000409197 ada2 GPT (954G) 34 262144 ada2p1 ms-reserved (128M) 262178 2000147053 - free - (954G) =3D> 34 1953497022 da0 GPT (932G) 34 262144 da0p1 ms-reserved (128M) 262178 2014 - free - (1.0M) 264192 1953230848 da0p2 ms-basic-data (931G) 1953495040 2016 - free - (1.0M) =3D> 1 60062499 da1 MBR (29G) 1 31 - free - (16K) 32 60062468 da1s1 fat32lba (29G) The 356G freebsd-ufs partition is the only one of the freebsd-ufs partitions updated so far. This is the context that had the problem with the ZFS-capable loaders --but no later kernel problem when a not-ZFS-capable loader was used (via copying over an older one --until I did the WITHOUT_ZFS=3D build/install). As for the ZFS-capable loader: May it has problems when it sees one or more of: ms-reserved (on GPT) ms-basic-data (on GPT) (NTFS file system) ms-recovery (on GPT) efi (on GPT) linux-data (on MBR) linux-lvm (on MBR) fat32lba (on MBR) (given that none of these is available in the Hyper-V context as the virtual machine has been configured). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)