From owner-freebsd-current@freebsd.org Mon Oct 22 03:30:16 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0CF84FE5A60 for ; Mon, 22 Oct 2018 03:30:16 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-vs1-xe33.google.com (mail-vs1-xe33.google.com [IPv6:2607:f8b0:4864:20::e33]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5365585328 for ; Mon, 22 Oct 2018 03:30:15 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-vs1-xe33.google.com with SMTP id i10so28643674vsm.13 for ; Sun, 21 Oct 2018 20:30:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:from:date:message-id:subject:to:cc; bh=Ht5DIerp556qK0YwFNRGn4D34h+CICXjOmHriUjRbMY=; b=jhOm6EgDYtSEDKy4mn2nndadRCMlAOO1f+TFYoAQNdNHfLx3yYHGsVGJ9TY++473fe fVDk5nwfgUU67NqMKhb7ox5XOvVitGAUUAeFAns5HKc8XsVk0Rm1Tj7oDl94ITc5QMxP biVO7ZddCzhgfGINPGh6piaKh7mTgxB5dJD8hCiVl/+ubknPNMW0+74T68mLDSIhGdbP B+MnJ6QQe7V5AijrwGhj98O8PH8CnGmUKgI0br+atUDb0U/C116pj9WRkMnNBc7qcy70 G/qxm/WzEr/Jqnads7IEBWSOg7bQPFch2mdkk7vo9KR03a1doFcMa8N5bnPlbR58k6g7 Yu1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:from:date:message-id :subject:to:cc; bh=Ht5DIerp556qK0YwFNRGn4D34h+CICXjOmHriUjRbMY=; b=nvs4kjYJgLJubbUhNa9ZbYY5B6Pn4FRJFead7HYwEDNq/k4V/n0EizRssSlzag3W2/ zWDggFOiy4EcYn6t66isDYrg+byF+VgA6TEMSpDL5W9ou1YXSKXBlGCBD50BHuy0yvLo QvzYr6Krg+XO966gp5vF4RUGsFCsITgUrgbR4IGakHlL5UQssupvG/9YzMmcfywg+4EO B4eJJmlucT9uxmN/453p+I2G29OgKh3ucyOdKMsf4s4cT7lsneM9V8d5ZrMzijNabrqb IvUl7zhevBKXsaFGCxWsDnJCCSbe+s2wtVd5etDBEelHrulZxYmL1eE8h9E665oJO16a e7kA== X-Gm-Message-State: ABuFfohh6V4bMA2XjDkcoHg0xHOXUVAHZeQmA39j013LVjuE6xMJM/+K q3dD5wnHNEoIH5Dypeu6GNAXP6R6uwzvOz73BGVB54cm X-Google-Smtp-Source: ACcGV61ItgNznFjKRg5GoRmi7YvDINjSnwPzyRMxqliqgxGcug0W/UH/SIQuJC2ODSju2XM5pwb/aYjuBFtNm8GY3+c= X-Received: by 2002:a67:2704:: with SMTP id n4mr18714119vsn.209.1540179014632; Sun, 21 Oct 2018 20:30:14 -0700 (PDT) MIME-Version: 1.0 References: <79973E2B-F5C4-4E7C-B92B-1C8D4441C7D1@yahoo.com> From: Warner Losh Date: Sun, 21 Oct 2018 21:30:03 -0600 Message-ID: Subject: Re: head -r338804 boots threadripper 1950X fine; head -r338810+ do not; -r338807 seems implicated To: Mark Millard Cc: Konstantin Belousov , FreeBSD Current , FreeBSD-STABLE Mailing List Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Oct 2018 03:30:16 -0000 On Sun, Oct 21, 2018 at 9:28 PM Warner Losh wrote: > > > On Sun, Oct 21, 2018 at 8:57 PM Mark Millard via freebsd-stable < > freebsd-stable@freebsd.org> wrote: > >> [I built based on WITHOUT_ZFS= for other reasons. But, >> after installing the build, Hyper-V based boots are >> working.] >> >> On 2018-Oct-20, at 2:09 AM, Mark Millard wrote: >> >> > On 2018-Oct-20, at 1:39 AM, Mark Millard wrote: >> > >> >> I attempted to jump from head -r334014 to -r339076 >> >> on a threadripper 1950X board and the boot fails. >> >> This is both native booting and under Hyper-V, >> >> same machine and root file system in both cases. >> > >> > I did my investigation under Hyper-V after seeing >> > a boot failure native. >> > >> > Looks like the native failure is even earlier, >> > before db> is even possible, possibly during >> > early loader activity. >> > >> > So this report is really for running under >> > Hyper-V: -r338804 boots and -r338810 does >> > not. By contrast -r334804 does not boot native. >> > (But I've little information for that context.) >> > >> > Sorry for the confusion. I rushed the report >> > in hopes of getting to sleep. It was not to be. >> > >> >> It fails just after the FreeBSD/SMP lines, >> >> reporting "kernel trap 9 with interrupts disabled". >> >> >> >> It fails in pmap_force_invaldiate_cache_range at >> >> a clflusl (%rax) instruction that produces a >> >> "Fatal trap 9: general protection fault while >> >> in kernel mode". cpudid=0 apic id= 00 >> >> >> >> I used kernel.txz files from: >> >> >> >> https://artifact.ci.freebsd.org/snapshot/head/r*/amd64/amd64/ >> >> >> >> to narrow the range of kernel builds for working -> failing >> >> and got: >> >> >> >> -r338804 boots fine >> >> (no amd64 kernel builds between to try) >> >> -r338810+ fails (any that I tried, anyway) >> >> >> >> In that range is -r338807 : >> >> >> >> QUOTE >> >> Author: kib >> >> Date: Wed Sep 19 19:35:02 2018 >> >> New Revision: 338807 >> >> URL: >> >> https://svnweb.freebsd.org/changeset/base/338807 >> >> >> >> >> >> Log: >> >> Convert x86 cache invalidation functions to ifuncs. >> >> >> >> This simplifies the runtime logic and reduces the number of >> >> runtime-constant branches. >> >> >> >> Reviewed by: alc, markj >> >> Sponsored by: The FreeBSD Foundation >> >> Approved by: re (gjb) >> >> Differential revision: >> >> https://reviews.freebsd.org/D16736 >> >> >> >> Modified: >> >> head/sys/amd64/amd64/pmap.c >> >> head/sys/amd64/include/pmap.h >> >> head/sys/dev/drm2/drm_os_freebsd.c >> >> head/sys/dev/drm2/i915/intel_ringbuffer.c >> >> head/sys/i386/i386/pmap.c >> >> head/sys/i386/i386/vm_machdep.c >> >> head/sys/i386/include/pmap.h >> >> head/sys/x86/iommu/intel_utils.c >> >> END QUOTE >> >> >> >> There do seem to be changes associated with >> >> clflush(...) use. Looking at: >> >> >> >> >> https://svnweb.freebsd.org/base/head/sys/amd64/amd64/pmap.c?annotate=339432 >> >> >> >> it appears that pmap_force_invalidate_cache_range has not >> >> changed since -r338807. >> >> >> >> It seems that -r338806 and -r3388810 would be unlikely >> >> contributors. >> > >> >> I went after my native-boot loader problem first because I >> could switch kernels via the loader for booting FreeBSD under >> Hyper-V. Switching loaders is more of a problem. >> >> In order to avoid the loader-time crash I switched to building >> installing based on WITHOUT_ZFS= . I've had no active use of >> ZFS in years. (The old official-build loaders that worked were >> non-ZFS ones.) >> >> This took care of the native-boot loader-crash --and, to my >> surprise, also the Hyper-V-boot kernel-time crash. >> >> My private builds now boot the 1950X in both contexts just >> fine. >> >> During my early investigation I did pick up specific changes >> from after -r339076 that seemed to be tied to Ryzen and such. >> (They made no difference to the boot problems at the time >> but I saw no reason to remove them.) >> >> # uname -apKU >> FreeBSD FBSDFSSD 12.0-ALPHA8 FreeBSD 12.0-ALPHA8 #5 r339076:339432M: Sun >> Oct 21 16:44:25 PDT 2018 markmi@FBSDFSSD:/usr/obj/amd64_clang/amd64.amd64/usr/src/amd64.amd64/sys/GENERIC-NODBG >> amd64 amd64 1200084 1200084 > > (stupid gmail) The phrase "no active use" bothers me. What does that mean? Are there any ZFS pools or any disks that any whiff of ZFSish thing on it at all? Clearly, there's something in the zfs boot loader that's freaking out by something on your system, but absent that information I can't help you. Warner