From owner-freebsd-hackers@freebsd.org Sat Apr 21 17:02:43 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DCE9CFB31D2 for ; Sat, 21 Apr 2018 17:02:42 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 665B47C8FE for ; Sat, 21 Apr 2018 17:02:42 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x232.google.com with SMTP id d6-v6so13887412iog.1 for ; Sat, 21 Apr 2018 10:02:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=gMHxVTSuji4TP3x0P5qLJZCkwUyt3QAZHJrEhepmnIE=; b=uwNOISH5lcak5EYYurrWAG8UCC7Gim0XrzSOIBEiqmn8v82Z0VXv0q+kRmEkhqbK5X gJu+D9uZEeDnDKZpCZrxC6t5R/awUrVAbEm5/qFKvSUaarbC8AfC2A9ejrXpllByXbpN MM0/1vykIolbDKfd6Onj4jed53sVkGKMfgc2E4VkdD2nSD2lrJ3Nu/brnHEN6C5sFnB+ xEJzLfOMrZFLnTpILfn0zYNTDOa5oPN5kPv6ccwqTjufVEaFqtcLAV3PMHBSZvYIiFoz 9XqBOuKOc74IHDLqMp1T17Wd8XlY/xiEsU71Zhh7W91eArBkdwUTgByD+T7AWUZtQWPM Gp5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=gMHxVTSuji4TP3x0P5qLJZCkwUyt3QAZHJrEhepmnIE=; b=uM1JZsrb5tVwip1NGEWvQ7chkYO61DReRiMkX7BMcsDo6/j/uNYbKQO1OhA/t0Qn5/ 0RTW5CJ4CCSUeUCro/AoNeAe5IgXaR/J5+3jsO1Yb0jDxICoR8EN7XMQHIRU+8To6qnE fcpXcIUotkK2Dj51h9AlU0bejEAluKt+OiUCvfBwqVxrTgWMEj5s5/UBz1Rv6SC6+O6Z R1DctGVzV5UCByzijpV9Gu/UaI8BJcHKukwMkIpXPBcTKNcvBCfEookdcXTm0FvHCIyl GG6thkvD8q4Kz4tac3MnLr1kgPGncNNaNj+ZOo/fKg2eY+VE0lecXQ+J1KCBy3yudbPD c+3w== X-Gm-Message-State: ALQs6tCPKXamn1+AXGbBsxa6I4wGwtR0ROOqm7wMqkymLsQ1rr+H92UR k9ZEFgLvCkQ1ZxW1xdstCXJcDcqch7ElyNySgIorXw== X-Google-Smtp-Source: AB8JxZr7wl87uSjgJo3qnmo+M4XYirtsQ9M5DpoZkmmIMbPV1wClrgsj+DXS9PjpdGNM/jc3uDfch9IS5GSlk4UEOuI= X-Received: by 2002:a6b:d404:: with SMTP id l4-v6mr12558012iog.37.1524330161575; Sat, 21 Apr 2018 10:02:41 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 2002:a4f:a604:0:0:0:0:0 with HTTP; Sat, 21 Apr 2018 10:02:40 -0700 (PDT) X-Originating-IP: [107.77.200.57] In-Reply-To: <01000162e58a4670-66a9983e-c3ef-493a-a60f-c477645b5100-000000@email.amazonses.com> References: <01000162df15f856-1e5d2641-2a72-4250-8d8e-adcd47bc5db4-000000@email.amazonses.com> <20180419204405.GE6887@kib.kiev.ua> <20180419214550.GF6887@kib.kiev.ua> <01000162e58a4670-66a9983e-c3ef-493a-a60f-c477645b5100-000000@email.amazonses.com> From: Warner Losh Date: Sat, 21 Apr 2018 11:02:40 -0600 X-Google-Sender-Auth: 2i4Ma6_N0oj1SloNQVl3lPf2L-M Message-ID: Subject: Re: RFC: Hiding per-CPU kernel output behind bootverbose To: Colin Percival Cc: Konstantin Belousov , Conrad Meyer , "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Apr 2018 17:02:43 -0000 On Fri, Apr 20, 2018 at 6:11 PM, Colin Percival wrote: > On 04/19/18 14:45, Konstantin Belousov wrote: > > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote: > >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote: > >>> The 'CPU XX Launched' messages are very useful for initial diagnostic > >>> of the SMP startup failures. You need to enable bootverbose to see the > >>> hang details, but for initial hint they are required. Unfortunately, AP > >>> startup hangs occur too often to pretend that this can be delegated to > >>> very specific circumstances. > >> > >> Really? I don't know that I've ever seen an AP startup hang. How > >> often do they occur? > > > > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS > > supplier and its interaction with the x2APIC enablement, see madt.c:170 > > and below. > > > > There were several recent reports of the issue with Broadwell Xeon > > machines, no additional data or resolution. > > > > There are sporadic reports of the problem, where I do not see > > a clear commonality. > > Would it be sufficient for debugging purposes if I change the !bootverbose > case from printing many lines of > > SMP: AP CPU #N Launched! > > to instead have a single > > SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 113 > 36 > 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 63 18 > 67 > 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 37 > 22 > 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 115 > 16 > 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 25 > 108 > 106 122 117 114 26 110 59 76 124 119 > > ? (With each AP printing its number as it reaches the appropriate point?) > > This yields almost the same gain as silencing the launch messages > completely, > while still allowing you to see each CPU announcing itself. The trouble is that you've got N CPUs that are doing output at the same time. You'll need to synchronize somehow. And how do you know that the last one is done? Especailly if one of the CPUs doesn't start.. It looks great in theory, but I'm not sure how you'd make it work in practice. The other stuff (cpu and per-cpu stuff) is actually easy to pare down entirely inside of newbus. I'll share a patch to do that. Warner