From owner-freebsd-hackers@freebsd.org Sat Apr 21 20:19:47 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E8630F96D2B for ; Sat, 21 Apr 2018 20:19:46 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6B8CF68949; Sat, 21 Apr 2018 20:19:46 +0000 (UTC) (envelope-from kib@freebsd.org) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id w3LKJTQL068050 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 21 Apr 2018 23:19:32 +0300 (EEST) (envelope-from kib@freebsd.org) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua w3LKJTQL068050 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id w3LKJTYW068049; Sat, 21 Apr 2018 23:19:29 +0300 (EEST) (envelope-from kib@freebsd.org) X-Authentication-Warning: tom.home: kostik set sender to kib@freebsd.org using -f Date: Sat, 21 Apr 2018 23:19:29 +0300 From: Konstantin Belousov To: Warner Losh Cc: gljennjohn@gmail.com, "freebsd-hackers@freebsd.org" , Colin Percival , Conrad Meyer Subject: Re: RFC: Hiding per-CPU kernel output behind bootverbose Message-ID: <20180421201929.GP6887@kib.kiev.ua> References: <01000162df15f856-1e5d2641-2a72-4250-8d8e-adcd47bc5db4-000000@email.amazonses.com> <20180419204405.GE6887@kib.kiev.ua> <20180419214550.GF6887@kib.kiev.ua> <01000162e58a466e-98f0305b-1723-467a-bc49-342c3fa9fc5b-000000@email.amazonses.com> <20180421092049.GM6887@kib.kiev.ua> <20180421123544.56d7e690@ernst.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Apr 2018 20:19:47 -0000 On Sat, Apr 21, 2018 at 01:52:46PM -0600, Warner Losh wrote: > On Sat, Apr 21, 2018 at 4:35 AM, Gary Jennejohn > wrote: > > > On Sat, 21 Apr 2018 12:20:49 +0300 > > Konstantin Belousov wrote: > > > > > On Sat, Apr 21, 2018 at 12:11:07AM +0000, Colin Percival wrote: > > > > On 04/19/18 14:45, Konstantin Belousov wrote: > > > > > On Thu, Apr 19, 2018 at 02:37:56PM -0700, Conrad Meyer wrote: > > > > >> On Thu, Apr 19, 2018 at 1:44 PM, Konstantin Belousov wrote: > > > > >>> The 'CPU XX Launched' messages are very useful for initial > > diagnostic > > > > >>> of the SMP startup failures. You need to enable bootverbose to see > > the > > > > >>> hang details, but for initial hint they are required. > > Unfortunately, AP > > > > >>> startup hangs occur too often to pretend that this can be > > delegated to > > > > >>> very specific circumstances. > > > > >> > > > > >> Really? I don't know that I've ever seen an AP startup hang. How > > > > >> often do they occur? > > > > > > > > > > It was epidemic with Sandy Bridge, mostly correlated to specific BIOS > > > > > supplier and its interaction with the x2APIC enablement, see > > madt.c:170 > > > > > and below. > > > > > > > > > > There were several recent reports of the issue with Broadwell Xeon > > > > > machines, no additional data or resolution. > > > > > > > > > > There are sporadic reports of the problem, where I do not see > > > > > a clear commonality. > > > > > > > > Would it be sufficient for debugging purposes if I change the > > !bootverbose > > > > case from printing many lines of > > > > > > > > SMP: AP CPU #N Launched! > > > > > > > > to instead have a single > > > > > > > > SMP: Launching AP CPUs: 86 73 111 21 8 77 100 28 57 42 10 60 87 88 41 > > 113 36 > > > > 19 72 46 92 52 24 81 90 3 107 96 9 14 80 118 29 121 62 74 56 55 1 12 > > 63 18 67 > > > > 13 45 102 33 94 69 68 93 83 48 31 30 32 51 89 71 78 64 84 123 61 40 47 > > 37 22 > > > > 54 101 38 4 97 44 17 109 104 5 85 43 2 99 39 65 95 53 58 66 91 125 23 > > 115 16 > > > > 35 79 112 103 82 7 75 11 6 98 15 126 127 20 70 34 105 27 50 116 120 49 > > 25 108 > > > > 106 122 117 114 26 110 59 76 124 119 > > > > > > > > ? (With each AP printing its number as it reaches the appropriate > > point?) > > > > > > > > This yields almost the same gain as silencing the launch messages > > completely, > > > > while still allowing you to see each CPU announcing itself. > > > I am fine with the behaviour, but I am not sure how would you implement > > > this. printf(9) buffers the output, you need to flush it somehow. > > > > > > > And printf(9) calls vprintf(9) which in turn calls _vprintf which > > will allocate a buffer on the stack (bad idea) if the option > > PRINTF_BUFR_SIZE is set in the kernel config file. So it seems > > that output may even be double buffered. > > > > I don't think either of these is true (the double buffered part,. nor the > buffered part). Well, they are both kinda true, but they don't matter. It's > all to make dmesg and friends work. There's other issues, but at this stage > of the boot we're single threaded and the CPUs that are launching take out > a spin lock. > > We always call putchar for each character that kvprintf(). So any buffering > that's done is in addition to putchar going to the console. > > diff --git a/sys/x86/x86/mp_x86.c b/sys/x86/x86/mp_x86.c > index 3fcf7aa25152..06973b1ea6d5 100644 > --- a/sys/x86/x86/mp_x86.c > +++ b/sys/x86/x86/mp_x86.c > @@ -1020,7 +1020,7 @@ init_secondary_tail(void) > smp_cpus++; > > CTR1(KTR_SMP, "SMP: AP CPU #%d Launched", cpuid); > - printf("SMP: AP CPU #%d Launched!\n", cpuid); > + printf("%d%s", cpuid, smp_cpus == mp_ncpus ? "\n" : " "); With this, my concern is to have the cpu ids printed immediately, instead of being buffered until \n is printed as well. Please ensure that this is the case. > > /* Determine if we are a logical CPU. */ > if (cpu_info[PCPU_GET(apic_id)].cpu_hyperthread) > > is enough to give them all to me on a single line. We'd need another printf > that says 'Starting APs: ' since otherwise we see: > > andom: unblocking device. > ioapic0 irqs 0-23 on motherboard > ioapic1 irqs 24-47 on motherboard > 1 8 10 20 2 19 11 12 22 4 14 23 16 13 3 18 5 17 21 9 15 6 7 > Timecounter "TSC-low" frequency 1350029824 Hz quality 1000 > > which doesn't make a lot of sense. > > Comments? > > Warner