From owner-freebsd-stable@freebsd.org Wed Oct 10 19:07:06 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 44E4610B9389 for ; Wed, 10 Oct 2018 19:07:06 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id F29C78D2C7; Wed, 10 Oct 2018 19:07:05 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from mousie.catspoiler.org (unknown [76.212.85.177]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: truckman) by smtp.freebsd.org (Postfix) with ESMTPSA id 53BF9DC22; Wed, 10 Oct 2018 19:07:05 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Date: Wed, 10 Oct 2018 12:07:03 -0700 (PDT) From: Don Lewis Subject: Re: early boot netisr_init() panic on older AMD SMP machine with recent 11-STABLE To: freebsd-stable@FreeBSD.org, jhb@FreeBSD.org cc: nwhitehorn@FreeBSD.org In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-Disposition: INLINE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Oct 2018 19:07:06 -0000 On 9 Oct, Don Lewis wrote: > My desktop machine has an older AMD SMP CPU and tracks 11-STABLE. For > about six months or so it frequently panics early in boot. If I retry a > sufficient number of times I can get a successful boot, but this is > rather annoying. > > A normal boot looks like this: > > Copyright (c) 1992-2018 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 11.2-STABLE #16 r339017M: Sat Sep 29 19:18:41 PDT 2018 > dl@mousie.catspoiler.org:/usr/obj/usr/src/sys/GENERICDDB amd64 > FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1 > ) > WARNING: WITNESS option enabled, expect reduced performance. > VT(vga): resolution 640x480 > CPU: AMD Athlon(tm) II X3 450 Processor (3214.60-MHz K8-class CPU) > Origin="AuthenticAMD" Id=0x100f53 Family=0x10 Model=0x5 Stepping=3 > Features=0x178bfbff MOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x802009 > AMD Features=0xee500800> > AMD Features2=0x37ff SKINIT,WDT> > SVM: NP,NRIP,NAsids=64 > TSC: P-state invariant > real memory = 34359738368 (32768 MB) > avail memory = 33275473920 (31733 MB) > Event timer "LAPIC" quality 100 > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs > FreeBSD/SMP: 1 package(s) x 3 core(s) > ioapic0: Changing APIC ID to 2 > ioapic0 irqs 0-23 on motherboard > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > Timecounter "TSC-low" frequency 1607298818 Hz quality 800 > random: entropy device external interface > [SNIP] > > An unsuccessful boot looks like this (hand transcribed): > [SNIP] > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs > FreeBSD/SMP: 1 package(s) x 3 core(s) > ioapic0: Changing APIC ID to 2 > ioapic0 irqs 0-23 on motherboard > SMP: AP CPU #2 Launched! > SMP: AP CPU #1 Launched! > Timecounter "TSC-low" frequency 1607298818 Hz quality 800 > panic: netisr_init: not on CPU 0 > cpuid = 2 > KDB: stack backtrace: > db_trace_selfwrapper() ... > vpanic() ... > doadump() ... > netisr_init() ... > mi_startup() ... > btext() ... > > This problem may be silently occuring on many other machines. This > machine is running a custom kernel with INVARIANTS and WITNESS. The > panic is coming from a KASSERT(), which is only checked when the kernel > is built with INVARIANTS. > > This KASSERT was removed from 12.0-CURRENT with this commit: > https://svnweb.freebsd.org/base/head/sys/net/netisr.c?r1=301270&r2=302595 > > Revision 302595 - (view) (download) (annotate) - [select for diffs] > Modified Mon Jul 11 21:25:28 2016 UTC (2 years, 2 months ago) by nwhitehorn > File length: 44729 byte(s) > Diff to previous 301270 > > Remove assumptions in MI code that the BSP is CPU 0. > > Perhaps this should be MFC'ed, but it seems odd that the BSP is > non-deterministic. I now wonder if this panic is a side effect of EARLY_AP_STARTUP, which was enabled by default in 11-STABLE GENERIC back in May, so the timeframe fits. Since the panic only happens with INVARIANTS enabled, most users are unlikely to to encounter this problem.