From owner-freebsd-stable@freebsd.org Wed Oct 10 04:30:33 2018 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E401210C7D0B for ; Wed, 10 Oct 2018 04:30:32 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 924658DEC1; Wed, 10 Oct 2018 04:30:32 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from mousie.catspoiler.org (unknown [76.212.85.177]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) (Authenticated sender: truckman) by smtp.freebsd.org (Postfix) with ESMTPSA id 0658F82E1; Wed, 10 Oct 2018 04:30:31 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Date: Tue, 9 Oct 2018 21:30:29 -0700 (PDT) From: Don Lewis Subject: early boot netisr_init() panic on older AMD SMP machine with recent 11-STABLE To: freebsd-stable@FreeBSD.org cc: nwhitehorn@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-Disposition: INLINE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Oct 2018 04:30:33 -0000 My desktop machine has an older AMD SMP CPU and tracks 11-STABLE. For about six months or so it frequently panics early in boot. If I retry a sufficient number of times I can get a successful boot, but this is rather annoying. A normal boot looks like this: Copyright (c) 1992-2018 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 11.2-STABLE #16 r339017M: Sat Sep 29 19:18:41 PDT 2018 dl@mousie.catspoiler.org:/usr/obj/usr/src/sys/GENERICDDB amd64 FreeBSD clang version 6.0.1 (tags/RELEASE_601/final 335540) (based on LLVM 6.0.1 ) WARNING: WITNESS option enabled, expect reduced performance. VT(vga): resolution 640x480 CPU: AMD Athlon(tm) II X3 450 Processor (3214.60-MHz K8-class CPU) Origin="AuthenticAMD" Id=0x100f53 Family=0x10 Model=0x5 Stepping=3 Features=0x178bfbff Features2=0x802009 AMD Features=0xee500800 AMD Features2=0x37ff SVM: NP,NRIP,NAsids=64 TSC: P-state invariant real memory = 34359738368 (32768 MB) avail memory = 33275473920 (31733 MB) Event timer "LAPIC" quality 100 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs FreeBSD/SMP: 1 package(s) x 3 core(s) ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard SMP: AP CPU #1 Launched! SMP: AP CPU #2 Launched! Timecounter "TSC-low" frequency 1607298818 Hz quality 800 random: entropy device external interface [SNIP] An unsuccessful boot looks like this (hand transcribed): [SNIP] ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs FreeBSD/SMP: 1 package(s) x 3 core(s) ioapic0: Changing APIC ID to 2 ioapic0 irqs 0-23 on motherboard SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! Timecounter "TSC-low" frequency 1607298818 Hz quality 800 panic: netisr_init: not on CPU 0 cpuid = 2 KDB: stack backtrace: db_trace_selfwrapper() ... vpanic() ... doadump() ... netisr_init() ... mi_startup() ... btext() ... This problem may be silently occuring on many other machines. This machine is running a custom kernel with INVARIANTS and WITNESS. The panic is coming from a KASSERT(), which is only checked when the kernel is built with INVARIANTS. This KASSERT was removed from 12.0-CURRENT with this commit: https://svnweb.freebsd.org/base/head/sys/net/netisr.c?r1=301270&r2=302595 Revision 302595 - (view) (download) (annotate) - [select for diffs] Modified Mon Jul 11 21:25:28 2016 UTC (2 years, 2 months ago) by nwhitehorn File length: 44729 byte(s) Diff to previous 301270 Remove assumptions in MI code that the BSP is CPU 0. Perhaps this should be MFC'ed, but it seems odd that the BSP is non-deterministic.