From owner-freebsd-hackers@FreeBSD.ORG Fri Nov 16 07:54:24 2012 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DCA7A971 for ; Fri, 16 Nov 2012 07:54:24 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 02F468FC12 for ; Fri, 16 Nov 2012 07:54:23 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA12848; Fri, 16 Nov 2012 09:54:21 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1TZGkn-00014Z-1O; Fri, 16 Nov 2012 09:54:21 +0200 Message-ID: <50A5F12C.1050902@FreeBSD.org> Date: Fri, 16 Nov 2012 09:54:20 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:16.0) Gecko/20121030 Thunderbird/16.0.2 MIME-Version: 1.0 To: Ryan Stone Subject: Re: stop_cpus_hard when multiple CPUs are panicking from an NMI References: In-Reply-To: X-Enigmail-Version: 1.4.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Nov 2012 07:54:25 -0000 on 16/11/2012 00:58 Ryan Stone said the following: > At work we have some custom watchdog hardware that sends an NMI upon > expiry. We've modified the kernel to panic when it receives the watchdog > NMI. I've been trying the "stop_scheduler_on_panic" mode, and I've > discovered that when my watchdog expires, the system gets completely > wedged. After some digging, I've discovered is that I have multiple CPUs > getting the watchdog NMI and trying to panic concurrently. One of the CPUs > wins, and the rest spin forever in this code: > > /* > * We don't want multiple CPU's to panic at the same time, so we > * use panic_cpu as a simple spinlock. We have to keep checking > * panic_cpu if we are spinning in case the panic on the first > * CPU is canceled. > */ > if (panic_cpu != PCPU_GET(cpuid)) > while (atomic_cmpset_int(&panic_cpu, NOCPU, > PCPU_GET(cpuid)) == 0) > while (panic_cpu != NOCPU) > ; /* nothing */ > > The system wedges when stop_cpus_hard() is called, which sends NMIs to all > of the other CPUs and waits for them to acknowledge that they are stopped > before returning. However the CPU will not deliver an NMI to a CPU that is > already handling an NMI, so the other CPUs that got a watchdog NMI and are > spinning will never go into the NMI handler and acknowledge that they are > stopped. I thought about this issue and fixed (in my tree) in a different way: http://people.freebsd.org/~avg/cpu_stop-race.diff The need for spinlock_enter in the patch in not entirely clear. The main idea is that a CPU which calls cpu_stop and loses a race should voluntary enter cpustop_handler. I am also not sure about MI-cleanness of this patch. P.S. I also have the hard stop and the "soft" stop separate in my tree. Just in case there is a "simultaneous" occurrence of the soft stop happening for whatever reason and the hard stop for panic or kdb entry, I always want the hard stop to override the soft stop. > I've been able to work around this with the following hideous hack: > > --- kern_shutdown.c 2012-08-17 10:25:02.000000000 -0400 > +++ kern_shutdown.c 2012-11-15 17:04:10.000000000 -0500 > @@ -658,11 +658,15 @@ > * panic_cpu if we are spinning in case the panic on the first > * CPU is canceled. > */ > - if (panic_cpu != PCPU_GET(cpuid)) > + if (panic_cpu != PCPU_GET(cpuid)) { > while (atomic_cmpset_int(&panic_cpu, NOCPU, > - PCPU_GET(cpuid)) == 0) > + PCPU_GET(cpuid)) == 0) { > + atomic_set_int(&stopped_cpus, PCPU_GET(cpumask)); > while (panic_cpu != NOCPU) > ; /* nothing */ > + } > + atomic_clear_int(&stopped_cpus, PCPU_GET(cpumask)); > + } > > if (stop_scheduler_on_panic) { > if (panicstr == NULL && !kdb_active) > > > But I'm hoping that somebody has some ideas on a better way to fix this > kind of problem. -- Andriy Gapon