From owner-freebsd-current@FreeBSD.ORG Thu Apr 8 11:59:54 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 04DA116A4CE; Thu, 8 Apr 2004 11:59:54 -0700 (PDT) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A70443D45; Thu, 8 Apr 2004 11:59:53 -0700 (PDT) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.11/8.12.11) with ESMTP id i38IxrS4002636; Thu, 8 Apr 2004 11:59:53 -0700 (PDT) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) i38Ixn9M023036; Thu, 8 Apr 2004 11:59:49 -0700 (PDT) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.11/8.12.11/Submit) id i38IxnZL023035; Thu, 8 Apr 2004 11:59:49 -0700 (PDT) (envelope-from marcel) Date: Thu, 8 Apr 2004 11:59:49 -0700 From: Marcel Moolenaar To: Robert Watson Message-ID: <20040408185949.GA22954@dhcp01.pn.xcllnt.net> References: <20040408154004.GA22500@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="n8g4imXOkfNTN/H1" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i cc: current@freebsd.org Subject: Re: panic on one cpu leaves others running... X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Apr 2004 18:59:54 -0000 --n8g4imXOkfNTN/H1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Apr 08, 2004 at 11:51:24AM -0400, Robert Watson wrote: > > > > Presumably in large part because I'm in code that doesn't require Giant, > > > so there are no lock conflicts. > > > > I don't think that's the case. It think we're just not stopping the CPUs > > or keep them stopped. > > I agree with that interpretation -- I was suggesting that the reason this > problem might not be noticed is that a lot of our code paths require > Giant, and it's only when you panic in code without Giant that Ah, ok. The thing that strikes me as odd, if not wrong, is that we use PCPU(CPUID) to update the stopped_cpus mask, while we should be using PCPU(CPUMASK) for that. See attached patch (untested). Am I off-base here? -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net --n8g4imXOkfNTN/H1 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="apic.diff" Index: apic_vector.s =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/apic_vector.s,v retrieving revision 1.97 diff -u -r1.97 apic_vector.s --- apic_vector.s 3 Feb 2004 22:00:41 -0000 1.97 +++ apic_vector.s 8 Apr 2004 18:58:31 -0000 @@ -336,7 +336,7 @@ call CNAME(savectx) /* Save process context */ addl $4, %esp - movl PCPU(CPUID), %eax + movl PCPU(CPUMASK), %eax lock btsl %eax, CNAME(stopped_cpus) /* stopped_cpus |= (1<