From owner-freebsd-hackers@FreeBSD.ORG Mon Dec 5 18:54:49 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9237916A420 for ; Mon, 5 Dec 2005 18:54:49 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0811343D5E for ; Mon, 5 Dec 2005 18:54:46 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.5b3) with ESMTP id 3200096 for multiple; Mon, 05 Dec 2005 13:52:18 -0500 Received: from localhost (john@localhost [127.0.0.1]) by server.baldwin.cx (8.13.1/8.13.1) with ESMTP id jB5Is6L2040561; Mon, 5 Dec 2005 13:54:06 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: Jacques Fourie Date: Mon, 5 Dec 2005 13:22:32 -0500 User-Agent: KMail/1.8.2 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200512051322.32993.jhb@freebsd.org> X-Spam-Status: No, score=-2.8 required=4.2 tests=ALL_TRUSTED autolearn=failed version=3.0.2 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on server.baldwin.cx X-Server: High Performance Mail Server - http://surgemail.com r=1653887525 Cc: freebsd-hackers@freebsd.org Subject: Re: 4.11 SMP issues on Intel SE7501CW2 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Dec 2005 18:54:49 -0000 On Monday 05 December 2005 11:29 am, Jacques Fourie wrote: > On 12/1/05, Jacques Fourie wrote: > > Hi John, > > > > I booted a 6.0-RELEASE CD and the same thing (panic that freezes the > > machine) happens. Can you think of any way in which to reliably reboot > > the machine if this situation occurs? > > > > regards, > > jacques > > > > On 12/1/05, John Baldwin wrote: > > > On Thursday 01 December 2005 08:20 am, Jacques Fourie wrote: > > > > Hi, > > > > > > > > With reference to the following thread : > > > > http://groups.google.com/group/mailing.freebsd.smp/browse_thread/thre > > > >ad/bd4 > > > > 5afab721e1a85/f66c8476272952af?lnk=st&q=%2Bfreebsd+%2B%22failed!%22+% > > > >2Bpanic &rnum=80#f66c8476272952af > > > > > > > > I am seeing the same issue on an Intel SE7501CW2 dual Xeon machine. > > > > 6.0 as well as -current exhibits the same behaviour. Various postings > > > > to the above thread suggests that this may be due to the APIC ID that > > > > the BIOS claims is assigned to the CPU not being the actual APIC ID > > > > assigned to the CPU. Does anyone have any new information on this > > > > issue? If the subsequent panic succeeded in rebooting the machine > > > > this would not be a big issue for me but unfortunately the machine > > > > hangs after pressing 'y' to the "panic [y/n]" prompt. Is there a way > > > > in which to initiate a hard reset in software? > > > > > > No, there hasn't been any recent info on this and I haven't had any > > > recent reports of these problems, at least not on 5.x or 6.x. Can you > > > try booting up a 5.4 or 6.0 CD to see if they boot up ok? > > > > > > -- > > > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > > > "Power Users Use the Power to Serve" = http://www.FreeBSD.org > > > _______________________________________________ > > > freebsd-hackers@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > > To unsubscribe, send any mail to > > > "freebsd-hackers-unsubscribe@freebsd.org" > > Hi John, > > In the end a workaround that "solved" the issue for me (on 4.11) was > to call cpu_reset() instead of panic() when failing to start an AP. > This causes the box to reboot reliably instead of freezing and after > the reboot all AP's also start without any issues. On FreeBSD 6.0 > (and -current) the panic() call successfully reboots the box so > although the original problem of failing to start the AP is present on > these platforms the problem is not that severe. > > In case anyone is interested in how to reproduce the problem (on 4.11, > 6.0 or -current) - just cycle through a few soft reboot cycles (I > placed a /sbin/reboot line in /etc/rc.local). Hmm, weird. I have no idea why the CPU is failing to startup the first time. Maybe it needs a longer timeout. You can try increasing the last DELAY() in start_ap() in sys/i386/i386/mp_machdep.c. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org