From owner-freebsd-hackers@FreeBSD.ORG Mon Dec 5 16:29:07 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 66DA716A41F for ; Mon, 5 Dec 2005 16:29:07 +0000 (GMT) (envelope-from jacques.fourie@gmail.com) Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.197]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7935E43D69 for ; Mon, 5 Dec 2005 16:29:03 +0000 (GMT) (envelope-from jacques.fourie@gmail.com) Received: by zproxy.gmail.com with SMTP id 12so806253nzp for ; Mon, 05 Dec 2005 08:29:03 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=O9zaylx9OE8nDlsyJtBShJD3PV5Q97XJjZo3WqQgEhGgRXr8YoE8IXF3MLlmEam2k6a6tNq0RtK/5fWoVe5DWR4fmCjrMCh/+R8HGPwMs9KcN2kMN4xDzvju0vp7iIaOowDJivXqq+jpmrLe2feSGonof4irsg/E8QZAqTqv5fI= Received: by 10.65.137.17 with SMTP id p17mr2895877qbn; Mon, 05 Dec 2005 08:29:02 -0800 (PST) Received: by 10.65.158.14 with HTTP; Mon, 5 Dec 2005 08:29:02 -0800 (PST) Message-ID: Date: Mon, 5 Dec 2005 18:29:02 +0200 From: Jacques Fourie To: John Baldwin In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <200512011133.10441.jhb@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: 4.11 SMP issues on Intel SE7501CW2 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Dec 2005 16:29:07 -0000 On 12/1/05, Jacques Fourie wrote: > Hi John, > > I booted a 6.0-RELEASE CD and the same thing (panic that freezes the > machine) happens. Can you think of any way in which to reliably reboot > the machine if this situation occurs? > > regards, > jacques > > On 12/1/05, John Baldwin wrote: > > On Thursday 01 December 2005 08:20 am, Jacques Fourie wrote: > > > Hi, > > > > > > With reference to the following thread : > > > http://groups.google.com/group/mailing.freebsd.smp/browse_thread/thre= ad/bd4 > > >5afab721e1a85/f66c8476272952af?lnk=3Dst&q=3D%2Bfreebsd+%2B%22failed!%2= 2+%2Bpanic > > >&rnum=3D80#f66c8476272952af > > > > > > I am seeing the same issue on an Intel SE7501CW2 dual Xeon machine. 6= .0 as > > > well as -current exhibits the same behaviour. Various postings to the > > > above thread suggests that this may be due to the APIC ID that the BI= OS > > > claims is assigned to the CPU not being the actual APIC ID assigned t= o the > > > CPU. Does anyone have any new information on this issue? If the subse= quent > > > panic succeeded in rebooting the machine this would not be a big issu= e for > > > me but unfortunately the machine hangs after pressing 'y' to the "pan= ic > > > [y/n]" prompt. Is there a way in which to initiate a hard reset in > > > software? > > > > No, there hasn't been any recent info on this and I haven't had any rec= ent > > reports of these problems, at least not on 5.x or 6.x. Can you try boo= ting > > up a 5.4 or 6.0 CD to see if they boot up ok? > > > > -- > > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > > "Power Users Use the Power to Serve" =3D http://www.FreeBSD.org > > _______________________________________________ > > freebsd-hackers@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.o= rg" > > > Hi John, In the end a workaround that "solved" the issue for me (on 4.11) was to call cpu_reset() instead of panic() when failing to start an AP. This causes the box to reboot reliably instead of freezing and after the reboot all AP's also start without any issues. On FreeBSD 6.0 (and -current) the panic() call successfully reboots the box so although the original problem of failing to start the AP is present on these platforms the problem is not that severe. In case anyone is interested in how to reproduce the problem (on 4.11, 6.0 or -current) - just cycle through a few soft reboot cycles (I placed a /sbin/reboot line in /etc/rc.local). regards, jacques