From owner-freebsd-stable@FreeBSD.ORG Sat May 24 15:01:37 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AAB121065672 for ; Sat, 24 May 2008 15:01:37 +0000 (UTC) (envelope-from pho@holm.cc) Received: from relay02.pair.com (relay02.pair.com [209.68.5.16]) by mx1.freebsd.org (Postfix) with SMTP id 59D2C8FC18 for ; Sat, 24 May 2008 15:01:36 +0000 (UTC) (envelope-from pho@holm.cc) Received: (qmail 12178 invoked from network); 24 May 2008 14:34:55 -0000 Received: from unknown (HELO peter.osted.lan) (unknown) by unknown with SMTP; 24 May 2008 14:34:55 -0000 X-pair-Authenticated: 87.58.145.180 Received: from peter.osted.lan (localhost.osted.lan [127.0.0.1]) by peter.osted.lan (8.13.6/8.13.6) with ESMTP id m4OEYsj9051157; Sat, 24 May 2008 16:34:54 +0200 (CEST) (envelope-from pho@peter.osted.lan) Received: (from pho@localhost) by peter.osted.lan (8.13.6/8.13.6/Submit) id m4OEYrnU051156; Sat, 24 May 2008 16:34:53 +0200 (CEST) (envelope-from pho) Date: Sat, 24 May 2008 16:34:53 +0200 From: Peter Holm To: John Baldwin Message-ID: <20080524143453.GA51069@peter.osted.lan> References: <720051dc0805220159n23eb6205yfcf9450be7af5c77@mail.gmail.com> <200805230829.09524.jhb@freebsd.org> <20080523132645.GO29770@deviant.kiev.zoral.com.ua> <200805231811.01936.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200805231811.01936.jhb@freebsd.org> User-Agent: Mutt/1.4.2.1i Cc: Kostik Belousov , freebsd-stable@freebsd.org, Mark Kirkwood , James Seward Subject: Re: BTX loader hangs after version info X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 May 2008 15:01:37 -0000 On Fri, May 23, 2008 at 06:11:01PM -0400, John Baldwin wrote: > On Friday 23 May 2008 09:26:45 am Kostik Belousov wrote: > > On Fri, May 23, 2008 at 08:29:09AM -0400, John Baldwin wrote: > > > On Friday 23 May 2008 07:53:11 am Kostik Belousov wrote: > > > > On Fri, May 23, 2008 at 01:22:55PM +1200, Mark Kirkwood wrote: > > > > > James Seward wrote: > > > > > >Hello, > > > > > > > > > > > >Two days ago I csup'd my desktop at home, which was running RELENG_7 > > > > > >from about 7.0-RELEASE time, to bring it up-to-date (still on > > > > > >RELENG_7). I followed my usual buildkernel/world procedure (the usual > > > > > >one) which has worked fine all the way since 5.x. After installing > > > > > >kernel and restarting in single user, it was working fine. However, > > > > > >following installworld it will not boot. > > > > > > > > > > > >It stops immediately after "BTX loader 1.00 BTX version 1.02", but > > > > > >with the cursor on the line *above* the first "B". Nothing futher > > > > > >happens, but the system responds to Ctrl-Alt-Del. > > > > > > > > > > > >I have managed to start it using the install CD and csup'd back to a > > > > > >version just before the commit to BTX that moved it to 1.02 (March > > > > > >18th, I think). However, that version too hangs after "BTX loader 1.00 > > > > > >BTX version 1.01". > > > > > > > > > > > >My desktop is currently building RELENG_7_0 to see if that will work, > > > > > >but I won't know that until later as I'm at work and it is at home :) > > > > > > > > > > > >The install CD (BTX 1.00/1.01) boots fine. Nothing else changed on my > > > > > >system between the last successful boot and the unsuccessful one. > > > > > > > > > > > >Any suggestions/advice for what I can try next, or what I can do to > > > > > >help the troubleshooting process? > > > > > > > > > > > >My desktop is an Athlon64 but I am using i386, on an Asus A8V-E Deluxe > > > > > >board. > > > > > > > > > > FWIW - I am seeing this too, on a Supermicro P3TDDE. 7-STABLE src from > > > > > 28-Feb is fine, but Mar, Apr, May code all hangs after printing "loading > > > > > /boot/defaults/loader.conf" - presumably reading my /boot/loader.conf? > > > > > > > > > > Interestingly I can usually get it to boot by escaping to the loader > > > > > prompt and then just pressing return. > > > > > > > > > > Oddly some other machines (Supermicro P3TDER and Asus PRO31J Laptop) > > > > > behave normally with src from Mar->May. > > > > > > > > > > In all cases the canonical procedure from UPDATING was used (buildworld, > > > > > kernel, reboot single, mergemaster -p, installworld, delete-old, > > > > > mergemaster, reboot). > > > > > > > > > > I happy to help collect some debug info (how do you switch this on for > > > > > the loader?), tho the machine exhibiting the problem is my workstation > > > > > (of course)! > > > > > > > > Try to install new bootblock. > > > > > > I would be wary of that as it might make things worse? These problems are all > > > from starting /boot/loader. boot2 is still working fine and thus there is > > > still the possiblity of using boot2 to load /boot/loader.old as a workaround. > > > If you update boot2 and it breaks you can't fix that w/o booting off of some > > > other media such as a CD. > > > > > > Debugging these hangs is not easy to do remotely. If you know assembly then > > > there are some things you can play with. For example, in the case where it > > > hangs after printing out the BTX version (from btxldr.S) you could start > > > adding debugging to btx.S to print out '.' characters in various places and > > > see how many get printed out before it hangs. However, doing this requires > > > familiarity with assembly and is a lot easier with physical access to a box. > > > > When I worked on my version of the realbtx, I sometimes experienced hangs when > > vm86 btx run before real-mode btx. I did not investigated it then, only noted > > the issue. > > > > Try this patch. I'm not 100% certain this will fix it as I can't reproduce > the issue, but I think it might help. Specifically, when the boot code makes > a v86 call, the loader/boot2/whatever swaps in/out a new set of registers via > the v86 structure including the eflags register. However, none of the boot > programs actually initialized the v86 structure. Thus, the BIOS routines > would start off running with whatever garbage was in v86.efl when each boot > program started. This meant that we could end up invoking BIOS routines with > interrupts disabled, and I think this might explain a hard hang (if a BIOS > routine was waiting for an interrupt the interrupt would never fire). The > patch fixes all the boot programs to initialize v86 to a better known state. > At the least it sets v86.efl to a sane value (0x202) rather than random. (The > random might have always been 0x0 BTW, not sure on that one.) > I can confirm that this patch fixes the loader problem seen with my old Tyan S2720 MB. - Peter