From owner-freebsd-stable@FreeBSD.ORG Thu Apr 10 21:23:24 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CE63DA00; Thu, 10 Apr 2014 21:23:24 +0000 (UTC) Received: from zoom.lafn.org (zoom.lafn.org [108.92.93.123]) by mx1.freebsd.org (Postfix) with ESMTP id A590A1664; Thu, 10 Apr 2014 21:23:24 +0000 (UTC) Received: from [10.0.1.3] (static-71-177-216-148.lsanca.fios.verizon.net [71.177.216.148]) (authenticated bits=0) by zoom.lafn.org (8.14.7/8.14.2) with ESMTP id s3ALN8WO019221 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Thu, 10 Apr 2014 14:23:09 -0700 (PDT) (envelope-from bc979@lafn.org) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: 9.2 Boot Problem From: Doug Hardie In-Reply-To: <981CAA9F-1E67-4E56-A119-BA6D1D29F383@lafn.org> Date: Thu, 10 Apr 2014 14:23:08 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <89290759-E5C2-4991-B644-A82648BEDD52@lafn.org> References: <175D3755-BB9B-4EAD-BDAD-06E9670E06AB@lafn.org> <186472F9-A97B-4863-81BC-67BE788D5E9A@lafn.org> <791C8200-023A-4ACB-9B6F-F5A8B0E170F4@lafn.org> <5bfb4fb619954c3dfbd3499aafa98917.authenticated@ultimatedns.net> <4F983E6A-0A7D-403C-AFAA-9CCCCB05716F@lafn.org> <0f3f01cf5439$13cf8570$3b6e9050$@FreeBSD.org> <981CAA9F-1E67-4E56-A119-BA6D1D29F383@lafn.org> To: freebsd-stable@freebsd.org X-Mailer: Apple Mail (2.1510) X-Virus-Scanned: clamav-milter 0.98 at zoom.lafn.org X-Virus-Status: Clean Cc: "dteske@FreeBSD.org Teske" , Chris H X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Apr 2014 21:23:24 -0000 On 9 April 2014, at 16:53, Doug Hardie wrote: >=20 > On 9 April 2014, at 14:17, dteske@FreeBSD.org wrote: >=20 >>=20 >>=20 >>> -----Original Message----- >>> From: Chris H [mailto:bsd-lists@bsdforge.com] >>> Sent: Wednesday, April 9, 2014 2:03 PM >>> To: Doug Hardie >>> Cc: freebsd-stable@freebsd.org List >>> Subject: Re: 9.2 Boot Problem >>>=20 >>>>=20 >>>> On 9 April 2014, at 13:49, "Chris H" = wrote: >>>>=20 >>>>>>=20 >>>>>> On 9 April 2014, at 11:29, "Chris H" = wrote: >>>>>>=20 >>>>>>>>=20 >>>>>>>> On 4 April 2014, at 21:08, Doug Hardie wrote: >>>>>>>>=20 >>>>>>>>> I put this out on Questions, but got no responses. Hopefully >>>>>>>>> someone here has some ideas. >>>>>>>>>=20 >>>>>>>>> FreeBSD 9.2. All of my systems are hanging during boot right >>>>>>>>> after the screen that has the picture. Its as if someone hit = a >>>>>>>>> space on the keyboard. However, these systems have no = keyboard. >>>>>>>>> If I plug one in, or use the serial console, and enter a = return, >>>>>>>>> the boot continues properly. >>>>>>>>>=20 >>>>>>>>> The boot menu is displayed along with Beastie. However, the = line >>>>>>>>> that says Autoboot in n seconds=85 never appears. It just = stops >>>>>>>>> there. These are all new installs from CD systems. >>>>>>>>> I just used freebsd-update to take a toy server from 9.1 to = 9.2 >>>>>>>>> and it doesn't exhibit this behavior. It boots properly. I = have >>>>>>>>> updated one of the production servers with the latest 9.2 = changes >>>>>>>>> and it still has the issue. I first thought that some config >>>>>>>>> file did not get updated properly on the CD. I have dug = around >>>>>>>>> through the 4th files and don't see anything obvious that = would >>>>>>>>> cause this. I have now verified that all the 4th files in = boot >>>>>>>>> are identical (except for the version number. They are = slightly >>>>>>>>> different). I don't believe this is a BIOS setting issue as >>>>>>>>> FreeBSD 7.2 didn't exhibit this behavior. All >>>>>>>>> 4 >>>>>>>>> systems are on totally different motherboards. >>>>>>>>>=20 >>>>>>>>> I tried setting loader_logo=3D"none" in /boot/config.rc and = that >>>>>>>>> eliminated the menu and Beastie. I think the system completed >>> booting, but the serial console was then dead. >>>>>>>>> It >>>>>>>>> did not respond or output anything. I had to remove that and >>>>>>>>> reboot to get the console back again. >>>>>>>>>=20 >>>>>>>>> I need to get this fixed as these are production servers that = are >>>>>>>>> essentially unmanned so its difficult to get them back up = again. >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> No response here either. Surely someone must know the loader. = I >>> have been digging >>>>>>>> through >>>>>>>> the code, and can't find any differences between the systems = that >>> work and those that >>>>>>>> don't. >>>>>>>> Is there any way to debug this? Is there a way to find out = where the >>> loader is sitting >>>>>>>> waiting on input from the terminal. That might give a clue as = to why it >>> didn't >>>>>>>> autoboot. >>>>>>>>=20 >>>>>>> OK. This is the first I've seen of your post. I'm not going to = profess >>>>>>> being an expert. But I might suggest adding the following to >>>>>>> loader.conf(5) >>>>>>>=20 >>>>>>> verbose_loading=3D"YES" >>>>>>> boot_verbose=3D"YES" >>>>>>>=20 >>>>>>> This raises the "noise level". Maybe that will help to provide = you with >>>>>>> a bit more information, as to what, or if, your booting. DO have = a look >>>>>>> through /boot/defaults/loader.conf for more hints, as to what, = and >>> how >>>>>>> you can control the boot process. As well as = /etc/defaults/rc.conf. >>>>>>> In fact, you can pre-decide what, and how, to boot. Even passing = by the >>>>>>> boot menu entirely. >>>>>>=20 >>>>>> Thanks Chris. I did that and here is what I get: >>>>>>=20 >>>>>> Rebooting... >>>>>> cpu_reset: Stopping other CPUs >>>>>> /boot.config: -Dh >>>>>> Consoles: internal video/keyboard serial port >>>>>> BIOS drive A: is disk0 >>>>>> BIOS drive C: is disk1 >>>>>> BIOS 640kB/2087360kB available memory >>>>>>=20 >>>>>> FreeBSD/x86 bootstrap loader, Revision 1.1 >>>>>> (doug@zool.lafn.org, Tue Apr 8 20:30:20 PDT 2014) >>>>>> Loading /boot/defaults/loader.conf >>>>>> Warning: unable to open file /boot/loader.conf.local >>>>>> /boot/kernel/kernel text=3D0xdb3171 data=3D0xf3c04+0xbb770 >>> syms=3D[0x4+0xeda80+0x4+0x1b8ebf] >>>>>> zpool_cache...failed! >>>>>> \ >>>>>> H[Esc]ape to loader prompt_ _____ _____ >>>>>> | ____| | _ \ / ____| __ \ >>>>>> | |___ _ __ ___ ___ | |_) | (___ | | | | >>>>>> | ___| '__/ _ \/ _ \| _ < \___ \| | | | >>>>>> | | | | | __/ __/| |_) |____) | |__| | >>>>>> | | | | | | || | | | >>>>>> |_| |_| \___|\___||____/|_____/|_____/ ``` = ` >>>>>> s` = `.....---.......--.``` -/ >>>>>> + Welcome to FreeBSD + +o .--` = /y:` +. >>>>>> | | yo`:. :o = `+- >>>>>> | 1. Boot Multi User [Enter] | y/ 3;46H / >>>>>> | 2.-- / | >>>>>> | | >>>>>> | 4. Reboot | `: = :` >>>>>> | | `: = :` >>>>>> | Options: / = / >>>>>> | 5. Configure Boot [O]ptions... .- = -. >>>>>> | -- = -. >>>>>> | `:` = `:` >>>>>> | .-- = `--. >>>>>> | = .---.....----. >>>>>> +-----------------------------------------+ >>>>>>=20 >>>>>> FreeBSD `Nakatomi = Socrates' 9.2 >>>>>>=20 >>>>>>=20 >>>>>> Now it waits for a return. I have tried changing the logo, = setting the >>> autoboot timeout >>>>>> and >>>>>> a couple others. The only thing that did anything different was = setting >>> the logo to an >>>>>> invalid value. Basically the console was dead after that, but = the system >>> did boot. I >>>>>> never >>>>>> see the Auto Boot in n seconds message. Its also interesting = that the list >>> of options >>>>>> above >>>>>> appears incomplete. On the working system, items 1 through 5 are = all >>> present. I have >>>>>> now >>>>>> checked all the cksum's for all the files in /boot and they are = all the same. >>>>>>=20 >>>>> Hmmm. Looks like you're going to make me do all your research, for = you. >>> ;) >>>>> You /did/ read the contents of /boot/defaults/loader.conf. Yes? = I'm >>> guessing >>>>> that you've also already read loader.4th(8), and the other related = info. >>>>> Now this is pure supposition; as it appears that you're looking = for a serial >>>>> console. I'd /speculate/ that you want to turn all that NASTY ANSI = stuff >>> OFF >>>>> That's why your not seeing the complete menu -- hear that Devin! >>>>> I'm going to post just this much for now, just to get you started. = I know >>>>> what else you need/are looking for. But need to find the /correct/ = syntax >>> -- >>>>> paraphrasing, just won't get it. :)\ >>>>=20 >>>> Setting loader_color=3D"NO" (from man page) does give back the = full menu. >>> Still waits for >>>> return after the version name. I haven't found in the forth where = it is >>> reading the >>>> keyboard. Yes, I have to use a serial console. These machines are = about >>> 100 miles away. >>>> Something is stopping the autoboot from even starting. >>>=20 >>> See my reply to this. I think I've given you the hints you need -- = fingers >>> crossed. :) >>>=20 >>=20 >> He's using console=3Dcomconsole (serial boot). >> When that is the case, loader_color is automatically set to NO. >> There's no reason to set both loader_color=3DNO and console=3D >> comconsole. The code that does this is here: >>=20 >> = http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/color.4th?revi= sion=3D255898&view=3Dmarkup >> Line 48 within the loader_color? function: >> boot_serial? if FALSE else TRUE then >>=20 >> As for answering the quandary of where the keyboard is polled >> during the timeout countdown, that's the getkey function in here: >>=20 >> = http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/menu.4th?revis= ion=3D255898&view=3Dmarkup >> --=20 >=20 >=20 >=20 > I commented out the 3 cursor positions in menu-timeout-update. It = does not appear that word is being used. The Autoboot message never = appeared. Obviously getkey is being used as it does respond properly to = a return. I am beginning to suspect that menu_timeout_enabled is zero. = I believe adding a line after getkey's begin with >=20 > s"menu_timeout_enabled =3D " type menu_timeout_enabled @ . 10 = spaces >=20 > will tell me. There is a missing space after the first " above. However, that does = confirm my suspicion that menu_timeout_enabled is set to 0. It is only = displayed once. On a working system the value is 1 and that message is = output numerous times until the 10 seconds expires and then the boot = begins. Now to figure out how that value is getting set incorrectly.