From owner-freebsd-stable@FreeBSD.ORG Sun Apr 13 02:08:01 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 153BE5BE; Sun, 13 Apr 2014 02:08:01 +0000 (UTC) Received: from zoom.lafn.org (zoom.lafn.org [108.92.93.123]) by mx1.freebsd.org (Postfix) with ESMTP id E00A51716; Sun, 13 Apr 2014 02:08:00 +0000 (UTC) Received: from [10.0.1.3] (static-71-177-216-148.lsanca.fios.verizon.net [71.177.216.148]) (authenticated bits=0) by zoom.lafn.org (8.14.7/8.14.2) with ESMTP id s3D27hiw099019 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 12 Apr 2014 19:07:44 -0700 (PDT) (envelope-from bc979@lafn.org) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\)) Subject: Re: 9.2 Boot Problem From: Doug Hardie In-Reply-To: <89290759-E5C2-4991-B644-A82648BEDD52@lafn.org> Date: Sat, 12 Apr 2014 19:07:43 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <1D50A38D-8919-4034-A4E5-EEF8E78E638D@lafn.org> References: <175D3755-BB9B-4EAD-BDAD-06E9670E06AB@lafn.org> <186472F9-A97B-4863-81BC-67BE788D5E9A@lafn.org> <791C8200-023A-4ACB-9B6F-F5A8B0E170F4@lafn.org> <5bfb4fb619954c3dfbd3499aafa98917.authenticated@ultimatedns.net> <4F983E6A-0A7D-403C-AFAA-9CCCCB05716F@lafn.org> <0f3f01cf5439$13cf8570$3b6e9050$@FreeBSD.org> <981CAA9F-1E67-4E56-A119-BA6D1D29F383@lafn.org> <89290759-E5C2-4991-B644-A82648BEDD52@lafn.org> To: freebsd-stable@freebsd.org X-Mailer: Apple Mail (2.1510) X-Virus-Scanned: clamav-milter 0.98 at zoom.lafn.org X-Virus-Status: Clean Cc: "dteske@FreeBSD.org Teske" , Chris H X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Apr 2014 02:08:01 -0000 On 10 April 2014, at 14:23, Doug Hardie wrote: >=20 > On 9 April 2014, at 16:53, Doug Hardie wrote: >=20 >>=20 >> On 9 April 2014, at 14:17, dteske@FreeBSD.org wrote: >>=20 >>>=20 >>>=20 >>>> -----Original Message----- >>>> From: Chris H [mailto:bsd-lists@bsdforge.com] >>>> Sent: Wednesday, April 9, 2014 2:03 PM >>>> To: Doug Hardie >>>> Cc: freebsd-stable@freebsd.org List >>>> Subject: Re: 9.2 Boot Problem >>>>=20 >>>>>=20 >>>>> On 9 April 2014, at 13:49, "Chris H" = wrote: >>>>>=20 >>>>>>>=20 >>>>>>> On 9 April 2014, at 11:29, "Chris H" = wrote: >>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> On 4 April 2014, at 21:08, Doug Hardie wrote: >>>>>>>>>=20 >>>>>>>>>> I put this out on Questions, but got no responses. Hopefully >>>>>>>>>> someone here has some ideas. >>>>>>>>>>=20 >>>>>>>>>> FreeBSD 9.2. All of my systems are hanging during boot right >>>>>>>>>> after the screen that has the picture. Its as if someone hit = a >>>>>>>>>> space on the keyboard. However, these systems have no = keyboard. >>>>>>>>>> If I plug one in, or use the serial console, and enter a = return, >>>>>>>>>> the boot continues properly. >>>>>>>>>>=20 >>>>>>>>>> The boot menu is displayed along with Beastie. However, the = line >>>>>>>>>> that says Autoboot in n seconds=85 never appears. It just = stops >>>>>>>>>> there. These are all new installs from CD systems. >>>>>>>>>> I just used freebsd-update to take a toy server from 9.1 to = 9.2 >>>>>>>>>> and it doesn't exhibit this behavior. It boots properly. I = have >>>>>>>>>> updated one of the production servers with the latest 9.2 = changes >>>>>>>>>> and it still has the issue. I first thought that some config >>>>>>>>>> file did not get updated properly on the CD. I have dug = around >>>>>>>>>> through the 4th files and don't see anything obvious that = would >>>>>>>>>> cause this. I have now verified that all the 4th files in = boot >>>>>>>>>> are identical (except for the version number. They are = slightly >>>>>>>>>> different). I don't believe this is a BIOS setting issue as >>>>>>>>>> FreeBSD 7.2 didn't exhibit this behavior. All >>>>>>>>>> 4 >>>>>>>>>> systems are on totally different motherboards. >>>>>>>>>>=20 >>>>>>>>>> I tried setting loader_logo=3D"none" in /boot/config.rc and = that >>>>>>>>>> eliminated the menu and Beastie. I think the system = completed >>>> booting, but the serial console was then dead. >>>>>>>>>> It >>>>>>>>>> did not respond or output anything. I had to remove that and >>>>>>>>>> reboot to get the console back again. >>>>>>>>>>=20 >>>>>>>>>> I need to get this fixed as these are production servers that = are >>>>>>>>>> essentially unmanned so its difficult to get them back up = again. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> No response here either. Surely someone must know the loader. = I >>>> have been digging >>>>>>>>> through >>>>>>>>> the code, and can't find any differences between the systems = that >>>> work and those that >>>>>>>>> don't. >>>>>>>>> Is there any way to debug this? Is there a way to find out = where the >>>> loader is sitting >>>>>>>>> waiting on input from the terminal. That might give a clue as = to why it >>>> didn't >>>>>>>>> autoboot. >>>>>>>>>=20 >>>>>>>> OK. This is the first I've seen of your post. I'm not going to = profess >>>>>>>> being an expert. But I might suggest adding the following to >>>>>>>> loader.conf(5) >>>>>>>>=20 >>>>>>>> verbose_loading=3D"YES" >>>>>>>> boot_verbose=3D"YES" >>>>>>>>=20 >>>>>>>> This raises the "noise level". Maybe that will help to provide = you with >>>>>>>> a bit more information, as to what, or if, your booting. DO = have a look >>>>>>>> through /boot/defaults/loader.conf for more hints, as to what, = and >>>> how >>>>>>>> you can control the boot process. As well as = /etc/defaults/rc.conf. >>>>>>>> In fact, you can pre-decide what, and how, to boot. Even = passing by the >>>>>>>> boot menu entirely. >>>>>>>=20 >>>>>>> Thanks Chris. I did that and here is what I get: >>>>>>>=20 >>>>>>> Rebooting... >>>>>>> cpu_reset: Stopping other CPUs >>>>>>> /boot.config: -Dh >>>>>>> Consoles: internal video/keyboard serial port >>>>>>> BIOS drive A: is disk0 >>>>>>> BIOS drive C: is disk1 >>>>>>> BIOS 640kB/2087360kB available memory >>>>>>>=20 >>>>>>> FreeBSD/x86 bootstrap loader, Revision 1.1 >>>>>>> (doug@zool.lafn.org, Tue Apr 8 20:30:20 PDT 2014) >>>>>>> Loading /boot/defaults/loader.conf >>>>>>> Warning: unable to open file /boot/loader.conf.local >>>>>>> /boot/kernel/kernel text=3D0xdb3171 data=3D0xf3c04+0xbb770 >>>> syms=3D[0x4+0xeda80+0x4+0x1b8ebf] >>>>>>> zpool_cache...failed! >>>>>>> \ >>>>>>> H[Esc]ape to loader prompt_ _____ _____ >>>>>>> | ____| | _ \ / ____| __ \ >>>>>>> | |___ _ __ ___ ___ | |_) | (___ | | | | >>>>>>> | ___| '__/ _ \/ _ \| _ < \___ \| | | | >>>>>>> | | | | | __/ __/| |_) |____) | |__| | >>>>>>> | | | | | | || | | | >>>>>>> |_| |_| \___|\___||____/|_____/|_____/ ``` = ` >>>>>>> s` = `.....---.......--.``` -/ >>>>>>> + Welcome to FreeBSD + +o .--` = /y:` +. >>>>>>> | | yo`:. :o = `+- >>>>>>> | 1. Boot Multi User [Enter] | y/ 3;46H / >>>>>>> | 2.-- / | >>>>>>> | | >>>>>>> | 4. Reboot | `: = :` >>>>>>> | | `: = :` >>>>>>> | Options: / = / >>>>>>> | 5. Configure Boot [O]ptions... .- = -. >>>>>>> | -- = -. >>>>>>> | `:` = `:` >>>>>>> | .-- = `--. >>>>>>> | = .---.....----. >>>>>>> +-----------------------------------------+ >>>>>>>=20 >>>>>>> FreeBSD `Nakatomi = Socrates' 9.2 >>>>>>>=20 >>>>>>>=20 >>>>>>> Now it waits for a return. I have tried changing the logo, = setting the >>>> autoboot timeout >>>>>>> and >>>>>>> a couple others. The only thing that did anything different was = setting >>>> the logo to an >>>>>>> invalid value. Basically the console was dead after that, but = the system >>>> did boot. I >>>>>>> never >>>>>>> see the Auto Boot in n seconds message. Its also interesting = that the list >>>> of options >>>>>>> above >>>>>>> appears incomplete. On the working system, items 1 through 5 = are all >>>> present. I have >>>>>>> now >>>>>>> checked all the cksum's for all the files in /boot and they are = all the same. >>>>>>>=20 >>>>>> Hmmm. Looks like you're going to make me do all your research, = for you. >>>> ;) >>>>>> You /did/ read the contents of /boot/defaults/loader.conf. Yes? = I'm >>>> guessing >>>>>> that you've also already read loader.4th(8), and the other = related info. >>>>>> Now this is pure supposition; as it appears that you're looking = for a serial >>>>>> console. I'd /speculate/ that you want to turn all that NASTY = ANSI stuff >>>> OFF >>>>>> That's why your not seeing the complete menu -- hear that Devin! >>>>>> I'm going to post just this much for now, just to get you = started. I know >>>>>> what else you need/are looking for. But need to find the = /correct/ syntax >>>> -- >>>>>> paraphrasing, just won't get it. :)\ >>>>>=20 >>>>> Setting loader_color=3D"NO" (from man page) does give back the = full menu. >>>> Still waits for >>>>> return after the version name. I haven't found in the forth where = it is >>>> reading the >>>>> keyboard. Yes, I have to use a serial console. These machines = are about >>>> 100 miles away. >>>>> Something is stopping the autoboot from even starting. >>>>=20 >>>> See my reply to this. I think I've given you the hints you need -- = fingers >>>> crossed. :) >>>>=20 >>>=20 >>> He's using console=3Dcomconsole (serial boot). >>> When that is the case, loader_color is automatically set to NO. >>> There's no reason to set both loader_color=3DNO and console=3D >>> comconsole. The code that does this is here: >>>=20 >>> = http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/color.4th?revi= sion=3D255898&view=3Dmarkup >>> Line 48 within the loader_color? function: >>> boot_serial? if FALSE else TRUE then >>>=20 >>> As for answering the quandary of where the keyboard is polled >>> during the timeout countdown, that's the getkey function in here: >>>=20 >>> = http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/menu.4th?revis= ion=3D255898&view=3Dmarkup >>> --=20 >>=20 >>=20 >>=20 >> I commented out the 3 cursor positions in menu-timeout-update. It = does not appear that word is being used. The Autoboot message never = appeared. Obviously getkey is being used as it does respond properly to = a return. I am beginning to suspect that menu_timeout_enabled is zero. = I believe adding a line after getkey's begin with >>=20 >> s"menu_timeout_enabled =3D " type menu_timeout_enabled @ . 10 = spaces >>=20 >> will tell me. >=20 >=20 >=20 > There is a missing space after the first " above. However, that does = confirm my suspicion that menu_timeout_enabled is set to 0. It is only = displayed once. On a working system the value is 1 and that message is = output numerous times until the 10 seconds expires and then the boot = begins. >=20 > Now to figure out how that value is getting set incorrectly. >=20 After much digging, I now know what it going on, but not why. When = getkey is called the first time, menu_timeout_enable is set to one. = However, it is set to zero on every check after that. In getkey after = the comment "Was a key pressed" is a check of key to see if a key was = pressed. It is returning a decimal 7 (BEL). That then clears = menu_timeout_enable and it then sits there waiting for a valid key = input. There is no keyboard plugged into the system. I have no idea = how that BEL is being generated or even how to prevent it. Could it be = possible that it comes from the serial console? I tend to doubt thats = the case since the system hangs during boot when the serial console is = not connected. I suppose that I could put in a test for a key value = that is not a control character, but that would only work until the next = system update. I'd have to remember to put it back in each time. Thats = not likely to happen. My memory is not that good. Whats interesting is = that I have 4 systems (i386) doing this and 1 system (i386) and 2 = systems (amd64) not doing it. The only common thread is the 4 systems = doing it are about 100 miles from me and the working ones are here.