Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 Apr 2014 22:25:53 -0700
From:      Doug Hardie <bc979@lafn.org>
To:        dteske@FreeBSD.org
Cc:        freebsd-stable@freebsd.org, 'Chris H' <bsd-lists@bsdforge.com>
Subject:   Re: 9.2 Boot Problem
Message-ID:  <A5176856-EF74-40CD-8F77-C05260D9F722@lafn.org>
In-Reply-To: <117a01cf56eb$6f989e50$4ec9daf0$@FreeBSD.org>
References:  <175D3755-BB9B-4EAD-BDAD-06E9670E06AB@lafn.org> <186472F9-A97B-4863-81BC-67BE788D5E9A@lafn.org> <a865b8f2ccb9ad4918544bad3d49554d.authenticated@ultimatedns.net> <791C8200-023A-4ACB-9B6F-F5A8B0E170F4@lafn.org> <5bfb4fb619954c3dfbd3499aafa98917.authenticated@ultimatedns.net> <4F983E6A-0A7D-403C-AFAA-9CCCCB05716F@lafn.org> <feeca307c8da9ca3b385cf47d75904a7.authenticated@ultimatedns.net> <0f3f01cf5439$13cf8570$3b6e9050$@FreeBSD.org> <981CAA9F-1E67-4E56-A119-BA6D1D29F383@lafn.org> <89290759-E5C2-4991-B644-A82648BEDD52@lafn.org> <1D50A38D-8919-4034-A4E5-EEF8E78E638D@lafn.org> <117a01cf56eb$6f989e50$4ec9daf0$@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On 13 April 2014, at 00:38, dteske@FreeBSD.org wrote:

>=20
>=20
>> -----Original Message-----
>> From: Doug Hardie [mailto:bc979@lafn.org]
>> Sent: Saturday, April 12, 2014 7:08 PM
>> To: freebsd-stable@freebsd.org
>> Cc: dteske@FreeBSD.org Teske; Chris H
>> Subject: Re: 9.2 Boot Problem
>>=20
>>=20
>> On 10 April 2014, at 14:23, Doug Hardie <bc979@lafn.org> wrote:
>>=20
>>>=20
>>> On 9 April 2014, at 16:53, Doug Hardie <bc979@lafn.org> wrote:
>>>=20
>>>>=20
>>>> On 9 April 2014, at 14:17, dteske@FreeBSD.org wrote:
>>>>=20
>>>>>=20
>>>>>=20
>>>>>> -----Original Message-----
>>>>>> From: Chris H [mailto:bsd-lists@bsdforge.com]
>>>>>> Sent: Wednesday, April 9, 2014 2:03 PM
>>>>>> To: Doug Hardie
>>>>>> Cc: freebsd-stable@freebsd.org List
>>>>>> Subject: Re: 9.2 Boot Problem
>>>>>>=20
>>>>>>>=20
>>>>>>> On 9 April 2014, at 13:49, "Chris H" <bsd-lists@bsdforge.com> =
wrote:
>>>>>>>=20
>>>>>>>>>=20
>>>>>>>>> On 9 April 2014, at 11:29, "Chris H" <bsd-lists@bsdforge.com>
>> wrote:
>>>>>>>>>=20
>>>>>>>>>>>=20
>>>>>>>>>>> On 4 April 2014, at 21:08, Doug Hardie <bc979@lafn.org> =
wrote:
>>>>>>>>>>>=20
>>>>>>>>>>>> I put this out on Questions, but got no responses. =
Hopefully
>>>>>>>>>>>> someone here has some ideas.
>>>>>>>>>>>>=20
>>>>>>>>>>>> FreeBSD 9.2.  All of my systems are hanging during boot =
right
>>>>>>>>>>>> after the screen that has the picture.  Its as if someone =
hit
>>>>>>>>>>>> a space on the keyboard.  However, these systems have no
>> keyboard.
>>>>>>>>>>>> If I plug one in, or use the serial console, and enter a
>>>>>>>>>>>> return, the boot continues properly.
>>>>>>>>>>>>=20
>>>>>>>>>>>> The boot menu is displayed along with Beastie.  However, =
the
>>>>>>>>>>>> line that says Autoboot in n seconds. never appears.  It =
just
>>>>>>>>>>>> stops there.  These are all new installs from CD systems.
>>>>>>>>>>>> I just used freebsd-update to take a toy server from 9.1 to
>>>>>>>>>>>> 9.2 and it doesn't exhibit this behavior.  It boots =
properly.
>>>>>>>>>>>> I have updated one of the production servers with the =
latest
>>>>>>>>>>>> 9.2 changes and it still has the issue.  I first thought =
that
>>>>>>>>>>>> some config file did not get updated properly on the CD.  I
>>>>>>>>>>>> have dug around through the 4th files and don't see =
anything
>>>>>>>>>>>> obvious that would cause this.  I have now verified that =
all
>>>>>>>>>>>> the 4th files in boot are identical (except for the version
>>>>>>>>>>>> number.  They are slightly different).  I don't believe =
this
>>>>>>>>>>>> is a BIOS setting issue as FreeBSD 7.2 didn't exhibit this
>>>>>>>>>>>> behavior.  All
>>>>>>>>>>>> 4
>>>>>>>>>>>> systems are on totally different motherboards.
>>>>>>>>>>>>=20
>>>>>>>>>>>> I tried setting loader_logo=3D"none" in /boot/config.rc and
>>>>>>>>>>>> that eliminated the menu and Beastie.  I think the system
>>>>>>>>>>>> completed
>>>>>> booting, but the serial console was then dead.
>>>>>>>>>>>> It
>>>>>>>>>>>> did not respond or output anything.  I had to remove that =
and
>>>>>>>>>>>> reboot to get the console back again.
>>>>>>>>>>>>=20
>>>>>>>>>>>> I need to get this fixed as these are production servers =
that
>>>>>>>>>>>> are essentially unmanned so its difficult to get them back =
up
>> again.
>>>>>>>>>>>=20
>>>>>>>>>>>=20
>>>>>>>>>>> No response here either.  Surely someone must know the
>> loader.
>>>>>>>>>>> I
>>>>>> have been digging
>>>>>>>>>>> through
>>>>>>>>>>> the code, and can't find any differences between the systems
>>>>>>>>>>> that
>>>>>> work and those that
>>>>>>>>>>> don't.
>>>>>>>>>>> Is there any way to debug this?  Is there a way to find out
>>>>>>>>>>> where the
>>>>>> loader is sitting
>>>>>>>>>>> waiting on input from the terminal.  That might give a clue =
as
>>>>>>>>>>> to why it
>>>>>> didn't
>>>>>>>>>>> autoboot.
>>>>>>>>>>>=20
>>>>>>>>>> OK. This is the first I've seen of your post. I'm not going =
to
>>>>>>>>>> profess being an expert. But I might suggest adding the
>>>>>>>>>> following to
>>>>>>>>>> loader.conf(5)
>>>>>>>>>>=20
>>>>>>>>>> verbose_loading=3D"YES"
>>>>>>>>>> boot_verbose=3D"YES"
>>>>>>>>>>=20
>>>>>>>>>> This raises the "noise level". Maybe that will help to =
provide
>>>>>>>>>> you with a bit more information, as to what, or if, your
>>>>>>>>>> booting. DO have a look through /boot/defaults/loader.conf =
for
>>>>>>>>>> more hints, as to what, and
>>>>>> how
>>>>>>>>>> you can control the boot process. As well as
> /etc/defaults/rc.conf.
>>>>>>>>>> In fact, you can pre-decide what, and how, to boot. Even
>>>>>>>>>> passing by the boot menu entirely.
>>>>>>>>>=20
>>>>>>>>> Thanks Chris.  I did that and here is what I get:
>>>>>>>>>=20
>>>>>>>>> Rebooting...
>>>>>>>>> cpu_reset: Stopping other CPUs
>>>>>>>>> /boot.config: -Dh
>>>>>>>>> Consoles: internal video/keyboard  serial port BIOS drive A: =
is
>>>>>>>>> disk0 BIOS drive C: is disk1 BIOS 640kB/2087360kB available
>>>>>>>>> memory
>>>>>>>>>=20
>>>>>>>>> FreeBSD/x86 bootstrap loader, Revision 1.1 =
(doug@zool.lafn.org,
>>>>>>>>> Tue Apr  8 20:30:20 PDT 2014) Loading =
/boot/defaults/loader.conf
>>>>>>>>> Warning: unable to open file /boot/loader.conf.local
>>>>>>>>> /boot/kernel/kernel text=3D0xdb3171 data=3D0xf3c04+0xbb770
>>>>>> syms=3D[0x4+0xeda80+0x4+0x1b8ebf]
>>>>>>>>> zpool_cache...failed!
>>>>>>>>> \
>>>>>>>>> H[Esc]ape to loader prompt_   _____ _____
>>>>>>>>> |  ____|             |  _ \ / ____|  __ \
>>>>>>>>> | |___ _ __ ___  ___ | |_) | (___ | |  | |
>>>>>>>>> |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
>>>>>>>>> | |   | | |  __/  __/| |_) |____) | |__| |
>>>>>>>>> | |   | | |    |    ||     |      |      |
>>>>>>>>> |_|   |_|  \___|\___||____/|_____/|_____/    ```
> `
>>>>>>>>>                                         s` =
`.....---.......--.```
> -/
>>>>>>>>> +            Welcome to FreeBSD           + +o   .--`         =
/y:`
> +.
>>>>>>>>> |                                         |  yo`:.            =
:o
> `+-
>>>>>>>>> |  1. Boot Multi User [Enter]             |   y/        3;46H =
/
>>>>>>>>> |  2.--  /                                |
>>>>>>>>> |                                         |
>>>>>>>>> |  4. Reboot                              | `:
> :`
>>>>>>>>> |                                         | `:
> :`
>>>>>>>>> |  Options:                                  /
> /
>>>>>>>>> |  5. Configure Boot [O]ptions...            .-
> -.
>>>>>>>>> |                                             --
> -.
>>>>>>>>> |                                              `:`
> `:`
>>>>>>>>> |                                                .--
> `--.
>>>>>>>>> |                                                   =
.---.....----.
>>>>>>>>> +-----------------------------------------+
>>>>>>>>>=20
>>>>>>>>>                                            FreeBSD `Nakatomi
>>>>>>>>> Socrates' 9.2
>>>>>>>>>=20
>>>>>>>>>=20
>>>>>>>>> Now it waits for a return.  I have tried changing the logo,
>>>>>>>>> setting the
>>>>>> autoboot timeout
>>>>>>>>> and
>>>>>>>>> a couple others.  The only thing that did anything different =
was
>>>>>>>>> setting
>>>>>> the logo to an
>>>>>>>>> invalid value.  Basically the console was dead after that, but
>>>>>>>>> the system
>>>>>> did boot.  I
>>>>>>>>> never
>>>>>>>>> see the Auto Boot in n seconds message.  Its also interesting
>>>>>>>>> that the list
>>>>>> of options
>>>>>>>>> above
>>>>>>>>> appears incomplete.  On the working system, items 1 through 5
>>>>>>>>> are all
>>>>>> present.  I have
>>>>>>>>> now
>>>>>>>>> checked all the cksum's for all the files in /boot and they =
are
> all the
>> same.
>>>>>>>>>=20
>>>>>>>> Hmmm. Looks like you're going to make me do all your research, =
for
>> you.
>>>>>> ;)
>>>>>>>> You /did/ read the contents of /boot/defaults/loader.conf. Yes?
>>>>>>>> I'm
>>>>>> guessing
>>>>>>>> that you've also already read loader.4th(8), and the other =
related
>> info.
>>>>>>>> Now this is pure supposition; as it appears that you're looking
>>>>>>>> for a serial console. I'd /speculate/ that you want to turn all
>>>>>>>> that NASTY ANSI stuff
>>>>>> OFF
>>>>>>>> That's why your not seeing the complete menu -- hear that =
Devin!
>>>>>>>> I'm going to post just this much for now, just to get you
>>>>>>>> started. I know what else you need/are looking for. But need to
>>>>>>>> find the /correct/ syntax
>>>>>> --
>>>>>>>> paraphrasing, just won't get it. :)\
>>>>>>>=20
>>>>>>> Setting loader_color=3D"NO"   (from man page)  does give back =
the full
>> menu.
>>>>>> Still waits for
>>>>>>> return after the version name.  I haven't found in the forth =
where
>>>>>>> it is
>>>>>> reading the
>>>>>>> keyboard.  Yes, I have to use a serial console.  These machines
>>>>>>> are about
>>>>>> 100 miles away.
>>>>>>> Something is stopping the autoboot from even starting.
>>>>>>=20
>>>>>> See my reply to this. I think I've given you the hints you need =
--
>>>>>> fingers crossed. :)
>>>>>>=20
>>>>>=20
>>>>> He's using console=3Dcomconsole (serial boot).
>>>>> When that is the case, loader_color is automatically set to NO.
>>>>> There's no reason to set both loader_color=3DNO and console=3D
>>>>> comconsole. The code that does this is here:
>>>>>=20
>>>>> =
http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/color.4t
>>>>> h?revision=3D255898&view=3Dmarkup Line 48 within the loader_color?
>>>>> function:
>>>>> 	boot_serial? if FALSE else TRUE then
>>>>>=20
>>>>> As for answering the quandary of where the keyboard is polled =
during
>>>>> the timeout countdown, that's the getkey function in here:
>>>>>=20
>>>>>=20
>> http://svnweb.freebsd.org/base/release/9.2.0/sys/boot/forth/menu.4th
>>>>> ?revision=3D255898&view=3Dmarkup
>>>>> --
>>>>=20
>>>>=20
>>>>=20
>>>> I commented out the 3 cursor positions in menu-timeout-update.  It
>>>> does not appear that word is being used.  The Autoboot message =
never
>>>> appeared.  Obviously getkey is being used as it does respond =
properly
>>>> to a return.  I am beginning to suspect that menu_timeout_enabled =
is
>>>> zero.  I believe adding a line after getkey's begin with
>>>>=20
>>>>      s"menu_timeout_enabled =3D " type menu_timeout_enabled @ . 10
>>>> spaces
>>>>=20
>>>> will tell me.
>>>=20
>>>=20
>>>=20
>>> There is a missing space after the first " above.  However, that =
does
> confirm
>> my suspicion that menu_timeout_enabled is set to 0.  It is only =
displayed
>> once.  On a working system the value is 1 and that message is output
>> numerous times until the 10 seconds expires and then the boot begins.
>>>=20
>>> Now to figure out how that value is getting set incorrectly.
>>>=20
>>=20
>> After much digging, I now know what it going on, but not why.  When =
getkey
>> is called the first time, menu_timeout_enable is set to one.  =
However, it
> is
>> set to zero on every check after that.  In getkey after the comment =
"Was a
>> key pressed" is a check of key to see if a key was pressed.  It is
> returning a
>> decimal 7 (BEL).  That then clears menu_timeout_enable and it then =
sits
>> there waiting for a valid key input.  There is no keyboard plugged =
into
> the
>> system.  I have no idea how that BEL is being generated or even how =
to
>> prevent it.  Could it be possible that it comes from the serial =
console?
> I tend
>> to doubt thats the case since the system hangs during boot when the =
serial
>> console is not connected.  I suppose that I could put in a test for a =
key
> value
>> that is not a control character, but that would only work until the =
next
> system
>> update.  I'd have to remember to put it back in each time.  Thats not
> likely to
>> happen.  My memory is not that good.  Whats interesting is that I =
have 4
>> systems (i386) doing this and 1 system (i386) and 2 systems (amd64) =
not
>> doing it.  The only common thread is the 4 systems doing it are about =
100
>> miles from me and the working ones are here.
>>=20
>=20
> Based on that feedback, I've developed the attached patch.txt.
> Can you give it a whirl and let me know how it works?

The patch works properly.  However, it the process of testing it, I =
discovered that the cause of the "bell" is actually the terminal =
emulator echoing that character back from something earlier in the =
reboot process.  Why that character is not understood.  Hence, the real =
problem lies in a hardware "failure" outside the motherboard.  So I =
don't know if you want to make that patch into the system or not.  It =
seems like a good idea to ignore anything thats a control character, or =
to clear out the input at the start of the process anyway.

In my case, I need the patch and will keep it in my systems.

Thanks for all the help.

-- Doug




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?A5176856-EF74-40CD-8F77-C05260D9F722>