Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Nov 2011 13:11:18 +0100
From:      Stefan Esser <se@freebsd.org>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: [amd64] Reproducible cold boot failure (reboot succeeds) in -CURRENT
Message-ID:  <4EBD10E6.9000302@freebsd.org>
In-Reply-To: <CAJ-FndBqwhS_Ez_2JV81LCE68edAHqWHTseBY5TzM_T%2B%2BS5xWw@mail.gmail.com>
References:  <4EBB885E.9060908@freebsd.org> <CAJ-FndBqwhS_Ez_2JV81LCE68edAHqWHTseBY5TzM_T%2B%2BS5xWw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 10.11.2011 11:32, schrieb Attilio Rao:
> 2011/11/10 Stefan Esser<se@freebsd.org>:
>> I can produce further debug output on demand, but I do not have a serial or
>> firewire console setup for debugging.
>>
>> Is anybody else affected by this boot problem?
>
> Can you setup a videocamera or a simple serial console?
> Did you try to boot with both -s and -v on?
>
> Attilio

I should be able to attach a serial console.

Booting with -s should make no difference (since booting fails during a 
very early initialization stage).

I tried -v, but found that I could not reproduce the cold boot problem
without the system being at least in S5 for hours (just switching off
power and waiting a few minutes did not suffice, but this morning the
system again booted only on the first attempt). This behavior obviously
limits the rate of tests possible ...


It looks as if the memory holding the loaded kernel and/or modules is
corrupted before the kernel is reloaded and started, as indicated by
this morning's boot failure:

kldload: unexpected relocation type 268435457
kldload: unexpected relocation type 67108865
Fatal trap 12: ...

The rest of the panic message and back trace is identical to the trap 12 
panic details in my previous message.


It really looks as if the loaded kernel image is corrupted at random
positions, leading to random panics (but often of the type trap 12 or
page fault in kernel) when execution reaches damaged code or data.

A reboot succeeded without any problem as in all prior cases ...

Any ideas?

Regards, STefan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EBD10E6.9000302>