Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Jun 2020 09:54:39 +0200
From:      Guido Falsi <mad@madpilot.net>
To:        Andriy Gapon <avg@FreeBSD.org>, freebsd-fs@freebsd.org
Subject:   Re: ZFS panic on boot (head)
Message-ID:  <6547bdf1-2332-ba2f-b7f6-51dbeed92d63@madpilot.net>
In-Reply-To: <5c6933c3-7996-694e-91e2-3dda125926e7@FreeBSD.org>
References:  <8a092b2f-ae13-ef3a-0086-9d4aaaf8466b@madpilot.net> <5c6933c3-7996-694e-91e2-3dda125926e7@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 09/06/20 07:53, Andriy Gapon wrote:
> On 08/06/2020 21:20, Guido Falsi via freebsd-fs wrote:
>> Hi,
>>
>> On a laptop running head r361728, after a normal reboot I'm
>> experiencing an had panic every boot.
>>
>> Booting a kernel with kdb I get:
>>
>> panic: Solaris(panic): zfs: allocating allocated
>> segment(offset=433818959872 size=4096) of (offset=433818021200 size=32768)
>>
>> I can provide a screenshot of the stack trace if required.
>>
>> Is there a way to recover? Or should I simply reinstall the machine?
>>
>> What is actually happened? Maybe a disk data error? (it's a laptop with
>> only one disk)
>>
>> Or something misbehaved in ZFS?
>>
>> Thanks in advance for any information/help.
> 
> What you see is inconsistent information in a space map.
> There are two elements that describe overlapping ranges of disk space.
> That is something that must never happen.
> I do not know how that can be repaired, the safest solution is to copy / restore
> the data to a fresh pool.

I see. Being a laptop with no important data on it, I think I'll just
reinstall/reconfigure the system.

> 
> As to why, this could be a result of a bug in ZFS.  If you google for the panic
> messages (sans concrete numbers) you can find some bugs that were fixed in the
> past.  Also, and it once happened to me, it could be a hardware problem.  I used
> non-ECC memory, (at least) one bit got flipped and it was a bit describing a
> space map entry.  After the corruption the good entry turned into a conflicting
> entry.

Again since the hardware does not have ECC memory (and I suspect no
support for it), this could be the explanation for what happened. I bet
I've been very unlucky!

> 
> A disk error can probably be ruled out.  First, ZFS stores 3 copies of all
> important metadata including space maps, so all three would have to be
> corrupted.  Second, even if all copies were corrupted then you would get a
> checksum  error (this is ZFS).
> 

At least I can have faith in the disk hardware :)


Thanks for taking your time to give me this information!

-- 
Guido Falsi <mad@madpilot.net>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6547bdf1-2332-ba2f-b7f6-51dbeed92d63>