Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Mar 2015 22:22:26 +1000
From:      Da Rock <freebsd-fs@herveybayaustralia.com.au>
To:        Kirk McKusick <mckusick@mckusick.com>, Benjamin Kaduk <kaduk@MIT.EDU>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: Delete a directory, crash the system
Message-ID:  <55169D02.8090107@herveybayaustralia.com.au>
In-Reply-To: <201503251712.t2PHC1R8090290@chez.mckusick.com>
References:  <201503251712.t2PHC1R8090290@chez.mckusick.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 26/03/2015 03:12, Kirk McKusick wrote:
>> Date: Wed, 25 Mar 2015 00:25:19 -0400 (EDT)
>> From: Benjamin Kaduk <kaduk@MIT.EDU>
>> To: Da Rock <freebsd-fs@herveybayaustralia.com.au>
>> Subject: Re: Delete a directory, crash the system
>> Cc: freebsd-fs@freebsd.org, mckusick@freebsd.org
>>
>> On Tue, 24 Mar 2015, Da Rock wrote:
>>
>>> On 03/25/15 00:16, Benjamin Kaduk wrote:
>>> Not precisely, but the message is just a flash and there is no copying of it.
>>> Anyway, inode 4 is the .sujournal file as expected; this means there is an
>>> issue with the softupdates. Could this be narrowing it down (the OP to this
>>> was also in this age of enlightenment, SU came in with 8.x didn't it?)?
>> Ah, SU+J could be quite relevant.  Soft-update journalling was enabled by
>> default for a period of time, but I believe it was disabled because there
>> were some scenarios where it was destabilizing.  CC-ing Kirk to improve on
>> my lousy memory.
> As far as I know SU+J is still on by default.
>
>> Do you remember what version was used to install the system in question
>> (i.e., create the filesystem in question)?  Please show the output of
>> 'tunefs -p <filesystem>'
>>
>>> So I did some fiddling with fsck, fsdb, find and stat; and got nowhere. I ran
>>> fsck again and it gave me not much again. It did hint at some files in the
>>> ports tree, so I cleaned up the ports tree to fresh install point, ran fsck
>>> again and rebooted. So far so good, but I'm keeping my fingers crossed still.
>> It is probably important to note that 'fsck -F' and saying 'no' to "USE
>> JOURNAL?" is the most relevant fsck invocation.
>>
>>> This doesn't help the panics - they're still a pita when they happen. It does
>>> help me resolve the issue this time though. But initiating this error in
>>> testing is damn near impossible. What can we document here as a way to gather
>>> data to determine how to resolve this issue? Given my luck with this, its
>>> bound to happen again at some point :)
>> I think actual diagnostic is beyond my expertise/time committment at the
>> moment.  I suspect that using tunefs to disable softupdate journalling
>> will be a workaround, if that is what you are really interested.
>>
>> I'll let Kirk decide if he wants to debug more, but the answer may well be
>> "no" if you're not running the latest ufs from -current.
>>
>> -Ben
> The suggestion to disable journalling is a good one. Journalling fixes
> only consistency errors that it knows about and cannot handle media errors.
> The sorts of panics you are getting are usually caused by media errors.
> So disabling journally and checking all metadata after crashes (which is
> what fsck does) should minimize your problems.
So my only option for journal is gjournal (slow) or zfs (memory hog) to 
maintain consistency; is that it? Incidentally, why keep SU+J on as 
default then? Wouldn't this be considered a bug still, then?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55169D02.8090107>