Date: Sat, 28 Mar 2015 22:22:26 +1000 From: Da Rock <freebsd-fs@herveybayaustralia.com.au> To: Kirk McKusick <mckusick@mckusick.com>, Benjamin Kaduk <kaduk@MIT.EDU> Cc: freebsd-fs@freebsd.org Subject: Re: Delete a directory, crash the system Message-ID: <55169D02.8090107@herveybayaustralia.com.au> In-Reply-To: <201503251712.t2PHC1R8090290@chez.mckusick.com> References: <201503251712.t2PHC1R8090290@chez.mckusick.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 26/03/2015 03:12, Kirk McKusick wrote: >> Date: Wed, 25 Mar 2015 00:25:19 -0400 (EDT) >> From: Benjamin Kaduk <kaduk@MIT.EDU> >> To: Da Rock <freebsd-fs@herveybayaustralia.com.au> >> Subject: Re: Delete a directory, crash the system >> Cc: freebsd-fs@freebsd.org, mckusick@freebsd.org >> >> On Tue, 24 Mar 2015, Da Rock wrote: >> >>> On 03/25/15 00:16, Benjamin Kaduk wrote: >>> Not precisely, but the message is just a flash and there is no copying of it. >>> Anyway, inode 4 is the .sujournal file as expected; this means there is an >>> issue with the softupdates. Could this be narrowing it down (the OP to this >>> was also in this age of enlightenment, SU came in with 8.x didn't it?)? >> Ah, SU+J could be quite relevant. Soft-update journalling was enabled by >> default for a period of time, but I believe it was disabled because there >> were some scenarios where it was destabilizing. CC-ing Kirk to improve on >> my lousy memory. > As far as I know SU+J is still on by default. > >> Do you remember what version was used to install the system in question >> (i.e., create the filesystem in question)? Please show the output of >> 'tunefs -p <filesystem>' >> >>> So I did some fiddling with fsck, fsdb, find and stat; and got nowhere. I ran >>> fsck again and it gave me not much again. It did hint at some files in the >>> ports tree, so I cleaned up the ports tree to fresh install point, ran fsck >>> again and rebooted. So far so good, but I'm keeping my fingers crossed still. >> It is probably important to note that 'fsck -F' and saying 'no' to "USE >> JOURNAL?" is the most relevant fsck invocation. >> >>> This doesn't help the panics - they're still a pita when they happen. It does >>> help me resolve the issue this time though. But initiating this error in >>> testing is damn near impossible. What can we document here as a way to gather >>> data to determine how to resolve this issue? Given my luck with this, its >>> bound to happen again at some point :) >> I think actual diagnostic is beyond my expertise/time committment at the >> moment. I suspect that using tunefs to disable softupdate journalling >> will be a workaround, if that is what you are really interested. >> >> I'll let Kirk decide if he wants to debug more, but the answer may well be >> "no" if you're not running the latest ufs from -current. >> >> -Ben > The suggestion to disable journalling is a good one. Journalling fixes > only consistency errors that it knows about and cannot handle media errors. > The sorts of panics you are getting are usually caused by media errors. > So disabling journally and checking all metadata after crashes (which is > what fsck does) should minimize your problems. So my only option for journal is gjournal (slow) or zfs (memory hog) to maintain consistency; is that it? Incidentally, why keep SU+J on as default then? Wouldn't this be considered a bug still, then?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55169D02.8090107>