Date: Fri, 1 Feb 2002 23:31:47 -0500 From: Brian T.Schellenberger <bts@babbleon.org> To: BOUWSMA Beery <freebsd-user@dcf77-zeit.netscum.dyndns.dk>, questions@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: Again Softupdates on 4.5 Message-ID: <20020202043147.2DDCF406A@i8k.babbleon.org> In-Reply-To: <200201311332.g0VDWvb01491@beerswilling.netscum.dyndns.dk> References: <20020130193145.5ED395D0B@ptavv.es.net> <20020131021257.193F44078@i8k.babbleon.org> <200201311332.g0VDWvb01491@beerswilling.netscum.dyndns.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 31 January 2002 08:32 am, BOUWSMA Beery wrote:
> Moin, moin!
> %s wrote on %.3s, %lld Sep 1993
>
> > > > Does 4.5 also leave write-caching on by default? If so, I think
> > > > that's a terrible mistake. Would I be correct in assuming it's way
> > > > to late to get this reconsidered?
> > >
> > > Yes, write-cache is enabled by default on 4.5 (as it was on 4.4).
> > >
> > > The debate on this has been long and often mis-informed. There is a
> > > real risk of metadata corruption with write caching and softupdates,
> > > but it appears to be EXTREMELY small. So far no case of it has
> > > actually been confirmed. There is a significant chance of data loss in
> > > recently updated files with write-cache, but that is also true without
> > > softupdates. The only totally safe way to deal with this is to run
> > > fully synchronous with write-cache disabled.
> >
> > My experience is that combining the two of them greatly magnifies the
> > risk of losing recent updates, and that in fact data can be lost even
> > without any system crash or other problems when using them together.
> > Indeed, I have a very reproducable case of this on my system--
> >
> > If I enabled softupdates + write cache and then I do
> >
> > cd /some-big-directory
> > rm -r *
> > shutdown -p now
> >
> > then the file system will be corrupted on reboot.
> >
> > I find this as default behavior pretty ridiculous, and it it comes about
> > *only* as result of having both enabled together.
>
> And, I'm just guessing here, only because the delay before poweroff
> isn't quite enough for the disk's write cache to drain. Just like
> if you were to yank out the power cord after giving the `shutdown -p'
> (poweroff) command.
>
> If you look at the source in /usr/src/sys/kern/kern_shutdown.c you'll
> see the following:
>
> /*
> * Support for poweroff delay.
> */
> #ifndef POWEROFF_DELAY
> # define POWEROFF_DELAY 5000
> #endif
> static int poweroff_delay = POWEROFF_DELAY;
>
> SYSCTL_INT(_kern_shutdown, OID_AUTO, poweroff_delay, CTLFLAG_RW,
> &poweroff_delay, 0, "");
>
> static void
> poweroff_wait(void *junk, int howto)
> {
> if(!(howto & RB_POWEROFF) || poweroff_delay <= 0)
> return;
> DELAY(poweroff_delay * 1000);
> }
>
> And you'll see a commit log message:
> | @Change the default poweroff delay from 0 to 5 seconds. This seems to be
> | adequate for the IDE disks that I have available for testing. Most seem
> | to wait between 1 and 3 seconds before flushing their caches.
> |
> | Add the ability to override the delay at compile time via the
> | undocumented option POWEROFF_DELAY. The delay can still be set via
> | sysctl as it was originally implemented.
>
> It sounds as if the default five seconds isn't always enough time for
> your disk to do its job. (I've only done poweroff on an idle system so
> I haven't run into such a problem myself.)
>
> I don't see it would hurt anything for this default to be increased to
> help out this problem. But what value would be good?
>
> As shown in the k0deZ plus the note, there's the sysctl tunable
> kern.shutdown.poweroff_delay: 5000
>
> Perhaps if you were to bump this up to 10000 (ten seconds) and then
> do your test, you wouldn't see this problem. Perhaps it would need to
> be higher; maybe something between five and ten seconds would suffice.
>
> Could you try testing this out with your particular hardware, for which
> five seconds isn't enough with your test, and see if it helps? If it
> does, then there would be good cause to bump up the delay for safety.
> If not, then it looks like the disk activity I see some number of
> seconds after such an `rm' doesn't get forced by the shutdown process,
> which I hope wouldn't be the case.
>
>
> I'm sure we'd all be happy to hear how things work, since not all the
> possible hardware combinations can be tested, and assumptions such as
> were made when selecting the value above may later become obsolete.
>
> As another possibility, the runtime value of the poweroff delay that
> is used could remain the default 5 sec if caching is disabled, or
> less (whatever works and is safe), and higher (some multiple of 5?)
> if caching is enabled, or if the kernel could tell there's a lot of
> data to be dumped to disk. Not that I'd know if it's possible...
Well, naturally, though I was *very* easily able to reproduce this this past
fall I can't get it to happen now. Did the default timeout value here get
increased sometime over the lifetime of FreeBSD 4.4 or something? I'm at a
loss . . . it was very reproducable and disabling the write cache fixed it.
And with softupdates there's enough of a performance boost that I haven't
felt terribly put out by having the cache disabled, either . . . but I can't
get the darn bug to reproduce now. I'm afraid I'm not quite ready to try
downgrading to a fall-era system just to test this, though . . .
>
> Just thoughts, feel free to flame
>
>
> barry bouwsma, netscum
>
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-questions" in the body of the message
--
Brian T. Schellenberger . . . . . . . bts@wnt.sas.com (work)
Brian, the man from Babble-On . . . . bts@babbleon.org (personal)
ME --> http://www.babbleon.org
http://www.eff.org <-- GOOD GUYS --> http://www.programming-freedom.org
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020202043147.2DDCF406A>
