Date: Fri, 1 Feb 2002 23:31:47 -0500 From: Brian T.Schellenberger <bts@babbleon.org> To: BOUWSMA Beery <freebsd-user@dcf77-zeit.netscum.dyndns.dk>, questions@FreeBSD.ORG, hackers@FreeBSD.ORG Subject: Re: Again Softupdates on 4.5 Message-ID: <20020202043147.2DDCF406A@i8k.babbleon.org> In-Reply-To: <200201311332.g0VDWvb01491@beerswilling.netscum.dyndns.dk> References: <20020130193145.5ED395D0B@ptavv.es.net> <20020131021257.193F44078@i8k.babbleon.org> <200201311332.g0VDWvb01491@beerswilling.netscum.dyndns.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 31 January 2002 08:32 am, BOUWSMA Beery wrote: > Moin, moin! > %s wrote on %.3s, %lld Sep 1993 > > > > > Does 4.5 also leave write-caching on by default? If so, I think > > > > that's a terrible mistake. Would I be correct in assuming it's way > > > > to late to get this reconsidered? > > > > > > Yes, write-cache is enabled by default on 4.5 (as it was on 4.4). > > > > > > The debate on this has been long and often mis-informed. There is a > > > real risk of metadata corruption with write caching and softupdates, > > > but it appears to be EXTREMELY small. So far no case of it has > > > actually been confirmed. There is a significant chance of data loss in > > > recently updated files with write-cache, but that is also true without > > > softupdates. The only totally safe way to deal with this is to run > > > fully synchronous with write-cache disabled. > > > > My experience is that combining the two of them greatly magnifies the > > risk of losing recent updates, and that in fact data can be lost even > > without any system crash or other problems when using them together. > > Indeed, I have a very reproducable case of this on my system-- > > > > If I enabled softupdates + write cache and then I do > > > > cd /some-big-directory > > rm -r * > > shutdown -p now > > > > then the file system will be corrupted on reboot. > > > > I find this as default behavior pretty ridiculous, and it it comes about > > *only* as result of having both enabled together. > > And, I'm just guessing here, only because the delay before poweroff > isn't quite enough for the disk's write cache to drain. Just like > if you were to yank out the power cord after giving the `shutdown -p' > (poweroff) command. > > If you look at the source in /usr/src/sys/kern/kern_shutdown.c you'll > see the following: > > /* > * Support for poweroff delay. > */ > #ifndef POWEROFF_DELAY > # define POWEROFF_DELAY 5000 > #endif > static int poweroff_delay = POWEROFF_DELAY; > > SYSCTL_INT(_kern_shutdown, OID_AUTO, poweroff_delay, CTLFLAG_RW, > &poweroff_delay, 0, ""); > > static void > poweroff_wait(void *junk, int howto) > { > if(!(howto & RB_POWEROFF) || poweroff_delay <= 0) > return; > DELAY(poweroff_delay * 1000); > } > > And you'll see a commit log message: > | @Change the default poweroff delay from 0 to 5 seconds. This seems to be > | adequate for the IDE disks that I have available for testing. Most seem > | to wait between 1 and 3 seconds before flushing their caches. > | > | Add the ability to override the delay at compile time via the > | undocumented option POWEROFF_DELAY. The delay can still be set via > | sysctl as it was originally implemented. > > It sounds as if the default five seconds isn't always enough time for > your disk to do its job. (I've only done poweroff on an idle system so > I haven't run into such a problem myself.) > > I don't see it would hurt anything for this default to be increased to > help out this problem. But what value would be good? > > As shown in the k0deZ plus the note, there's the sysctl tunable > kern.shutdown.poweroff_delay: 5000 > > Perhaps if you were to bump this up to 10000 (ten seconds) and then > do your test, you wouldn't see this problem. Perhaps it would need to > be higher; maybe something between five and ten seconds would suffice. > > Could you try testing this out with your particular hardware, for which > five seconds isn't enough with your test, and see if it helps? If it > does, then there would be good cause to bump up the delay for safety. > If not, then it looks like the disk activity I see some number of > seconds after such an `rm' doesn't get forced by the shutdown process, > which I hope wouldn't be the case. > > > I'm sure we'd all be happy to hear how things work, since not all the > possible hardware combinations can be tested, and assumptions such as > were made when selecting the value above may later become obsolete. > > As another possibility, the runtime value of the poweroff delay that > is used could remain the default 5 sec if caching is disabled, or > less (whatever works and is safe), and higher (some multiple of 5?) > if caching is enabled, or if the kernel could tell there's a lot of > data to be dumped to disk. Not that I'd know if it's possible... Well, naturally, though I was *very* easily able to reproduce this this past fall I can't get it to happen now. Did the default timeout value here get increased sometime over the lifetime of FreeBSD 4.4 or something? I'm at a loss . . . it was very reproducable and disabling the write cache fixed it. And with softupdates there's enough of a performance boost that I haven't felt terribly put out by having the cache disabled, either . . . but I can't get the darn bug to reproduce now. I'm afraid I'm not quite ready to try downgrading to a fall-era system just to test this, though . . . > > Just thoughts, feel free to flame > > > barry bouwsma, netscum > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-questions" in the body of the message -- Brian T. Schellenberger . . . . . . . bts@wnt.sas.com (work) Brian, the man from Babble-On . . . . bts@babbleon.org (personal) ME --> http://www.babbleon.org http://www.eff.org <-- GOOD GUYS --> http://www.programming-freedom.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020202043147.2DDCF406A>