Date: Mon, 29 Aug 2005 19:59:10 -0700 (PDT) From: Jon Dama <jd@ugcs.caltech.edu> To: Matthias Buelow <mkb@incubus.de> Cc: mckusick@mckusick.com, freebsd-stable@freebsd.org, Mark Kirkwood <markir@paradise.net.nz> Subject: Re: Sysinstall automatic filesystem size generation. Message-ID: <Pine.LNX.4.53.0508291847480.20467@riyal.ugcs.caltech.edu> In-Reply-To: <20050830011632.GG1462@drjekyll.mkbuelow.net> References: <200508291836.j7TIaVEk013147@gw.catspoiler.org> <20050829185933.GB1462@drjekyll.mkbuelow.net> <431362ED.9030800@mac.com> <20050829204714.GC1462@drjekyll.mkbuelow.net> <43137AFB.9060304@mac.com> <20050829215613.GD1462@drjekyll.mkbuelow.net> <431390A0.5080007@mac.com> <20050830002051.GE1462@drjekyll.mkbuelow.net> <4313AB8D.4010807@paradise.net.nz> <Pine.LNX.4.53.0508291750030.20467@riyal.ugcs.caltech.edu> <20050830011632.GG1462@drjekyll.mkbuelow.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Well, I think one issue is that it destroys one of the fundamental advantages of softupdates which was that you could interleave streams of strongly ordered metadata writes without demanding a sequence for the streams collectively. By using request barriers, you are effectively forcing an additional synchronization requirement, the secret will be not forcing us all the way back to having effectively synchronous metadata writes (see below). As I understand, metadata operations are only added to the WORKLIST when their dependents have already been "completed" i.e., at the lowest level have had biodone called to mark the write operation completed. I am not sure how ffs_softdeps checks this property. It seems you need to add a layer of indirection. (owing to biodone being called merely when the drive has cached the request). What you know is that those operations marked completed by biodone are in fact done only after a (costly) flush cache operation is executed. Therefore you want to delay this operation for as long as possible, in fact until you actually depend on biodone being honest. I.e., at the time another operation is inserted into the WORKLIST. The secret I think is to keep track of which bp's marked B_DONE by biodone that have been certified by a flush cache. Thus permitting you to avoid some cache flushes. Furthermore, the softdep code has to be responsible for envoking the flush cache operation when it notices that the B_DONE flag that it cares about does not have a matching B_REALLY_DONE flag, which every block should have that had B_DONE set before the flush cache operation happened. I do not really know how GEOM has changed this situation. biodone seems to have been stripped of much of its old responsibilities? -Jon I'd guess that it belongs On Tue, 30 Aug 2005, Matthias Buelow wrote: > Jon Dama wrote: > > >Ironically, phk backed out the underlying support for this safety fix > > from the FreeBSD kernel b.c. it wasn't integrated into the softupdates > >code > >whereas in reality the proper course of action would have been to hook it > >in. :-/ > > Can it be put into softupdates at all? From what I understand (which > is probably a rather sketchy idea of the matter), write barriers > work because they are only used here to separate journal writes > from data writes (i.e., to make sure the log is written, by flushing > the cache, before any filesystem data hits the platters). I've read > the softupdates paper some time ago and haven't found similar > sequence points where one could insert such flushing. One would > have to "flush" all the time, either continuously or in very short > intervals, in order to keep the ordering, which then would amount > to the same effects as if one simply disabled the cache. But probably > I'm wrong here (I hope). > > mkb. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.53.0508291847480.20467>