Date: Thu, 10 May 2001 21:56:07 -0700 From: Kirk McKusick <mckusick@mckusick.com> To: Alfred Perlstein <bright@wintelcom.net> Cc: hackers@freebsd.org Subject: Re: utilizing write caching Message-ID: <200105110456.VAA14681@beastie.mckusick.com> In-Reply-To: Your message of "Thu, 19 Apr 2001 00:07:12 PDT." <20010419000712.C976@fw.wintelcom.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Sorry for the slow response. I only read my freebsd.org email very occationally. Soft updates does do most of its writes asynchronously, but it still needs to know when the data has really hit stable store. With SCSI disks, we can use tag queuing to reliably get this information. With IDE disks the only way to get this information is to disable write-cacheing. Most failure senarios allow IDE disks to write out their caches - software crashes, plug pulled out of the wall, etc. Where they cannot write out their caches are instances where the power drops nearly instantly such as a power supply failure, or the battery being pulled out of a laptop. We could decide that we are willing to lump those sorts of failures in with media failure as a class of problems that we choose not to protect against, but I think that should be a decision that users have to take an active role to make (much as they can choose to mount their filesystems async). So, I agree with the decision to turn off write caching by default, though there should be an easy way to reenable it for those users that want to run the associated risks. Kirk McKusick =-=-=-=-=-= Date: Thu, 19 Apr 2001 00:07:12 -0700 From: Alfred Perlstein <bright@wintelcom.net> To: hackers@freebsd.org Cc: Kirk McKusick <mckusick@freebsd.org> Subject: utilizing write caching I'm sure you guys remeber the recent discusion wrt write caching on disks possibly causing inconsistancies for UFS and just about any filesystem or program that expect things like fsync() to actually work. The result of the discussion was that write caching was disabled for all disks. I really think this is suboptimal. I mean _really_ suboptimal, my laptop disk is a pig since the default went in for ata disks. Or maybe it's just a pig anyway, but I'd like to take a look at this. The most basic fix to gain performance back would be to have the device examine the B_ASYNC flags and decide there whether or not to perform write caching. However, I have this strange feeling that softupdates is actually able to issue the meta-data writes with B_ASYNC set. Kirk, is this true? If so would it be possible to tag the buffer with yet another flag saying "yes, write me async, but safely" when doing softdep disk io? If softupdates doesn't use B_ASYNC, then it seems trivial to make DEV_STRATEGY propogate B_ASYNC into the bio request (BIO_STRATEGY) via OR'ing something like BIO_CACHE so that the device driver could then choose to activate write caching. This is still suboptimal because we'll be turning off caching when the buffer system is experiencing a shortage and issuing sync writes in order not to deadlock, but it's still better IMO than turning it off completely. If on the otherhand Kirk can figure out a quick hack to flag buffers that need completely stable storage (including fsync(2)*) ops then I think we've got a solution. (*) i'll look at fsync and physio if the scope of fixing those seems to be too much wrt to time available. If softupdates doesn't use B_ASYNC something like this: Index: sys/bio.h =================================================================== RCS file: /home/ncvs/src/sys/sys/bio.h,v retrieving revision 1.104 diff -u -r1.104 bio.h --- sys/bio.h 2001/01/14 18:48:42 1.104 +++ sys/bio.h 2001/04/19 06:53:52 @@ -91,6 +91,7 @@ #define BIO_ERROR 0x00000001 #define BIO_ORDERED 0x00000002 #define BIO_DONE 0x00000004 +#define BIO_ASYNC 0x00000008 /* Device may choose to write cache */ #define BIO_FLAG2 0x40000000 /* Available for local hacks */ #define BIO_FLAG1 0x80000000 /* Available for local hacks */ Index: sys/conf.h =================================================================== RCS file: /home/ncvs/src/sys/sys/conf.h,v retrieving revision 1.126 diff -u -r1.126 conf.h --- sys/conf.h 2001/03/26 12:41:26 1.126 +++ sys/conf.h 2001/04/19 06:52:08 @@ -157,6 +157,8 @@ (bp)->b_io.bio_offset = (bp)->b_offset; \ else \ (bp)->b_io.bio_offset = dbtob((bp)->b_blkno); \ + if ((bp)->b_flags & B_ASYNC) \ + (bp)->b_io.bio_flags |= BIO_ASYNC \ (bp)->b_io.bio_done = bufdonebio; \ (bp)->b_io.bio_caller2 = (bp); \ BIO_STRATEGY(&(bp)->b_io, dummy); \ could do the trick, no? -- -Alfred Perlstein - [alfred@freebsd.org] Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom. ----- End forwarded message ----- -- -Alfred Perlstein - [alfred@freebsd.org] Daemon News Magazine in your snail-mail! http://magazine.daemonnews.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200105110456.VAA14681>