Date: Thu, 23 Mar 2000 15:58:13 -0800 From: Greg Lehey <grog@lemis.com> To: Dan Nelson <dnelson@emsphone.com> Cc: Poul-Henning Kamp <phk@critter.freebsd.dk>, Alfred Perlstein <bright@wintelcom.net>, Matthew Dillon <dillon@apollo.backplane.com>, current@FreeBSD.ORG Subject: Write clustering (was: patches for test / review) Message-ID: <20000323155812.F9318@mojave.worldwide.lemis.com> In-Reply-To: <20000323174438.B59166@dan.emsphone.com>; from dnelson@emsphone.com on Thu, Mar 23, 2000 at 05:44:38PM -0600 References: <20000320115902.C14789@fw.wintelcom.net> <20211.953581241@critter.freebsd.dk> <20000320152330.A48212@dan.emsphone.com> <20000323152718.C9318@mojave.worldwide.lemis.com> <20000323174438.B59166@dan.emsphone.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, 23 March 2000 at 17:44:38 -0600, Dan Nelson wrote: > In the last episode (Mar 23), Greg Lehey said: >> >> Agreed. This is on the Vinum wishlist, but it comes at the expense of >> reliability (how long do you wait to cluster? What happens if the >> system fails in between?). In addition, for Vinum it needs to be done >> before entering the hardware driver. > > For the simplest case, you can choose to optimize only when the user > sends a single huge write(). We discussed that. Since the optimum band size is much larger than MAXPHYS, this can't happen on a correctly configured system. > That way you don't have to worry about caching dirty pages in vinum. > This is basically what the hardware RAIDs I have do. Right, but that seriously degrades normal non-band writes. > They'll only do the write optimization (they call it "pipelining") > if you actually send a single SCSI write request large enough to > span all the disks. I don't know what would be required to get our > kernel to even be able to write blocks this big (what's the upper > limit on MAXPHYS)? MAXPHYS is currently 128 kB. I recommend stripes of 256 kB to 512 kB, so with a 9 disk RAID we're talking about bands of 2 to 4 MB. My current idea is to set a flag on each volume specifying that it's prepared to wait up to n seconds for write clustering. Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000323155812.F9318>