Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Mar 2000 15:58:13 -0800
From:      Greg Lehey <grog@lemis.com>
To:        Dan Nelson <dnelson@emsphone.com>
Cc:        Poul-Henning Kamp <phk@critter.freebsd.dk>, Alfred Perlstein <bright@wintelcom.net>, Matthew Dillon <dillon@apollo.backplane.com>, current@FreeBSD.ORG
Subject:   Write clustering (was: patches for test / review)
Message-ID:  <20000323155812.F9318@mojave.worldwide.lemis.com>
In-Reply-To: <20000323174438.B59166@dan.emsphone.com>; from dnelson@emsphone.com on Thu, Mar 23, 2000 at 05:44:38PM -0600
References:  <20000320115902.C14789@fw.wintelcom.net> <20211.953581241@critter.freebsd.dk> <20000320152330.A48212@dan.emsphone.com> <20000323152718.C9318@mojave.worldwide.lemis.com> <20000323174438.B59166@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday, 23 March 2000 at 17:44:38 -0600, Dan Nelson wrote:
> In the last episode (Mar 23), Greg Lehey said:
>>
>> Agreed.  This is on the Vinum wishlist, but it comes at the expense of
>> reliability (how long do you wait to cluster?  What happens if the
>> system fails in between?).  In addition, for Vinum it needs to be done
>> before entering the hardware driver.
>
> For the simplest case, you can choose to optimize only when the user
> sends a single huge write().

We discussed that.  Since the optimum band size is much larger than
MAXPHYS, this can't happen on a correctly configured system.

> That way you don't have to worry about caching dirty pages in vinum.
> This is basically what the hardware RAIDs I have do.

Right, but that seriously degrades normal non-band writes.

> They'll only do the write optimization (they call it "pipelining")
> if you actually send a single SCSI write request large enough to
> span all the disks.  I don't know what would be required to get our
> kernel to even be able to write blocks this big (what's the upper
> limit on MAXPHYS)?

MAXPHYS is currently 128 kB.  I recommend stripes of 256 kB to 512 kB,
so with a 9 disk RAID we're talking about bands of 2 to 4 MB.  My
current idea is to set a flag on each volume specifying that it's
prepared to wait up to n seconds for write clustering.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000323155812.F9318>