From owner-freebsd-arch  Mon Feb  5 13: 2:30 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from critter.freebsd.dk (flutter.freebsd.dk [212.242.40.147])
	by hub.freebsd.org (Postfix) with ESMTP
	id 909E337B491; Mon,  5 Feb 2001 13:02:09 -0800 (PST)
Received: from critter (localhost [127.0.0.1])
	by critter.freebsd.dk (8.11.1/8.11.1) with ESMTP id f15L1fB28620;
	Mon, 5 Feb 2001 22:01:41 +0100 (CET)
	(envelope-from phk@critter.freebsd.dk)
To: Alfred Perlstein <bright@wintelcom.net>
Cc: "Justin T. Gibbs" <gibbs@scsiguy.com>,
	Randell Jesup <rjesup@wgate.com>,
	Matt Dillon <dillon@earth.backplane.com>,
	Matthew Jacob <mjacob@feral.com>, Mike Smith <msmith@FreeBSD.ORG>,
	Dag-Erling Smorgrav <des@ofug.org>,
	Dan Nelson <dnelson@emsphone.com>,
	Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>, arch@FreeBSD.ORG
Subject: Re: Bumping up {MAX,DFLT}*PHYS (was Re: Bumping up {MAX,DFL}*SIZ in i386) 
In-Reply-To: Your message of "Mon, 05 Feb 2001 12:47:07 PST."
             <20010205124707.Y26076@fw.wintelcom.net> 
Date: Mon, 05 Feb 2001 22:01:41 +0100
Message-ID: <28618.981406901@critter>
From: Poul-Henning Kamp <phk@critter.freebsd.dk>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In message <20010205124707.Y26076@fw.wintelcom.net>, Alfred Perlstein writes:

>One of the suggestions that Poul-Henning made was to have the device
>somehow specify an optimal clustering strategy, being able to specify
>bounds and sizes.
>
>[...]
>
>Currently (i think) we only cluster based on logical file offsets,
>it would be interesting to allow drivers to do callbacks into the
>FS to ask for blocks physically adjacent to the blocks being written.

I've been playing with various ideas in this area, and to be frank,
totally failed to come up with a breakthrough.

Give methods like striping and RAID-5, it becomes nontrivial to
find a specification language for the driver to say "it would be
quick to write the following blocks also" and it would be even
slower to determine if this was indeed feasible.

"feasible" covers not only "do we have it in RAM", but also "is it
already scheduled for writing", "is it dirty" and not the least
"would softupdates take a fit if we wrote it".

The best I have been able to do so far is if the device-driver
can specify the following quantities:

	(M) maxmimum request size
	(R) preferred request size
	(B) preferred request sector boundary 

The clustering code would then try to increase request to:

	N * R sectors starting X
	where X mod B == 0
	and N * R <= M

Having found a cluster opportunity, the cluster code will
issue the read/write request specifying:

	(E) First possible sector in request
	(S) First mandatory sector in request
	(L) Last mandatory sector in request
	(F) Lase possible sector in request
	(B) Sector address of (S) on media.

The driver has to process the data from [S ... L],
and can optionally process [E...S[ and ]L...F] if
that seems convenient.

If somebody is looking for a good project, benchmarking
the performance of our current clustering and playing
around with various changes would not be the worst 
way to spend some winter evenings.  Playing with FFS/UFS
options (block/fragment etc) at the same time may be
worth while.

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message