Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 Oct 2005 13:17:32 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc:        performance@freebsd.org
Subject:   Re: dd(1) performance when copiing a disk to another (fwd)
Message-ID:  <20051005111732.GB17298@garage.freebsd.pl>
In-Reply-To: <205.1128428088@critter.freebsd.dk>
References:  <20051004122459.E69774@fledge.watson.org> <205.1128428088@critter.freebsd.dk>

next in thread | previous in thread | raw e-mail | index | archive | help

--U+BazGySraz5kW0T
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 04, 2005 at 02:14:48PM +0200, Poul-Henning Kamp wrote:
+> Second issue: issuing intelligently sized/aligned requests.
+>=20
+> Notwithstanding the above, it makes sense to issue requests that
+> work as efficient as possible further down the GEOM mesh.
+>=20
+> The chopping is one case, and it can (and will) be solved by
+> propagating a non-mandatory size-hint upwards.  physio will
+> be able to use this to send down requests that require minimal
+> chopping later on.
+>=20
+> But the other issue is alignment.  For a RAID-5 implementation it
+> is paramount for performance that requests try to align themselves
+> with the stripe size.  Other transformations have similar
+> requirements, striping and (gbde) encryption for instance.

That's true. When I worked on gstripe I was wondering for a moment about
an additional class which will cut IOs for me and gstripe will only
decide where to send all the pieces. In this case it was overkill of
course.

On the other hand, I implemented 'fast' mode in gstripe which is intend
to work fast even for very small stripe size, ie. when stripe size is
equal to 1kB and we receive 128kB request, we don't send 128 requests down,
but only as many requests as many components we have and do all the
shuffle magic when reading is done (or before writting).
Not sure how it can be achived when some upper layer will split the
request for me. How can I avoid sending 128 requests then?

+> Third issue: The problem extends all the way up to sysinstall.
+>=20
+> Currently we do systematically shoot RAID-5 performance down by our
+> strict adherence to MBR formatting rules.  We reserve the first
+> track of typically 63 sectors to the MBR.
+>=20
+> The first slice therefore starts in sector number 63.  All partitions
+> in that slice inherit that alignment and therefore unless the RAID-5
+> implementation has a stripe size of 63 sectors, a (too) large
+> fraction of the requests will have one sector in one raid-stripe
+> and the rest in another, which they often fail to fill by exactly
+> one sector.

Just to be sure I understand it correctly: You're talking about hardware
RAID, so bascially exported provider is attached to rank#1 geom?

If so this is not the case for software implementation which are
configured on top of slices/partitions, instead of raw disks.

As a work around we can configure 'a' partition start at offset
equal to stripe size, right? Of course it's not a solution for anything,
but want to be sure I get the things right.

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--U+BazGySraz5kW0T
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (FreeBSD)

iD8DBQFDQ7ZMForvXbEpPzQRAi20AJ4zoFKbteSKqDBQpG7T9nkGx7K+IACeJUlH
3idA9HUvRyLNq4g5e8DFyjo=
=9DAw
-----END PGP SIGNATURE-----

--U+BazGySraz5kW0T--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20051005111732.GB17298>