Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Oct 2005 14:14:48 +0200
From:      "Poul-Henning Kamp" <phk@phk.freebsd.dk>
To:        performance@freebsd.org
Subject:   Re: dd(1) performance when copiing a disk to another (fwd) 
Message-ID:  <205.1128428088@critter.freebsd.dk>
In-Reply-To: Your message of "Tue, 04 Oct 2005 12:25:21 BST." <20051004122459.E69774@fledge.watson.org> 

next in thread | previous in thread | raw e-mail | index | archive | help

Robert forwarded this message.

>---------- Forwarded message ----------
>Date: Tue, 4 Oct 2005 10:48:48 +1000 (EST)
>From: Bruce Evans <bde@zeta.org.au>
>To: Tulio Guimar=E3es da Silva <tuliogs@pgt.mpt.gov.br>
>Cc: freebsd-performance@FreeBSD.org
>Subject: Re: dd(1) performance when copiing a disk to another

I raised this subject early in the GEOM era but got very little
feedback, so I decided to sit back and wait until it came up again,
and that seems to be now.

First issue: chopping requests.

In the future we will have even larger I/O requests because (at least
we hope) that bio requests will get rid of the antique requirement to
be mapped into sequential mapped kernel VM.

That means that somebody will have to cut I/O requests up somewhere
and it stands to reason that this happens as far down as possible
for reasons of memory management and workload avoiddance.

So in the future, device drivers will have to accept for all practical
purposes infinite bio requests and service them in pieces as best
they can.

In addition to chopping, drivers/classes which need to access the
data in the I/O request will need to request VM mapping of it.


Second issue: issuing intelligently sized/aligned requests.

Notwithstanding the above, it makes sense to issue requests that
work as efficient as possible further down the GEOM mesh.

The chopping is one case, and it can (and will) be solved by
propagating a non-mandatory size-hint upwards.  physio will
be able to use this to send down requests that require minimal
chopping later on.

But the other issue is alignment.  For a RAID-5 implementation it
is paramount for performance that requests try to align themselves
with the stripe size.  Other transformations have similar
requirements, striping and (gbde) encryption for instance.

Therefore in addition to the size hint, a stripe width and stripe
alignment hint needs to be passed up and then physio can start
to send requests that not only have the right size, but also
the right alignment for downstream processing.

The outline of this was committed to src/sys/geom/notes around
2½ years ago and the only thing that has changed is that after
some consideration I have concluded that the hints should be
non-binding for performance reasons.


Third issue: The problem extends all the way up to sysinstall.

Currently we do systematically shoot RAID-5 performance down by our
strict adherence to MBR formatting rules.  We reserve the first
track of typically 63 sectors to the MBR.

The first slice therefore starts in sector number 63.  All partitions
in that slice inherit that alignment and therefore unless the RAID-5
implementation has a stripe size of 63 sectors, a (too) large
fraction of the requests will have one sector in one raid-stripe
and the rest in another, which they often fail to fill by exactly
one sector.


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?205.1128428088>