From owner-freebsd-hackers  Wed May  1 12:32:38 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id MAA11833
          for hackers-outgoing; Wed, 1 May 1996 12:32:38 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id MAA11796
          Wed, 1 May 1996 12:32:32 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id MAA10011; Wed, 1 May 1996 12:21:44 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199605011921.MAA10011@phaeton.artisoft.com>
Subject: Re: lmbench IDE anomaly
To: luigi@labinfo.iet.unipi.it (Luigi Rizzo)
Date: Wed, 1 May 1996 12:21:44 -0700 (MST)
Cc: terry@lambert.org, msmith@atrad.adelaide.edu.au, koshy@india.hp.com,
        hackers@FreeBSD.org, current@FreeBSD.org
In-Reply-To: <199605010947.LAA08814@labinfo.iet.unipi.it> from "Luigi Rizzo" at May 1, 96 11:47:43 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk

> > > That's about right.  The SCSI disk gets the chance to sort the I/O to suit
> > > itself, optimising its performance.  The IDE disk only gets to look at one
> > > transaction at a time, so it's at the mercy of the disksorting code in
> > > the operating system.  I don't know that FreeBSD's disksort stuff is 
> > > terribly wonderful, but I'd happily stand corrected.
> > 
> > The disksort stuff is pessimal.  Contact mday@elbereth.org for details.
> 
> The original poster was doing a couple of lmdd from /dev/rwd0a, the raw
> device. Does disksort get in the way in this case ?

No.  The original poster wasn't doing much concurrency testing (1/2 vs.
1/2/4/8/16/32...).

> BTW, running a number of dd on /dev/wd0a works *much* better, you get
> almost n times the bandwidth at least with up to 5 instances, what I
> tried.

This has to do with interleaved I/O.

You will get the same effect with "team" or "ddd", or with a single
process if we support async I/O or the more general async system
call trap vector.

You *won't* get the same effect with pthreads, because it is a
threading *environment*, not a threading *system*.  The pthreads
I/O's are still consecutive.


Probably a more interesting bogosity to eliminate to be higher
throughput would be the soft clustering code (again, mday@elbereth.org).

Even then, the disksort code is hard to stomach with ZBR'ed drives,
no matter what optimizations you make elsewhere.  Avoiding seeks
only works if you know what you are doing.

Most writes want to be in reverse sector order anyway, unless the
drive controller is smart as well (in which case reversing them would
cause you to shoot yourself in the foot).

Anyway, the results showing SCSI being better than IDE are certainly
valid.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.