FreeBSD Mail Archives

Date:      Sat, 22 Apr 1995 22:04:43 -0700 (PDT)
From:      "Rodney W. Grimes" <rgrimes@gndrsh.aac.dev.com>
To:        terry@cs.weber.edu (Terry Lambert)
Cc:        jgreco@brasil.moneng.mei.com, freebsd-hackers@FreeBSD.org
Subject:   Re: large filesystems/multiple disks [RAID]
Message-ID:  <199504230504.WAA03102@gndrsh.aac.dev.com>
In-Reply-To: <9504230355.AA10300@cs.weber.edu> from "Terry Lambert" at Apr 22, 95 09:55:59 pm


[Good feedback indicating my numbers are telling me what I was pretty
sure there where telling me deleted]

> > > Having recently seen Solaris' Online: DiskSuite, which suffers from fairly
> > > significant performance degradations, I'm curious to see what a real
> > > operating system can do.  ;-)
> > 
> > It will be at least another week, but you'll now I have made serious
> > progress when you see a cvs commit message for the import of
> > sys/dev/concat.
> 
> For truly random strip placement, there will be a potential for
> performance degradation based on the file system mechanism used to
> address the blocks themselves, and whether it is a high percentage
> of the overhead on the attempted I/O.

Yes, and since I am working with the raw device and dd here my initial
``will it work'' kind of testing does not take any of this into
consideration :-(.

> The other consideration is that you are not typically going to see
> the performance increase unless you either split the drives between
> SCSI controllers or actually get command queueing working.

Agreed, and since controllers are cheap for me, I'll create an
a hardware queue by going to one drive per controller for now,
then go hammer on the SCSI drivers when I need more :-) :-)

We do have folks woring on tagged queueing out to the drives (aic7xxx)
that will be help, but I have not had realible enough file systems
when using this controller yet to convert my stripe developement
machine to it, and they are high $$$$ items :-(

> Typically, I would expect that spindle sync would do nothing for
> you unless your stripe lengths are on the order of single cluster
> size and you divide the actual rotational latency of the drive by
> the number of synced spindles before using it, and then scale it
> by the relative sync notification time added to the rotational
> period.  Adjusting all this for possible ZBR variations in the
> effective rotational period based on distance from the spindle.
> Then use round-robin allocation of sequential blocks from disk
> to disk to ensure linear ordering of the distribution.

The interleave modulus is tuneable (right now it is a crock compiled
into ioconf.c, in ilv.c you can set it with an ioctl at LV creation
time).  If you keep this modulus to be smaller than the size of a
cylinder for the smallest ZBR cyclinder you should not see the above
problems.  Also remember, I always use drives that have write behind
caches in them so as long as I can deliver data to the drive at or
near drive data rate this issue also dies.

With sync'ed spindles I suspect that setting the interleave modulus
to be DEV_BSIZE will end up optimal so that even file system frags
have a chance of hitting 2 drives, and an 8K block has a chance to
hit 16 :-)  Though the overhead of splitting the IO and taking all
those interrupts may nullify any advantage :-(.

> Or, you could get complicated.  8^).

Nope, don't want to do that, I'll use complicated hardware instead :-)


-- 
Rod Grimes                                      rgrimes@gndrsh.aac.dev.com
Accurate Automation Company                   Custom computers for FreeBSD

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199504230504.WAA03102>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation