Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 8 Apr 95 21:47 WET DST
From:      pete@pelican.com (Pete Carah)
To:        current@FreeBSD.org
Subject:   Re: Disk performance
Message-ID:  <m0rxotz-000K0iC@pelican.com>
In-Reply-To: <Pine.BSI.3.91.950408171607.787F-100000@aries.ibms.sinica.edu.tw>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <Pine.BSI.3.91.950408171607.787F-100000@aries.ibms.sinica.edu.tw> you write:
>
>On Sat, 8 Apr 1995, Rodney W. Grimes wrote:
>> 
....

RG Especially with out detailed information about how the kernel implements
RG the buffer copies to and from user land.  I'm pretty sure the SGI boxes
RG use page flipping,

T    Hmmm, the only page flipping I know of is a double-buffered
T animation technique used on the Apple II hi-res graphics screen.  :)

SGI indeed does page flipping for the disk buffers *IF* the read is
a multiple of the page size and is page-aligned (and they arrange that
stdio does so).  I've been told by several sgi folk that, at least in
4.0.x, if either of these is not true, then the bcopy is done.  This
implies that kernel buffers are always page-aligned ...  They do this
also for network buffers where it applies (MTU for fddi is around 5k,
so at 4 they can do it).  They do this for raw disk I/O too.  They went
to a lot of trouble to do this so that video could be transferred to/from
disk in real time (remember the "graphics" in the name)...
(and supposedly the performance is max'd for read if the read request is
16k; they have a dma limit that isn't much bigger than ours, so can't
usually go over 64k in a single (raw) read.  longer filesystem reads
do work.)  Note that their filesystem looks to the user like a UFS
but it is really extent based.

I had made a comment to JD about that but at the time they were having
a big problem just getting the unified page/disk cache to work at all...
Also you have to bite the bullet and make the stdio buffers 4k or more
and waste up to 4k more to get malloc page-aligned, before you get much
(any visible) advantage...  Page-flip buffering would make iozone
VERY fast since long streams are almost the best kind of I/O for that
buffer scheme.  Don't know how much help it would be for "normal"
apps.

Also the onyx and challenge use a 256-bit bus with 21ns (bus) cycle
and cache line of 4x that...  Handles 200-250mhz processors just fine.
*that* is a memory bus.  Do remember that D1 video runs at 280Mbits/sec...
(and the memory board is 2 or 4 -way interleaved depending on SIMM mix)
Their trick for cache consistency is cute, too.  Then again, you can't
buy one for $1k, either.

T    About 5MB/sec slower than the 486's.  A damn waste of CPU too... all
T I see people do on that machine is read news and mail.  :(

Well, if it runs both sendmail and inn and does both in high volumes,
you want lots of disk bandwidth, and lots of ram and that bandwidth too...
I'd rather see an SGI or HP there (though HP messed up the OS enough
that a lot of the software is hard to port.)  486's and pentiums aren't
all that bad for networking.  Our 386-25 gets half the ftp rate that the
mid-sized SGI's do (about 450kbytes/sec, to/from IDE drives, on 1.1.5 to
or from an sgi.) Not bad for a hand-me-down slow motherboard and $100
ethernet boards.  This is over twice as fast as svr4 with Lachman TCP
on the same machine, with UFS filesystems, too.  The indigos and 4D35's
do about 850-900; the 380, crimson, and onyx will ftp at 1.1-1.2m, just
below the ethernet maximum.  They didn't want to spring for a 486
for the firewall; with a 56k net link you don't really need it either...

RG I expect to be able towards the end of next week show some really
RG amazing numbers on a Pentium system with respect to memory speeds.
RG Let's all hope that the new EDO and Pipelined Burst SRAM stand up
RG to the theories and we start to see 100++MB/sec main memory speeds on
RG a Pentium like we should.

Agreed...  Looking forward to it.  PCI specs are pretty good but PC
motherboards have never been very good in the memory interface dept.

Is the idea of EDO that you hit RAS for the next cycle while the out
data is still current?  Bank interleave probably helps more for cheaper,
if you have/need enough memory.  Eeek.  Here I go back to my old
mainframe days :-)

-- Pete



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?m0rxotz-000K0iC>