Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Feb 2004 13:47:56 -0500
From:      Scott W <wegster@mindcore.net>
To:        Charles Swiger <cswiger@mac.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: NFS server usage
Message-ID:  <4040E25C.60608@mindcore.net>
In-Reply-To: <9F44162A-68AD-11D8-870A-003065ABFD92@mac.com>
References:  <478667A6-6892-11D8-A5DD-00039367611E@obfuscated.net> <5FCEDFA8-68A3-11D8-870A-003065ABFD92@mac.com> <CE6F38ED-68A6-11D8-A5DD-00039367611E@obfuscated.net> <9F44162A-68AD-11D8-870A-003065ABFD92@mac.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Charles Swiger wrote:

> On Feb 26, 2004, at 4:57 PM, Michael Conlen wrote:
>
>> [ ... ]
>> The production system will use dual channel U320 RAID controllers 
>> with 12 disks per channel, so disk shouldn't be an issue, and it will 
>> connect with GigE, so network is plenty fine, now I'm on to CPU.
>
>
> Sounds like you've gotten nice hardware.  Four or so years ago, I 
> built out a roughly comparible fileserver [modulo the progess in 
> technology since then] on a Sun E450, which housed 10 SCA-form-factor 
> disks over 5 UW SCSI channels (using 64-bit PCI and backplane, 
> though), and could have held a total of 20 disks if I'd filled it.  I 
> mention this because...
>
>> Low volume tests with live data indicate low CPU  usage however when 
>> I best fit the graph it's dificult to tell how linear (or non linear) 
>> the data is. [ ... ] Does that kind of curve look accurate to you 
>> (anyone)?
>
>
> ...even under stress testing on the faster four-disk RAID-10 volume 
> using SEAGATE-ST336752LC drives (15K RPM, 8MB cache), each on a 
> seperate channel, with ~35 client machines bashing away, the 
> fileserver would bottleneck on disk I/O without more than maybe 10% or 
> 15% CPU load, and that was using a 400MHz CPU.
>
> The notion that an NFS fileserver is going to end up CPU-bound simply 
> doesn't match my experience or my expectations.  If you have 
> single-threaded sequential I/O patterns (like running dd, or maybe a 
> database), you'll bottleneck on the interface or maximum disk 
> throughput, otherwise even with ~3.5 ms seek times, multi-threaded I/O 
> from a buncha clients will require the disk heads to move around so 
> much that you bottleneck at a certain number of I/O operations per 
> second per disk, rather than a given bandwidth per disk.
>
Just to add a few .02 cents.  Experience has shown pretty much the same 
as mentioned.  I've done some fileserving performance benchmarks (more 
than I want to count) a while back for a company that was working on a 
new fileserver 'appliance' system like a lower end to midrange NetApp.  
Once your network bandwidth was taken care of (meaning enough bandwidth 
to handle incoming requests), the bottlenecks inevitably were disk I/O- 
note that this was not always nescessarily indicating adding more disks- 
if you have a few dozen disks hanging off a dual channel SCSI or RAID 
card, the actual bottleneck could be the bus the card is plugged into, 
or the bus speed/bandwidth, so splitting the load across multiple cards 
(and buses if possible) can be the culprit instead of adding more disk.

Other things worth looking at are buffer sizes, both for system and 
TCP/IP, as well as mount options for NFS shares- if your NFS server is 
using battery batcked up cache, and is also on a UPS, you definately 
want to use async in your mount options from clients to speed things up 
significantly.  read and write buffer sizes seem to do best nowadays 
(huge generalization, but seems to be true for different systems and 
*NIX OSes I have currently) is somewhere in the 32k-64k range 
(rsize/wsize client options).

One thing that may be worth something as well is the disk throughput 
itself- on an U320 interface, if you're loaded with 15 disks per 
channel, it _may_ be bottlenecking the U320 bus at that point.  I don't 
have currently valid numbers on what realistic sustained output is for 
U320, but I'm sure it can be googled easily enough- I'd expect sustained 
transfer to be on the order of ~160MB/sec, which is fairly likely to be 
saturated with 10 or fewer disks.

Lastly, you're almost always better, if you can afford the hardware, to 
handle different types of access via different controllers- in other 
words, if you are going to be handling mail, web, user home, and a 
database over NFS or SMB, break them up into individual filesystems, 
preferably on their own channel and disks, opposed to combining.  (This 
is ignoring the fact that mail, apache, and DBs should really be served 
by local disk, but as an obvious example.)  This is actually just a 
re-statement of the previous posters comment about disk I/O from many 
clients moving the heads around, but is certainly true..

Scott




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4040E25C.60608>