Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 6 Aug 2013 22:48:18 -0400
From:      J David <j.david.lists@gmail.com>
To:        freebsd-questions@freebsd.org
Subject:   Terrible disk performance with LSI / FreeBSD 9.2-RC1
Message-ID:  <CABXB=RSRnB41yjq5Qcbiz-JCRssNwx2AatJ2Dn%2BHhuD9GaBh%2Bw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
We have a machine running 9.2-RC1 that's getting terrible disk I/O
performance.  Its performance has always been pretty bad, but it
didn't really become clear how bad until we did a zpool replace on one
of the drives and realized it was going to take 3 weeks to rebuild a
<1TB drive.

The hardware specs are:
- 2 x Xeon L5420
- 32 GiB RAM
- LSI Logic SAS 1068E
- 2 x 32GB SSD's
- 6 x 1TB Western Digital RE3 7200RPM SATA

The LSI controller has the most recent firmware I'm aware of
(6.36.00.00 / 1.33.00.00 dated 2011.08.24), is in IT mode, and appears
to be working fine:

mpt0 Adapter:
       Board Name: USASLP-L8i
   Board Assembly: USASLP-L8i
        Chip Name: C1068E
    Chip Revision: B3
      RAID Levels: none

mpt0 Configuration: 0 volumes, 8 drives
    drive da0 (30G) ONLINE <FTM32GL25H 10> SATA
    drive da1 (29G) ONLINE <SSDSA2SH032G1GN 8860> SATA
    drive da2 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
    drive da3 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
    drive da4 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
    drive da5 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
    drive da6 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA
    drive da7 (931G) ONLINE <WDC WD1002FBYS-0 0C05> SATA

The eight drives are configured as ZIL, L2ARC on SSD and a six drive
raidz2 on the spinning disks.

We did a ZFS replace on the last drive in the line, and the resilver
is proceeding at less than 800k/sec.

                        extended device statistics
device     r/s   w/s    kr/s    kw/s qlen svc_t  %b
da0        0.0   0.0     0.0     0.1    0   0.9   0
da1        0.0   8.2     0.0    19.9    0   0.1   0
da2      125.6  23.0   768.2    40.5    4  33.0  88
da3      126.6  23.1   769.0    41.3    4  32.3  89
da4      126.0  24.0   768.5    42.7    4  32.1  88
da5      125.9  22.0   768.2    40.1    4  31.6  87
da6      124.0  22.0   766.6    39.9    5  31.4  84
da7        0.0 136.9     0.0   801.3    0   0.6   4

The system has plenty of free RAM, is 99.7% idle, has nothing else
going on, and runs like a one-legged dog.

There are no error messages or any sign of a problem anywhere, other
than the really terrible performance.  (When not rebuilding, it does
light NFS duty.  That performance is similarly bad, but has never
really mattered.)

Similar systems running Solaris put out 10x these numbers claiming 30%
busy instead of 90% busy.

Does anyone have any suggestions for how I could troubleshoot this
further?  At this point, I'm kind of at a loss as to where to go from
here.  My goal is to try to phase out the Solaris machines, but this
is kind of a roadblock.

Thanks for any advice!



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABXB=RSRnB41yjq5Qcbiz-JCRssNwx2AatJ2Dn%2BHhuD9GaBh%2Bw>