Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 May 2012 16:39:13 -0400
From:      Gary Palmer <gpalmer@freebsd.org>
To:        Kees Jan Koster <kjkoster@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: FreeBSD 9.0 hangs on heavy I/O
Message-ID:  <20120529203913.GB92444@in-addr.com>
In-Reply-To: <BD5D6BB6-8CFF-456A-B03E-05454EB03AB6@gmail.com>
References:  <BD5D6BB6-8CFF-456A-B03E-05454EB03AB6@gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, May 29, 2012 at 09:26:32PM +0200, Kees Jan Koster wrote:
> Dear All,
> 
> I seem to have a problem where really heavy disk I/O is drowning my machine. I see hangs in the shell where I am logged on using ssh. Network connections get dropped for no apparent reason and some HTTP requests are served really slowly. Profiling the app code shows that the hangs are in completely random places. Operations that are no more than a few lines of code apart suddenly take seconds to complete.
> 
> In my search I seem to find that my machine is quite slow on the disk. I find that rather odd, given that the device in question is an SSD drive and it is a good bit faster than the WD drive that used to carry the data set that is accessed heavily. This drive is doing 1.5 times the throughput, but the hangs have not gone away.
> 
> To clarify, the data set used to live on ada2 (see the devlist below) which is a spinning disk. When I experienced intermittent hangs I plugged in an SSD drive (ada3 on the devlist) and moved the data there. This improved the MB's per second that are being written (it is mostly-write data) but has not changed the hangs. If anything, they got worse since.
> 
> Using gstat I notice that I/O service time is quite high. From the gstat below you can see that it takes just over 2s to servr the requests. The L(q) seems to never drop far below 100 and %busy hovers around 100% all day long. Can someone please help me troubleshoot that further? What can I do to make the underlying problem visible?
> 
> I should mention all data is referenced through cross-mountpoint symlinks, would that make a difference? Should I use canonical paths in the code instead?
> 
> All file systems are mounted "noatime, soft-updates".
> 
> Details:
> 
> # uname -a 
> FreeBSD cumin.java-monitor.com 9.0-STABLE FreeBSD 9.0-STABLE #0: Mon Mar 26 14:30:19 UTC 2012     kjkoster@cumin.java-monitor.com:/usr/obj/usr/src/sys/CUMIN  amd64
> # gstat -f 'ada[0-3]$' -b
> dT: 1.001s  w: 1.000s  filter: ada[0-3]$
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
>     0      0      0      0    0.0      0      0    0.0    0.0  ada0
>     0      0      0      0    0.0      0      0    0.0    0.0  ada1
>     0      0      0      0    0.0      0      0    0.0    0.0  ada2
>   103    273      0      0    0.0    273  34630   2062  121.9  ada3
> # camcontrol devlist
> <WDC WD740ADFD-00NLR1 20.07P20>    at scbus1 target 0 lun 0 (pass0,ada0)
> <WDC WD740GD-00FLC0 33.08F33>      at scbus2 target 0 lun 0 (pass1,ada1)
> <WDC WD740GD-00FLC0 33.08F33>      at scbus3 target 0 lun 0 (pass2,ada2)
> <OCZ SUMMIT VBM1801Q>              at scbus4 target 0 lun 0 (pass3,ada3)
> <PepperC Virtual Disc 1 0.01>      at scbus7 target 0 lun 0 (pass4,cd0)
> <PepperC Virtual Disc 2 0.01>      at scbus8 target 0 lun 0 (pass5,cd1)


Check the SSD for its internal block size and make sure your filesystem
and partitions are aligned with the disk block size.  Unless there
is something wrong with your SATA controller I'd expect a lot more than
273 IOPS/sec and ~30MByte/sec from a SSD.

Regards,

Gary



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120529203913.GB92444>