Date: Fri, 22 Mar 2013 14:24:46 -0600 From: Josh Beard <josh@signalboxes.net> To: Steven Hartland <killing@multiplay.co.uk> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime Message-ID: <CAHDrHSvpcnnEf5_ys67rF4md7fDdKZ4f%2B3bDsndy7_hnmofrWg@mail.gmail.com> In-Reply-To: <12CCA57CCC7E4F16A1147F8422F5F151@multiplay.co.uk> References: <CAHDrHSsCunt9eQKjMy9epPBYTmaGs5HNgKV2%2BUKuW0RQZPpw%2BA@mail.gmail.com> <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk> <CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ@mail.gmail.com> <12CCA57CCC7E4F16A1147F8422F5F151@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 22, 2013 at 1:07 PM, Steven Hartland <killing@multiplay.co.uk>wrote: > > ----- Original Message ----- From: Josh Beard >> >>> A snip of gstat: >>> >>> dT: 1.002s w: 1.000s >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>> >> ... > >> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >>> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >>> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >>> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >>> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >>> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >>> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >>> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >>> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >>> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >>> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >>> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >>> >>> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s >>> is what HD's will peak at, so whatever our machine is doing is causing >>> it to max the available IO for your disks. >>> >>> If you boot back to your previous kernel does the problem go away? >>> >>> If so you could look at the changes between the two kernel revisions >>> for possible causes and if needed to a binary chop with kernel builds >>> to narrow down the cause. >>> >> >> Thanks for your response. I booted with the old kernel (9.1-RC3) and the >> problem disappeared! We're getting 3x the performance with the previous >> kernel than we do with the 9.1-RELEASE-p1 kernel: >> >> Output from gstat: >> >> 1 362 0 0 0.0 345 20894 9.4 52.9| da1 >> 1 365 0 0 0.0 348 20893 9.4 54.1| da2 >> 1 367 0 0 0.0 350 20920 9.3 52.6| da3 >> 1 362 0 0 0.0 345 21275 9.5 54.1| da4 >> 1 363 0 0 0.0 346 21250 9.6 54.2| da5 >> 1 359 0 0 0.0 342 21352 9.5 53.8| da6 >> 1 347 0 0 0.0 330 20486 9.4 52.3| da7 >> 1 353 0 0 0.0 336 20689 9.6 52.9| da8 >> 1 355 0 0 0.0 338 20669 9.5 53.0| da9 >> 1 357 0 0 0.0 340 20770 9.5 52.5| da10 >> 1 351 0 0 0.0 334 20641 9.4 53.1| da11 >> 1 362 0 0 0.0 345 21155 9.6 54.1| da12 >> >> >> The kernels were compiled identically using GENERIC with no modification. >> I'm no expert, but none of the stuff I've seen looking at svn commits >> looks like it would have any impact on this. Any clues? >> > > Your seeing a totally different profile there Josh as in all writes no > reads where as before you where seeing mainly reads and some writes. > > So I would ask if your sure your seeing the same work load, or has > something external changed too? > > Might be worth rebooting back to the new kernel and seeing if your > still see the issue ;-) > > > Regards > Steve > > Regards > Steve > > Steve, You're absolutely right. I didn't catch that, but the total ops/s is reaching quite a bit higher. Things are certainly more responsive than they have been, for what it's worth, so it "feels right." I'm also not seeing this thing consistently railed to 100% busy like I was before with similar testing (that was 50 machines just pushing data with dd). I won't be able to get a good comparison until Monday, when our students come back (this is a file server for a public school district and used for network homes). Josh
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHSvpcnnEf5_ys67rF4md7fDdKZ4f%2B3bDsndy7_hnmofrWg>