Date: Fri, 22 Mar 2013 12:17:45 -0600 From: Josh Beard <josh@signalboxes.net> To: Steven Hartland <killing@multiplay.co.uk> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime Message-ID: <CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ@mail.gmail.com> In-Reply-To: <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk> References: <CAHDrHSsCunt9eQKjMy9epPBYTmaGs5HNgKV2%2BUKuW0RQZPpw%2BA@mail.gmail.com> <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 21, 2013 at 10:14 AM, Steven Hartland <killing@multiplay.co.uk>wrote: > > ----- Original Message ----- From: "Josh Beard" <josh@signalboxes.net> > To: <freebsd-fs@freebsd.org> > Sent: Thursday, March 21, 2013 3:53 PM > Subject: ZFS + NFS poor performance after restarting from 100 day uptime > > > > Hello, >> >> I have a system with 12 disks spread between 2 raidz1. I'm using the >> native ("new") NFS to export a pool on this. This has worked very well >> all >> along, but since a reboot, has performed horribly - unusably under load. >> >> The system was running 9.1-rc3 and I upgraded it to 9.1-release-p1 >> (GENERIC >> kernel) after ~110 days of running (with zero performance issues). After >> rebooting from the upgrade, I'm finding the disks seem constantly slammed. >> gstat reports 90-100% busy most of the day with only ~100-130 ops/s. >> >> I didn't change any settings in /etc/sysctl.conf or /boot/loader. No ZFS >> tuning, etc. I've looked at the commits between 9.1-rc3 and >> 9.1-release-p1 >> and I can't see any reason why simply upgrading it would cause this. >> > ... > >> A snip of gstat: >> dT: 1.002s w: 1.000s >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >> 0 0 0 0 0.0 0 0 0.0 0.0| cd0 >> 0 1 0 0 0.0 1 32 0.2 0.0| da0 >> 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 >> 0 1 0 0 0.0 1 32 0.2 0.0| da0p2 >> 0 0 0 0 0.0 0 0 0.0 0.0| da0p3 >> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >> > > Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s > is what HD's will peak at, so whatever our machine is doing is causing > it to max the available IO for your disks. > > If you boot back to your previous kernel does the problem go away? > > If so you could look at the changes between the two kernel revisions > for possible causes and if needed to a binary chop with kernel builds > to narrow down the cause. > > Regards > Steve > > Regards > Steve > > > Steve, Thanks for your response. I booted with the old kernel (9.1-RC3) and the problem disappeared! We're getting 3x the performance with the previous kernel than we do with the 9.1-RELEASE-p1 kernel: Output from gstat: 1 362 0 0 0.0 345 20894 9.4 52.9| da1 1 365 0 0 0.0 348 20893 9.4 54.1| da2 1 367 0 0 0.0 350 20920 9.3 52.6| da3 1 362 0 0 0.0 345 21275 9.5 54.1| da4 1 363 0 0 0.0 346 21250 9.6 54.2| da5 1 359 0 0 0.0 342 21352 9.5 53.8| da6 1 347 0 0 0.0 330 20486 9.4 52.3| da7 1 353 0 0 0.0 336 20689 9.6 52.9| da8 1 355 0 0 0.0 338 20669 9.5 53.0| da9 1 357 0 0 0.0 340 20770 9.5 52.5| da10 1 351 0 0 0.0 334 20641 9.4 53.1| da11 1 362 0 0 0.0 345 21155 9.6 54.1| da12 The kernels were compiled identically using GENERIC with no modification. I'm no expert, but none of the stuff I've seen looking at svn commits looks like it would have any impact on this. Any clues?
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ>