Date: Thu, 11 Apr 2013 08:40:15 -0600 From: Josh Beard <josh@signalboxes.net> To: freebsd-fs@freebsd.org Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime Message-ID: <CAHDrHSvA%2B782vNfPt4iERfCh_C_HnRBpBKGh1rhND9Ea=Lsh2g@mail.gmail.com> In-Reply-To: <CAHDrHSvpcnnEf5_ys67rF4md7fDdKZ4f%2B3bDsndy7_hnmofrWg@mail.gmail.com> References: <CAHDrHSsCunt9eQKjMy9epPBYTmaGs5HNgKV2%2BUKuW0RQZPpw%2BA@mail.gmail.com> <D763F64A24B54755BBF716E91D646F6A@multiplay.co.uk> <CAHDrHSvXCu%2Bv%2Bps3ctg=T0qtHjKGkXxvnn_EaNrt_eenkJ9dbQ@mail.gmail.com> <12CCA57CCC7E4F16A1147F8422F5F151@multiplay.co.uk> <CAHDrHSvpcnnEf5_ys67rF4md7fDdKZ4f%2B3bDsndy7_hnmofrWg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
I wanted to give a followup to this in case someone else stumbles upon this thread with search queries. I was wrong about the original (9.1-RC3) kernel performing better. It was exhibiting the same behavior under "real world" conditions. Real world for this server is 100-200 Mac clients connecting with network homes via NFS. I haven't completely confirmed anything, but disabling Spotlight Indexing (Mac client feature) helped *significantly*. It's still curious why spotlight indexing was never an issue prior to the reboot I mentioned. I'm also unsure why the RAID controller's verifications are intermittently slow since that reboot. In any event, I don't think it's a ZFS or FreeBSD issue, based off of various benchmarks, which show expected performance. Thanks. On Fri, Mar 22, 2013 at 2:24 PM, Josh Beard <josh@signalboxes.net> wrote: > > > On Fri, Mar 22, 2013 at 1:07 PM, Steven Hartland <killing@multiplay.co.uk>wrote: > >> >> ----- Original Message ----- From: Josh Beard >>> >>>> A snip of gstat: >>>> >>>> dT: 1.002s w: 1.000s >>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>>> >>> ... >> >>> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >>>> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >>>> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >>>> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >>>> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >>>> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >>>> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >>>> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >>>> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >>>> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >>>> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >>>> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >>>> >>>> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s >>>> is what HD's will peak at, so whatever our machine is doing is causing >>>> it to max the available IO for your disks. >>>> >>>> If you boot back to your previous kernel does the problem go away? >>>> >>>> If so you could look at the changes between the two kernel revisions >>>> for possible causes and if needed to a binary chop with kernel builds >>>> to narrow down the cause. >>>> >>> >>> Thanks for your response. I booted with the old kernel (9.1-RC3) and the >>> problem disappeared! We're getting 3x the performance with the previous >>> kernel than we do with the 9.1-RELEASE-p1 kernel: >>> >>> Output from gstat: >>> >>> 1 362 0 0 0.0 345 20894 9.4 52.9| da1 >>> 1 365 0 0 0.0 348 20893 9.4 54.1| da2 >>> 1 367 0 0 0.0 350 20920 9.3 52.6| da3 >>> 1 362 0 0 0.0 345 21275 9.5 54.1| da4 >>> 1 363 0 0 0.0 346 21250 9.6 54.2| da5 >>> 1 359 0 0 0.0 342 21352 9.5 53.8| da6 >>> 1 347 0 0 0.0 330 20486 9.4 52.3| da7 >>> 1 353 0 0 0.0 336 20689 9.6 52.9| da8 >>> 1 355 0 0 0.0 338 20669 9.5 53.0| da9 >>> 1 357 0 0 0.0 340 20770 9.5 52.5| da10 >>> 1 351 0 0 0.0 334 20641 9.4 53.1| da11 >>> 1 362 0 0 0.0 345 21155 9.6 54.1| da12 >>> >>> >>> The kernels were compiled identically using GENERIC with no modification. >>> I'm no expert, but none of the stuff I've seen looking at svn commits >>> looks like it would have any impact on this. Any clues? >>> >> >> Your seeing a totally different profile there Josh as in all writes no >> reads where as before you where seeing mainly reads and some writes. >> >> So I would ask if your sure your seeing the same work load, or has >> something external changed too? >> >> Might be worth rebooting back to the new kernel and seeing if your >> still see the issue ;-) >> >> >> Regards >> Steve >> >> Regards >> Steve >> >> > Steve, > > You're absolutely right. I didn't catch that, but the total ops/s is > reaching quite a bit higher. Things are certainly more responsive than > they have been, for what it's worth, so it "feels right." I'm also not > seeing this thing consistently railed to 100% busy like I was before with > similar testing (that was 50 machines just pushing data with dd). I won't > be able to get a good comparison until Monday, when our students come back > (this is a file server for a public school district and used for network > homes). > > Josh > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHSvA%2B782vNfPt4iERfCh_C_HnRBpBKGh1rhND9Ea=Lsh2g>