From owner-freebsd-fs@FreeBSD.ORG Thu Apr 11 14:40:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1725E35A for ; Thu, 11 Apr 2013 14:40:18 +0000 (UTC) (envelope-from josh@signalboxes.net) Received: from mail-oa0-f46.google.com (mail-oa0-f46.google.com [209.85.219.46]) by mx1.freebsd.org (Postfix) with ESMTP id D14EEEFD for ; Thu, 11 Apr 2013 14:40:17 +0000 (UTC) Received: by mail-oa0-f46.google.com with SMTP id h2so228365oag.5 for ; Thu, 11 Apr 2013 07:40:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:x-received:in-reply-to:references:date :message-id:subject:from:to:content-type:x-gm-message-state; bh=4wTG567QSC1HtoRTpKL8srgBv1ESqARO8K4KCqE9gdI=; b=PgvIOc1Xu5VHyG4VDI58hgahcpDdPi0ynrD4xSHyKW1xdJhPD3fl7aLHDHP0MYAPnQ PITQ/sT+v3Xu98WBoOBYhOCGplfEWeXz46kzj53dZEXeMZnz16Uzk3t0oWeDZl0YpQ7q hzPcKtKqNUQsaPWmErTl/xJw3ImB0ad1GKOD8HUYipAX6EH6GGREm43ue4CI76qucj55 GySanCvj0wQTNb5uJn0fQjdN4gXBVFQldP5qkES5kCBqOJOKaOqkIYY5DusWGPCC390G 5i8oA+AlELdc1BWnhqcwxTuALU1uXeozXax5VygM7pYv1tbK01NjMGm8jvR8Xx/ee8UF eQtA== X-Received: by 10.60.135.103 with SMTP id pr7mr2336039oeb.142.1365691216835; Thu, 11 Apr 2013 07:40:16 -0700 (PDT) Received: from mail-ob0-x230.google.com (mail-ob0-x230.google.com [2607:f8b0:4003:c01::230]) by mx.google.com with ESMTPS id do4sm925053oeb.0.2013.04.11.07.40.16 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Apr 2013 07:40:16 -0700 (PDT) Received: by mail-ob0-f176.google.com with SMTP id er7so1445718obc.21 for ; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.155.212 with SMTP id vy20mr1036777oeb.33.1365691215672; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) Received: by 10.60.140.130 with HTTP; Thu, 11 Apr 2013 07:40:15 -0700 (PDT) In-Reply-To: References: <12CCA57CCC7E4F16A1147F8422F5F151@multiplay.co.uk> Date: Thu, 11 Apr 2013 08:40:15 -0600 Message-ID: Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime From: Josh Beard To: freebsd-fs@freebsd.org X-Gm-Message-State: ALoCoQmqQMVk+pfhsN+fhXmc1sFAkeiMIk2WooDGClaFtQB0Edi229V9/+3Ui0r0cGTFMbeJyWil Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Apr 2013 14:40:18 -0000 I wanted to give a followup to this in case someone else stumbles upon this thread with search queries. I was wrong about the original (9.1-RC3) kernel performing better. It was exhibiting the same behavior under "real world" conditions. Real world for this server is 100-200 Mac clients connecting with network homes via NFS. I haven't completely confirmed anything, but disabling Spotlight Indexing (Mac client feature) helped *significantly*. It's still curious why spotlight indexing was never an issue prior to the reboot I mentioned. I'm also unsure why the RAID controller's verifications are intermittently slow since that reboot. In any event, I don't think it's a ZFS or FreeBSD issue, based off of various benchmarks, which show expected performance. Thanks. On Fri, Mar 22, 2013 at 2:24 PM, Josh Beard wrote: > > > On Fri, Mar 22, 2013 at 1:07 PM, Steven Hartland wrote: > >> >> ----- Original Message ----- From: Josh Beard >>> >>>> A snip of gstat: >>>> >>>> dT: 1.002s w: 1.000s >>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >>>> >>> ... >> >>> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >>>> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >>>> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >>>> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >>>> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >>>> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >>>> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >>>> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >>>> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >>>> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >>>> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >>>> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >>>> >>>> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s >>>> is what HD's will peak at, so whatever our machine is doing is causing >>>> it to max the available IO for your disks. >>>> >>>> If you boot back to your previous kernel does the problem go away? >>>> >>>> If so you could look at the changes between the two kernel revisions >>>> for possible causes and if needed to a binary chop with kernel builds >>>> to narrow down the cause. >>>> >>> >>> Thanks for your response. I booted with the old kernel (9.1-RC3) and the >>> problem disappeared! We're getting 3x the performance with the previous >>> kernel than we do with the 9.1-RELEASE-p1 kernel: >>> >>> Output from gstat: >>> >>> 1 362 0 0 0.0 345 20894 9.4 52.9| da1 >>> 1 365 0 0 0.0 348 20893 9.4 54.1| da2 >>> 1 367 0 0 0.0 350 20920 9.3 52.6| da3 >>> 1 362 0 0 0.0 345 21275 9.5 54.1| da4 >>> 1 363 0 0 0.0 346 21250 9.6 54.2| da5 >>> 1 359 0 0 0.0 342 21352 9.5 53.8| da6 >>> 1 347 0 0 0.0 330 20486 9.4 52.3| da7 >>> 1 353 0 0 0.0 336 20689 9.6 52.9| da8 >>> 1 355 0 0 0.0 338 20669 9.5 53.0| da9 >>> 1 357 0 0 0.0 340 20770 9.5 52.5| da10 >>> 1 351 0 0 0.0 334 20641 9.4 53.1| da11 >>> 1 362 0 0 0.0 345 21155 9.6 54.1| da12 >>> >>> >>> The kernels were compiled identically using GENERIC with no modification. >>> I'm no expert, but none of the stuff I've seen looking at svn commits >>> looks like it would have any impact on this. Any clues? >>> >> >> Your seeing a totally different profile there Josh as in all writes no >> reads where as before you where seeing mainly reads and some writes. >> >> So I would ask if your sure your seeing the same work load, or has >> something external changed too? >> >> Might be worth rebooting back to the new kernel and seeing if your >> still see the issue ;-) >> >> >> Regards >> Steve >> >> Regards >> Steve >> >> > Steve, > > You're absolutely right. I didn't catch that, but the total ops/s is > reaching quite a bit higher. Things are certainly more responsive than > they have been, for what it's worth, so it "feels right." I'm also not > seeing this thing consistently railed to 100% busy like I was before with > similar testing (that was 50 machines just pushing data with dd). I won't > be able to get a good comparison until Monday, when our students come back > (this is a file server for a public school district and used for network > homes). > > Josh > >