From owner-freebsd-fs@FreeBSD.ORG Fri Mar 22 18:17:48 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EC738F1D for ; Fri, 22 Mar 2013 18:17:47 +0000 (UTC) (envelope-from josh@signalboxes.net) Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com [IPv6:2607:f8b0:4003:c01::22a]) by mx1.freebsd.org (Postfix) with ESMTP id B42947B7 for ; Fri, 22 Mar 2013 18:17:47 +0000 (UTC) Received: by mail-ob0-f170.google.com with SMTP id wc20so4351136obb.29 for ; Fri, 22 Mar 2013 11:17:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:x-received:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=jAOjl2sMt/HpbIyZMc7IZegZwhQvq0EkQn/xEbRxyPA=; b=aFPQ/2EXRiBxvyE5N6XNB/Mh4TG/YILHK8wjNXMQGVtrCCLEKuH7meTWuU0DXe/hRz KpZyNFVpGsOSMXCrLa23dHfeJwPHDK8E+udj/i2Z1+pstQlc4kRhAyXi4x43avJ6G7SE aisxgmo86StYB1waV7DDxgO0DeCYgI0KIaSp2VHbKwi3o3LByU5wb1TBzVR4fjXbYXgh 0sN7pgA49PcDfnwgRcO1orxo6/hqMksjvx3+Bf9g87cfJZx6P3ivKxcfa6yad7TuHQpB vzhWYoShQiyC8CtfgqT0cUVvLr2d5H4vASCuv3EHljvODNnDJpXJ+WzvRpjmQYBm63KS 2hbg== X-Received: by 10.60.14.71 with SMTP id n7mr2742411oec.135.1363976267162; Fri, 22 Mar 2013 11:17:47 -0700 (PDT) Received: from mail-ob0-x234.google.com (mail-ob0-x234.google.com [2607:f8b0:4003:c01::234]) by mx.google.com with ESMTPS id w10sm3314862oed.2.2013.03.22.11.17.46 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 22 Mar 2013 11:17:46 -0700 (PDT) Received: by mail-ob0-f180.google.com with SMTP id wo10so1973153obc.39 for ; Fri, 22 Mar 2013 11:17:46 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.60.11.8 with SMTP id m8mr2947829oeb.22.1363976265951; Fri, 22 Mar 2013 11:17:45 -0700 (PDT) Received: by 10.60.62.168 with HTTP; Fri, 22 Mar 2013 11:17:45 -0700 (PDT) In-Reply-To: References: Date: Fri, 22 Mar 2013 12:17:45 -0600 Message-ID: Subject: Re: ZFS + NFS poor performance after restarting from 100 day uptime From: Josh Beard To: Steven Hartland X-Gm-Message-State: ALoCoQlbllTJIAHuHXTwAvryKCwdCQY4RUMIK391p4bNYFI4bgNNPiuf7njkCzJJ4JNiwCZm6a04 Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Mar 2013 18:17:48 -0000 On Thu, Mar 21, 2013 at 10:14 AM, Steven Hartland wrote: > > ----- Original Message ----- From: "Josh Beard" > To: > Sent: Thursday, March 21, 2013 3:53 PM > Subject: ZFS + NFS poor performance after restarting from 100 day uptime > > > > Hello, >> >> I have a system with 12 disks spread between 2 raidz1. I'm using the >> native ("new") NFS to export a pool on this. This has worked very well >> all >> along, but since a reboot, has performed horribly - unusably under load. >> >> The system was running 9.1-rc3 and I upgraded it to 9.1-release-p1 >> (GENERIC >> kernel) after ~110 days of running (with zero performance issues). After >> rebooting from the upgrade, I'm finding the disks seem constantly slammed. >> gstat reports 90-100% busy most of the day with only ~100-130 ops/s. >> >> I didn't change any settings in /etc/sysctl.conf or /boot/loader. No ZFS >> tuning, etc. I've looked at the commits between 9.1-rc3 and >> 9.1-release-p1 >> and I can't see any reason why simply upgrading it would cause this. >> > ... > >> A snip of gstat: >> dT: 1.002s w: 1.000s >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name >> 0 0 0 0 0.0 0 0 0.0 0.0| cd0 >> 0 1 0 0 0.0 1 32 0.2 0.0| da0 >> 0 0 0 0 0.0 0 0 0.0 0.0| da0p1 >> 0 1 0 0 0.0 1 32 0.2 0.0| da0p2 >> 0 0 0 0 0.0 0 0 0.0 0.0| da0p3 >> 4 160 126 1319 31.3 34 100 0.1 100.3| da1 >> 4 146 110 1289 33.6 36 98 0.1 97.8| da2 >> 4 142 107 1370 36.1 35 101 0.2 101.9| da3 >> 4 121 95 1360 35.6 26 19 0.1 95.9| da4 >> 4 151 117 1409 34.0 34 102 0.1 100.1| da5 >> 4 141 109 1366 35.9 32 101 0.1 97.9| da6 >> 4 136 118 1207 24.6 18 13 0.1 87.0| da7 >> 4 118 102 1278 32.2 16 12 0.1 89.8| da8 >> 4 138 116 1240 33.4 22 55 0.1 100.0| da9 >> 4 133 117 1269 27.8 16 13 0.1 86.5| da10 >> 4 121 102 1302 53.1 19 51 0.1 100.0| da11 >> 4 120 99 1242 40.7 21 51 0.1 99.7| da12 >> > > Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s > is what HD's will peak at, so whatever our machine is doing is causing > it to max the available IO for your disks. > > If you boot back to your previous kernel does the problem go away? > > If so you could look at the changes between the two kernel revisions > for possible causes and if needed to a binary chop with kernel builds > to narrow down the cause. > > Regards > Steve > > Regards > Steve > > > Steve, Thanks for your response. I booted with the old kernel (9.1-RC3) and the problem disappeared! We're getting 3x the performance with the previous kernel than we do with the 9.1-RELEASE-p1 kernel: Output from gstat: 1 362 0 0 0.0 345 20894 9.4 52.9| da1 1 365 0 0 0.0 348 20893 9.4 54.1| da2 1 367 0 0 0.0 350 20920 9.3 52.6| da3 1 362 0 0 0.0 345 21275 9.5 54.1| da4 1 363 0 0 0.0 346 21250 9.6 54.2| da5 1 359 0 0 0.0 342 21352 9.5 53.8| da6 1 347 0 0 0.0 330 20486 9.4 52.3| da7 1 353 0 0 0.0 336 20689 9.6 52.9| da8 1 355 0 0 0.0 338 20669 9.5 53.0| da9 1 357 0 0 0.0 340 20770 9.5 52.5| da10 1 351 0 0 0.0 334 20641 9.4 53.1| da11 1 362 0 0 0.0 345 21155 9.6 54.1| da12 The kernels were compiled identically using GENERIC with no modification. I'm no expert, but none of the stuff I've seen looking at svn commits looks like it would have any impact on this. Any clues?