From owner-freebsd-stable@FreeBSD.ORG Mon Jun 3 20:32:09 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 09A204D8 for ; Mon, 3 Jun 2013 20:32:09 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by mx1.freebsd.org (Postfix) with ESMTP id 9E89D1478 for ; Mon, 3 Jun 2013 20:32:08 +0000 (UTC) Received: from mfilter1-d.gandi.net (mfilter1-d.gandi.net [217.70.178.130]) by relay5-d.mail.gandi.net (Postfix) with ESMTP id 0DFED41C06C; Mon, 3 Jun 2013 22:31:51 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter1-d.gandi.net Received: from relay5-d.mail.gandi.net ([217.70.183.197]) by mfilter1-d.gandi.net (mfilter1-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id 7fKVDRDcTqED; Mon, 3 Jun 2013 22:31:49 +0200 (CEST) X-Originating-IP: 67.180.84.87 Received: from jdc.koitsu.org (c-67-180-84-87.hsd1.ca.comcast.net [67.180.84.87]) (Authenticated sender: jdc@koitsu.org) by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id C7BAF41C078; Mon, 3 Jun 2013 22:31:48 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id BDA5773A1C; Mon, 3 Jun 2013 13:31:46 -0700 (PDT) Date: Mon, 3 Jun 2013 13:31:46 -0700 From: Jeremy Chadwick To: Ross Alexander Subject: Re: 9.1-current disk throughput stalls ? Message-ID: <20130603203146.GB49602@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Jun 2013 20:32:09 -0000 On Mon, Jun 03, 2013 at 09:38:45AM -0600, Ross Alexander wrote: > I wonder if anyone here has insight on a disk throughput problem > that's come up over the last week or two. Now, I habitually run an > 'svn up' and then rebuild world + kernel every Saturday morning on the > home machines. It's all scripted and logged; I've been doing this for > years and the process is very cut and dried. Saturday AM, I started > it as usual - today it was still running, but only about 15% done. > Normally it completes in 39 minutes, +/- 1 minute. > > What I've noticed is that disk performance on disk intensive stuff has > gotten very flaky over the last two or three weeks. A buildworld, to > pick an example, will run nicely for three to five minutes and then > bog down. The disks stay busy, but forward progress slows to a crawl > and then apparently stops. Individual cleandirs are taking five to > ten seconds each on an otherwise unloaded machine. It feels like > a vax-11/780 with 30 users and RA-80s, if anyone here remembers those > days :). > > Here's a 'systat -vms': > > 5 users Load 0.30 0.30 0.27 Jun 3 09:07 > > Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER > Tot Share Tot Share Free in out in out > Act 84032 13908 1949112 40736 15071k count > All 671192 16300 1076410k 61416 pages > Proc: Interrupts > r p d s w Csw Trp Sys Int Sof Flt cow 630 total > 113 3573 29 113 630 83 26 26 zfod hdac1 16 > ozfod xhci0 ehci > 0.9%Sys 0.2%Intr 0.3%User 0.0%Nice 98.6%Idle %ozfod ohci0 ohci > | | | | | | | | | | | daefr 93 emu10kx0 > + prcfr 178 hpet0:t0 > dtbuf 596 totfr hdac0 259 > Namei Name-cache Dir-cache 329578 desvn react 359 ahci0 260 > Calls hits % hits % 17505 numvn pdwak re0 261 > 475 294 62 14841 frevn pdpgs > intrn > Disks ada0 ada1 pass0 pass1 796676 wire > KB/t 5.42 5.96 0.00 0.00 65484 act > tps 197 192 0 0 45332 inact > MB/s 1.04 1.12 0.00 0.00 cache > %busy 74 82 0 0 15071692 free > buf > > This is taken during the early stages of a builworld. The cleandir > job steps are *crawling* along. Rattling the keyboard (USB or serial, > although an SSH sessions seems to work sometimes as well) gets the > buildworld doing some useful work again. Meanwhile, the apps load > (which is two instances of WSPR, an instance of baudline, KDE, and a > vncserver), which is soundcard I/O bound and does little to no disk > I/O) runs along perfectly happily. > > The oldest kernel I have that shows the syndrome is - > > FreeBSD aukward.bogons 9.1-STABLE FreeBSD 9.1-STABLE #59 r250498: > Sat May 11 00:03:15 MDT 2013 > toor@aukward.bogons:/usr/obj/usr/src/sys/GENERIC amd64 > > H/W info - > > hw.machine: amd64 > hw.model: AMD Phenom(tm) II X4 965 Processor > hw.ncpu: 4 > hw.physmem: 16883937280 > hw.clockrate: 3411 > kern.sched.name: ULE > > ahci0: port 0xa000-0xa007,0x9000-0x9003,\ > 0x8000-0x8007,0x7000-0x7003,0x6000-0x600f mem 0xfe6ffc00-0xfe6fffff \ > irq 19 at device 17.0 on pci0 > ahci0: AHCI v1.20 with 6 6Gbps ports, Port Multiplier supported > ahcich0: at channel 0 on ahci0 > [...] > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: ATA-6 SATA 1.x device > ada0: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > ada0: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C) > ada0: Previously was known as ad4 > ada1 at ahcich2 bus 0 scbus2 target 0 lun 0 > ada1: ATA-6 SATA 1.x device > ada1: 150.000MB/s transfers (SATA 1.x, UDMA6, PIO 8192bytes) > ada1: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C) > ada1: Previously was known as ad8 > > I'm not paging, I don't have wild interrupt loads (checked with > 'vmstat -i'), the ZFS pool is not in the middle of a scrub, but the > machine has bad trivial response and buildworld doesn't get finished. > I am seeing very similar behaviour on three other 9.1-current > machines, all of which are AHCI/SATA setups, using both Seagate and WD > disks (of random sizes and ages). All these boxes ran fine a month > ago. > > BTW, when I do the rattle-keyboard-to-get-disks-going trick, the NFS > daemon reports that the system clock slews badly - machine time drops > behind wall clock time. Something is locking the clock update off. > > (Hmmm, I see I'm running a pre-5000/feature flags ZFS pool, FWTW. > I'll run zpool upgrade, my bad.) 1. There is no such thing as 9.1-CURRENT. Either you meant 9.1-STABLE (what should be called stable/9) or -CURRENT (what should be called head). 2. Is there some reason you excluded details of your ZFS setup? "zpool status" would be a good start. 3. Do any of your filesystems/pools have ZFS compression enabled, or have in the past? 4. Do any of your filesystems/pools have ZFS dedup enabled, or have in the past? 5. Does the problem go away after a reboot? 6. Can you provide smartctl -x output for both ada0 and ada1? You will need to install ports/sysutils/smartmontools for this. The reason I'm asking for this is there may be one of your disks which is causing I/O transactions to stall for the entire pool (i.e. "single point of annoyance"). 7. Can you remove ZFS from the picture entirely (use UFS only) and re-test? My guess is that this is ZFS behaviour, particularly the ARC being flushed to disk, and your disks are old/slow. (Meaning: you have 16GB RAM + 4 core CPU but with very old disks). -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB |