From owner-freebsd-stable@FreeBSD.ORG Tue Feb 8 05:34:41 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13C0F106564A for ; Tue, 8 Feb 2011 05:34:41 +0000 (UTC) (envelope-from greg@bonett.org) Received: from bonett.org (bonett.org [66.249.7.150]) by mx1.freebsd.org (Postfix) with ESMTP id C40BA8FC12 for ; Tue, 8 Feb 2011 05:34:40 +0000 (UTC) Received: from [192.168.1.216] (unknown [76.91.19.169]) by bonett.org (Postfix) with ESMTPSA id A45DC124367; Tue, 8 Feb 2011 05:34:38 +0000 (UTC) From: Greg Bonett To: Jeremy Chadwick In-Reply-To: <20110207085537.GA20545@icarus.home.lan> References: <1297026074.23922.8.camel@ubuntu> <20110207045501.GA15568@icarus.home.lan> <1297065041.754.12.camel@ubuntu> <20110207085537.GA20545@icarus.home.lan> Content-Type: text/plain; charset="UTF-8" Date: Mon, 07 Feb 2011 21:34:36 -0800 Message-ID: <1297143276.9417.400.camel@ubuntu> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: 8.1 amd64 lockup (maybe zfs or disk related) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Feb 2011 05:34:41 -0000 Thank you for the help. I've implemented your suggested /boot/loader.conf and /etc/sysctrl.conf tunings. Unfortunately, after implementing these settings, I experienced another lockup. And by "lockup" I mean, nothing responding (sshd, keyboard, num lock) - had to reset. I'm trying to isolate the cause of these lockups. I rebooted the system and tried to simulate high load condition WITHOUT mounting my zfs pool. First I ran many instances of "dd if=/dev/random of=/dev/null bs=4m" to get high CPU load. The machine ran for many hours under this condition without lockup. Then I added a few "dd if=/dev/adX of=/dev/null bs=4m" to simulate some io load. After doing this it locked up immediately. Thinking I had figured out the source of the problem, I rebooted and tried to replicate this experience but was not able to. So far it has been running for two hours with six "dd if=/dev/adX" commands (one for each disk) and about a dozen "dd if=/dev/urandom" commands (to keep cpu near 100%). I'll let it keep running and see if it locks again without ever mounting zfs. any ideas? On Mon, 2011-02-07 at 00:55 -0800, Jeremy Chadwick wrote: > On Sun, Feb 06, 2011 at 11:50:41PM -0800, Greg Bonett wrote: > > Thanks for the response. > > I have no tunings in /boot/loader.conf > > according to http://wiki.freebsd.org/ZFSTuningGuide for amd64 > > "FreeBSD 7.2+ has improved kernel memory allocation strategy and no > > tuning may be necessary on systems with more than 2 GB of RAM. " > > I have 8GB of ram. > > do you think this is wrong? > > > > Handbook recommends these (but says their test system has 1gb ram): > > vm.kmem_size="330M" > > vm.kmem_size_max="330M" > > vfs.zfs.arc_max="40M" > > vfs.zfs.vdev.cache.size="5M" > > > > what do you recommend? > > The Wiki is outdated, I'm sorry to say. Given that you have 8GB RAM, I > would recommend these settings. Please note that some of these have > become the defaults in 8.1 (depending on when your kernel was built and > off of what source date), and in what will soon be 8.2: > > /boot/loader.conf : > > # > # ZFS tuning parameters > # NOTE: Be sure to see /etc/sysctl.conf for additional tunings > # > > # Increase vm.kmem_size to allow for ZFS ARC to utilise more memory. > vm.kmem_size="8192M" > vfs.zfs.arc_max="6144M" > > # Disable ZFS prefetching > # http://southbrain.com/south/2008/04/the-nightmare-comes-slowly-zfs.html > # Increases overall speed of ZFS, but when disk flushing/writes occur, > # system is less responsive (due to extreme disk I/O). > # NOTE: Systems with 8GB of RAM or more have prefetch enabled by default. > vfs.zfs.prefetch_disable="1" > > # Disable UMA (uma(9)) for ZFS; amd64 was moved to exclusively use UMA > # on 2010/05/24. > # http://lists.freebsd.org/pipermail/freebsd-stable/2010-June/057162.html > vfs.zfs.zio.use_uma="0" > > # Decrease ZFS txg timeout value from 30 (default) to 5 seconds. This > # should increase throughput and decrease the "bursty" stalls that > # happen during immense I/O with ZFS. > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html > # http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007355.html > vfs.zfs.txg.timeout="5" > > > > /etc/sysctl.conf : > > # > # ZFS tuning parameters > # NOTE: Be sure to see /boot/loader.conf for additional tunings > # > > # Increase number of vnodes; we've seen vfs.numvnodes reach 115,000 > # at times. Default max is a little over 200,000. Playing it safe... > kern.maxvnodes=250000 > > # Set TXG write limit to a lower threshold. This helps "level out" > # the throughput rate (see "zpool iostat"). A value of 256MB works well > # for systems with 4GB of RAM, while 1GB works well for us w/ 8GB on > # disks which have 64MB cache. > vfs.zfs.txg.write_limit_override=1073741824 > > > Be aware that the vfs.zfs.txg.write_limit_override tuning you see above > may need to be adjusted for your system. It's up to you to figure out > what works best in your environment. > > > I think the ad0: FAILURE - READ_DMA4 errors may be from a bad sata cable > > (or rather, a 12in sata cable connecting a drive that is one inch away) > > I'm ordering a new drive bay to improve this, but should a bad cable > > cause lockups? > > Semantic point: it's READ_DMA48, not READ_DMA4. The "48" indicates > 48-bit LBA addressing. There is no 4-bit LBA addressing mode. > > The term "lock up" is also too vague. If by "lock up" you mean "the > system seems alive, hitting NumLock on the console keyboard toggles the > LED", then the kernel is very likely spending too much of its time > spinning in something (such as waiting for commands to return from the > SATA controller, which could also indirectly be the controller waiting > for the disk to respond to commands). If by "lock up" you mean "the > system is literally hard locked, nothing responds, I have to hit > physical Reset or power-cycle the box", then no, a bad cable should not > be able to cause that. > > > > #smartctl -a /dev/ad0 > > > > === START OF INFORMATION SECTION === > > Model Family: Western Digital Caviar Green (Adv. Format) family > > Device Model: WDC WD10EARS-00Y5B1 > > First thing to note is that this is one of those new 4KB sector drives. > I have no personal experience with them, but they have been talked about > on the FreeBSD lists for quite some time, especially with regards to > ZFS. The discussions involve performance. Just a FYI point. > > > SMART Attributes Data Structure revision number: 16 > > Vendor Specific SMART Attributes with Thresholds: > > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > > UPDATED WHEN_FAILED RAW_VALUE > > 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 > > 3 Spin_Up_Time 0x0027 121 121 021 Pre-fail Always - 6933 > > 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 30 > > 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 > > 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 > > 9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 2664 > > 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 > > 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 > > 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 28 > > 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 27 > > 193 Load_Cycle_Count 0x0032 135 135 000 Old_age Always - 196151 > > 194 Temperature_Celsius 0x0022 125 114 000 Old_age Always - 22 > > 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 > > 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 > > 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 > > 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 > > 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 > > > > SMART Error Log Version: 1 > > No Errors Logged > > > > SMART Self-test log structure revision number 1 > > Num Test_Description Status Remaining > > LifeTime(hours) LBA_of_first_error > > # 1 Short offline Completed without error 00% 1536 > > Your disk looks "almost" fine. There are no indicators of bad blocks or > CRC errors (which indicate bad SATA cables or physical PCB problems on > the disk) -- that's the good part. > > The bad part: Attribute 193. Your disk is literally "load cycling" > (which is somewhat equivalent to a power cycle; I'd rather not get into > explaining what it is, but it's not good) on a regular basis. This > problem with certain models of Western Digital disks has been discussed > on the FreeBSD lists before. There have been statements made by users > that Western Digital has indirectly acknowledged this problem, and fixed > it in a later drive firmware revision. Please note that in some cases > WD did not increment/change the firmware revision string in their fix, > so you can't rely on that to determine anything. > > Would this behaviour cause READ_DMAxx and WRITE_DMAxx errors? > Absolutely, no doubt about it. > > My recommendations: talk to Western Digital Technical Support and explain > the problem, point them to this thread, and get a fixed/upgraded > firmware from them. If they do not acknowledge the problem or you get > stonewalled, I recommend replacing the drive entirely with a different > model (I highly recommend the Caviar Black drives, which do not have > this problem). > > If they give you a replacement firmware, you'll probably need a DOS boot > disk to accomplish this, and need to make sure your BIOS does not have > AHCI mode enabled (DOS won't find the disk). You can always re-enable > AHCI after the upgrade. If you don't have a DOS boot disk, you'll need > to explain to Western Digital that you need them to give you a bootable > ISO that can allow you to perform the upgrade. > > If you need me to dig up mailing lists posts about this problem I can do > so, but it will take me some time. The discussions might have been for > a non-4K-sector Green drive as well, but it doesn't matter, the problem > is known at this point. > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP 4BD6C0CB |