From owner-freebsd-stable@FreeBSD.ORG Thu Jul 9 17:21:03 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 045051065670 for ; Thu, 9 Jul 2009 17:21:03 +0000 (UTC) (envelope-from thomas@ronner.org) Received: from mail.knopje.net (mail.knopje.net [213.214.107.232]) by mx1.freebsd.org (Postfix) with ESMTP id BD72E8FC19 for ; Thu, 9 Jul 2009 17:21:01 +0000 (UTC) (envelope-from thomas@ronner.org) Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.knopje.net (Postfix) with ESMTP id 2BA3938128; Thu, 9 Jul 2009 19:21:01 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at knopje.net Received: from mail.knopje.net ([127.0.0.1]) by localhost (hal.knopje.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Dqq+FNdjWwyc; Thu, 9 Jul 2009 19:21:00 +0200 (CEST) Received: from [192.168.28.109] (48-251.surfsnel.dsl.internl.net [145.99.251.48]) by mail.knopje.net (Postfix) with ESMTP id AEE08380FC; Thu, 9 Jul 2009 19:21:00 +0200 (CEST) Message-ID: <4A5626FC.5000306@ronner.org> Date: Thu, 09 Jul 2009 19:21:00 +0200 From: Thomas Ronner User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <1247085058.6197.18.camel@bugstore> <4A55F96B.6070303@icyb.net.ua> <4A560E2F.1050906@ronner.org> In-Reply-To: <4A560E2F.1050906@ronner.org> Content-Type: text/plain; charset=KOI8-U; format=flowed Content-Transfer-Encoding: 7bit Cc: Andriy Gapon Subject: Re: ZFS: zpool scrub lockup X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 17:21:03 -0000 Thomas Ronner wrote: > Hi Andriy, > > Andriy Gapon wrote: >> on 08/07/2009 23:30 Thomas Ronner said the following: >>> Hello, >>> >>> I don't know whether this is the right list; maybe freebsd-fs is more >>> appropriate. So please redirect me there if this isn't the right place. >>> >>> My system (i386, Athlon XP) locks hard when scrubbing a certain pool. It >>> has been doing this for at least a couple of months now. For this reason >>> I upgraded to 7.2-STABLE recently as this had the latest ZFS bits, but >>> this doesn't help. It even makes the problem worse: in previous versions >>> I just hit the reset button and forgot about it, but now it "remembers" >>> that it was scrubbing (I presume) and tries to resume at the exact same >>> place, locking up again. This means I haven't been able to mount these >>> ZFS volumes successfully: the moment I do a /etc/rc.d/zfs start from >>> single user mode (I have my /, /var and /usr on UFS) it locks up in a >>> couple of seconds. And by locks up I really mean locks up. No panic, >>> nothing. Pressing the reset button on the chassis is the only way to >>> reboot. >> >> You can try adding SW_WATCHDOG option to your kernel which might help >> catching the >> lockup. Things like INVARIANTS and WITNESS might help th debugging too. >> Serial console for remote debugging would be very useful too. >> > > I'll definitely try those and report back on this list. Thanks for your > answer! I put the following in my kernel config: # debugging options KDB options DDB options GDB options BREAK_TO_DEBUGGER options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_KDB options DEBUG_VFS_LOCKS options DIAGNOSTIC options SW_WATCHDOG When I send a BREAK from my serial console it enters the debugger, so that works. But when I start ZFS (/etc/rc.d/zfs start) it freezes again and BREAK doesn't enter the debugger. I'll try playing with the watchdog now, but I doubt this will help. Any clues? Thanks, Thomas