From owner-freebsd-fs@FreeBSD.ORG Tue Oct 12 16:35:09 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 581EE1065670; Tue, 12 Oct 2010 16:35:09 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id E2BCE8FC15; Tue, 12 Oct 2010 16:35:08 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id E74C3153433; Tue, 12 Oct 2010 18:35:07 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IYCkm6D7PkHx; Tue, 12 Oct 2010 18:35:05 +0200 (CEST) Received: from [127.0.0.1] (opteron [192.168.10.67]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id AFF2E153435; Tue, 12 Oct 2010 18:35:05 +0200 (CEST) Message-ID: <4CB48E38.3080409@digiware.nl> Date: Tue, 12 Oct 2010 18:35:04 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <4CB1DD0F.6000209@digiware.nl> <20101012153654.GC2197@garage.freebsd.pl> In-Reply-To: <20101012153654.GC2197@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: fs@freebsd.org Subject: Re: ZFS freeze/livelock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Oct 2010 16:35:09 -0000 On 2010-10-12 17:36, Pawel Jakub Dawidek wrote: > On Sun, Oct 10, 2010 at 05:34:39PM +0200, Willem Jan Withagen wrote: >> Hi, >> >> Just had my FreeBSD freeze on me with what I would think is sort of an >> livelock.... >> >> While I was receiving zfs snapshots on my data pool. >> >> Top and systat just kept running, >> but anything getting near a shell (and perhaps disk-io) ended up in: >> >> root@zfs.digiware.nl# gpart create -s gpt da6 >> load: 0.00 cmd: csh 12393 [zfsvfs->z_teardown_inactive_lock] 26.12r >> 0.00u 0.00s 0% 2480k >> load: 0.10 cmd: csh 12393 [zfsvfs->z_teardown_inactive_lock] 96.01r >> 0.00u 0.00s 0% 2480k >> >> Trying to execute to execute shutdown -r now had no effect what so ever. >> Neither did the three-finger salute. >> (Well at least not in 60 sec I was willing to wait.) >> >> Only way out of this situation was hard-reset. And I do have to admit I >> like ZFS for the speed it recovers after unexpected reboot. >> >> To bad there was no alt-ctrl-backspace escape to debugger compiled in. >> I'll do that with the next kernel, just in case. >> >> So the only data point I can give is the ^T output above. > > Maybe you still be able to provide backtraces for all processes with > 'procstat -kk'? > > It looks like a deadlock related to 'zfs recv' or maybe unmounting? System has long rebooted..... I was no longer ableto start any new programs. Probably due to pwdbeing on the locked volume. But I'll this in mind for future time. And yes I was recv-ing snapshots from my other/working zfs-system. Having played with that quite some time, I really like that feature. Although it is not yet robust enough. Perhaps due to the above type problem. I'm now running it with all debug/witness flags on, but boy does that make it slow... --WjW