From owner-freebsd-fs@FreeBSD.ORG Wed Oct 13 16:16:15 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FFEA106566B for ; Wed, 13 Oct 2010 16:16:15 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2640D8FC13 for ; Wed, 13 Oct 2010 16:16:14 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 7D8EF153434; Wed, 13 Oct 2010 18:16:13 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id t224EkeIgHpZ; Wed, 13 Oct 2010 18:16:10 +0200 (CEST) Received: from [127.0.0.1] (opteron [192.168.10.67]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id D0C21153433; Wed, 13 Oct 2010 18:16:10 +0200 (CEST) Message-ID: <4CB5DB47.9010904@digiware.nl> Date: Wed, 13 Oct 2010 18:16:07 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Borja Marcos References: <4CB1DD0F.6000209@digiware.nl> <98AF4752-7881-4C50-8A59-243F1AD55318@sarenet.es> In-Reply-To: <98AF4752-7881-4C50-8A59-243F1AD55318@sarenet.es> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: fs@freebsd.org Subject: Re: ZFS freeze/livelock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 13 Oct 2010 16:16:15 -0000 On 2010-10-13 13:08, Borja Marcos wrote: > > On Oct 10, 2010, at 5:34 PM, Willem Jan Withagen wrote: > >> Hi, >> >> Just had my FreeBSD freeze on me with what I would think is sort of >> an livelock.... >> >> While I was receiving zfs snapshots on my data pool. > There is an (as far as I know) unsolved deadlock situation when > receiving a snapshot while you read the target dataset. > > I found it in a redundant server configuration. I replicate some > datasets periodically doing an incremental send-receive. It works > perfectly but it can deadlock if I have a process reading the > destination dataset on the secondary server. And those things can > happen if you have, for example, one of the nightly periodic tasks > running. > > Were you doing a siimilar thing? Or are you sure there was no reading > activity on the destination dataset? Well I think what I did more or less fits your desciption. But thusfar it did not happen. And I'm (very slowly) redoing some of these steps, with all debugging settings in the kernel. # Debugging for use in -current options KDB # Enable kernel debugger support. options DDB # Support DDB. options GDB # Support remote GDB. options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed Things are real sloooooooooow now. --WjW