From owner-freebsd-fs@FreeBSD.ORG Thu Oct 15 19:18:02 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1959110656C1 for ; Thu, 15 Oct 2009 19:18:02 +0000 (UTC) (envelope-from stef-list@memberwebs.com) Received: from memberwebs.com (memberwebs.com [94.75.203.95]) by mx1.freebsd.org (Postfix) with ESMTP id DA94B8FC23 for ; Thu, 15 Oct 2009 19:18:01 +0000 (UTC) Received: from [172.27.5.159] (unknown [172.27.5.159]) by memberwebs.com (Postfix) with ESMTP id 7761B83E4C8 for ; Thu, 15 Oct 2009 19:04:21 +0000 (UTC) Message-ID: <4AD77230.3030803@memberwebs.com> Date: Thu, 15 Oct 2009 14:04:16 -0500 From: Stef Walter User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Deadlock after canceled zfs recv X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: stef@memberwebs.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Oct 2009 19:18:02 -0000 I'm running the latest RELENG_8, and been doing some pre-production stress testing. I can deadlock (reproduceable) after a canceled ssh + zfs recv. Here's how I reproduce the problem: Do this in a new tank without any data in it. Reboot the system, and make this the first zfs operations done. Files available here: http://memberwebs.com/stef/misc/recv-snapshots-zfs-hang.tbz Receive new file system, and then incremental snapshot: # cat step-one | zfs recv tank/received # cat step-two | zfs recv tank/received At this point should look like: # zfs list -t snapshot,filesystem | grep received tank 2.35G 16.4G 22K /tank tank/received 491M 16.4G 489M /tank/received tank/received@justnow 1.32M - 160M - tank/received@later 0 - 489M - The third one goes through ssh. Count about three to five seconds (one one thousand, two one thousand, three one thousand) and press Ctrl-C # cat step-three | ssh localhost zfs recv tank/received Execute the above 'zfs list' command, and more often than not, parts of the zfs system are hung, and remain deadlocked until reboot. If it doesn't happen the first time, try the step three + ctrl-c again. When run through ssh, and ctrl-c cancelled it seems like 'zfs recv' doesn't have time to do the cleanup that it normally does if run directly. FreeBSD zfs8.ws.local 8.0-RC1 FreeBSD 8.0-RC1 #0: Wed Oct 14 16:04:50 UTC 2009 root@zfs8.ws.local:/usr/obj/usr/src/sys/GENERIC i386 I'm available for any further information and want to help nail down this bug. Cheers, Stef