From owner-freebsd-stable@FreeBSD.ORG Mon Oct 13 08:06:12 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 58439D6F; Mon, 13 Oct 2014 08:06:12 +0000 (UTC) Received: from smtp1.multiplay.co.uk (smtp1.multiplay.co.uk [85.236.96.35]) by mx1.freebsd.org (Postfix) with ESMTP id 17B23F08; Mon, 13 Oct 2014 08:06:11 +0000 (UTC) Received: by smtp1.multiplay.co.uk (Postfix, from userid 65534) id 86B5220E708CA; Mon, 13 Oct 2014 08:06:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.multiplay.co.uk X-Spam-Level: ** X-Spam-Status: No, score=2.2 required=8.0 tests=AWL,BAYES_00,DOS_OE_TO_MX, FSL_HELO_NON_FQDN_1,RDNS_DYNAMIC,STOX_REPLY_TYPE autolearn=no version=3.3.1 Received: from r2d2 (82-69-141-170.dsl.in-addr.zen.co.uk [82.69.141.170]) by smtp1.multiplay.co.uk (Postfix) with ESMTPS id 3A10F20E708C8; Mon, 13 Oct 2014 08:06:08 +0000 (UTC) Message-ID: <6E01BBEDA9984CCDA14F290D26A8E14D@multiplay.co.uk> From: "Steven Hartland" To: "K. Macy" References: <54372173.1010100@ijs.si><644FA8299BF848E599B82D2C2C298EA7@multiplay.co.uk><54372EBA.1000908@ijs.si><543731F3.8090701@ijs.si><543AE740.7000808@ijs.si> Subject: Re: zfs pool import hangs on [tx->tx_sync_done_cv] Date: Mon, 13 Oct 2014 09:06:04 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="UTF-8"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: Mark Martinec , "freebsd-fs@FreeBSD.org" , FreeBSD Stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2014 08:06:12 -0000 ----- Original Message ----- From: "K. Macy" > A recent quick read of the code would lead me to believe that zio_wait not > returning there means that the zio never reached the zio_done stage. Parent > zios seem to yield in a couple of stages in the pipeline if they have > incomplete children. They determine this by calling zio_wait_for_children > with zio child types and their corresponding wait type. In so doing they > set the io_stall to the count of the number of waiters of the first > non-zero check. This parent I/O will be resumed by the last child zio of > that type and wait state in zio_notify_parent. I'm sure you know all this - > but I wrote it to preface asking for the following fields of the zio being > waited on in dsl_pool_sync_mos: io_stall (i.e, which field in io_children > is pointed to) *io_stall, io_children[*][*], io_child_list (at a first > glance just the addresses). The other alternative is that it reexecuting >has gotten in to a bad place in the state machine so io_reexecute. Yer I would have got the zio details but typically its "optimised out" by the compiler, so will need some effort to track that down unfortunately :( Regards Steve