From owner-freebsd-fs@FreeBSD.ORG Mon Oct 13 19:14:00 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 55534B95; Mon, 13 Oct 2014 19:14:00 +0000 (UTC) Received: from mail-yh0-x22f.google.com (mail-yh0-x22f.google.com [IPv6:2607:f8b0:4002:c01::22f]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 067D11F1; Mon, 13 Oct 2014 19:13:59 +0000 (UTC) Received: by mail-yh0-f47.google.com with SMTP id c41so3897462yho.34 for ; Mon, 13 Oct 2014 12:13:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=Ea7Yw4RhAHySmYJCtQSRSqE3Bd1Y69auwHcMR4b5ME8=; b=O9l5alfrZBEuTq/aT0d3XKiYz7mpM00ajBByGZyluc7Lo+W0z3IjnxVaGozJmNl2hO WEjLYGNByL9CNzaTMVlFBfeF8VTRC+rfhBhw9VtMKuTbzTiJ2ue9aBu8enBheWN15fVY G4Y6+8/LQjSh/y8fJjiecAbGM58H81g0IEOWxkLxdHvAOhMHE1tOB47k6Fe6uONqC75G 2CZpFkykHIGGO+31/4uWZzMGCvY4ikCiau7GP/wvCdmXmLyJzw1RgqQa42K+7ADRJmEi Kk3h+wVdZ7wA95kqppWLEJMhWAIYT9XzzNg65Oej5FTLRIxjZyvWKOdcKzHTawlmBmu9 wIGQ== MIME-Version: 1.0 X-Received: by 10.236.25.1 with SMTP id y1mr814329yhy.62.1413227639056; Mon, 13 Oct 2014 12:13:59 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.170.82.197 with HTTP; Mon, 13 Oct 2014 12:13:58 -0700 (PDT) In-Reply-To: References: <54372173.1010100@ijs.si> <644FA8299BF848E599B82D2C2C298EA7@multiplay.co.uk> <54372EBA.1000908@ijs.si> <543731F3.8090701@ijs.si> <543AE740.7000808@ijs.si> <6E01BBEDA9984CCDA14F290D26A8E14D@multiplay.co.uk> Date: Mon, 13 Oct 2014 12:13:58 -0700 X-Google-Sender-Auth: F6WId1xapj9yD1oj_DAfO2MWkvQ Message-ID: Subject: Re: zfs pool import hangs on [tx->tx_sync_done_cv] From: "K. Macy" To: Steven Hartland Content-Type: text/plain; charset=UTF-8 Cc: "freebsd-fs@FreeBSD.org" , FreeBSD Stable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2014 19:14:00 -0000 >>> Yer I would have got the zio details but typically its "optimised out" by >>> the >>> compiler, so will need some effort to track that down unfortunately :( >>> >> >> Well, let me know if you can. Re-creating a new 10.x VM is taking a while >> as it's taking me forever to checkout the sources. >> >> Things like that need to somehow continue to be accessible. > > > I believe there's some pool corruption here somewhere as every once in a > while > I trip and ASSERT panic: > panic: solaris assert: size >= SPA_MINBLOCKSIZE || > range_tree_space(msp->ms_tree) == 0, file: > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, > line: 1636 > <... snip> You are correct. (kgdb) p ((zio_t *)$r14)->io_reexecute $32 = 2 '\002' (kgdb) p ((zio_t *)$r14)->io_flags $33 = 0 (kgdb) p ((zio_t *)$r14)->io_spa->spa_suspended $34 = 1 '\001' This means zio_suspend has been called from zio_done: else if (zio->io_reexecute & ZIO_REEXECUTE_SUSPEND) { /* * We'd fail again if we reexecuted now, so suspend * until conditions improve (e.g. device comes online). */ zio_suspend(spa, zio); } If failure mode were panic we would have panicked when attempting the import: void zio_suspend(spa_t *spa, zio_t *zio) { if (spa_get_failmode(spa) == ZIO_FAILURE_MODE_PANIC) fm_panic("Pool '%s' has encountered an uncorrectable I/O " "failure and the failure mode property for this pool " "is set to panic.", spa_name(spa));