From owner-freebsd-fs@FreeBSD.ORG Wed Oct 3 08:12:40 2012 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CA939106564A for ; Wed, 3 Oct 2012 08:12:40 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 1EFB78FC08 for ; Wed, 3 Oct 2012 08:12:39 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA02515; Wed, 03 Oct 2012 11:12:37 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1TJK4K-000GRP-VC; Wed, 03 Oct 2012 11:12:37 +0300 Message-ID: <506BF372.1090208@FreeBSD.org> Date: Wed, 03 Oct 2012 11:12:34 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:15.0) Gecko/20120913 Thunderbird/15.0.1 MIME-Version: 1.0 To: Nikolay Denev References: <906543F2-96BD-4519-B693-FD5AFB646F87@gmail.com> In-Reply-To: <906543F2-96BD-4519-B693-FD5AFB646F87@gmail.com> X-Enigmail-Version: 1.4.3 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Cc: "" Subject: Re: nfs + zfs hangs on RELENG_9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Oct 2012 08:12:40 -0000 on 02/10/2012 13:26 Nikolay Denev said the following: > 7 100537 zfskern txg_thread_enter mi_switch+0x186 sleepq_wait+0x42 > _cv_wait+0x121 zio_wait+0x61 dsl_pool_sync+0xe0 spa_sync+0x336 > txg_sync_thread+0x136 fork_exit+0x11f fork_trampoline+0xe >From my past experience the threads stuck in zio_wait always meant an I/O operation stuck in a storage controller driver, controller firmware, etc. Not necessarily a case here, but a possibility. Perhaps try camcontrol tags -v to see the state of disk queues. P.S. It would be nice if for debugging purposes we had some place in zio to record bio that it depends upon. E.g. something like: diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h index 80d9336..75b2fcf 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h @@ -432,6 +432,7 @@ struct zio { #ifdef _KERNEL /* FreeBSD only. */ struct ostask io_task; + void *io_bio; #endif }; diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c index 7d146ff..36bb5ad 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c @@ -684,6 +684,7 @@ vdev_geom_io_intr(struct bio *bp) vd->vdev_delayed_close = B_TRUE; } } + zio->io_bio = NULL; g_destroy_bio(bp); zio_interrupt(zio); } @@ -732,6 +733,7 @@ sendreq: } bp = g_alloc_bio(); bp->bio_caller1 = zio; + zio->io_bio = bp; switch (zio->io_type) { case ZIO_TYPE_READ: case ZIO_TYPE_WRITE: Then, in situation like yours you could use kgdb, switch to the thread in zio_wait, go to zio_wait frame and get bio pointer from zio. From there you could try to deduce what is going on with the I/O request. -- Andriy Gapon