From owner-freebsd-geom@FreeBSD.ORG Tue Mar 23 19:20:03 2010 Return-Path: Delivered-To: freebsd-geom@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C110106566B for ; Tue, 23 Mar 2010 19:20:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7A8628FC0A for ; Tue, 23 Mar 2010 19:20:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o2NJK3G7020124 for ; Tue, 23 Mar 2010 19:20:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o2NJK3CS020123; Tue, 23 Mar 2010 19:20:03 GMT (envelope-from gnats) Date: Tue, 23 Mar 2010 19:20:03 GMT Message-Id: <201003231920.o2NJK3CS020123@freefall.freebsd.org> To: freebsd-geom@FreeBSD.org From: Christopher Key Cc: Subject: Re: kern/113957: [gmirror] gmirror is intermittently reporting a degraded mirror array upon reboot. X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Christopher Key List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Mar 2010 19:20:03 -0000 The following reply was made to PR kern/113957; it has been noted by GNATS. From: Christopher Key To: bug-followup@FreeBSD.org, ayochum@pair.com Cc: Subject: Re: kern/113957: [gmirror] gmirror is intermittently reporting a degraded mirror array upon reboot. Date: Tue, 23 Mar 2010 19:17:59 +0000 I am suffering very similar symptoms. I have two mirrors, gm0 (ad4+ad6) and gm1 (ad8+ad10). On every restart, gm1 gets rebuilt, but never gm0. The problem appears to stem from gm1 not being closed cleanly at shutdown. I consistently see: GEOM_MIRROR: Device gm0: provider mirror/gm0 destroyed GEOM_MIRROR: Device gm0 destroyed. but never a corresponding message for gm1. I think that this is a problem with reference counts, and that something is keeping gm1 open. Immediately before shutdown, I have (trimmed): # gmirror list Geom name: gm0 State: COMPLETE Components: 2 Providers: 1. Name: mirror/gm0 Mode: r5w5e9 Geom name: gm1 State: COMPLETE Components: 2 Balance: round-robin Providers: 1. Name: mirror/gm1 Mode: r3w3e4 gm0 has 6 GPT partitions. Four have mounted UFS filesystems one is used for swap and one is a boot partition. I assume that the mounted partitions contribute r1w1e2 to the ref count, and the swap r1w1e1, leading to r5w5e9. When shutting down with kern.geom.mirror.debug=2, I get: GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w-1e-2 GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w-1e-2 GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w-1e-2 GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w-1e-2 GEOM_MIRROR[2]: Access request for mirror/gm0: r-1w-1e-1 which appears to confirm this. After this, the refcount becomes r0w0e0, and the devices gets destroyed. gm1 has 3 MBR slices. Two have mounted UFS filesystems, and the third slice is used for a ZFS pool. Here, I assume that the mounted partitions contribute r1w1e1 to the ref count, and the partition used by ZFS contributes r1w1e2. I'm not sure why UFS mounted from a GPT partition should contribute r1w1e2, but mounted from a MBR slice should contribute r1w1e1, but testing with memory disks does appear to confirm that this is the case. When shutting down with kern.geom.mirror.debug=2, I get: GEOM_MIRROR[2]: Access request for mirror/gm1: r-1w-1e-1 GEOM_MIRROR[2]: Access request for mirror/gm1: r-1w-1e-1 This seems to suggest that the problem is ZFS not releasing gm1 at shutdown. Testing with memory disk based mirrors appears to show the same result, with ZFS not releasing the mirror, resulting in the mirror not being cleanly destroyed.