From owner-freebsd-hackers@FreeBSD.ORG Tue Jul 6 13:03:49 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 414F61065673 for ; Tue, 6 Jul 2010 13:03:49 +0000 (UTC) (envelope-from kpielorz_lst@tdx.co.uk) Received: from mail.tdx.com (mail.tdx.com [62.13.128.18]) by mx1.freebsd.org (Postfix) with ESMTP id BE42A8FC13 for ; Tue, 6 Jul 2010 13:03:48 +0000 (UTC) Received: from HexaDeca64.dmpriest.net.uk (HPQuadro64.dmpriest.net.uk [62.13.130.30]) (authenticated bits=0) by mail.tdx.com (8.14.3/8.14.3/Kp) with ESMTP id o66D3lES004179 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO) for ; Tue, 6 Jul 2010 14:03:47 +0100 (BST) Date: Tue, 06 Jul 2010 14:04:43 +0100 From: Karl Pielorz To: freebsd-hackers@freebsd.org Message-ID: <518F0331CD800BA31E0236EB@HexaDeca64.dmpriest.net.uk> X-Mailer: Mulberry/4.0.8 (Win32) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: 7.3-STABLE 'zfs attach' results in geom guid mismatch? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jul 2010 13:03:49 -0000 Hi All, This is related to a post I made the other day in freebsd-fs, which didn't get any replies (I'm a bit desperate as I need to replace a failing drive on the system - hence need to attach a spare - so apologies for the kind of cross-post)... I'm running 7.3-STABLE on an amd64, w/10Gb of RAM, and 2 * dual core Opteron 285's. When I do a 'zfs attach' the system hangs with no I/O - everything that touches zfs hangs. Doing some digging around (turning on ZFS debug) I see: host# zfs attach vol ad34 ad40 " vdev_geom_attach:112[1]: Attaching to ad40. vdev_geom_attach:153[1]: Created consumer for ad40. vdev_geom_read_guid:334[1]: guid for ad40 is 13247785578180267154 vdev_geom_detach:173[1]: Closing access to ad40. vdev_geom_detach:177[1]: Destroyed consumer to ad40. vdev_geom_open_by_path:472[1]: guid mismatch for provider /dev/ad40: 835553262974889329 != 13247785578180267154. vdev_geom_open_by_guid:430[1]: Searching by guid [835553262974889329]. " Should I be worried about that first "mismatch for provider"? It then seems to iterate through all the disk devices on the system (including some ZFS 'volumes') before appearing to hang on one of those (i.e. with GEOM debug turned on) the end of output is: " ... Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e1fd000(zvol/vol2/zfs_backups/scanned) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol2/zfs_backups/scanned), -1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r-1w0e0] old:[r1w0e0] provider:[r1w0e0] 0xffffff000e1fd000(zvol/vol2/zfs_backups/scanned) Jul 5 19:42:50 host kernel: g_detach(0xffffff0035015380) Jul 5 19:42:50 host kernel: g_access(0xffffff0035015380(zvol/vol/scanned@1237495449), 1, 0, 0) Jul 5 19:42:50 host kernel: open delta:[r1w0e0] old:[r0w0e0] provider:[r0w0e0] 0xffffff000e60b300(zvol/vol/scanned@1237495449) [hangs here] " ps axl at that point shows: " 0 2250 2004 0 -8 0 14460 2044 g_wait D+ p0 0:00.01 zpool attach vol ad34 ad40 " So it appears to be hung in 'g_wait'. Any suggestions? -Karl