From owner-freebsd-geom@FreeBSD.ORG Sun Nov 18 21:44:56 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DCC616A468 for ; Sun, 18 Nov 2007 21:44:56 +0000 (UTC) (envelope-from pjah@hicom.net) Received: from ns1.hicom.net (ns1.hicom.net [208.245.180.8]) by mx1.freebsd.org (Postfix) with ESMTP id 4E52113C48A for ; Sun, 18 Nov 2007 21:44:55 +0000 (UTC) (envelope-from pjah@hicom.net) Received: from [127.0.0.1] (pool-68-239-213-35.nwrk.east.verizon.net [68.239.213.35]) (authenticated bits=0) by ns1.hicom.net (8.13.6/8.13.6) with ESMTP id lAILWUh2040766 for ; Sun, 18 Nov 2007 16:32:33 -0500 (EST) Message-ID: <4740AF69.8070003@hicom.net> Date: Sun, 18 Nov 2007 16:32:25 -0500 From: Juergen Heberling User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: freebsd-geom@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: GMIRROR uses wrong drive/provider after a problem with a drive/provider X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Nov 2007 21:44:56 -0000 Hi all, I'm not sure what I'm doing wrong. The above scenario has occurred to me 2x already. I would appreciate any suggestions. I'm running FB 6.1 with 4 mirrors, each mirror consisting of 2 drives, each drive in a mirror is on a different SCSI channel (da0, 2, 4,6 are on Channel A and DA1, 3, 5, 7 are on channel B). da0 and da1 are providers for mirror "gm0" which is the "system"/"boot" mirror - I boot from da0. da2 and da3 are providers for mirror "mail". I'm also running with: kern.geom.mirror.disconnect_on_failure=0 My notes for using this option say "so a component does not get disconnected in case of an error" but I do not remember what originally prompted me to use this option. So here is the scenario: 1. First I get a write error on da2 (mail), after a couple of retries I get: GEOM_MIRROR: Device mail: provider da2 disconnected. OK, now I'm happy that I'm running a mirror and mail is running fine on da3 2. About 1 minute later I get a write error on da0(gm0), after a couple of retries I get: GEOM_MIRROR: Cannot update metadata on disk da0 (error=16). and finally GEOM_MIRROR: Device gm0: provider da0 disconnected. So gm0 is now running solely using da1. OK so far. Later on in the day (about 18 hours later), I try to rebuild the mirrors, for which I have to reboot (since the device entries were invalidated FB needs to find the drives again). I try do this in single-user mode. Problem : Booting into single-user mode, gm0 detects and uses DA0 (the outdated drive!) and decides to NOT use da1, the "current" drive, in the mirror (that was good actually since now I can try to recover gm0 from da1). (I expected FB to boot from da0, try to detect the various mirrors, and, when detecting mirror gm0, detect the providers of gm0, noting that da0 needs upgrading from da1.) It does the same for the mail mirror (uses da2, the outdated drive, and does NOT use da3, the current drive) Question: What do I need to do so GEOM/GMIRROR uses the most current version of the data in the mirrors? Note - prior to booting into single-user mode, I could not access the defunct drives to clear the GMIRROR/GEOM metadata. I had tried camcontrol-reset against the defunct drives but did not want to risk camcontrol-rescan. TIA Juergen