From owner-freebsd-geom@FreeBSD.ORG Mon Oct 4 11:06:42 2004 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A2BAB16A4CF; Mon, 4 Oct 2004 11:06:42 +0000 (GMT) Received: from clueful.shagged.org (clueful.shagged.org [212.13.201.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4096843D49; Mon, 4 Oct 2004 11:06:42 +0000 (GMT) (envelope-from chris@clueful.shagged.org) Received: from chris by clueful.shagged.org with local (Exim 4.40 (FreeBSD)) id 1CEQft-000J2W-Hg; Mon, 04 Oct 2004 12:06:37 +0100 Date: Mon, 4 Oct 2004 12:06:37 +0100 From: Chris Elsworth To: Pawel Jakub Dawidek Message-ID: <20041004110637.GA72685@shagged.org> References: <20041004084442.GA65504@shagged.org> <20041004090227.GB73767@darkness.comp.waw.pl> <20041004091509.GB65504@shagged.org> <20041004102411.GC73767@darkness.comp.waw.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041004102411.GC73767@darkness.comp.waw.pl> User-Agent: Mutt/1.5.6i Sender: Chris Elsworth X-Shagged-MailScanner-Information: See www.mailscanner.info for information X-Shagged-MailScanner: Found to be clean X-MailScanner-From: chris@clueful.shagged.org cc: freebsd-geom@freebsd.org Subject: Re: SCSI disk getting disconnected on boot X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Oct 2004 11:06:42 -0000 On Mon, Oct 04, 2004 at 12:24:11PM +0200, Pawel Jakub Dawidek wrote: > On Mon, Oct 04, 2004 at 10:15:09AM +0100, Chris Elsworth wrote: > +> GEOM_MIRROR[1]: Disk da0 state changed from NEW to ACTIVE (device gm). > +> GEOM_MIRROR[2]: Access da0 r0w1e1 = 1 > +> GEOM_MIRROR[0]: Cannot update metadata on disk da0 (error=1). > > I haven't seen this error before. > Simlar race was reported earlier. Could you try this patch: > > http://people.freebsd.org/~pjd/patches/gmirror.patch > > (You need to recompile you kernel and geom_mirror.ko module.) Hello Pawel, Oh dear - this seems to have made it worse :( My boot procedure is now as follows (started from the first GEOM_MIRROR output) GEOM_MIRROR[2]: Tasting fd0. acd0: CDROM at ata0-master PIO4 Waiting 5 seconds for SCSI devices to settle GEOM_MIRROR[2]: Tasting acd0. da0 at ahc0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) da1 at ahc0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) GEOM_MIRROR[2]: Tasting da0. SMP: AP CPU #1 Launched! magic: GEOM::MIRROR version: 1 name: gm mid: 2253826535 did: 2995324107 all: 2 syncid: 5 priority: 0 slice: 4096 balance: split mediasize: 36420074496 sectorsize: 512 syncoffset: 0 mflags: NONE dflags: NONE hcprovider: MD5 hash: 0cc1a692117f8e6afef48b4c45452382 GEOM_MIRROR[1]: Creating device gm (id=2253826535). GEOM_MIRROR[0]: Device gm created (id=2253826535). GEOM_MIRROR[1]: Adding disk da0 to gm. GEOM_MIRROR[2]: Adding disk da0. GEOM_MIRROR[2]: Disk da0 connected. GEOM_MIRROR[1]: Disk da0 state changed from NONE to NEW (device gm). GEOM_MIRROR[0]: Device gm: provider da0 detected. GEOM_MIRROR[2]: Tasting da1. magic: GEOM::MIRROR version: 1 name: gm mid: 2253826535 did: 1391052059 all: 2 syncid: 6 priority: 0 slice: 4096 balance: split mediasize: 36420074496 sectorsize: 512 syncoffset: 0 mflags: NONE dflags: NONE hcprovider: MD5 hash: 30f8adf515872347230b383c5af68b4f GEOM_MIRROR[1]: Adding disk da1 to gm. GEOM_MIRROR[2]: Adding disk da1. GEOM_MIRROR[2]: Disk da1 connected. GEOM_MIRROR[1]: Disk da1 state changed from NONE to NEW (device gm). GEOM_MIRROR[0]: Device gm: provider da1 detected. GEOM_MIRROR[1]: Device gm state changed from STARTING to RUNNING. GEOM_MIRROR[1]: Disk da1 state changed from NEW to ACTIVE (device gm). GEOM_MIRROR[2]: Tasting da0a. GEOM_MIRROR[2]: Tasting da0b. GEOM_MIRROR[2]: Tasting da0c. GEOM_MIRROR[2]: Tasting da0d. GEOM_MIRROR[2]: Tasting da0e. GEOM_MIRROR[2]: Tasting da0f. GEOM_MIRROR[2]: Tasting da0g. GEOM_MIRROR[2]: Tasting da0h. GEOM_MIRROR[2]: Tasting da1a. GEOM_MIRROR[2]: Tasting da1b. GEOM_MIRROR[2]: Tasting da1c. GEOM_MIRROR[2]: Tasting da1d. GEOM_MIRROR[2]: Tasting da1e. GEOM_MIRROR[2]: Tasting da1f. GEOM_MIRROR[2]: Tasting da1g. GEOM_MIRROR[2]: Tasting da1h. Mounting root from ufs:/dev/mirror/gma setrootbyname failed ffs_mountroot: can't find rootvp Root mount failed: 6 Mounting root from ufs:mirror/gma setrootbyname failed ffs_mountroot: can't find rootvp Root mount failed: 6 Manual root filesystem specification: : Mount using filesystem eg. ufs:da0s1a ? List valid disk boot devices Abort manual input mountroot> ? List of GEOM managed disk devices: da1h da1g da1f da1e da1d da1c da1b da1a da0h da0g da0f da0e da0d da0c da0b da0 .. So, at this point, it's not actually started the mirror? If I choose one of the underlying drives just to try and get it booted: .. mountroot> ufs:da0a Mounting root from ufs:da0a GEOM_MIRROR[2]: Access da1 r0w1e1 = 0 GEOM_MIRROR[2]: Access da1 r0w-1e-1 = 0 GEOM_MIRROR[2]: Metadata on da1 updated. GEOM_MIRROR[0]: Device gm: provider da1 activated. GEOM_MIRROR[1]: Device gm: syncid bumped to 7. GEOM_MIRROR[2]: Tasting da1a. GEOM_MIRROR[2]: Tasting da1b. GEOM_MIRROR[2]: Tasting da1c. GEOM_MIRROR[2]: Tasting da1d. GEOM_MIRROR[2]: Tasting da1e. GEOM_MIRROR[2]: Tasting da1f. GEOM_MIRROR[2]: Tasting da1g. GEOM_MIRROR[2]: Tasting da1h. GEOM_MIRROR[2]: Access da1 r0w1e1 = 0 GEOM_MIRROR[2]: Access da1 r0w-1e-1 = 0 GEOM_MIRROR[2]: Metadata on da1 updated. GEOM_MIRROR[1]: Disk da0 state changed from NEW to SYNCHRONIZING (device gm). GEOM_MIRROR[0]: Device gm: provider mirror/gm launched. GEOM_MIRROR[0]: Device gm: rebuilding provider da0. GEOM_MIRROR[2]: Access da0 r0w1e1 = 1 GEOM_MIRROR[1]: Disk da0 state changed from SYNCHRONIZING to DISCONNECTED (devi. GEOM_MIRROR[0]: Device gm: provider da0 disconnected. GEOM_MIRROR[2]: Disk da0 disconnected. GEOM_MIRROR[2]: Consumer da0 destroyed. .. and from here it boots. But it looks like the mirror/gm provider isn't launched until after I chose a root device to mount. df after this boot looks like: # df -k Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/da0a 126702 50226 66340 43% / devfs 1 1 0 100% /dev /dev/mirror/gmd 1012974 6 931932 0% /tmp /dev/mirror/gme 1012974 24622 907316 3% /var /dev/mirror/gmf 8122126 1801154 5671202 24% /usr /dev/mirror/gmg 2026030 4 1863944 0% /jail /dev/mirror/gmh 18066100 4 16620808 0% /dump So the mirror is still working, using da1 as it's only disk. A gmirror list just for the record.. # gmirror list Geom name: gm State: DEGRADED Components: 2 Balance: split Slice: 4096 Flags: NONE SyncID: 7 ID: 2253826535 Providers: 1. Name: mirror/gm Mediasize: 36420074496 (34G) Sectorsize: 512 Mode: r6w6e1 Consumers: 1. Name: da1 Mediasize: 36420075008 (34G) Sectorsize: 512 Mode: r6w6e2 State: ACTIVE Priority: 0 Flags: DIRTY SyncID: 7 ID: 1391052059 Geom name: gm.sync If I shut it down and boot with old kernel now, it comes up fine, although in the process, disconnecting da0 again: GEOM_MIRROR[1]: Adding disk da1 to gm. GEOM_MIRROR[2]: Adding disk da1. GEOM_MIRROR[2]: Disk da1 connected. GEOM_MIRROR[1]: Disk da1 state changed from NONE to NEW (device gm). GEOM_MIRROR[0]: Device gm: provider da1 detected. GEOM_MIRROR[1]: Device gm state changed from STARTING to RUNNING. GEOM_MIRROR[1]: Disk da1 state changed from NEW to ACTIVE (device gm). GEOM_MIRROR[2]: Access da1 r0w1e1 = 0 GEOM_MIRROR[2]: Tasting da0a. GEOM_MIRROR[2]: Access da1 r0w-1e-1 = 0 GEOM_MIRROR[2]: Metadata on da1 updated. GEOM_MIRROR[0]: Device gm: provider da1 activated. GEOM_MIRROR[1]: Disk da0 state changed from NEW to SYNCHRONIZING (device gm). GEOM_MIRROR[0]: Device gm: provider mirror/gm launched. GEOM_MIRROR[0]: Device gm: rebuilding provider da0. GEOM_MIRROR[2]: Access da0 r0w1e1 = 1 GEOM_MIRROR[1]: Disk da0 state changed from SYNCHRONIZING to DISCONNECTED (devi. GEOM_MIRROR[0]: Device gm: provider da0 disconnected. GEOM_MIRROR[2]: Disk da0 disconnected. GEOM_MIRROR[2]: Consumer da0 destroyed. -- Chris