Date: Mon, 4 Oct 2004 09:44:42 +0100 From: Chris Elsworth <chris@shagged.org> To: freebsd-geom@freebsd.org Subject: SCSI disk getting disconnected on boot Message-ID: <20041004084442.GA65504@shagged.org>
next in thread | raw e-mail | index | archive | help
Hello, After having a two-way gmirror happily working for a few days, upon rebooting both machines, they both seem to have lost half the mirror. Here's the debug output from bootup on one of them: Waiting 5 seconds for SCSI devices to settle GEOM_MIRROR[2]: Tasting acd0. da0 at ahc0 bus 0 target 0 lun 0 da0: <QUANTUM ATLAS10K3_36_SCA 120G> Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) da1 at ahc0 bus 0 target 1 lun 0 da1: <QUANTUM ATLAS10K3_36_SCA 120G> Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C) GEOM_MIRROR[2]: Tasting da0. SMP: AP CPU #1 Launched! magic: GEOM::MIRROR version: 1 name: gm mid: 1573691141 did: 1965364196 all: 2 syncid: 3 priority: 0 slice: 4096 balance: split mediasize: 36420074496 sectorsize: 512 syncoffset: 12766412800 mflags: NONE dflags: DIRTY SYNCHRONIZING hcprovider: da0 MD5 hash: ad3dd443dde332bde5d63b262571dcc9 GEOM_MIRROR[1]: Creating device gm (id=1573691141). GEOM_MIRROR[0]: Device gm created (id=1573691141). GEOM_MIRROR[1]: Adding disk da0 to gm. GEOM_MIRROR[2]: Adding disk da0. GEOM_MIRROR[2]: Disk da0 connected. GEOM_MIRROR[1]: Disk da0 state changed from NONE to NEW (device gm). GEOM_MIRROR[0]: Device gm: provider da0 detected. GEOM_MIRROR[2]: Tasting da1. magic: GEOM::MIRROR version: 1 name: gm mid: 1573691141 did: 4008348218 all: 2 syncid: 3 priority: 0 slice: 4096 balance: split mediasize: 36420074496 sectorsize: 512 syncoffset: 0 mflags: NONE dflags: NONE hcprovider: da1 MD5 hash: e6584ea109907134ce7285853c7bbcb1 GEOM_MIRROR[1]: Adding disk da1 to gm. GEOM_MIRROR[2]: Adding disk da1. GEOM_MIRROR[2]: Disk da1 connected. GEOM_MIRROR[1]: Disk da1 state changed from NONE to NEW (device gm). GEOM_MIRROR[0]: Device gm: provider da1 detected. GEOM_MIRROR[1]: Device gm state changed from STARTING to RUNNING. GEOM_MIRROR[1]: Disk da1 state changed from NEW to ACTIVE (device gm). GEOM_MIRROR[2]: Access da1 r0w1e1 = 0 GEOM_MIRROR[2]: Tasting da0a. GEOM_MIRROR[2]: Access da1 r0w-1e-1 = 0 GEOM_MIRROR[2]: Metadata on da1 updated. GEOM_MIRROR[0]: Device gm: provider da1 activated. GEOM_MIRROR[1]: Disk da0 state changed from NEW to SYNCHRONIZING (device gm). GEOM_MIRROR[0]: Device gm: provider mirror/gm launched. GEOM_MIRROR[0]: Device gm: rebuilding provider da0. GEOM_MIRROR[2]: Access da0 r0w1e1 = 1 GEOM_MIRROR[1]: Disk da0 state changed from SYNCHRONIZING to DISCONNECTED (devi. GEOM_MIRROR[0]: Device gm: provider da0 disconnected. GEOM_MIRROR[2]: Disk da0 disconnected. GEOM_MIRROR[2]: Consumer da0 destroyed. GEOM_MIRROR[2]: Tasting da0b. GEOM_MIRROR[2]: Tasting da0c. GEOM_MIRROR[2]: Tasting da0d. GEOM_MIRROR[2]: Tasting da0e. GEOM_MIRROR[2]: Tasting da0f. GEOM_MIRROR[2]: Tasting da0g. GEOM_MIRROR[2]: Tasting da1a. GEOM_MIRROR[2]: Tasting da1b. GEOM_MIRROR[2]: Tasting da1c. GEOM_MIRROR[2]: Tasting da1d. GEOM_MIRROR[2]: Tasting da1e. GEOM_MIRROR[2]: Tasting da1f. GEOM_MIRROR[2]: Tasting da1g. GEOM_MIRROR[2]: Tasting mirror/gm. GEOM_MIRROR[2]: Access request for mirror/gm: r1w0e0. GEOM_MIRROR[2]: Access da1 r1w0e1 = 0 GEOM_MIRROR[2]: Access request for mirror/gm: r-1w0e0. GEOM_MIRROR[2]: Access da1 r-1w0e-1 = 0 ... After this there's lots of access requests for various da1 partitions, all of which succeed. The system boots normally from here, using the gmirror device with just one provider left. I have to activate da0 in order to get it to resync. You'll notice that in this particular case, da0 was still resyncing when I rebooted the machine, but this is reproducible even if both halves of the mirror are synced. I'd expected that even in the case of rebooting during a resync, the resync should restart after a boot, not disconnect the drive. The only explanation I could think of - da0 is the boot device; is this locking the metadata against being updated somehow? Would using mirror devices of da0s1 and da1s1 get round this? Appreciate any input :) -- Chris
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041004084442.GA65504>