Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 4 Oct 2004 09:44:42 +0100
From:      Chris Elsworth <chris@shagged.org>
To:        freebsd-geom@freebsd.org
Subject:   SCSI disk getting disconnected on boot
Message-ID:  <20041004084442.GA65504@shagged.org>

next in thread | raw e-mail | index | archive | help
Hello,

After having a two-way gmirror happily working for a few days, upon
rebooting both machines, they both seem to have lost half the mirror.

Here's the debug output from bootup on one of them:

Waiting 5 seconds for SCSI devices to settle
GEOM_MIRROR[2]: Tasting acd0.
da0 at ahc0 bus 0 target 0 lun 0
da0: <QUANTUM ATLAS10K3_36_SCA 120G> Fixed Direct Access SCSI-3 device 
da0: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd
da0: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <QUANTUM ATLAS10K3_36_SCA 120G> Fixed Direct Access SCSI-3 device 
da1: 160.000MB/s transfers (80.000MHz, offset 127, 16bit), Tagged Queueing Enabd
da1: 34732MB (71132959 512 byte sectors: 255H 63S/T 4427C)
GEOM_MIRROR[2]: Tasting da0.
SMP: AP CPU #1 Launched!
     magic: GEOM::MIRROR
   version: 1
      name: gm
       mid: 1573691141
       did: 1965364196
       all: 2
    syncid: 3
  priority: 0
     slice: 4096
   balance: split
 mediasize: 36420074496
sectorsize: 512
syncoffset: 12766412800
    mflags: NONE
    dflags: DIRTY SYNCHRONIZING
hcprovider: da0
  MD5 hash: ad3dd443dde332bde5d63b262571dcc9
GEOM_MIRROR[1]: Creating device gm (id=1573691141).
GEOM_MIRROR[0]: Device gm created (id=1573691141).
GEOM_MIRROR[1]: Adding disk da0 to gm.
GEOM_MIRROR[2]: Adding disk da0.
GEOM_MIRROR[2]: Disk da0 connected.
GEOM_MIRROR[1]: Disk da0 state changed from NONE to NEW (device gm).
GEOM_MIRROR[0]: Device gm: provider da0 detected.
GEOM_MIRROR[2]: Tasting da1.
     magic: GEOM::MIRROR
   version: 1
      name: gm
       mid: 1573691141
       did: 4008348218
       all: 2
    syncid: 3
  priority: 0
     slice: 4096
   balance: split
 mediasize: 36420074496
sectorsize: 512
syncoffset: 0
    mflags: NONE
    dflags: NONE
hcprovider: da1
  MD5 hash: e6584ea109907134ce7285853c7bbcb1
GEOM_MIRROR[1]: Adding disk da1 to gm.
GEOM_MIRROR[2]: Adding disk da1.
GEOM_MIRROR[2]: Disk da1 connected.
GEOM_MIRROR[1]: Disk da1 state changed from NONE to NEW (device gm).
GEOM_MIRROR[0]: Device gm: provider da1 detected.
GEOM_MIRROR[1]: Device gm state changed from STARTING to RUNNING.
GEOM_MIRROR[1]: Disk da1 state changed from NEW to ACTIVE (device gm).
GEOM_MIRROR[2]: Access da1 r0w1e1 = 0
GEOM_MIRROR[2]: Tasting da0a.
GEOM_MIRROR[2]: Access da1 r0w-1e-1 = 0
GEOM_MIRROR[2]: Metadata on da1 updated.
GEOM_MIRROR[0]: Device gm: provider da1 activated.
GEOM_MIRROR[1]: Disk da0 state changed from NEW to SYNCHRONIZING (device gm).
GEOM_MIRROR[0]: Device gm: provider mirror/gm launched.
GEOM_MIRROR[0]: Device gm: rebuilding provider da0.
GEOM_MIRROR[2]: Access da0 r0w1e1 = 1
GEOM_MIRROR[1]: Disk da0 state changed from SYNCHRONIZING to DISCONNECTED (devi.
GEOM_MIRROR[0]: Device gm: provider da0 disconnected.
GEOM_MIRROR[2]: Disk da0 disconnected.
GEOM_MIRROR[2]: Consumer da0 destroyed.
GEOM_MIRROR[2]: Tasting da0b.
GEOM_MIRROR[2]: Tasting da0c.
GEOM_MIRROR[2]: Tasting da0d.
GEOM_MIRROR[2]: Tasting da0e.
GEOM_MIRROR[2]: Tasting da0f.
GEOM_MIRROR[2]: Tasting da0g.
GEOM_MIRROR[2]: Tasting da1a.
GEOM_MIRROR[2]: Tasting da1b.
GEOM_MIRROR[2]: Tasting da1c.
GEOM_MIRROR[2]: Tasting da1d.
GEOM_MIRROR[2]: Tasting da1e.
GEOM_MIRROR[2]: Tasting da1f.
GEOM_MIRROR[2]: Tasting da1g.
GEOM_MIRROR[2]: Tasting mirror/gm.
GEOM_MIRROR[2]: Access request for mirror/gm: r1w0e0.
GEOM_MIRROR[2]: Access da1 r1w0e1 = 0
GEOM_MIRROR[2]: Access request for mirror/gm: r-1w0e0.
GEOM_MIRROR[2]: Access da1 r-1w0e-1 = 0
...

After this there's lots of access requests for various da1 partitions,
all of which succeed. The system boots normally from here, using the
gmirror device with just one provider left. I have to activate da0 in
order to get it to resync.

You'll notice that in this particular case, da0 was still resyncing
when I rebooted the machine, but this is reproducible even if both
halves of the mirror are synced. I'd expected that even in the case of
rebooting during a resync, the resync should restart after a boot, not
disconnect the drive.

The only explanation I could think of - da0 is the boot device; is
this locking the metadata against being updated somehow? Would using
mirror devices of da0s1 and da1s1 get round this?

Appreciate any input :)
-- 
Chris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20041004084442.GA65504>