Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Jan 2007 09:51:14 +0100 (CET)
From:      Oliver Fromme <olli@lurza.secnetix.de>
To:        freebsd-geom@FreeBSD.ORG, freebsd-geom@FreeBSD.ORG
Subject:   gmirror or ata problem
Message-ID:  <200701300851.l0U8pEkO005250@lurza.secnetix.de>

next in thread | raw e-mail | index | archive | help
Hi,

This is strange.  gmirror just detached one of its disks
for no apparent reason.  I've built a mirror consisting of
the components ad0 and ad1 (both SATA drives).  It has
been running fine.  This is RELENG_6 from 2006-12-20.

Yesterday evening ad1 was detached.  There is no other
error message logged on console or in the logs (i.e. no
I/O error such as a bad sector or anything).  There was
no particularly high load at that time.  In fact, the
machine had been under much higher load before, without
anything bad happening.

This is from the logs:

Jan 29 19:10:13 pluto -- MARK --
Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached
Jan 29 19:20:26 pluto kernel: subdisk1: detached
Jan 29 19:20:26 pluto kernel: ad1: detached
Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6).
Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6).
Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected.
Jan 29 19:50:13 pluto -- MARK --

This almost looks like typical Windows problems:  Something
reports a "failure", but no reason or any other useful
information.  :-(

"atacontrol list" reports for ad1::

    Master:      no device present

After an atacontrol detach/attach cycle, the device is back
again:

    Master:  ad1 <SAMSUNG HD160JJ/WU100-41> Serial ATA II

I inserted it back into the gmirror, and right now it's
synchronizing happily.

Can anybody please explain what happened, and -- more
importantly -- how to avoid it in the future?  As far as
I can tell, the disk drives are perfectly OK.

Best regards
   Oliver

PS:  disk-related stuff from dmesg:

atapci0: <VIA 6420 SATA150 controller> port 0xe100-0xe107,0xe200-0xe203,0xe300-0xe307,0xe400-0xe403,0xe500-0xe50f,0xe600-0xe6ff irq 20 at device 15.0 on pci0
ata2: <ATA channel 0> on atapci0
ata3: <ATA channel 1> on atapci0
atapci1: <VIA 8237 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe700-0xe70f at device 15.1 on pci0
ata0: <ATA channel 0> on atapci1
ata1: <ATA channel 1> on atapci1
ad0: 152627MB <SAMSUNG HD160JJ WU100-41> at ata2-master SATA150
ad1: 152627MB <SAMSUNG HD160JJ WU100-41> at ata3-master SATA150

The PATA controller (ata[01] on atapci1) is not used.
I have disabled ATA_STATIC_ID, so the disks are named
ad0 and ad1.  I've also atapicam in the kernel, but
it's not actually used and shouldn't make a difference.

This is the SATA-related info from pciconf -lv:

atapci0@pci0:15:0: class=0x010400 card=0x70941462 chip=0x31491106 rev=0x80 hdr=0x00
    vendor   = 'VIA Technologies Inc'
    device   = 'VT8237  VT6410 SATA RAID Controller'
    class    = mass storage
    subclass = RAID

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606, USt-Id: DE204219783
Any opinions expressed in this message are personal to the author and may
not necessarily reflect the opinions of secnetix GmbH & Co KG in any way.
FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"C++ is the only current language making COBOL look good."
        -- Bertrand Meyer



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701300851.l0U8pEkO005250>