Date: Tue, 30 Jan 2007 09:51:14 +0100 (CET) From: Oliver Fromme <olli@lurza.secnetix.de> To: freebsd-geom@FreeBSD.ORG, freebsd-geom@FreeBSD.ORG Subject: gmirror or ata problem Message-ID: <200701300851.l0U8pEkO005250@lurza.secnetix.de>
next in thread | raw e-mail | index | archive | help
Hi, This is strange. gmirror just detached one of its disks for no apparent reason. I've built a mirror consisting of the components ad0 and ad1 (both SATA drives). It has been running fine. This is RELENG_6 from 2006-12-20. Yesterday evening ad1 was detached. There is no other error message logged on console or in the logs (i.e. no I/O error such as a bad sector or anything). There was no particularly high load at that time. In fact, the machine had been under much higher load before, without anything bad happening. This is from the logs: Jan 29 19:10:13 pluto -- MARK -- Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached Jan 29 19:20:26 pluto kernel: subdisk1: detached Jan 29 19:20:26 pluto kernel: ad1: detached Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected. Jan 29 19:50:13 pluto -- MARK -- This almost looks like typical Windows problems: Something reports a "failure", but no reason or any other useful information. :-( "atacontrol list" reports for ad1:: Master: no device present After an atacontrol detach/attach cycle, the device is back again: Master: ad1 <SAMSUNG HD160JJ/WU100-41> Serial ATA II I inserted it back into the gmirror, and right now it's synchronizing happily. Can anybody please explain what happened, and -- more importantly -- how to avoid it in the future? As far as I can tell, the disk drives are perfectly OK. Best regards Oliver PS: disk-related stuff from dmesg: atapci0: <VIA 6420 SATA150 controller> port 0xe100-0xe107,0xe200-0xe203,0xe300-0xe307,0xe400-0xe403,0xe500-0xe50f,0xe600-0xe6ff irq 20 at device 15.0 on pci0 ata2: <ATA channel 0> on atapci0 ata3: <ATA channel 1> on atapci0 atapci1: <VIA 8237 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe700-0xe70f at device 15.1 on pci0 ata0: <ATA channel 0> on atapci1 ata1: <ATA channel 1> on atapci1 ad0: 152627MB <SAMSUNG HD160JJ WU100-41> at ata2-master SATA150 ad1: 152627MB <SAMSUNG HD160JJ WU100-41> at ata3-master SATA150 The PATA controller (ata[01] on atapci1) is not used. I have disabled ATA_STATIC_ID, so the disks are named ad0 and ad1. I've also atapicam in the kernel, but it's not actually used and shouldn't make a difference. This is the SATA-related info from pciconf -lv: atapci0@pci0:15:0: class=0x010400 card=0x70941462 chip=0x31491106 rev=0x80 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT8237 VT6410 SATA RAID Controller' class = mass storage subclass = RAID -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, USt-Id: DE204219783 Any opinions expressed in this message are personal to the author and may not necessarily reflect the opinions of secnetix GmbH & Co KG in any way. FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "C++ is the only current language making COBOL look good." -- Bertrand Meyer
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701300851.l0U8pEkO005250>