From owner-freebsd-geom@FreeBSD.ORG Tue Jan 30 08:51:22 2007 Return-Path: X-Original-To: freebsd-geom@FreeBSD.ORG Delivered-To: freebsd-geom@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 55E6316A404 for ; Tue, 30 Jan 2007 08:51:22 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [83.120.8.8]) by mx1.freebsd.org (Postfix) with ESMTP id 920F413C4A6 for ; Tue, 30 Jan 2007 08:51:21 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (ajchob@localhost [127.0.0.1]) by lurza.secnetix.de (8.13.4/8.13.4) with ESMTP id l0U8pEDc005251 for ; Tue, 30 Jan 2007 09:51:19 +0100 (CET) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.13.4/8.13.1/Submit) id l0U8pEkO005250; Tue, 30 Jan 2007 09:51:14 +0100 (CET) (envelope-from olli) Date: Tue, 30 Jan 2007 09:51:14 +0100 (CET) Message-Id: <200701300851.l0U8pEkO005250@lurza.secnetix.de> From: Oliver Fromme To: freebsd-geom@FreeBSD.ORG, freebsd-geom@FreeBSD.ORG User-Agent: tin/1.8.2-20060425 ("Shillay") (UNIX) (FreeBSD/4.11-STABLE (i386)) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.2 (lurza.secnetix.de [127.0.0.1]); Tue, 30 Jan 2007 09:51:19 +0100 (CET) Cc: Subject: gmirror or ata problem X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Jan 2007 08:51:22 -0000 Hi, This is strange. gmirror just detached one of its disks for no apparent reason. I've built a mirror consisting of the components ad0 and ad1 (both SATA drives). It has been running fine. This is RELENG_6 from 2006-12-20. Yesterday evening ad1 was detached. There is no other error message logged on console or in the logs (i.e. no I/O error such as a bad sector or anything). There was no particularly high load at that time. In fact, the machine had been under much higher load before, without anything bad happening. This is from the logs: Jan 29 19:10:13 pluto -- MARK -- Jan 29 19:20:26 pluto kernel: ad1: FAILURE - device detached Jan 29 19:20:26 pluto kernel: subdisk1: detached Jan 29 19:20:26 pluto kernel: ad1: detached Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot write metadata on ad1 (device=gm0, error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Cannot update metadata on disk ad1 (error=6). Jan 29 19:20:26 pluto kernel: GEOM_MIRROR: Device gm0: provider ad1 disconnected. Jan 29 19:50:13 pluto -- MARK -- This almost looks like typical Windows problems: Something reports a "failure", but no reason or any other useful information. :-( "atacontrol list" reports for ad1:: Master: no device present After an atacontrol detach/attach cycle, the device is back again: Master: ad1 Serial ATA II I inserted it back into the gmirror, and right now it's synchronizing happily. Can anybody please explain what happened, and -- more importantly -- how to avoid it in the future? As far as I can tell, the disk drives are perfectly OK. Best regards Oliver PS: disk-related stuff from dmesg: atapci0: port 0xe100-0xe107,0xe200-0xe203,0xe300-0xe307,0xe400-0xe403,0xe500-0xe50f,0xe600-0xe6ff irq 20 at device 15.0 on pci0 ata2: on atapci0 ata3: on atapci0 atapci1: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xe700-0xe70f at device 15.1 on pci0 ata0: on atapci1 ata1: on atapci1 ad0: 152627MB at ata2-master SATA150 ad1: 152627MB at ata3-master SATA150 The PATA controller (ata[01] on atapci1) is not used. I have disabled ATA_STATIC_ID, so the disks are named ad0 and ad1. I've also atapicam in the kernel, but it's not actually used and shouldn't make a difference. This is the SATA-related info from pciconf -lv: atapci0@pci0:15:0: class=0x010400 card=0x70941462 chip=0x31491106 rev=0x80 hdr=0x00 vendor = 'VIA Technologies Inc' device = 'VT8237 VT6410 SATA RAID Controller' class = mass storage subclass = RAID -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, USt-Id: DE204219783 Any opinions expressed in this message are personal to the author and may not necessarily reflect the opinions of secnetix GmbH & Co KG in any way. FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd "C++ is the only current language making COBOL look good." -- Bertrand Meyer